What arrives in our in-boxes these days is becoming progressively richer and fatter. The content includes HTML formatted rich text, hyperlinks and attachments of various types, including Office documents, databases, images, videos, etc. It is now estimated that more than 5 per cent of emails contain images.
Most companies that employ content security have an email/Internet policy document. Email users must read this document and agree to abide by its rules. Email misuse can be deliberate or accidental, so it is important to have the ability to enforce the rules in the policy to prevent a policy breach. The ability to analyze messages and detect unacceptable images is a very important part of this policy enforcement.
Pornography is a big issue in many parts of the world, but acceptance of pornography varies by region and culture. In the workplace it is unacceptable in most regions. Most companies would not find it acceptable for employees to bring pornographic magazines into work and start reading them at their desk during working hours. The same type of content is available via email and this should be treated with the same attitude.
There have been many high-profile cases involving pornography in email. There are often large numbers of people involved, simply because it is so easy to forward email to large groups of people. Terminating employees and/or suspending them on full pay, for a lengthy investigation, constitutes an enormous cost to business in terms of lost revenue and damage to reputation.
Pornography is not the only threat
There are hundreds of ‘joke image’ web sites available, with thousands of joke images, many of which could be considered blasphemous, racist, sexist, pornographic or otherwise offensive. Some office jokers would be easily tempted to click on the ’email to a friend’ button on these sites, but not everyone may have the same sense of humor. An image that is a joke to one person may well be offensive to another. Post-September 11, 2001, and the subsequent U.S. military action, there were a great number of ‘joke’ emails being circulated containing images featuring Osama bin Laden. Many people found these images offensive.
Mail can easily be misdirected; it is very easy to send the wrong message to the wrong person. Email clients may auto-complete the email address from address books for you and this is often the source of mistakes. Most email users have either sent or received a misdirected email at some point. Employing a policy-based content security solution can help reduce the risk of misdirected content.
Protecting the value of images
Images can contain confidential information. These images could be photographs from medical or legal records, confidential designs such as silicon chip designs, or the shape and styling of a new prototype car. It may be completely acceptable, or even necessary, for these images to circulate within a company, but it is so easy for them to be accidentally or deliberately forwarded to the wrong person outside the company. One of our customers in East Asia is a car manufacturer. The company’s new car designs were stored in the Clearswift software as images unacceptable for transmission out of the company. A company insider attempted to send the designs to a competitor and the image management software successfully prevented this.
Litigation related to content security is on the increase. In most cases the organization is responsible and liable for the actions of its employees in the workplace, including employees’ use of email and all the information transmitted from their systems. Legal issues can arise if an employee sends an image that depicts other companies or individuals in a less favorable light.
It is easy to see how unacceptable images in email can lead to legal problems – for example, if employees receive emailed pornography and their employer has made no effort to prevent this. In many countries, companies have a legal responsibility to protect their employees from exposure to content threats, such as racist, sexist, pornographic and other offensive material.
Images can be copyright. Forwarding an email with such an image could constitute an infringement of copyright. The issues of digital rights management are becoming better understood and organizations will progressively seek to formalize the description, identification, trading, protection, monitoring and tracking of image assets – including the management of right holders’ relationships.
Mishandling of images can slow the company down
Images add a considerable size overhead to email. A typical text email could be around 1Kb. Adding one JPEG (typically about 40Kb) to a small email could make the email over 40 times its original size. An email with five attached images could be more than 200 times larger than an email containing just text. If on average there is one image for every ten emails, this could increase the volume of email traffic by 40 per cent.
This has implications in terms of network bandwidth and storage. Emails with images take up more space on email servers. Multiple copies of the same email could be stored multiple times on the same email server. For example, a cartoon email received by one person may be forwarded to colleagues, and then on to others. It is not uncommon to see some joke emails multiple times from different sources – “Oh no, not that one again.” This is a waste of bandwidth and storage resources.
Advanced content security products should be able to extract images from reports, spreadsheets, presentations and many other types of documents. Documents are often distributed using email, both internally within an organization, and externally, to customers and partners. Many of these documents contain images. For content security software to be effective, it is essential that rich documents can be decomposed to extract the images within them. These images can then be passed to an image analysis component.
Reducing the incidence of ‘false positives’
Image recognition software will never be 100 per cent accurate when trying to detect ‘unknown’ porn images. Therefore there will be false positives (innocent images detected as pornography) and false negatives (pornography passing through undetected).
Both these factors are very important to consider when looking at detection rates. False negatives can mean that ‘unacceptable’ images are delivered; false positives mean that potentially business critical mail is being held up. From a business continuity point of view, false positives can have the biggest impact. In respect to mail, business email could be held in a quarantine area because it has incorrectly been identified as containing pornography.
Some companies will wish to implement a policy temporarily or intermittently. A company introducing content security or refining its policy may start by simply monitoring to get an idea of what the current situation is. This monitoring policy may deliver all emails, but gather information that will help with the implementation of policy.
Pornography can be detected in images by examining a variety of image attributes such as shape, color, tone gradients, position, body part recognition, etc. This type of image analysis is processor intensive. It is possible to create image signatures of ‘known images’ and use this as a method of blocking known unacceptable images or passing known acceptable images. This type of image processing is very fast.
Performing sophisticated image analysis on images will add a processing overhead to all messages containing images that can be processed. This overhead may affect other processes on the host and delay email. Combining image analysis with known hash comparison is much more efficient. For example, known company logos can be added to the list of ‘acceptable images’ to avoid false positives and reduce processing required.
Keeping good images in and bad images out
Many companies may already have some form of content security, but few content security solutions have the ability to manage images. There are many options for implementing policy for images. These include monitoring images in email; deleting all inbound and outbound mail containing images; removing images from all email, but passing the email on without the images; manually validating all email containing images; manual validation with an ‘acceptable list’; and using an automatic process to validate images in emails based on analysis of the image content.
The ideal solution would combine monitoring, automatic analysis, manual validation and validation from ‘acceptable’ and ‘unacceptable’ lists. Automatic analysis performs most of the validation. This validation can be manually reviewed and corrected. The ‘acceptable’ and ‘unacceptable’ lists are used to correct the automatic analysis and reduce the processing administration workload. Once an image has been manually validated, it can permanently added to a pre-classification database (as ‘acceptable’ or ‘unacceptable’) and will not need to be manually validated again.
Problem images not identified during automatic analysis can be added to the ‘unacceptable’ database, as can images that are confidential or intellectual property. The database will contain only the image fingerprints, not the images themselves. Once a problem image is found, it should be very easy to block emails containing the images. Fingerprints can be added from monitoring and, over time, the pre-classification database will contain many image fingerprints, making the solution accurate and efficient, and reducing costs.
Paul Rutherford is chief marketing officer for Clearswift (www.clearswift.com).