Battle of wills

Do you want to buy Vi@gra? Need to en1arge part of your anatomy? Anyone who uses email will be familiar with phrases like these, distorted to fool spam filters by marketers who want us to buy their products online.

Designing spam has become an art form, constantly evolving to beat filtering systems. The most basic way of doing this is to use "digit words" where numbers or symbols replace letters in the middle of a word. They are designed to fool lexical analysis tools that examine the word content of an email and recognize common "spam" terms.

However, as the most common digit words (such as Vi@gra) are now recognized by most lexical filters, foreign characters and accented letters are starting to appear in their place.

More sophisticated (but still common) techniques include HTML obfuscating, which is placing HTML tags that are invisible to the reader in the middle of words. For example, if a spam filter is set up to recognize the word enlargement, or digit words like it, a spammer might try to avoid this by replacing "enlargement" with "enlargement." Once the email appears on the recipient's screen, the becomes invisible so the reader will only see the word "enlargement." This technique is used to fool lexical analysis filters.

One of the techniques growing in popularity at the moment is "hash busting" – including text in emails that is not relevant to the email itself. This is designed to confuse Bayesian filters which use statistical probability analysis to identify spam trends. Random groups of words or lines of text are added to the end of emails so that Bayesian filters struggle to identify spam patterns. Sometimes, this text is nearly invisible because it is written in a tiny font.

Shifting servers and domain names to host spam image content is also becoming more common. This means that the website URLs contained in a spam email will be relocated every couple of days to a new server that has a number of different domain names directed at it. This passes through filters that blacklist URLs known to be used by spammers to host content, and it works if the blacklists are not constantly updated.

Similarly, spammers are starting to use automatic redirects. So, if you click on a URL link, you may find that you are redirected several times before finally reaching the destination website.

Where does spam go from here? The key to fooling filters is to change patterns of behavior constantly. As soon as a pattern is established it can be tracked and blocked, so spammers are always looking for new ways to get through.

The most enduring method to date is word obfuscation (those digit words), which puts obscure characters in place of letters and is quick and easy to implement. These emails are the easiest for filters to spot, so those that do get through tend to be so distorted that they are almost impossible to read. New techniques already make greater use of dynamic HTML to create spam content.

The next step, assuming that the spam email has successfully evaded filters and been received at its destination, is for the spammer to persuade users to open the email. This is mostly done by appealing to natural curiosity, greed or insecurity – maybe this really is a once-in-a-lifetime chance of winning a million dollars, shedding those extra pounds or overcoming impotence.

What many people do not realize is that, by using "web bugs," a spammer can confirm a recipient's email address before the mail is opened. Displaying an email in the preview pane is enough to send a message back to the spammer confirming that the email address is in use – and once you are on a spammer's list, the emails will not stop coming.

Get daily email updates

SC Media's daily must-read of the most current and pressing daily news

By clicking the Subscribe button below, you agree to SC Media Terms and Conditions and Privacy Policy.