Why should anti-virus products employ heuristic detection?

Once upon a time, in a galaxy not so far away, there were only a few computer viruses. Little more than a curiosity to most, they were relatively rare, slow spreading and in general, more of a mythological threat than a real one. The initial anti-virus products that were created to deal with these were perfectly adequate; updated once a month or so, the customers could be fairly sure that they would be protected. Fast-forward to today’s interconnected and always-on world, and the scale of the problem is orders of magnitude above where it was back in those early days.

New viruses can infect hundreds or thousands of machines within hours or in cases such as the SQL Slammer worm, mere minutes.

Traditionally, anti-virus programs are reactive; they work on the basis of scanning for things they already know about. They can't detect a new virus until they have been given a "signature" for it. AV companies work as fast as possible, but inevitably the process of providing these updated signatures can take anything from a couple of hours to a couple of days. This delay period is generally referred to as the 'window of vulnerability' for obvious reasons.

The anti-virus user who is relying on a reactive, update-based, system to detect such malware has to hope they're using a product that will detect everything within an hour of its discovery. There is usually a grace period of around two hours before a worm really starts to spread fast; this is the optimum time to update the product before the user is exposed to the worm. However there's no guarantee that they wont be one of the unlucky ones who sees it first and gets infected within the first couple of hours.

Of course, with very aggressive viruses, there will be an even shorter timeframe for updates to be made available, while all that time the window of vulnerability is open wide.

There are various sites which track virus outbreaks and statistics; one of the more comprehensive is Virus-radar (https://www.virus-radar.com). Such sites can be used to demonstrate the speed with which email borne worms explode into the wild. With some worms, it has been possible to track the first trickle of infections, and then the flood of mail as critical mass is reached (usually around the two hour mark).

Because closing the two-hour window of opportunity is so important, a really robust defence against attack means that users need something more than the traditional, reactive model of update signature-based anti-virus scanners.

Enter the advanced heuristic scanner.

Heuristic analysis is a predictive method of malware evaluation; it attempts to decide what a piece of code is going to do, and based on that, makes a decision whether it is undesirable or not.

Heuristic methods have been around for quite sometime, but the implementations in many cases have not been very successful or reliable. However, this is beginning to change and there are now a small number of virus-scanners that employ advanced heuristic methods for analysis of new files. The best of these use some sort of virtualisation, or 'sandboxing' technique. This allows the file being scanned to run in a secure environment, which appears to be a regular PC. The scanner can then determine what the program is attempting to do, and make a decision as to whether it is a virus or not.

Advances in this kind of technology mean that the customer using a product combining a heuristic approach with signature-based scanning has far more robust protection against new viruses.

Well, that's the good news; most antivirus companies have woken up to the threat posed by today's rapidly spreading malware, and are implementing some form of heuristic detection method. However, good news or not, the need for signature-based detection is still far from satisfied. There are two very important reasons for this. The first is that heuristic analysis is not an exact science; the very best products can detect around 90% of new viruses heuristically, while others only detect a far smaller proportion, so it is imperative that updates continue to be released and anti-virus products consistently updated.

The second reason is one that is often overlooked by those who predict the death of signature-based anti-virus (something that has happened on a regular basis since the inception of the antivirus industry). Signature-based detection is still usually required to precisely identify a virus or worm so that it can be cleaned correctly, without damaging the host system.

The lack of an exact identification method has always been a major disadvantage of a product that relies entirely on generic detection (as heuristic analysis necessarily does), especially if the system is compromised by malware that it did not detect. The nature of some modern worms is such that a heuristic approach to cleaning is becoming possible, but in many cases, it remains important to have exact identification.

Another disadvantage of the heuristic approach is the increased possibility of creating "false positives" that is, deciding that a legitimate file is malware. This can have a range of negative effects; from simple annoyance, right through to system instability or, in the worst case, data loss. In most cases, less aggressive modes of heuristics are used on the desktop, and the more paranoid settings reserved for scanning email and web traffic.

In many cases heuristic scanning can have a significant effect on system performance. Because the most advanced heuristics-based anti-virus programs use some sort of virtualisation or sandboxing technique there is a significant overhead in terms of performance. Taking into account the constant increase in speed and power of modern systems, this performance hit becomes less and less significant, particularly when using a well-optimised and high performance scanner.

It may be that the traditional anti-virus scanner, based only on signature updates, will eventually disappear completely in favour of products that have strong heuristic engines, but anti-virus companies will still be required to provide signature-based detection for the foreseeable future. It seems most likely that heuristics-based scanners will be used more and more to protect the entry points, email, web, ftp, IRC, IM etc., to help prevent the initial infiltration and the traditional scanner will remain firmly in place as the last bastion of good desktop protection.

Andrew Lee is CTO at Eset Software