The Case for Behavioral Analysis

Behavioral detection relies on observing the execution of a program, the sample, and inferring malicious intent based on those observations. This is usually done in a contained and instrumented environment like a sandbox, but can also occur during real time execution on a real end point host. Most network security solutions do not require the ability to observe program execution on each end point they protect and rely only on an embedded sandbox to extract breadcrumbs of execution known as traces during program execution. There are several types of sandboxes that could perform partial emulation, full system emulation or true execution with binary translation. For the sake of simplicity, this article will not discuss the differences here since all end up with some number of traces that the solution needs to analyze.

Based on the observed traces of execution, a security solution will apply any number of methods to infer malicious intent. The goal of most methods is to render a verdict: the sample is either safe or malicious. Some methods will shy away from rendering a verdict and provide a probability of maliciousness, shifting the responsibility of setting a malware verdict threshold to the user.

In contrast, static detection methods do not require to simulate the execution of a sample to infer malicious intent. They rely on the extraction of static attributes from the sample and application of some analysis method to render a verdict. Again, the analysis could be heuristics rules based or machine learning, just like with behavioral analysis.

In order for us to appreciate the complexities either approach has to beat to be successful, we will give two examples.

The first example is armoring. Armoring is a set of techniques implemented by malware authors to thwart any attempt at automated behavior analysis and therefore prevent the security solution from issuing a Malware verdict. There is a wide range of such techniques; some apply randomness: the sample will only execute its malicious code if the host computer has been up for some period of time. Or a compromised web site will only deliver malware to one in a thousand visitors. But most malware will rely on detecting an analysis environment and either do nothing (exit) or perform some harmless activity to throw the analysis off track. The quality of a behavioral detection solution will depend heavily on how it counters some of these armoring techniques. This doesn't mean the malware needs to be duped into executing its malicious payload. After all, how often do you see a legitimate application try to detect the presence of a debugger and exit if it thinks one is present?

The second example is packing. This technique is implemented by malware authors to avoid detection by static analysis solutions. Packing is the process of packaging an executable payload inside another executable whose only role is to install the inner payload. This technique gives malware operators the ability to store the malware payload in a way that makes it virtually unique and hides all static attributes of the malware. One can no longer rely on hashes or string patterns to identify malware since each new variant is basically a new binary data blob. That said, recent research has shown that even in these cases, some attributes remain detectable by deep learning methods when packing does not involve encryption. Otherwise, all bets are off.

Some malware relies on known packers for which AV companies have developed un-packers (in the tune of a few dozen packers) but the more sophisticated threats would use custom packers for which no known unpackers exist.

Most malware goes after the mass market: it's a numbers game. As long as the authors build a bigger botnet, or infect a large number of endpoints with ransomware or key loggers, they couldn't care less whether the malware is detected by a behavioral analysis solution or not. Since behavioral analysis tends to be more expensive to implement and to operate, it is usually not present on end points, especially non-enterprise end points. This drives the vast majority of malware to employ techniques to defeat static analysis tools, but only the more sophisticated malware will attempt to thwart behavioral analysis tools.

Case-in-Point – Real-Life Examples

Let's analyze how static Anti-Virus engines and Cyphort's behavioral detection engine fair with some relatively well known malware families.

We will use the Trojan Dynamer as an example. Dynamer has evolved from a run of the mill Trojan to a fairly sophisticated banking Trojan. It is capable of downloading modules to update itself and perform new tasks.

Here is a table that shows the progression of AV engines detection on VirusTotal from the first time a sample of this family is uploaded to the last scan some time later.

This example illustrates the fact that over several days, a pattern of detection emerges: As a new sample of this family is discovered in the wild, AV engines struggle to detect it on the first day, then catch up 24h later as they develop new signatures. But the very next day, a new sample is discovered and detection for it is again low. This cycle keeps repeating day after day.On the flip side, a behavioral analysis that is well implemented has no problem consistently detecting all these variants. This is because the behavior of each sample when executed in a sandbox remains largely the same. Malware authors have no interest, nor time, to vastly change the behavior of a malware family every day, but it is trivial for them to change packing technique or encryption parameters to create new samples that evade signature detection. So as long as the sandbox is well implemented and the analysis of its traces uses a solid method, detection of variants causes no problem.