Taming your "unknown unknowns" through network traffic analysis
Taming your "unknown unknowns" through network traffic analysis

Former Secretary of Defense Donald Rumsfeld once remarked, "There are also unknown unknowns -- the ones we don't know we don't know."

Although Rumsfeld was not talking about securing a network, the point is still a valid one. Nowadays, most organizations have perimeter defenses and some form of network monitoring. Unfortunately, many organizations look for traffic on their network using only static, signature-based alerts. 

For example, many organizations run intrusion detection systems (IDS) or intrusion prevention systems (IPS).  Unfortunately, IDS/IPS only looks for traffic it knows, or is told, by us, to look for.

In other words, organizations are looking for known threats that might be on their network -- "known knowns."  This is extremely important and must be done, but how can an organization find the threats to its network that it didn't know it don't know about -- the unknown unknowns?

Network traffic analysis is one way of getting a handle on this question. Studying and analyzing the traffic transiting the network can help us better understand the network, which, in turn, can help us identify unknown unknowns and turn them into known knowns.

I consider network traffic analysis to be a toolbox of methods that together can be used to understand the traffic transiting a network. The approach I use is similar to the Unix philosophy -- a number of simple tools that together can be used to accomplish some very powerful things. 

In the Unix world, one can pipe together two or more simple commands to produce a more sophisticated output. For example, by piping ls -1 to wc –l [ls -1 | wc –l], one can very easily count the number of files in a directory.  This is a simple example of how combining two simple tools can create a more powerful tool that can be used in scripts and elsewhere. 

Similarly, in the network traffic analysis world, one can pipe together two or more simple queries to produce a more sophisticated query that is more likely to yield analytically actionable results. For example, instead of searching network flow data for just an IP address, one can search that same network flow data for an IP address communicating on unusual/unexpected protocols, which will yield more interesting and actionable results. 

Since most networks are extremely complicated, I essentially treat the entire network as a black box. I seek to interrogate the data through queries specially crafted to exploit certain nuances of the data on a quest for the unknown unknowns. The point of this approach is to look at the black box that is the network from many angles -- which I call jumping-off points -- to ascertain the best picture of what is truly going on inside the box. 

Jumping-off points provide a tangible starting point for network monitoring professionals to latch onto suspicious activity and analyze it until ground truth is reached.

This tangible starting point makes all the difference in my experience. With this approach, an interesting phenomenon occurs. As time moves on, one begins to know the network through an iterative process of learning and interacting with the data. 

I've seen this occur first hand in multiple different security operations center (SOC) settings. Why is this important? Because once you understand what belongs on your network, you can begin to look for the opposite of that behavior.

Naturally, knowing what is normal for your network is easier said than done. Getting to a point where you're comfortable identifying what is normal and what is anomalous or abnormal can take some time and is an iterative learning process. 

I've been fortunate enough to have spent more than a decade analyzing traffic transiting real live networks. My experience has caused me to identify useful methods for analyzing vast quantities of network traffic data. In my experience, an organized, well-structured approach to network traffic analysis goes a long way toward tackling what is a very difficult challenge. 

Finding a way to make sense of the vast quantities of network traffic data collected on operational networks is a key component of a successful network monitoring program.

Why guess at what your network is actually doing?  Analyzing the traffic transiting your network can give you a firm and scientific understanding of what your network is up to. 


For more insights by Josh Goldfarb relating to analytical methods, check out his personal blog.