Breaking the Security Logjam

Log files are an important security resource containing a wealth of information.

Used effectively, they provide benefits like surveillance, attack detection, prosecution support and damage assessment, as they act like a security camera on your network. Recording events from servers, applications, routers and firewalls can catch troublemakers in the act and show historical trends to help prevent future misuse. Detailed, historical log data also complements and significantly raises the value of systems that centralize and process alerts.

Host and network intrusion detection systems (IDS) are infamous for their alert volume that includes both legitimate alerts and false positives. Even if an IDS vendor had the perfect mechanism to reduce false positives to zero (don't hold your breath) most enterprises would still be overwhelmed with the volume of legitimate alerts. It is not uncommon for a large enterprise to generate terabytes of useful log data in a week.

The problem with logs is that there's so darn many of them. Issues with centralization, normalization, and processing the huge volume of log data are what we call the security logjam. To break the logjam you have to understand and overcome these issues so that you can effectively leverage the valuable information within your logs.

Security information management products

Security information management (SIM) products, also known as security management consoles, are one way of dealing with the high alert volumes and false positives that distract administrators. SIMs aggregate data from various sources, including intrusion detection systems, firewalls, routers and servers, and filter the events to detect patterns of misuse.

Most SIMs offer event correlation that can tie multiple events together to detect suspicious patterns. SIMs are usually configured out of the box with policies for filtering, correlating and prioritizing alerts. These policies must be modified to address your enterprise-specific network traffic and business rules. To an internet service provider, a barrage of SSH connections from various IP addresses outside the company could be a sign of a healthy business. For a bank, though, it could be indicative of a problem.

Due to scalability issues, there is a limit to the number of incoming events that can be processed by a SIM. This means many enterprises will be forced to filter events even before looking at them; and most will not maintain a history of security alerts, and relevant network, host and application events. Lacking access to this historical data will leave enterprises vulnerable and suffering from the following weaknesses:

Vulnerable to 'low and slow' long-term attacks. Sometimes, what real-time SIMs miss is far more important than what they catch. Once attackers break in, they will usually establish back doors that avoid most monitoring systems and return repeatedly to your network. They will both put the security of your data at risk and increase your liability as they use your network as a launching point for attacks into other companies and networks.
Poor damage assessment. An IDS is like a watchdog barking. Human operators still need to inspect individual computers on the network manually, looking for compromised databases, missing files or other changes. The dirty secret at many security operations centers is that operators often have little information with which to respond to the security events. Many times, they find archived information so cumbersome that they don't complete a thorough investigation.
No idea what's normal. Without detailed historical log data, it's impossible to accurately determine what's normal and what's anomalous on your network. Operators must often configure baselines and thresholds for their alerting systems by guesswork, setting them too high or too low and, as a result, missing important company-specific patterns that should be implemented into signatures for monitoring.
Poor behavioral analysis. A social engineering attack that results in a successful login with a legitimate account is not going to set off any alerts in an IDS. However, the tracks are still there and may be evident though behavioral analysis of database query logs. Regular reports on things like new account creations and database access to look for usage outside normal business patterns can call attention to an intruder. However, this data exists in the detailed logs that are currently ignored due to volume.

Many security managers are aware of these weaknesses, yet the sheer volume of event logs on modern enterprise networks leads them to filter or discard this essential information.

Detailed logs complement real-time monitors

A recent poll of information security professionals at Global 2000 firms found 80 percent of them consider more flexible reporting on their log data a number one security concern. Financial and government networks, which often have the most to lose to break-ins, are starting to discuss ways to use detailed log histories to better secure their operations. Logs are starting to be used as the critical security resource they are as opposed to the albatross they have been. Effective use of log analysis reaps the following benefits:

Improve SIM effectiveness using trending information. A detailed log archive lets you understand what is normal on your own network. By identifying historical trends, you can establish baselines against which to look for anomalies. Historical log analysis eliminates the guesswork from understanding your organization's specific requirements.
Detect long-term attacks. Intrusion detection systems watch for specific events at a point in time. Many attacks happen over days, weeks, or even months. Suspicious long-term patterns can be detected, such as multiple attempted logins with the same username on multiple computers, or an abnormal percentage of connections from a geographic area. Log data can be used to identify these suspicious trends.
Tip-off. You can't watch all of the people all of the time, but you can record their activities in your logs to identify those who may require closer monitoring, and prioritize their IP addresses or usernames.
Damage assessment. Intruders may have created back doors in your network or left other weaknesses. Damage assessment is about determining the extent of the compromise in an economical way. Consider that if you have 50 machines, and two of them are compromised, it will probably cost you more to determine if the other 48 are compromised than it will to recover from the two you've identified. Detailed log analysis can significantly reduce these costs and provide the information you need to contain and repair the damage.
Prosecution support. When you catch a hacker, logs provide the most detailed evidence possible of his or her activities. You'll need this information to prosecute your case against the culprit.
Manage high-volume traffic and attacks. An attack often causes a flood of events and alerts faster than they can be handled. Archiving events in detail lets you take a proactive approach to attacks, rather than simply reacting to the attacks.

Finding and fixing the logjam

You'll recognize the logjam by three telltale signs. This is what they are and what you can do about them.

Are you throwing away data you want? In an ideal world, you would store every bit of log data and throw away nothing. While this may be an unrealistic goal, you may be able to capture and keep more of the event data from your network and its hosts than you do now. Reconsider the value of specific logs or fields within them, and look for ways to store them efficiently and cost-effectively.

Is accessing your logs so hard no one does it? When an alert comes up, do your staff investigate all available data for signs of intrusion or damage? Often, the information to know with certainty is on one or more servers, but digging it up and sifting through it is too complicated and time-consuming. If your administrators must manually find, unpack and correlate different format logs from different sources every time they need to create a report, they may look for reasons to avoid the task instead. Normalizing and centralizing logs will help alleviate this issue.

Does it take a long time to run a report? Stuffing logs into a relational database is not a scaleable solution. Because log data is read-only and sequential, there's usually no need to introduce the overhead of a relational database management system for it. Whether you use products or homegrown solutions, look for techniques that use effective and scaleable methods.

The first step to breaking the security logjam is an honest appraisal of which detailed log data would complement your SIM system. Then, you need to look for cost-effective ways to make that information available for reporting. Advanced compression techniques, plus the ever-falling cost of disk space and processor power, make this problem more manageable as time passes. As the log traffic on your network grows, so should your ability to take advantage of it - and break the logjam.

Mark Searle is co-founder and CEO of Addamark Technologies (www.addamark.com).