All along the watchtower
Most information security teams run an abundance of security detection tools. A recent report by the Cloud Security Alliance and SkyHigh Networks says that 50% of organizations are using six or more security tools that generate constant alerts. Between antivirus, firewalls, IDS/IPS, SIEM, endpoint detection, and much more, there’s a lot to manage on a daily basis. No one wants to miss an important alert and end up as the next Target, but the consequence of so many tools is dealing with an overwhelming number of alerts, and ultimately, alert fatigue.
Security teams can see hundreds or thousands of alerts per day—tens of thousands per week, millions per year—and all but the best resourced companies have security staff pulling double duty, meaning that triaging alerts is not their sole responsibility. Even companies that are able to staff a full team of dedicated analysts are unlikely to employ enough experienced people to look into every alert. Depending on which security survey you read, between 60-80% of alerts are false positives, and within the remaining 20-40% that may be indicative of a problem, about 40% of alerts lack the context that allows analysts to make informed, actionable decisions or recommendations.
There must be some kind of way outta here
The result of this abundance of tools which produce accompanying alerts, so many of which are known-benign, is desensitization. There are so many “needs attention” tasks on the average security practitioner’s plate that we’ve become somewhat apathetic when anything but a “critical” alert is triggered. Even among “high” or “severe” alerts, its’ not uncommon for one to be missed or dismissed by a busy security team. A 2016 article from Hexadite says that “only 1% of ‘severe’ security alerts are ever investigated,” and a 2014 post from Carbon Black (we’re going back into the archives a bit here) says that “missed alerts are normal and happen to everyone.” Normal and happen to everyone? Maybe true, yet when that missed or skipped alert becomes a live incident, very few organizations’ management teams are going to be likely to say, “Oh well. It happens to everyone.”
What are security teams to do, then, when the output from the profusion of detection tools becomes prohibitive? No one wants to be the person who fails to act on a critical alert, but hiring more bodies to manage alerts is not necessarily (or feasibly) the answer.
Said the joker to the thief
Automation and Orchestration
With plenty of buzz around machine learning and artificial intelligence, some professionals are starting to worry that “outsourcing” job responsibilities to technology is a risk they don’t want to take. When it comes to triaging alerts, however, automation and orchestration are no threat. Some of the next-gen detection tools on the market allow security analysts to parse vast amounts of data quickly, but the ultimate analysis portion still needs to be done by a human.
Automating the collection, categorization, and correlation of data from sometimes-disparate systems is an undertaking no security pro wants, nor is it likely that, given the amount of data churning from systems, very many teams are realistically able to handle what they’ve got. For the sake of efficiency, various tools should be connected to reduce alert redundancy and streamline detection, response, and remediation. Doing so allows the security and/or operations team to simplify tool management (and corresponding alerts). Once orchestration is achieved, automating alert triage based on company-specific rules frees up analyst time, allowing team members to focus on top-tier alerts and, without significant worry, ignore those which do not bubble to the top.
Source Validation
It may sound overly simplistic, but operations and security teams need to know which sources of alerts are trustworthy. Commercial detection tools try to be a little bit of everything to everybody, producing bountiful alerts so that no customer can point a finger back at the vendor when a mega breach occurs because a critical alert wasn’t reported (at all or in a timely fashion). No tool aims to be overly noisy, but reducing the noise, and thus the number of false positives and unactionable alerts, requires security and ops teams to learn which sources are reliable and which ones produce larger amounts of less-useful information.
Needless to say, whatever tools security teams are buying “off the shelf” need to be configured to the organization’s specific environment. (Further, if the tool does not allow any customization, it should not pass the PoC stage.) Unfortunately, many security tools are purchased based on larger-than-life marketing promises and don’t receive the proper care and feeding that is mandatory for them to be effective in anything but the most elementary environment. When this happens, these tools are prone to throwing off copious alerts, adding to alert fatigue.
The solution is knowing one’s environment, understanding tool capabilities, and regularly tuning controls to match the desired outcomes of the organization.
Baselining
Along the same lines as knowing your toolset, security teams have to understand what’s normal and what’s acceptable in the organization’s environment. For some companies, seeing a high amount of traffic from China or Russia in the middle of the night U.S. time would signal a red flag. For others which conduct legitimate business and/or have partners or employees in those regions, seeing one hundred connections at 2:00 AM ET is no big thing. Remember the case of the hacked Illinois water pump facility? This was a prime example of an organization jumping to conclusions without properly investigating an event or having additional data on hand to support or disprove the incident.
Baselines are about standard operating procedures, establishing usual or “normal” activity, and identifying when behavior becomes anomalous. Applying an industry standard or other type of framework to your environment won’t accomplish the task of proper baselining. The only way to have a baseline—and thus understand when an action is suspicious—is to internally develop baselines then check and re-check them from time to time as situations and authorized parties fluctuate.
Contextualization
A frequent complaint from security pros is the lack of context included in alerts or threat data. Because the typical organization produces hundreds or thousands of alerts every day, checking into each one is impossible, but the desire to be diligent leads security/ops staff to feel overwhelmed. Alternatively, after chasing down a high number of red herrings based solely on a “severe” or “critical” categorization, security pros may become desensitized to alert ratings, leading them to ignore or put off investigation.
Alerts are just one piece of the security puzzle. Security teams need to enrich data with external sources (e.g., OSINT or HUMINT) and internal telemetry from other network detection tools. This information will color in the spaces where context is missing and allow security analysts to take the most appropriate actions and provide the best recommendations to the organization.
Orchestration and automation is a big piece of the contextualization puzzle, as having a “single pane of glass” helps analysts reduce the signal-to-noise ratio by applying (some) enrichment before an alert is triggered, and consolidating workflow.
There’s too much confusion
It’s unlikely that security teams are going to see fewer alerts over time since the amount of data organizations collect continues to grow, as cybersecurity is slowly becoming a top-line business risk, and as the threat landscape multiplies, seemingly day by day. Because very few organizations will be able to staff the SOC with hundreds of analysts, managing alerts requires a streamlined and strategic approach. Haphazardly responding to every “high” level alert is not tenable. However, orchestrating disparate tools and automating alert triage, learning your trustworthy sources, baselining network and authorized user behavior, and enriching alerts with contextual information will allow security and ops teams to handle critical alerts more seamlessly.