Crying wolf: Combatting cybersecurity alert fatigue

Not only must security pros contend with ever-increasing attacks to their networks, they also must finagle the tool sets guarding their systems to make certain settings are as they should be, reports Greg Masters.

No wonder alert fatigue has become an unwelcome part of the mitigation process. With red lights constantly triggered – could be a legitimate intrusion, could be a false positive – IT security administrators, charged with keeping the data flowing without malware or any other pollutant getting into the operation, face a formidable obstacle.

At the top of the list is desensitization – with so many bells and whistles going off, how is one supposed to remain alert to what is truly necessary? The point is, security personnel can grow wary of the notices their equipment is throwing back at them, as the alarms go off so frequently that the humans monitoring the systems can only handle so much. In other words, so many of the notifications are set off by minor infractions that they lose urgency.

The results of an April 2017 study, "A Day in the Life of a Cyber Security Pro," an Enterprise Management Associates info brief written by David Monahan for Bay Dynamics, illustrates the challenge: Respondents identified that they have to deal with a large number of vulnerabilities in their organizations. On average, 10 vulnerabilities exist per system. In fact, nearly three-quarters of security teams stated they were overwhelmed by the volume of vulnerability maintenance work assigned to them.

And, when security teams were queried about contending with threat alerts, an even bigger percentage (79%) said they were overwhelmed by the volume.

One issue is that alerting systems, such as security incident and event management (SIEM) systems, often don't come equipped with the data required for security pros to make informed decisions, the EMA study found. "This creates a situation where too many alerts are created, with the highest priority then requiring additional work by analysts to make a proper reprioritization."

Those queried for the survey said they have to manually reprioritize over half of the threat alerts they receive. Obviously, this creates more work and adds considerably to the stress factor, the report said.

Kevin Reid (left), VP of national security and CIO at KeyLogic Systems, says alert fatigue is like the boy who cried wolf: "If there are too many similar alerts that end up being empty threats, eventually IT security teams will just ignore the warnings," he says.

A great example of alert fatigue resulting in a widespread attack is the Target breach, he points out. "Leading up to this attack, the security team was consistently seeing the same, empty malware alerts, so they grew numb to the notifications and ignored the warning when there was a real intrusion."

Another challenge Reid sees is the bulk of data, which, he predicts, is only going to continue increasing. "Even if an alert isn't received, security teams get a report of abnormalities within their system that need to be analyzed. However, as the amount of data grows, so do these reports, making them more difficult to examine for threats.

Security teams can have all the best tools available, but if they aren't being implemented correctly, their networks are still at risk, Reid says.

Alert fatigue is the threshold at which it becomes too difficult for a security analyst to recognize the important alerts from the stream of everything that they receive, says Maxine Holt, principal analyst at the Information Security Forum (ISF). "Analysts must review each alert to decide if it really is suspicious or another ‘false positive,' when they all appear similar at first."

When magnified by multiple systems/software delivering alerts, it quickly becomes apparent that there are so many alerts it is difficult to see the ‘true positives' among the false ones, Holt says. "Furthermore, some alerts require aggregation before the combination can be confirmed as a ‘true positive' and potential business impact assessed."

When a security analyst suspects a true positive there are a number of processes to follow to initiate an information security incident, Holt adds. "Some analysts are concerned about making mistakes if they report too many alerts, leading to a potentially costly mistake if a true positive isn't dealt with in a timely manner."

For Dan Lohrmann, CSO and chief strategist at Security Mentor, alert fatigue challenges can be grouped into the traditional buckets of people, process and technology: The people part of the pie involves the long hours doing the same role and functions, he says. Along with that comes improperly trained staff with not enough experience or not knowing how to use tools.

The team issue begins with weak partnerships, where communication is lacking to/from/between all levels of management. Caffeine or drugs are sometimes used to help overcome tiredness or provide more attention, but these ‘solutions' can sometimes lead to health issues, Lohrmann (left) says.

It all factors into becoming desensitized to important alerts causing an increase in response times or missing items.

With process issues, Lohrmann points to an improper distribution of workload and/or alerts, an improper categorization of specific alerts types, and problems with alert levels or classifications, such as too many high level alerts.

He says that the notification and escalation process is flawed. Proper help may not be available when needed or management may be called in too often – thus causing issues when real problems emerge.

As far as the technology piece, Lohrmann points to wrong tools (old legacy, too many alerts, not enough alerts, etc.), multiple tools that don't work together, and not a wide or specific enough view of data, threats, etc. That is, lacking is a national or global perspective that comes with Information Sharing & Analysis Centers (ISACs) and other global data trend information.

Tools today produce too many false positives or incidents for the system analysts to review, Lohrmann says. We use technology to process as many of these as possible without human intervention – such as looking at system logs and network alerts being reported to management consoles.

The problem for Sam McLane, head of security engineering at Arctic Wolf, is that many security operations operate under the 'work harder not smarter' mode, which, he says, is unsustainable.

"When you have too many alerts, the solution is not to work harder to get through them all. Eventually, your staff will be desensitized, and their diligence will wane."

Lenny Zeltser (left), vice president of products at Minerva, points out that the global shortage of IT security personnel results in many teams tasked with handling the alerts being understaffed and overworked. "This factor, combined with the overwhelming number of alerts that need to be handled on an ongoing basis, creates an imbalance. In turn, many important alerts go unnoticed or are disregarded even though they could be the indicator of an actual attack."

While most shops have tools to sense threats and alert security professionals – these alerts typically lack meaningful context to understand the impact or potential impact, says Druce MacFarlane, vice president of products and marketing for Bricata. In other words, he says, security is deluged by alerts that are often technically true, but largely irrelevant.

"This requires IT security to investigate such alerts, but the volume and vectors have grown beyond the finite resource of most organizations. Consequently, some alerts start to slip and go uninvestigated."

The Sony breach of 2015 demonstrated this challenge, MacFarlane points out. "While the tools were able to identify the malicious activity, those alerts were lost in a sea of 40,000 other alerts that same month. With a limited security staff, some malicious activity went uninvestigated until the inevitable happen."

A good security team will want to collect as much information as possible about the systems they protect, says says Chris Simpson, academic program director, BS Cybersecurity, National University, School of Engineering and Computing. However, he adds, this is a double-edged sword as they collect more data than they require so more resources are needed to understand the data.

"Alerting is used to bring important information to the attention of a security team,"Simpson says. "Alert fatigue can occur when a system generates so many alerts that the operator can't prioritize or respond to all of the alerts. For example, an alert can be generated if a user has a failed login attempt, and when many of the alerts are false positive this causes the security team to miss valid alerts."

A recent survey conducted by the Cloud Security Alliance highlights the large number of alerts that organizations deal with, Simpson says. The survey noted that 2.7 billion events were generated by the average enterprise using cloud services. Of these events, 2,542 on average were anomalous of which 23 were actual threats. The survey also noted that 32% of the respondents ignored alerts due to the large number of false positives.

The Target data breach is an example of alert fatigue that allowed a data breach to go undetected, Simpson explains. "Target had the right technology in place and received valid alerts that malware was inside their system. However, because the system was new and they had received excessive alerts they were unable to properly handle the alerts. This affected millions of customers, cost Target millions of dollars and lowered consumer confidence.

Many assume the biggest and only challenge IT security personnel face in dealing with alert fatigue is the overlooked threats that gets by among the sea of alerts, says May Wang (right), CTO and co-founder, ZingBox. "Unfortunately, that is not the only damaging result of alert fatigue. Due to the sheer volume of alerts, many IT staff are forced to define their own unique criteria of what's worth the time to investigate and what is not. Organization's exposure to specific threats can vary greatly from hour to hour based on the shift of the IT staff. It can also vary greatly across organizations even when they employ the same security solutions. The inconsistent security coverage resulting from this practice can often pose a bigger risk than few threats that may get overlooked."

Unfortunately, alert fatigue will not go away any time soon, Wang says, adding that many organizations have come to expect false positives as a sign of comprehensive security coverage during proof of concept. "When presented with X number of threats, detection of anything less than X number of threats is frowned upon. However, detection of more than X number of threats, as long as the specific threats are detected, is often considered a successful evaluation. Some security vendors are leveraging this unfortunate misconception and very much focus on 'lighting it up' during product evaluations."

Tools in place

There are all kinds of tools available today, but it all comes down to having the right IT network monitoring, including security monitoring capabilities, fault management, configuration, performance and security management, says Reid. Finding the tools for your organization isn't the challenge, he explains. Many have all the necessary solutions, but are still at risk due to poor implementation.

"Parameters need to be set on all intrusion alert tools to get rid of false alarms and ensure real threats don't go unnoticed," Reid says. "This idea of a monitoring philosophy must be defined from the start, outlining thresholds and triggers so that security teams know alerts are actually alerts, not just some sort of vulnerability or malfunction within the tool."

Additionally, with the increase of data, businesses can expect security teams to capture and monitor everything, there just isn't enough time or people to do it, Reid says. "With this in mind, security tools need to be updated to leverage AI that can support the security professionals in monitoring for accurate and real threats."

Holt advocates for security information and event management (SIEM) software that is already being used by many organizations to help. "This provides security analysts with a holistic view of alerts from multiple event logs," she says.

However, depending on reach, SIEMs can be costly to deploy and complex to operate and manage, she admits. "To combat this, some organizations use SIEM service providers, which are paid for on a usage basis. However, SIEM can still result in alert fatigue."

Although it may be beneficial to outsource the identification of potential attacks (i.e. review security alerts), few organizations outsource the investigation of these alerts, Holt says. "This is because outsource providers do not generally have the intimate, overarching view the organization has over its environment."

Lohrmann adds that there are a wide variety of vendor tools that can help categorize, prioritize and assist in dealing with alerts. Also, cloud-based tools can help in this process to administer alerts. Other orchestration tools can also help move toward the “holy grail” of “one pane of glass,” he says, which many vendors promise – but no one has truly delivered.

As more organizations invest in their security programs, deploying additional tools to bolster their layers of defense, they create more visibility to the security posture of their entire network, says Nathan Wenzler, chief security strategist at AsTech Consulting. "This, of course, is most commonly done through the huge volume of events and alerts which are generated by the activity detected from endpoints to network infrastructure devices to applications and user activities."

He too points to security information and event management (SIEM) tools, which, he says, were meant to aggregate the millions upon millions of events generated and allow security professionals and administrators a way to filter through the noise to find the most important, high-priority events that needed attention. "But, even these tools can struggle to bring only the most pertinent items up to the attention of those who need it, leading to huge volumes of alerts that must be reviewed and dealt with almost constantly."

Wenzler says that now a new wave of tools promises to fix this problem, too: behavioral analysis tools, machine learning tools, predictive analysis, and so on. "However, at the end of the day, what administrators and security pros need to focus on is getting a better handle on the basics of managing their environment to create a stronger baseline for what they expect to happen."

To illustrate his point, he offers the example of properly managing administrator credentials everywhere in the environment. This should, he says, create a scenario where very few of these types of credentials are ever used, are only used from authorized endpoints by authorized users and for specific activities. "If controlled to that level, it means that any other activity by an unexpected credential or user is going to be a valid event that needs more immediate attention, and fewer alerts need to be generated for analysis by whatever tools are in use."

McLane says admins need to work smarter and not harder to combat alert fatigue. These personnel need to put in place filters and processes to create a framework that ignores the false positives and flags the real security alerts. "This requires a combination of human and machine intelligence and constant tuning."

Zeltser says it's necessary to assess whether the security architecture of the enterprise can be strengthened to stop adversarial actions earlier in the attack process. "This might involve supplementing existing defenses with a layer that increases the effectiveness of preventative controls without overlapping with the tools already in place."

If the enterprise can automatically stop more threats before they warrant a human's intervention and investigation, the organization will decrease the number of alerts and related events that IT personnel will need to handle, Zeltser says. "The result? Less noise in the alert stream, and more time for the team to dig into the alerts that truly warrant attention."

Some point to security analytics as the answer, but the challenge remains nested in the source data being feed into the analytics tool, says MacFarlane at Bricata. "As the saying goes, garbage in, equals garbage out."

What security analysts need, he says, is context around the alerts. Context can provide two important aspects when trying to identify which events require the greatest degree of attention.

First, MacFarlane explains, additional context can help prioritize; that is, differentiate the “technically true but largely irrelevant” events from the critical events. As an example, he says that if one identifies a malicious Windows executable downloaded to a Linux or OSX PC, it will probably be a lower priority as it has decreased chances of compromising the threat target.

MacFarlane's second point is that additional context also helps provide valuable information needed to correlate alerts from one's complete ecosystem of security solutions. "For example, imagine an analytics tool that could identify cancer, but the only attribute data being fed is biological gender. The tool might conclude men are more likely to get cancer than women. However, if you start feeding the tool additional attributes – diet, exercise, tobacco use, and family history – the analysis gets a whole lot more accurate."

This is what IT needs in cybersecurity, he emphasizes, a way to look at the same threat from different perspectives in order to understand context. The more data one has about each alert, the more information is available to correlate and paint a larger picture of the problem.

"When equipped with these different perspectives, security alerts are enriched with the most contextually relevant information around assets, attacks, attackers, attack campaigns, targets, exploits and other attributes that analytics tools can slice and dice to separate important alerts from the noise," MacFarlane explains.

Improving accuracy goes a long way to lessen the possibility of alert fatigue, says ZingBox's Wang (her company's focus is to secure IoT devices, such as connected medical devices, manufacturing robots, HVAC systems, surveillance systems, etc). "By using behavior analytics, we can determine if a particular device is compromised or being attacked. To increase the accuracy of detection, we employ patent pending three-tier profiling."

She explains that many people assume the detection of a type of device, such as an IV pump along with its manufacturer and model number, is sufficient to model the behavior of the device. However, these two parameters assume that the device is used in the exact same manner and frequency in all deployments.

"By factoring in the usage behavior of individual devices, organizations do not get alerted when their IoT device behaves abnormally as determined simply by the device function," she says. "Instead, the organization only gets alerted when their IoT device behaves abnormally compared to how the device had been used by that particular organization."

Strategies and processes

To manage the challenges, Reid at KeyLogic Systems says that from the continuous monitoring perspective, IT professionals should establish a baseline behavior of activity for which they're monitoring. This will help flag when activity steps out of the norm, and prevent false detections that often occur during “normal behavior.” Not having an established baseline from the beginning is a common theme among companies that experience breaches, he says.

"Companies looking to combat alert fatigue also need to have an escalation matrix," Reid days. For instance, if there is a minor threat, an admin will receive the alert and can vet and handle the issue. If there's a larger alert, the notification moves up the chain to security personnel. And, then, if there is concern of a major breach, IT ports are shut down. "This helps reduce the number of alerts each individual is receiving and therefore frees up time for each group to look at threats applicable to them."

Most importantly, though, Reid says, is there needs to be some sort of reporting at every layer of a company's infrastructure – from network to IP and even physical. "If you are just monitoring and reporting activity on the top layer and a hacker comes in from a back door, you won't see the threat until it's too late."

Ultimately it comes down to security best practices, he says. "If your organization has the right processes in place and is following cybersecurity rules of thumb, there will be less vulnerabilities and ultimately less alerts. From there, if layers of monitoring are put in place and roles are clearly defined, the appropriate parties can look at the most important alerts and hand off the rest – reducing fatigue."

As with many repetitive tasks, Holt at ISF says it is better to have multiple people reviewing alerts, to help pick up the true positives. "Also, rotating people around different information security jobs can help, so they are not reviewing security alerts month after month. If possible, it is worth varying the ways in which alerts are notified, as humans are very good at focusing on changes."

Frequently, she adds, alerts require finetuning to turn down the noise from false positives. This should be undertaken regularly, particularly when it becomes clear that false positives are increasing in quantity.

Finally, she suggests admins consider offering incentives to security analysts for correctly identifying both false and true positives. "This can help not only refine the skills of a security analyst but also protect the organization from a true positive being missed."

For McLane at Arctic Wolf, a key strategy is to not overly rely on technology to solve this problem. "Security technology is great when there is a known signature," he says. "Any security framework that does not involve a smart security expert looking at the data is bound to fail."

There are many process challenges, says Security Mentor's Lohrmann. "You need to address each of these with best practice and detailed operational guidelines that gets everyone on the same page," he says. Also, he says that new tools can automate the processing of incidents and take action on many alerts, thus reducing the number of items that analysts need to see. "Use technology to process as many of these as possible without human intervention – such as looking at system logs and network alerts being reported to management consoles."

There are many other fundamental pieces of a security program that, if well-managed, can also reduce the number of less important or even truly false positive events that ultimately require humans to review and take action upon, says Wenzler at AsTech Consulting. "Things like strong patch management programs, secure coding processes and security assessments for application vulnerabilities, and enforcing consistent least use privilege access concepts across the board will collectively decrease the number of potentially malicious events that would need to be reviewed."

These strategies make any alerts triggered by the various monitoring systems much more likely to be a true positive event that requires attention, requiring less time and human resources to review, sort through and determine if it is a priority or not, he explains. "As time goes on, many of the machine learning or behavioral analysis tools will get more sophisticated and find more accurate programmatic ways to filter through these events. But until then, dialing in the fundamentals that can reduce false positives is an ideal way to reduce the fatigue to admins and security personnel who have to sift through the events and alerts generated.”

Alert fatigue is a global problem across an organization's entire security ecosystem, says MacFarlane at Bricata. He offers two key ways to reduce the strain with regard to alert quality and security analytics.

First, security should strive to improve alert enrichment. "This means implementing process and tools that include as much information as possible into each alert to allow analysts to rapidly triage and prioritize each alert as rapidly as possible."

Second, he says, most security organizations use a SIEM as a primary interface, or other solution that combines and correlates alerts into a single pane of glass. "The goal here is to provide the best source of data – through best of breed tools – to that SEIM solution to provide the information necessary to produce better correlation for other tools that work with that SEIM as well."

The tools put in place need to be flexible enough to support the process because these things must work together, says MacFarlane. "For example, given the dynamic nature of threat intelligence, this means cybersecurity policy management must be easily adaptable." What that must lead to, he says, is offering a simplified way to manage policies and the associated workflow, which in turn directs finite attention on the real threats."

In order to avoid alert fatigue, organizations needs to put the right technology and processes in place, says Simpson at National University.

"Build context into the alerting process," he says. For example, alerting on every failed login attempt isn't practical due to the large number of automated logins that systems receive when connected to the internet. This data should still be collected, he says, but only alert if there was a success login attempt from an IP address that was unusual (i.e. outside of the country or outside of the company network).

Simpson also advises that IT security administators streamline and source alerts. "Assign team members to specifically monitor alerts and source them to another team member designated to handle that specific type of alert. Clear communication and division of tasks can help reduce alert fatigue from one individual responsible for receiving, assessing and responding to each alert."

Zeltser at Minerva adds that when evaluating security products, pay attention to the amount of noise they will contribute to the workload of your IT security staff. "For instance, some products will typically generate more false positives than others and, therefore, will contribute toward alert fatigue. In addition, be sure to tune the product or the way in which you filter its output to avoid meaningless alert that consume time and energy without providing useful and actionable information."

Combatting alert fatigue and enhancing security accuracy is an on-going process, says ZingBox's Wang. "To continuously improve security accuracy, ZingBox encourages its customers to provide additional details regarding the IoT devices discovered in their network as well as how they use such devices on a regular basis."

By leveraging crowdsourcing of all sorts and enabling processes for its customers to easily provide these insights, her company's solution continuously improves its accuracy from both product innovation as well as customer use cases.

When it comes down to it, Reid says that if admins don't establish multiple layer monitoring upfront, they are going to have alert overloads. "If you do a poor man, quick implementation, which many organizations have done, you are going to receive a lot of alerts and you will pay for it down the road."

Automation will continue to have an impact on combatting alert fatigue, says Holt. "Repeatable and mundane tasks can be performed quickly via automated systems and on a much greater scale than a fallible human analyst."

There are, however, limitations, in that overreliance on automation can lead to problems if simple automated analysis is applied inappropriately to more complex tasks, she says. "Simple aspects can be automated, but the more complex aspects requiring judgment cannot. For now, technology should be used to support and enhance human activities, automating only where appropriate."

Alert fatigue can happen in many areas beyond cybersecurity and technology roles, says Lohrmann, pointing to the many case studies and solutions offered in medical and other professions. "Solutions in other fields may help you gain a wider perspective on this topic." n

No vendor can solve the cybersecurity challenge alone and no customer wants to be entirely dependent on a single provider, says MacFarlane at Bricata. "As a result, CISOs will increasingly demand integration and interoperability among the vendors in their security ecosystem."