Inside New York City's Cyber Command. A new feature from malware scanning site VirusTotal is designed to help Security Operations triage security alerts for false positives. (Credit: New York University)

VirusTotal is seeking to curb the scourge of security alert fatigue by rolling out a new feature called Known Distributors, which allows developers to identify the provenance of submitted files back to their original company or product line.

In a blog, threat intelligence strategist Vicente Dìaz said many people already use VirusTotal for automatic security telemetry enrichment, and VirusTotal has “seen how the inclusion of AI/ML in detection engines has led to more false positives, and, most importantly to increasing lack of context.”

“Many detections these days do not include any malware family/toolkit label, and since they are ML-powered, the analyst is provided with no additional information beyond a red flag, which in some cases might be misleading,” Dìaz wrote.

The main use case for this tool is helping SOCs identify and discard false positive alerts. Known Distributors was developed to make this use case more “straightforward” by allowing users to submit file objects, like a malware hash, to the website, which will analyze the file and determine which company or companies are distributing the file through their products.

It pulls data from a number of sources, including the National Software Reference Library and a partnership with “key software vendors” to help tag and sign files for classification. In addition to using the NSRL, VirusTotal’s Monitor service, Hash DB and the Trusted Source project, Dìaz said VirusTotal is also working with developers of software download portals to centralize author and distribution metadata for the software they submit.

It’s the latest attempt to grapple with the big data realities of modern threat intelligence. Many security teams are relying more on newer tools like EDR and SIEM to ingest massive amounts of telemetry coming from their IT assets to detect threats or ongoing attacks. Those tools in turn have increasingly come to rely on machine learning to create complex rules that allow security operation centers to automate that process and conduct initial classification of security alerts.

But pulling data from an increasingly diverse range of sources creates all kinds of classification, formatting and tuning problems that make it easy to confuse these machine learning algorithms, resulting in the creation of (a lot) more false positives.

This has created a dynamic where, in order to cover a broad range of different threats, security professionals have to use tools with automation to keep up. However, in the process of reducing the human burden on one work stream it ends up creating a whole layer of new work, namely sifting through the haystack of hundreds or thousands of false alerts generated daily by this process in order to find the few needles that represent true threats to an organization’s network.

That has led vendors to steadily add more data sources and make automation a bigger feature, but as one analyst told SC Media, this is a “very fragile form of automation” that creates a lot of noise and still requires significant human input to manage and use effectively.

Tools like SIEM “need an approach that doesn’t rely on this excessive amount of data coming in to perform these detections, because ultimately what they’ve built is a big data problem that is a very challenging thing to solve,” Forrester analyst Allie Mellen told SC Media last month.

Dìaz said VirusTotal, which is already plugged into most commercial antivirus and detection software and holds a wealth of malware information, is one of the organization’s positioned to make that kind of alert triage easier to do.

“By incorporating the Known Distributors details along with VirusTotal’s wealth of contextual information, security teams can overcome the shortcomings of noisy detection mechanisms that have yet to mature,” he wrote.