Critical for who? The triumph and tragedy of CVSS as a risk rating tool

Within the cybersecurity community, the Common Vulnerability Scoring System, or CVSS, is the defacto standard for distilling significance of a bug. But a debate among security professionals has some questioning the practical value of the ubiquitous scores.

The CVSS score – more accurately, the CVSS base score – is a useful tool to compare vulnerabilities in the abstract. But it was not designed to evaluate risk or be the end of the conversation on vulnerability prioritization. And yet, that is often how CVSS is utilized.

"I do think CVSS is good at what it is supposed to be good at, which sounds weird since I spent all that time criticizing it," said Allan Liska of Recorded Future's computer security incident response team, following a talk about alternatives to CVSS last week at the RSA Conference. "But what CVSS is supposed to do is provide an apples-to-apples comparison of vulnerabilities across all types of technologies. it can be really hard to compare a vulnerability in a protocol to a vulnerability in an application."

The CVSS base score does not, for example, tell you whether a vulnerability is likely to be used by hackers. Research conducted by the Cyentia Institute, Rand Institute and Virginia Tech shows only around 5.5 percent of vulnerabilities are ever seen in the wild. It does not say whether vulnerability is in 15 systems or 150,000, or whether the vulnerability is in a publicly accessible server.

"If you're using only the CVSS score to prioritize your patching, then you're not adequately managing risks," said Liska.

The number commonly referred to as a CVSS score is really just a base score. CVSS is actually fairly aware of the base score's limitations, and offers two additional calculations network defenders can use to gauge applicability to their own environment and relevance, known as temporal and environmental scores.

"That is the biggest problem that most people have with CVSS. They want, the base score to be generic," said Jorge Orchilles, chief technology officer at Scythe and a voting member of the CVSS working group for version 3.0 and 3.1.

Orchilles is keenly aware that CVSS scores don't tell the entire story. He headed Citi's offensive security team for a decade prior to joining Scythe, where he said me made prioritization decisions based on limited information and without enough resources to patch everything immediately. Often, he said, that decision came down to experience in guessing which vulnerability would be exploited.

Whether or not a vulnerability is likely to be exploited is not reflected in the CVSS score. Neither is whether or not the vulnerability is already being seen in the wild. Yet both may be a better indication of what vulnerability is most critical to patch. Ochilles notes that threat intelligence services can be critical in determining which vulnerabilities are being used by hackers or have exploits available, a key indicator of what will be used. Liska notes that certain products are more likely to be targeted — a CVSS 10 in Microsoft Exchange is more likely to catch a hacker's eye than one in a smaller competitor like SquirrelMail.

Further confounding matters, CVSS evaluates the usefulness of each vulnerability as a stand-alone means to attack. Vulnerabilities that need to be chained together for an attack score lower than ones that don't, even if a chain collectively can be devastating.

"The score doesn't matter as much as as whether or not it's being seen in the wild," said Liska.

There are several alternatives to CVSS, which take different approaches to incorporate risk management and prioritization into the system. These include the Exploit Prediction Scoring System, designed to look specifically at the likelihood a vulnerability will be exploited, and Carnegie Mellon CERT's Stakeholder Specific Vulnerability Categorization, or SSVC, focusing on dangers to specific networks. The biggest downside, said Liska, is that none of them are widely adopted. In fact, he said, since many of them are bespoke systems used by vendors like Tenable or Rapid7, choosing to use a scoring alternative might require vendor lock in.

But, said Nathan Dyer, director of product marketing at Tenable, vendor's services can come with added value. Tenable, he boasts, touts behind the scenes data science, machine learning, and threat intelligence services and continuously updating scores as facts on the ground change.

Taking into account threat intelligence can make a tremendous difference in scoring. Dyer points to a recent Amazon Linux bug rated 9.8 on CVSS as an example.

"It can expose sensitive credentials in the HTTP header to unintended hosts, and so theoretically this type of vulnerability is extremely critical and should be nearly at the top of the mediation queue for most organizations," he said. "But if you take a look the actual threat landscape, it isn't being leveraged in the wild, the exploit code maturity is rather low, and really no threat actors are talking about the CVE through the dark web or online channels. And so we at Tenable rated this vulnerability a 5.9."

Conversely, he said, threat intelligence around another Linux bug named DirtyCOW led Tenable to rate that bug two points higher than its CVSS score.

If CVSS comes off looking a little like a punching bag, it shouldn't, said Dyer, Liska, and Orchilles. Even Dyer, who represents an ostensible competitor, said "CVSS is really a good foundation for understanding vulnerabilities, or getting into the technical details of what the vulnerability is, what the potential implications to an organization are if it were to be exploited. But it's just being used in the wrong way by the industry. It was never really meant to be a risk prioritization tool."

And while people make negative comments about the usefulness of CVSS, "when we have the open working group meetings, you don't see them there," said Orchilles.