A house gets burglarized, the owners buy a fancy alarm system. A hurricane knocks a house down, it gets rebuilt bigger and stronger.
Indeed, there are a slew of analogies that could be applied to SolarWinds, which just might now be among the most secure software companies in the tech universe. But really, such explanations oversimplify what transpired during the last year and a half, since the company was at the center of arguably the most significant security breach in U.S. history.
“Companies often get religion after they’ been hacked — I used to call it the ‘conversion experience,’” said Jim Lewis, senior vice president and director of the strategic technologies program at the Center for Strategic and International Studies. “And given how important it is for SolarWinds, I’m not surprised they made a big effort to upgrade security.
But within a threat landscape where even the most security-aware companies get hacked (remember, FireEye was among the initial victims to come forward) how is success measured? And what constitutes legitimate efforts to be better, versus investment for the sake of PR and crisis management?
SC Media spoke with Tim Brown, chief information security officer at SolarWinds, and Chip Daniels, the company's head of government affairs, to dig deeper into the response and long term implications that the Sunburst attack had on its own security posture and that of the software market at large.
“We are hoping to be a poster child for a new model,” Brown said. “This idea is that a response of transparency won't crush you as a company. We got beat up at the beginning, but we haven't been beat up as much anymore. The more that we get other entities to realize that you can go through it and be transparent and still survive, the more it will help us across the board.”
Response in the wake of Sunburst
Of course, transparency is in the eye of the beholder. Much has been reported about the actions taken in the immediate aftermath of the breach, first discovered in December 2020 and described about a year later by Microsoft President Brad Smith as "the largest and most sophisticated attack the world has ever seen.” Some critique has been tied to executive response, with certain allegations leading to a lawsuit. Others have been tied to disclosures. Did the company share too much? Did they share too little?
Brown concedes, it’s a challenging balance. You start by asking what you are legally obligated to disclose, then transition to what is reasonable to disclose.
“There was a decision made very early on to be open and transparent,” he said. “But you do have dozens of lawyers, and they're absolutely looking out for the future, because every word matters. We tried to push as much out as possible."
More importantly in retrospect was what was happening behind the scenes during those initial weeks. Engineering development of new features paused and did not restart for about seven months. During that period (and since) the company tried to effectively respond to the incident and make sure security gaps that allowed the attack to happen in the first place were addressed.
Indeed, the effort in the initial months ultimately led to the transformation of DevOps within the company. And that transformation began with the discovery by Brown's team that attackers didn’t inject code into the source control system — a tactic Brown said the company would have detected immediately. Rather, they injected code into a transient virtual machine that was part of the build system.
How could a team deal with that? Ultimately, the decision was made to transition to a two-way build, which was executed roughly six weeks after discovery.
“That meant we went from source code to product, to install, to decompile and then linked it back to the source control system,” Brown said. “So, we knew we had that linkage. That was stage one.”
Stage two was to move everything to AWS, recreating build environments entirely. So all ephemeral environments — temporary deployments created for individual features — would disappear.
“The build systems don't last for a long time. They just are built when we need them, they break down when we don't need them,” Brown said. “And being in code means that under five people have access to be able to write to that code that does these builds.”
The third stage was a triple build that is deterministic, meaning repeatable and producing the same output no matter how many times it’s run.
“What we were able to do is make the deterministic builds of the Orion platform, which allowed us to run triple pipelines,” Brown said. “So, I build a development build, I build a security build, I build a validation build. They should all compare before I ship. And no one person has access to all three. So then in order to affect my build, you would need collusion between three people.”
The approach demonstrated such potential, in fact, “we open-sourced it — we made it available to the world to say, 'Hey, here's a different model to build.'”
The SolarWinds internal operational shift
Beyond the development process, SolarWinds reevaluated its own security protocols to take what Brown described as an "assume breach model" throughout the environment — from internal IT infrastructure to the engineering and development organizations, to the security team.
"After the incident, we really wanted more and more and more visibility and more eyes,” Brown said.
The organization moved to a multi-tiered, multi-factor authentication solution using YubiKey, first for administrators, but with plans to ultimately roll it out to all. The company also went from a single security operations center run by Brown’s team to three SOCs: CrowdStrike handles threat hunting and management of post breach environment, instrumenting and monitoring workstations and servers; a secondary managed security service provider takes on monitoring of that information as well as the firewall, and Azure and AWS environments; and the internal team manages the tertiary SOC — comprehensively getting as much visibility across the entire environment as possible.
Prior to the incident, SolarWinds had a part-time red team; post incident, a full-time red team was put in place, focusing first on the build systems, then secondary items around what Brown described as “outside edges” and infrastructure.
“I also added an internal audit function to the security team, to look at things like how we audit every step from a line of code all the way through to a product being shipped or a service being run,” he said. “We believe that we're going to see more requirements from an audit perspective in that area. So, we're just doing that before people are asking.”
Indeed, since the Sunburst attack, customers are asking for more — not only from SolarWinds, but from vendors in general. In the early days after the attack, for example, Homeland Security Department CISO Ken Bible asked Brown to answer 12 questions “in depth,” covering development, protection of networks, and an array of other specifics tied to management of the company’s comprehensive security posture. SolarWinds turned its responses into a document published to its website.
The responses were a far cry from the typical check marks on a compliance document.
“That’s what the expectation of vendors is starting to become,” Brown said. “And no, we're not the only ones getting asked those questions. It’s gone from expectation of generic to expectation of very specific.”
Operational changes went beyond technology, as well. In the aftermath of the attack, the security team sent out emails to every one of the customer email addresses that they had to notify them of the exposure. Unfortunately, many of those went to sales people. Now SolarWinds has in place a security contact inside of its Salesforce system for every customer. The team has a new release, the release has a security fix, Brown’s team sends out an email directly to that security mailing list so that information lands in the most appropriate inbox.
All of this attention has resulted in a cultural shift in how security is considered among the technology team at SolarWinds, perhaps most notably among developers.
“The attitude was, ‘Wow, this happened on us. It happened to my product,’” Brown said. “There’s a lot of ownership, and they don't want anything like that to happen again. It becomes more than just somebody telling you what to do. It's emotionally ingrained.”
Market response, SolarWinds forgiveness?
No operational improvements will amount to much if a company can’t survive a breach — which many saw as a very real possibility for SolarWinds in the wake of the Sunburst attack. SolarWinds estimates the actual number of customers hacked through Sunburst to be fewer than 100, including nine federal agencies. Many of them either paused usage of SolarWinds or ripped it out entirely in the days and weeks after the attack was discovered.
Fast forward to today. The company just returned renewal rates to the low 90s, only a couple percentage points lower than pre-Sunburst. The company touts about a dozen federal customers, including all nine agencies that were impacted by the attack.
“Now that doesn't mean the entire fill-in-the-blank department now exclusively uses SolarWinds. That's not what I'm saying,” said Daniels, who came on board after the Sunburst campaign to help in communications with government partners and customers. “But there are elements of all of them that have either never left or have come back.”
He draws a couple conclusions based upon the timing. News of the breach emerged in December of 2020. They started returning in January and February 2022.
“They had basically gone with somebody else for a year. And then it was time to look at renewal and they wanted to talk again,” Daniels said. “So Tim and I have been on this campaign of assuring folks, ‘Hey, here's what we've done over the last year. It's OK to come back.'"
Indeed, Brown and Daniels are on every call with inquiring CIOs and CISOs, explaining the state of play. And truth be told, much like that house that is knocked to the ground in the hurricane, many of those customers believe SolarWinds can deliver superior security because they were backed into a corner and forced to change.
Had the breach not happened, “would we have stopped development for six months and focused on security? No. Would we have spent millions of dollars for inspection of the internal environment? Probably not,” Brown said. “So, when you look at a comparison of one entity to another, you have to say that entity that went through it ends up bringing less risk, from a practical perspective, than the one that didn't. And that's what we get a lot of CSOs telling us, ‘I'm glad it was you and not me. But you're also safer than the other guy because you went through it.”
Of course, that rationale won’t last forever. The unknown is whether SolarWinds — and other vendors that perhaps learned from watching — will continue such diligence. And any company lucky enough to win back customers to recover some semblance of trust after a breach of this magnitude will remain under the spotlight for a lengthy period of time.
“The best-case scenario outcome of a breach is that it spurs the organization to action so they invest resources and time into security,” said Allie Mellen, a senior analyst with Forrester. “Even if they do, it’s a difficult road. There are often systemic challenges, processes, and cultures that require upheaval that can take years to address.”