Amid high-profile outages, automated certificate management offers a solution

An expired certificate caused an hour-long outage last summer on Spotify. Today’s columnist, Abul Salek of Sectigo, offers some tips on how automation can help security teams more effectively manage digital certificates. (CC BY-NC 2.0)

High-profile outages caused by expired digital identity certificates have dominated the headlines of late, as widely-used platforms have suffered serious disruptions. Although the industry commonly uses the term certificate outage, the term fails to accurately describe what actually happens: the failure lies not with certificates themselves, but instead stems from an organization’s failure to properly manage the lifecycles of those certificates.

In recent months, the industry has reduced the lifetime of publicly trusted TLS/SSL certificate to 398 days. Now, organizations must renew certificates more frequently. Manually doing so adds burden to IT teams and increases the risk of human error when rolling out and renewing certificates. Thankfully, today’s advanced automation lets enterprises easily and reliably manage their certificate lifecycles, saving time and preventing outages before disaster can strike.

Outages are more common than people think

Much of the public at large may not even be aware that digital identity certificates exist—but they are certainly aware of what happens when organizations fail to properly manage them. Client systems cannot establish a secure connection with a site with an expired certificate blocking access to the service. For example if a user goes to their banking site for a transaction, if there’s an expired certificate the browser will warn the user and block the site. From the end user’s perspective it’s an outage even though the bank’s backend servers may still be up and running in its data center.

Within the past several months alone, a number of major companies have suffered service outages because of certificate mismanagement. In February, Microsoft Teams went down for roughly three hours because of an expired certificate, temporarily crippling the service for those who rely upon it for teleconferences. Months later, music streaming service Spotify met a similar fate, rendering users unable to listen to their favorite songs. Although the service was only down for an hour, it wasn’t until a Cloudfare engineer noticed that an important TLS certificate had expired that Spotify could rectify the problem.

While these service outages were brief and relatively inconsequential, we are not always that lucky. In 2018, customers of European mobile provider O2 experienced a nearly day-long outage. The problem was eventually traced to an expired certificate managed by Ericsson, and O2 later negotiated an estimated £100 million ($132.8 million) damage settlement from the European networking giant. Although this was not the first major issue caused by the failure to renew an expired certificate, it was the incident that awoke many to the potentially serious consequences—both financial and reputational—of such lapses.

Recently, an even more concerning example made headlines, as California discovered that the state had accidentally underreported its number of COVID-19 cases because of a system backlog caused by an expired certificate that had prevented data uploads. Certificates are essential for proper website and device security, and failure to properly manage them can have serious, real-world consequences.

Modernizing certificate management

The Microsoft Teams outage was a concern for different reasons. Organizations may have a massive number of certificates in use at any given time, and that number continues to grow larger as IoT and other connected devices enter the market en masse. With countless individuals accessing corporate networks from unfamiliar personal devices amid the COVID-19 crisis, the need for effective certificate and identity management has never been clearer. People are prone to human error under the best of conditions—let alone when they are tired, worried, or distracted by children or pets. Let’s break down a few reasons why automation will become the future of certificate management:

  • Managing certificates the old-fashioned way does not make sense. When businesses were dealing with relatively small numbers of certificates, it may have been possible to manage the renewal process via spreadsheets. But as the number of certificates in use for a given organization climbs into the hundreds, thousands, or even tens of thousands, that approach no longer works.

  • Automation isn’t just faster—it reduces the likelihood of human error. Combing through spreadsheets was always a problematic approach, and one that carried a high likelihood of human error. That error might come in the form of forgetting to renew an expiring certificate, failing to correctly provision a new certificate, or a variety of other options, and each carries potentially damaging consequences. At the current scale of certificates, automation remains the only feasible way to take human fallibility out of the equation.

  • Certificate automation tools are becoming more common. Microsoft itself offers a built-in certificate manager, Microsoft Active Directory Certificate Services (ADCS), in its family of products, highlighting the integral role that the company expects automation to play moving forward. In fact, many organizations may already use this simple form of automation without realizing it. However, it’s important to note that the Microsoft Teams outage indicates that there’s still a long way to go when it comes to widespread adoption of automation—even within organizations that most would expect to handle it properly.

  • Third-party tools and protocols have made automation accessible to all. Today, a new breed of tools can manage both private and public certificates via a single platform, streamlining not only the discovery of all certs across an enterprise, but also provisioning, renewal, and even revocation. Most certificate managers are compatible with today’s most common Representation State Transfer (REST) APIs, and many offer integration with modern business necessities like DevOps platforms and public cloud capabilities. And using certificate management tools that leverage popular protocols such as Automated Certificate Management Environment (ACME) can eliminate a number of different problems. It’s possible to set expiring certificates to automatically renew—preventing costly errors with the click of a button.

Don’t wait: automate

Manually managing identity certificates has become increasingly difficult as the number of devices connecting to corporate networks has multiplied, and it will only likely become harder as time goes on. The rollout of IoT shows no sign of slowing, and in today’s era of distributed workforces, the number of users, devices, applications, and servers in need of authentication will only rise. For too long, businesses have put themselves at the mercy of human error, and the results have made headlines across the globe. At a time when automated certificate management has become easier than ever, why risk a damaging service outage? Security teams will rely on automation in the future, and automatic certificate management will become one of the most visible and valuable places to implement it.

Abul Salek, director of product management, Sectigo

Get daily email updates

SC Media's daily must-read of the most current and pressing daily news

By clicking the Subscribe button below, you agree to SC Media Terms and Conditions and Privacy Policy.