Sep. 30 ‘internet blackout’ showed the importance of security certificate encryption

What happened on September 30 in words of (mostly) one syllable.
11 October 2021

On September 30 this year, multiple websites and services stopped working for many users due to a problem with a security certificate expiry. Many online shoppers found that they couldn’t complete their purchases, and companies using Xero’s or Intuit QuickBooks’s web services also found their work interrupted.

There were issues with CDN Cloudflare and older versions of iOS unable to connect securely to online websites and services. All over the internet, browsers and apps either silently failed to connect with their target destinations, or flagged up errors saying that safe connections couldn’t be established.

Even among seasoned technology professionals and among the majority of the internet-using population, the whole subject of certificates, security certificate authorities, root certificates, SSL, TLS, and encryption in general is a dark art. Other than the display of a padlock on a web browser’s URL bar, or a vague knowledge that https://site.com is somehow safer than http://site.com, few people have any idea of the layers of technology that work constantly to keep the internet safe.

For those that want to know the reasons for what happened on September 30, how the safety net of encryption is spread over much of the internet, and what happens when the silicon entities stop trusting each other, read on. Here’s the business professional’s guide to certificates, and how some very clever mathematics helps keep internet traffic safe.

You own google.com (or goooogle.com)

Imagine you own a domain — let’s call it goooogle.com. When people want to surf your website at www.goooogle.com, their web browser sends you a message asking “is it really you? Can I trust you?” To reassure them, you send them a copy of your digital ID, which is a certificate. Bound up in the certificate is a complex code, which is a key given out to help the public ascertain that you are really who you say you are. It’s called a public key.

The people wanting to browse your website look at the certificate which contains a record of where it was made. If its maker (called a Certificate Authority) doesn’t appear on the list of makers (CAs) baked into the web browser (the length of the list of acceptable CAs varies depending on whether you use Firefox, Chrome, Edge, Safari and so on), the person sat at the browser gets an error message. You may have seen one of these — something along the lines of “This certificate issuer is not unknown, are you sure you want to continue?”. For now, however, let’s imagine all is well, and the certificate authority is a recognized one.

The web browser then checks the certificate and verifies, mathematically, that it was indeed issued by the certificate authority that it claims. If it all checks out, the browser and the website decide to chat. Between them, they come up with a session key, which is used by both sides of the conversation to obfusticate their conversation.

Impersonation isn’t easy

The internet is a free and open place, and anyone can issue a security certificate that can be used to encrypt traffic between a website they’ve put together, and people that want to browse it safely. The difference between self-certification and getting a certificate from a CA is twofold. Firstly, anyone browsing to a site that has a self-certified will get the message mentioned above, namely, “This certificate is from an authority I don’t recognize. Are you really really sure?” The second difference is that to get a certificate from a recognized certificate authority (one of the hundred or so baked into the web browser), organizations have to jump through some hoops — like anyone applying for a passport or national identity card would have to.

When an organization applies for a recognized secure certificate it has to at least prove that it owns the domain, by responding to an email sent to that domain and maybe a couple of other checks behind the scenes. The more secure a site wants to be, the greater number of hoops it will need to jump through — this is called extended validation, and ends in the issuing of an EV certificate.

So what went wrong on September 30?

One of the most-used certificate authorities is Let’s Encrypt. It is commonly used by any organization that wants one of the quick, less stringently-checked certificates. Rather than the extended validation certificate, these are known as domain-validated certificates, or the snappily-titled X.509 digital certificates. For application developers, they’re a real boon, because they offer a decent degree of security yet are quick to obtain and install. They’re used a great deal to provide an HTTPS connection to a content-driven, non-sensitive data-holding website, or an API gateway between applications passing data to one another.

However at the core of the certificate authorities’ workings are what are known as root certificates. Each CA has one, and in fact, there’s a copy of each of them baked into the average web browser — they are what make up the list of approved issuers mentioned above. Root certificates are used by the CA to create at least one intermediate certificate, also signed by the security certificate authority. That creates a chain of trust: the website’s certificate can be checked against the CA’s intermediate certificate, which can be checked against the root certificate. And like all certificates, root certificates expire and are replaced — although, it has to be said, not very often.

The nature of technology systems is such that many older operating systems, browsers and other apps that use the internet remain in production in the wild (for instance, older Android phones). The root certificate pertaining to Let’s Encrypt on some older systems didn’t update. Therefore, there was a mismatch between certificates along the chain of trust — the root certificates on older devices couldn’t be mathematically verified as kosher.

Let’s Encrypt published multiple notices of the expiry of its root security certificate, and even offered several workarounds to effectively nudge older systems to update them, but for all sorts of reasons — developers moving on from job to job, human mistakes, and so on — there were plenty of instances where the certificate updates failed.

The issue on September 30 was not a systemic failure, but a procedural one. Additionally, it affected only one of the root authorities (the certificate authorities that hold root certificates — Let’s Encrypt, in this case), so the outages were not widespread.

It’s easy enough for articles like this to end with a call to arms to ensure systems are up to date, but the nature of most technological systems is that they comprise many, many moving parts. The areas of key pairs, encryption, digital signing and certificates can be highly complex, even for mathematicians with PhDs in Computing Science. After all, business professionals might not understand how a car’s airbags actually work — it’s just good to know they’re there, and that occasionally (unfortunately) it’s proven they work.