Skip to content

Search the site

What caused the outage which brought down Github?

"No server is currently available to service your request. Sorry about that. Please try refreshing."

The famous angry unicron which presages doom for developers who rely on Github
The famous angry unicron which presages doom for developers who rely on Github

In the wee hours of this morning, Github suffered a major outage which brought down the site and prevented access to its services.

When they tried to visit the site or access services, users were shown a picture of an angry unicorn and a message which said: "No server is currently available to service your request. Sorry about that. Please try refreshing and contact us if the problem persists."

Now the platform has explained why it suffered the outage.

In a statement, it blamed a configuration change for the issues, which made GitHub services inaccessible for all users on August 14, 2024 between 23:02 and 23:38 UTC.

The first announcement Github made about the outage came less than ten minutes after users began reporting problems.

It said: "We are investigating reports of degraded availability for Actions, Pages and Pull Requests."

The Github status page then revealed a battle to get services up and running, first identifying "degraded availability" or "degraded performance" across Git Operations, Copilot, Codespaces, Packages, Pages, Webhooks, Actions, Pull Requests, Issues.

Status updates provided details of every minor and major victory as Github "rolled-back the changes to database infrastructure and mitigated the impact."

At about 1am in the morning UK time, the battle appeared to have been won.

Github wrote: "All GitHub services were inaccessible for all users. This was due to a configuration change that impacted traffic routing within our database infrastructure, resulting in critical services unexpectedly losing database connectivity. There was no data loss or corruption during this incident.

"We mitigated the incident by reverting the change and confirming restored connectivity to our databases. At 23:38 UTC, traffic resumed and all services recovered to full health. Out of an abundance of caution, we continued to monitor before resolving the incident at 00:30 UTC on August 15th, 2024.

"We will provide more details as our investigation proceeds and will post additional updates in the coming days."

The outage first prompted shock, as developers were forced to down tools or simply go to bed if they were working in a time zone where it was the middle of the night. Then it sparked anger.

On Hacker News, one commenter wrote: "I've never seen an outage this big. Even the homepage doesn't load. We've had recurrent issues with Actions not running, but this seems a lot bigger."

The incident also prompted inevitable criticism of Microsoft, which owns GitHub, as well as warnings about dependency on the platform, which could create a major point of failure.

"While other industries, such as in the oil and gas industry, are well versed in creating systems which resilience in the failure of any part of an infrastructure, unfortunately IT often has single points of failure," wrote Professor Bill Buchanan of Edinburgh Napier University's School of Computing. "So, if your company have a GitHub respository as its sole provision of its code, be a little worried."

On LinkedIn, Kevin Inman, Senior Director DevOps SRE at the US mortgage provider Fannie Mae, wrote: "With GitHub going down today we all witnessed productivity nosedive. Nothing like major outages with critical infrastructure to make you stop and retrace for single points of failure, again.

"Because tech changes rapidly and our internal pipelines and stacks do as well. A couple of years go by and you may have introduced another single point of failure along the way and not even realised it.

"It’s always good to audit ourselves. Being proactive is never a bad thing."

READ MORE: 'Rotate your keys now': Sensitive data could be accessible in deleted or private Github repositories

THIS STORY IS BEING UPDATED

Latest