An update on GitHub availability

github.blog

377 points by salkahfi a day ago


bartread - an hour ago

> I wanted to give an update on GitHub’s availability in light of two recent incidents.

[Emphasis mine]

Vlad, you are living in a very different world to me.

GitHub has suffered dozens and dozens of outages since the beginning of the year. It is notably less available and reliable than it was even as recently as last year. People have created dashboards and heatmaps showing how bad GitHub has become. At least one of those has made it to the front page of Hacker News. In fact its unreliability and persistent availability issues have become a frequent topic of conversation across sites and communities frequented its users - of which HN and Reddit are two obvious examples. At this point GitHub's unreliability risks becoming a meme, if it hasn't already done so.

The only thing your post makes clear is that your priorities ARE NOT clear.

> Our priorities are clear: availability first, then capacity, then new features.

WRONG!

Your priorities are:

1. Availability 2. Availability 3. Availability

You have NO OTHER PRIORITIES.

If you want other priorities, focus on AVAILABILITY for 6 months and then come back and we can all have a serious conversation about something else.

In the meantime, you need to understand that GitHub's reliability over months and months - not just in April - has been completely unacceptable.

Focus on fixing that and on nothing else.

embedding-shape - 21 hours ago

Hah, love that now they say "Our priorities are clear: availability first, then capacity, then new features" when 6 months ago, it was seemingly exactly the same except Azure supposedly was gonna save them:

> GitHub Will Prioritize Migrating to Azure Over Feature Development - GitHub is working on migrating all of its infrastructure to Azure, even though this means it'll have to delay some feature development.

> In a message to GitHub’s staff, CTO Vladimir Fedorov notes that GitHub is constrained on capacity in its Virginia data center. “It’s existential for us to keep up with the demands of AI and Copilot, which are changing how people use GitHub,” he writes.

https://thenewstack.io/github-will-prioritize-migrating-to-a...

So the currently delayed feature development is now gonna be further delayed, yet almost every week we see new features and changes, just the other day the single issues view was changed, as just one example. And it was "existential" 6 months ago yet they keep stumbling on the exact same issue today?

Even if they're focused exclusively on reliability and uptime, we get the experience that we have today, kind of incredible how a company with the resources of Microsoft seemingly are unable to stop continuously shot themselves in the foot. It's kind of impressive actually. As icing on the cake, they've decided to buy up all popular developer services then migrate them all to the same platform, great idea too.

maccard - a day ago

It's kind of hard to read this with a straight face.

The unlabelled graph with big numbers on top, the priorities that don't match with what we're experiencing, and a list of things that they're doing without a real acknowledgement of the _dire_ uptime over the last 12 months....

mijoharas - a day ago

> we started working on path to multi cloud.

Is this microsoft stating that they aren't able to get acceptable reliability from Azure? (I mean, I think a lot of us have heard that, but it's interesting to hear it from microsoft themselves).

s_ting765 - 21 hours ago

> Vladimir Fedorov is GitHub's Chief Technology Officer .... He currently serves on the board of Codepath.org, an organization dedicated to reprogramming higher education to create the first AI-native generation of engineers, CTOs, and founders.

I think I found the issue.

BlackFingolfin - 21 hours ago

GitHub stability has been bad for me. And recently even the data they show me in the web has been unreliably.

Since yesterday, me and several colleagues noticed that the pull request lists on the website are incomplete, across many repositories. For example, on https://github.com/gap-system/gap/pulls it says "Pull requests 78" in the "tab list", but the PR list view reports "35 open" (the number 78 is correct, and confirmed by e.g. `gh pr list`)

And that despite <https://www.githubstatus.com> reporting "all systems operational".

darkwater - a day ago

Glad that they released some data about new repo/issues/commits over the last years. It confirms what everyone else already believed from the outside: agents are putting a lot of extra, sudden pressure on GitHub. It's like a startup that is growing exponentially, with the difference that they already have a large user base to serve - and that keeps them in the bullseye - and probably a not-so-fast-moving organization when it comes down to changes. On the other side of the coin, they also have a lot of talent, infra and money a startup might not have yet.

LiamPowell - 21 hours ago

I can not figure out what on Earth they've done with these graphs, it almost seems like these are an artists impression of a graph.

Looking at the commit graph: Why do commits have big steps followed by slow rolloffs? Why do the steps not happen at uniform points Why do larger steps sometimes have less of a slope than smaller steps but not all the time?

Then looking at the other graphs there's completely different effects going on.

icy - a day ago

I'm biased (founder of tangled.org), but the future really should be federated forges. Host repositories on sovereign infra with global identity + federated "metadata" (issues, pulls, etc.).

Global indices for this should be trivial to spin up so availability is never a concern (we're working towards this!).

frangonf - a day ago

What are we doing?

Stop subsidizing tokens now that we extracted enough training data from you and we have enough agentic junkies business to keep the flywheel going up and cut on the loss leaders. [0]

[0] https://news.ycombinator.com/item?id=47923357

latexr - 21 hours ago

> The main driver is a rapid change in how software is being built. Since the second half of December 2025, agentic development workflows have accelerated sharply.

GitHub instability has started way before that. I understand it’s too much to ask of a trillion-dollar corporation to consider the impact of their own actions, but perhaps they should’ve thought of that before forcing LLM development down everyone’s throats.

torben-friis - 21 hours ago

Not enough attention is being put in the production/delivery mismatch.

GitHub is claiming they require 30x scale due to the giant increase in repository creation, PRs, commits, etc.

I have not seen a single product increase in features or quality as an end user, nor new significant products have come out in this period (other than the LLMs themselves).

Where is all this code going?

zamalek - 16 hours ago

> Our priorities are clear: availability first, then capacity, then new features.

No mention of Copilot/slopiffication. Probably an intentional omission as Microsoft only has one true priority across all of its products.

jftuga - 21 hours ago

Some interesting tid bits:

* we had to resolve a variety of bottlenecks that appeared faster than expected from moving webhooks to a different backend (out of MySQL)

* * redesigning user session cache to redoing authentication and authorization flows to substantially reduce database load.

* we accelerated parts of migrating performance or scale sensitive code out of Ruby monolith into Go.

I'd like to know what database backend they migrated to. I was also surprised to read that the migration from Ruby to a more performant language had not already been completed. I assume this is because it a large code base with many moving parts, etc.

danra - 3 hours ago

When it's down to brass tacks, the most common GitHub action, actions/checkout, is not taking contributions due to "focus [...] on strategic areas" [0] despite having years-old issues - here's one[1] that soon celebrates its sixth birthday, despite having an available PR!

[0] https://github.com/actions/checkout#note

[1] https://github.com/actions/checkout/issues/270#issue-6289677...

clvx - 19 hours ago

With this prioritization Github IPv6 support is gonna happen the next decade.

himata4113 - 21 hours ago

so what they're saying is that Co-Authored-By claude@anthropic.com is overloading their systems?

and that azure cannot scale fast enough to handle the load so they're embracing multi-cloud as a company... owned by microsoft?

woah. what am I reading.

baq - a day ago

openai, anthropic, google and a plethora of chinese models all end up pushing code into github. you can discuss whether gpt 5.5 is better than opus 4.7, but for github it doesn't matter: they'll be receiving the code no matter which llm spits it out.

amazing on one hand, quite scary on the other for github and all other forges if this continues and there is no reason why it wouldn't.

mrhottakes - 17 hours ago

LLMs have helped us invent websites that only work sometimes. We're truly living in the future.

pluc - a day ago

There are no words that Microsoft can use that would make me trust Microsoft.

sikozu - 21 hours ago

This latest incident was the nail in the coffin for me. I've been on GitHub since 2012 but I'm feeling the pull to migrate out to Gitea/Forgejo. Has anybody done this recently? How'd it go?

steve1977 - 21 hours ago

I know that I'm simplifying (probably too much), but it seems like things were fine when GitHub was still a Ruby on Rails monolith and all the rigmarole with microservices etc. only made things worse.

jcattle - a day ago

When there's a gold rush invest in checks notes jewellery makers?

eolgun - 21 hours ago

The AI agent growth explanation is interesting but also a bit of a deflection. If a meaningful portion of your traffic is now automated agents, your capacity planning model is fundamentally different, you're no longer scaling for human paced workflows but for burst patterns that look nothing like historical load.

The unlabeled graphs don't help the credibility case. When you are already in the hole on trust, shipping a post that requires readers to assume favorable baselines is exactly the wrong move.

dangoodmanUT - 20 hours ago

Two incidents? Just two?

In seriousness, looking at their scale, this is an insane engineering challenge.

Especially if they’re moving databases, not easy ever, and certainly not at that scale

zinodaur - 18 hours ago

> posts graphs without way to determine scale of y axis

Now that’s the kind of excellence I expect from the GitHub engineering team

otar - 21 hours ago

I had to postpone a call with developers (in 2 different countries) because I didn't had access to the issues board, which is a single source of truth for us.

I understand the rapid growth (because of AI agents), but if such critical software service becomes unstable then it's time to migrate? Thinking about self-hosting GitLab.

saghm - 16 hours ago

Given what "An Update on <XYZ>" usually means, I can only assume this means that Github has decided to no longer provide availability. Not particularly surprising given current trends I guess

mendyberger - 20 hours ago

I wonder if this mess has anything to do with talent loss resulting from layoffs after the pandemic

- 20 hours ago
[deleted]
guidoiaquinti - a day ago

> While we were already in progress of migrating out of our smaller custom data centers into public cloud, we started working on path to multi cloud. This longer-term measure is necessary to achieve the level of resilience, low latency, and flexibility that will be needed in the future.

Wild

TuxPowered - 18 hours ago

The availability of GitHub is still at 0% - it can't be reached over IPv6.

cedws - 21 hours ago

I wonder if they’ll end the free lunch we’ve been having since the MS takeover. There’s been a deluge of spam and crapware projects due to the LLM wave which is visible in that graph. Can’t see them sustaining being a public dustbin for low value projects forever.

nraynaud - a day ago

So I gather that nobody is working on a search that stays on the current branch?

GS_Projects - 20 hours ago

The bit nobody covers in these write-ups: small teams without dual-cloud failover budget. Last big GitHub outage cost me a deploy day. Not catastrophic but the kind of thing you don't budget for when GitHub is your single source of truth.

Status page is also still doing that thing where every component is green but in practice clone is hanging, push is timing out, actions are stuck. Per-service uptime is a managed number. The user-experience number is the one that matters and it's not in the post-mortem.

throwatdem12311 - 21 hours ago

> The main driver is a rapid change in how software is being built.

Leopard, meet face.

Too little too late, yesterday was the straw that broke the camel’s back for us and we’ve started a migration to a self-hosted GitLab.

Waterluvian - 20 hours ago

I have a hard time believing anything what's said in a blog post where a graph lacks axes labels/scale. It tells me that nobody who cares about correctness had any say on the content of the post. Maybe I'm being 8am cranky and pedantic, but I'm sticking with it.

> availability first, then capacity, then new features.

I'd love to experience first-hand a leadership team who says, "stop accepting new paying customers until we've got availability sorted out!"

sltr - 21 hours ago

One thing is clear: an LLM wrote this.

dzonga - 20 hours ago

blame MySQL. Blame Ruby.

on another note - is the exponential growth from 'agentic' workflows actually resulting in productive software in the wild. Or it is just noise. On my end I haven't seen the software I use getting better.

rootnod3 - 21 hours ago

> Our priorities are clear: availability first

That's a delayed April fool's right?

pier25 - 18 hours ago

Github has been having availability issues for years now.

fontain - a day ago

Personally, I’m sympathetic. We know that GitHub did a huge amount of work over the last decade to make Git scale, which has benefited us all. These new scaling challenges are real challenges, 30x growth would be a nightmare for any system that was already pushing the limits of what was possible, I think we are being far too hard on GitHub, they deserve a little grace.

perbu - 19 hours ago

fwiw, I've had good luck scaling git, specially doing clones, in the HTTP layer, using Varnish. this was CI bringing Github Enterprise to it's knees.

BigTTYGothGF - 18 hours ago

LLMs and vibe coding ruining it for the rest of us.

lousken - 21 hours ago

Availability is priority? Does not seem like it is https://mrshu.github.io/github-statuses/

jameskilton - 20 hours ago

Nice, they have availability numbers now on their status page, but they aren't aggregating.

If you multiply all current numbers together (as of Apr 28), you find out that GitHub has a 97.26% uptime.

One ... single ... 9.

They can do better.

bananapub - 21 hours ago

anyone who's actually worked there, could you explain why they're finding scalability and reliability so hard? naively it seems like 'repo groups', ie clusters of repositories linked by being mutual forks, would be fairly isolated for the whole git storage layer, and everything else feels pretty easily parallelisable (issues, actions, etc, modulo taking locks now and then to submit results or whatever). and given that, surely you can incrementally deploy changes across those many shards to avoid most big outages?

are there big conceptual serialisations that I've missed? is it just not well factored? was the move to Azure just a catastrophically bad idea? some other thing?

imrozim - 21 hours ago

As a solo dev GitHub going down is scary all my code, all my history, one platform. This makes me want to keep local backups more seriously.

devmor - 17 hours ago

Microsoft has been an abysmal steward of Github - the few nice features it has over self-hosting just aren't worth losing an hour or more of CI/CD downtime during daylight hours every week.

Yesterday was the last straw for me - I've begun migrating my personal private projects and my contracting firm's projects off of github.

OutOfHere - 19 hours ago

> we accelerated parts of migrating performance or scale sensitive code out of Ruby monolith into Go.

I am surprised that Microsoft is allowed to use Go. How long will it be before a bean counter forces a rewrite to a Microsoft favored language?

everfrustrated - 21 hours ago

So they haven't even finished migrating from their datacenters to Azure and have now started a project to add another cloud provider ("multi cloud")? Madness.

JimmaDaRustla - 18 hours ago

AS IF THEY POST THIS WHILE THEIR SEARCH IS BROKEN, what a circus

- 21 hours ago
[deleted]
agluszak - 19 hours ago

Regarding their image with stats (https://github.blog/wp-content/uploads/2026/04/record-accell...) - what exactly are the ranges on y-axes? I doubt they had close to 0 PRs merged in 2023 ;)

yieldcrv - 21 hours ago

Ruby catching strays

Good chuckle out of this post, it’s crazy that neither Atlassian (Bitbucket) or Gitlab are capturing value out of this same agentic coding boom. I wish github was separately publicly traded outside of Microsoft.

Nowhere to get exposure to this

000ooo000 - 19 hours ago

Load from paying customers vs. load from nonpaying users would be interesting to know. No doubt omitted deliberately.

dangus - 13 hours ago

Notice how the graphs have no Y axis. That's how you know it's manipulative.

This company is owned by one of the major causes of the AI boom and is hiding behind difficulty scaling, despite its parent company also being a premier source of scaling solutions.

GitHub: don't gaslight your customers.

It is not your customers' problem that you're having trouble scaling. Nobody cares. Give us the service we are paying you for and make it reliable, or else we'll choose something else.

After the words "Both of those incidents are not acceptable" the blog post should have been over. Nobody needs to hear a sob story about how your service is too popular.

jimmypk - 20 hours ago

[dead]

- 21 hours ago
[deleted]
huijzer - a day ago

I’m pretty sure my Forgejo instance on a Raspberry Pi is outperforming GitHub reliability. It’s faster that’s for sure.