Incident Report: Railway Blocked by Google Cloud (Resolved)

blog.railway.com

506 points by aarondf 13 hours ago


https://status.railway.com/incident/I23M92U0

r721 - 5 hours ago

>We have resolved this incident and a post mortem is available here.

>https://blog.railway.com/p/incident-report-may-19-2026-gcp-a...

>May 20, 07:57 UTC

https://status.railway.com/incident/I23M92U0

tardwrangler - 8 hours ago

Everyone is eager to point a finger at Google, but I've been a user of Railway for a while now, and I've seen enough nonsense to want to hear what GCP has to say about this before drawing any conclusions. Let's just say Railway has had problems like this before, and the way their team handles them does not inspire any confidence.

Regardless of how it happened, for me, this is the straw that broke the camel's back.

dangoodmanUT - 12 hours ago

It has been 0 days since GCP has taken down a startup (again).

You see this at least once a year. Never heard of this from AWS or Azure.

In all seriousness, this is why we don't use them. They have the most ergonomic cloud of the big three, then absolutely murder it by having this kind of reputation.

valgaze - 9 hours ago

May 2024 UniSuper incident: https://cloud.google.com/blog/products/infrastructure/detail...

https://www.unisuper.com.au/about-us/media-centre/2024/a-joi...

A joint statement from UniSuper CEO Peter Chun and Google Cloud CEO Thomas Kurian

8 May 2024

UniSuper and Google Cloud understand the disruption to services experienced by members has been extremely frustrating and disappointing. We extend our sincere apologies to all members.

While supporting UniSuper to bring its systems back online, Google Cloud has been conducting a root cause analysis.

Thomas Kurian has confirmed that the disruption arose from an unprecedented sequence of events, where an inadvertent misconfiguration during provisioning of UniSuper’s Private Cloud services ultimately resulted in the deletion of UniSuper’s Private Cloud subscription.

This is described as an isolated, “one-of-a-kind occurrence” that has never before occurred with any Google Cloud client globally. This should not have happened. Google Cloud has identified the sequence of events and taken measures to ensure it does not happen again.

Why did the outage last so long?

UniSuper had duplication across two geographies as protection against outages and data loss. However, the deletion of the Private Cloud subscription triggered deletion across both geographies.

Restoring the Private Cloud required significant coordination and effort between UniSuper and Google Cloud, including recovery of hundreds of virtual machines, databases, and applications.

binarycleric - 11 hours ago

How the heck do these things happen, especially with companies with huge monthly spend? At my last job we had some suspicious workloads running on AWS and our TAM reached out to us before taking any action. Who wants to bet this was some AI automation gone wrong and because GCP seems to be allergic to actually contacting a human to get a response, this just sits in some support queue that outsourced workers look at after a few hours just to give a canned response?

BitWiseVibe - 11 hours ago

As someone who runs some public APIs, the amount of spam from Railway IPs is insane. They have horrible abuse prevention. Hopefully this encourages them to improve their operations.

fjni - 12 hours ago

Wait… railway runs on GCP? Didn’t they make a whole thing about not “building a cloud on top of another cloud?”

Or did they just mean that they’re not renting VPSs but only metal from the cloud provider?

In my mind I was so excited that there was another provider not just paying one of the hyperscalars but at a minimum colocating and owning more of their stack. https://blog.railway.com/p/heroku-walked-railway-run

chatmasta - 10 hours ago

I thought Railway was building their own data centers? [0]

> The fact of the matter is, you simply cannot build a cloud on someone else’s cloud.

Indeed…

[0] https://blog.railway.com/p/launch-week-02-welcome

ksajadi - 7 hours ago

When you signup for Railway, they have uncommon way of making sure you have read and understood their T&C regarding abuse of their systems, including crypto mining, etc.

My guess is that many are abusing their free tier, causing them trouble with their service providers.

I take no joy in seeing Railway take a hit like this, even as a competitor, but free compute attracts all sorts of strange users. We've been there and decided early on to avoid free compute even it costs us our top of the funnel.

danpalmer - 22 minutes ago

7 minutes from bug filing to account restoration. This shouldn't have happened in the first place, but that's an excellent response time from the support team.

eoswald - 12 hours ago

Sorry, I have a hard time blaming Google for this, when Railway seems to be having increasing trouble keeping the platform stable. Something like this should NOT take down an ENTIRE service. There should be a backup when literally your business is about being the reliable backend. This just seems like poor planning to me.

brokenodo - 11 hours ago

Well, as a 2 week tenured and very happy Railway customer until now, I am now a Render customer. Somehow DNS cut over within 1 min(!) and live after about 30 minutes of work. Not bad!

faangguyindia - 13 hours ago

Google cloud also locked out a Korean Goverment Organization recently. The guy posted on GCP subreddit.

Google really need to improve their support team. It's strange such a big corp can't even afford to have proper support team.

UrbanNorminal - 11 hours ago

Is google allergic to humans or something? Cannot they just send an email or call the company before taking a wrecking ball to the entire company's infra? Are they stupid?

codegeek - 11 hours ago

This is bad. Even their own website is down at railway.com. Looks like total dependency on google cloud. Surprising for a company of their scale with all this VC money.

whh - 11 hours ago

This could kill a startup. I really don't like Google's automated and silent account murder functionality.

Avicebron - 12 hours ago

Isn't Railway the "the API key to delete the backups is in the prod database, because that's where the backups live duh" guys?

throwaranay4933 - 14 hours ago

This screenshot from Discord suggests the idea that the outage is caused by automated GCP account ban: https://x.com/acgfbr/status/2056866780866351323

bearjaws - 11 hours ago

I will never leverage GCP in an enterprise setting, it's honestly amazing how hard they fumble the bag. Will be interesting to see when GCP support started working with them, from the updates there was an hour and change from when they identified the issue and GCP support was confirmed.

In the cloud space it seems like AWS does nothing and wins.

usernametaken29 - 10 hours ago

I didn’t knew Railway so with this misleading headline I thought a Google Cloud data centre was being built in the way of a railroad. That’d been a funny story to read..

enahs-sf - 12 hours ago

I respect what railway is doing but also would never run my business on such a platform.

cube00 - 6 hours ago

Railway "What we know so far: May 19th 2026": https://station.railway.com/community/what-we-know-so-far-ma...

padolsey - 11 hours ago

Does anyone know how this even happens inside the walls of google? Is it an automated process? How is such a (presumably) high revenue account just magically blocked without human intervention? I'm quite perplexed.

TheTaytay - 12 hours ago

I’ve seen a few smug “all your eggs in one basket” comments here.

I’m aware of some companies hosting their own metal and infra, but I’m not aware of large companies mitigating risk by hosting on separate cloud providers as a fallback mechanism. We might disagree with cloud provider choice, or think they should have been hosting their own metal, but that’s still an “all your eggs in one basket” choice, right?

Heck, they might even have multi-region fallback with GCP, but if GCP bans your account, that doesn’t matter.

Are there good examples of running a company of railway’s size so redundantly that their host could nuke one of their accounts and they’d just keep on trucking?

jkogara - 4 hours ago

Interestingly, upon logging in this morning I was presented with a new terms and conditions banner that required me to agree to not deploy a list of, to varying degrees, nefarious things (bots, torrents, "anything illegal", etc.). Is it likely that some of these workloads resulted in the auto restriction from GCP?

whh - 3 hours ago

There's that "automated action" again. Regardless of the architectural decision, it makes me incredibly uneasy relying on GCP if these types of things can happen.

mjy78 - 9 hours ago

All in on cloud so we don’t need to worry about backups. Now your subscription is the single point of failure.

zx8080 - 5 hours ago

For those who opened this link to read news about the real railway (with trains), it's not about it. Thank you for wasting my time!

jefborges - 11 hours ago

Railway is back, but I’m not sure if I can trust keeping my projects there, so I’m going to migrate to another company.

hnburnsy - 10 hours ago

From their founder on X...

"Absolutely. The Railway network is a mesh ring between AWS, GCP, and Metal

So: - High availability interconnects - High availability path routing between clouds - Database itself is high availability

However, Google's VPC itself is not. So we will add a shard to Metal and AWS"

jaspanglia - 8 hours ago

Cloud platform dependencies are becoming a huge single point of failure

sammy2255 - 10 hours ago

The 3-2-1 backup rule is pretty outdated in the world of cloud. You could have 3 complete copies of your data in different S3 buckets, but if they're all under the same account you've lost your blast radius protection

gnabgib - 13 hours ago

Dupe - join the discussion started an hour ago instead of query string work (12 points, 4 comments) https://news.ycombinator.com/item?id=48200827

thrownthatway - 9 hours ago

Huh.

Railway dot com

Has nothing to do with railways.

I wish software people would get their own words.

brunooliv - 5 hours ago

Having tried many of these hosting services to host/play with toy apps, DigitalOcean and Fly.io are both unparalleled GOATs.

Mengkudulangsat - 12 hours ago

That explains why all my vibe-coded hobby projects are down.

Thank God I'm not dealing with any public-facing sites! Would have been an expensive lesson for a newbie coder if my job depended on this.

orliesaurus - 11 hours ago

I wonder if someone has exploited a weird Google-safety automated process to report something on Railway which caused Google to block the whole thing.

- 9 hours ago
[deleted]
yomismoaqui - 4 hours ago

Remember, the cloud is someone else's computer.

If that person turns it off you're screwed.

mattbee - 6 hours ago

The risk of an "upstream cloud provider" is not something you need to tolerate in your supplier of internet infrastructure!

zelon88 - 10 hours ago

Wild to me that any tech sector business would want to rent an operating environment to park their entire infrastructure into. This is the equivalent to traveling shoe salesmen setting up a tent in the parking lot of a strip mall.

r_lee - 11 hours ago

seriously, is it possible to trust GCP with critical data/services at this point if you're not a billion dollar company?

I'm exaggerating but someone said they got "auto banned"

what if that happens to a small account which hosts some really important data/services there?

pavelevst - 8 hours ago

Avoid vendor locking, have backups, make disaster recovery standby (or plan for quick recovery elsewhere)

tux - 11 hours ago

At this point you can’t trust Google anymore, it keeps breaking things. Imagine having Google AI do this thins automatically. Will have apocalypse in in a day.

dwa3592 - 11 hours ago

Wait, I thought railway was a cloud provider like AWS, GCP but better and more agile. At least that's the impression i got from their website.

leventhan - 10 hours ago

What's a good alternative to Railway?

steve1977 - 8 hours ago

Lesson learned: don't rely on a single hyperscaler, even (or especially) as a startup.

brokenodo - 12 hours ago

I’m a new customer and have been falling in love with Railway over the last 2 weeks, but this is quite the wake up call.

bilalq - 9 hours ago

Building a startup on GCP (or even Google Workspace) is an existential risk.

redanddead - 11 hours ago

one of the many reasons companies are cloud agnostic and dont want to get locked in

dlcarrier - 9 hours ago

This is the kind of outage worthy of a Kevin Fang video.

koolhead17 - 9 hours ago

Let's blame some rouge AI agent at GCP causing this.

parineum - 11 hours ago

There's a lot of, what seems to me, unfounded blame being directed at Google for this. Isn't railway the company that just blamed Anthropic for deleting their prod database?

jujube3 - 10 hours ago

If you buy a cloud-on-a-cloud, you're a clown-on-a-clown.

- 11 hours ago
[deleted]
ryanisnan - 13 hours ago

Yikes. I was wondering why my TLS certs were coming up as invalid.

eezing - 9 hours ago

“Deletion of private cloud subscription…”

Who deleted it?

bshack0 - 12 hours ago

so....what are we switching to y'all? cloud-run ? ;P

isninkhamiss - 11 hours ago

github got way more noise for less

ChrisArchitect - 11 hours ago

Earlier: https://news.ycombinator.com/item?id=48200827

jamwise - 8 hours ago

There goes a 9

WhereIsTheTruth - 6 hours ago

When your cloud depends on an other cloud

All these companies are fraud

Drew-Aetherwave - 11 hours ago

It is killing me...

mcontrerazCL - 13 hours ago

all my fkn postgres bd in railways! what do i do now?

shevy-java - 9 hours ago

Do not become dependent on Google. Ever.

Osborn_Ojure - 11 hours ago

compute recovered, get ready boys!

- 11 hours ago
[deleted]
iloveplants - 14 hours ago

seems like it's every day

fnord77 - 10 hours ago

wish I knew what "railway" is

rvz - 11 hours ago

Let me guess… Googler running AI agent in production that blocked this startup’s account.

- 11 hours ago
[deleted]
paganel - 4 hours ago

Apparently this has nothing to do with real-world trains and to the real-world rail system, at first, and reading the title alone, I had thought that some trains might have got stuck somewhere because of an IT (google cloud) failure. It's just another SaaS story.

codepack - 9 hours ago

[dead]

codepack - 9 hours ago

[dead]

codepack - 9 hours ago

[dead]

codepack - 9 hours ago

[dead]

upnorthmedia - 11 hours ago

[dead]

htrp - 9 hours ago

[dead]

upnorthmedia - 11 hours ago

[dead]

unit490 - 10 hours ago

[dead]

rekabis - 12 hours ago

TL;DR: putting all your eggs into one basket is bad, man.

rekabis - 12 hours ago

TL;DR: putting all your eggs into one basket is bad, man.