Google Safe Browsing incident

statichost.eu

225 points by ericselin 3 days ago


dang - 3 days ago

PSA: Submitted title was 'PSA: Always use a separate domain for user content". We've changed it per https://news.ycombinator.com/newsguidelines.html. Might be worth knowing for context.

SquareWheel - 3 days ago

It's generally good advice, but I don't see that Safe Browsing did anything wrong in this case. First, it sounds like they actually were briefly hosting phishing sites:

> All sites on statichost.eu get a SITE-NAME.statichost.eu domain, and during the weekend there was an influx of phishing sites.

Second, they should be using the public suffix list (https://publicsuffix.org/) to avoid having their entire domain tagged. How else is Google supposed to know that subdomains belong to different users? That's what the PSL is for.

From my reading, Safe Browsing did its job correctly in this case, and they restored the site quickly once the threat was removed.

kbolino - 3 days ago

Putting user content on another domain and adding that domain to the public suffix list is good advice.

So good, in fact, that it should have been known to an infrastructure provider in the first place. There's a lot of vitriol here that is ultimately misplaced away from the author's own ignorance.

ericselin - 3 days ago

Since there's a lot of discussion about the Public Suffix list, let me point out that it's not just a webform where you can add any domain. There's a whole approval process where one very important criterion is that the domain to be added has a large enough user base. When you have a large enough user base, you generally have scammers as well. That's what happened here.

It basically goes: growing user base -> growing amount of malicious content -> ability to submit domain to PSL. In that order, more or less.

In terms of security, for me, there's no issue with being on the same domain as my users. My cookies are scoped to my own subdomain, and HTTPS only. For me, being blocked was the only problem, one that I can honestly admit was way bigger than I thought.

Hence, the PSA. :)

ArnoVW - 3 days ago

As a CISO I am happy with many of the protections that Google creates. They are in a unique position, and probably the only ones to be able to do it.

However, I think the issue is that with great power comes great responsibility.

They are better than most organisations, and working with many constraints that we cannot always imagine.

But several times a week we get a false "this mail is phishing" incident, where a mail from a customer or prospect is put in "Spam", with a red security banner saying it contains "dangerous links". Generally it is caused by domain reputation issues, that block all mail that uses an e-mail scanning product. These products wrap URLs so they can scan when the mail is read, and thus when they do not detect a virus, they become defacto purveyors of virii, and their entire domain is tagged as dangerous.

I have raised this to Google in May (!) and have been exchanging mail on a nearly daily basis. Pointing out a new security product that has been blacklisted, explaining the situation to a new agent, etc.

Not only does this mean that they are training our staff that security warnings are generally false, but it means we are missing important mail from prospects and customers. Our customers are generally huge corporations, missing a mail for us is not like missing one mail for a B2C outfit.

So far the issue is not resolved (we are in Oct now!) and recently they have stopped responding. I appreciate our organisation is not the US Government, but still, we pay upwards of 20K$ / year for "Google Workspace Enterprise" accounts. I guess I was expecting something more.

If someone within Google reads this: you need to fix this.

SerCe - 2 days ago

What this post might be missing is that it’s not just Google that can block your website. A whole variety of actors can, and any service that can host user-generated content, not just html (a single image is enough), is at risk, but really, any service is at risk. I’ve had to deal with many such cases: ISPs mistakenly blocking large IP prefixes, DPI software killing the traffic, random antivirus software blocking your JS chunk because of a hash collision, even small single-town ISPs sinkholing your domain because of auto-reports, and many more.

In the author’s case, he was at least able to reproduce the issues. In many cases, though, the problem is scoped to a small geographic region, but for large internet services, even small towns still mean thousands of people reaching out to support while the issue can’t be seen on the overall traffic graph.

The easiest set of steps you can do to be able to react to those issues are: 1. Set up NEL logging [1] that goes to completely separate infrastructure, 2. Use RIPE Atlas and similar services in the hope of reproducing the issue and grabbing a traceroute.

I’ve even attempted to create a hosted service for collecting NEL logs, but it seemed to be far too niche.

[1]: https://developer.mozilla.org/en-US/docs/Web/HTTP/Guides/Net...

freehorse - 3 days ago

I don't see how a separate domain would solve the main issue here. If something on that separate domain was flagged, it would still affect all user content on that domain. If your business is about serving such user content, the main service of your business would be down, even though your main domain would still be up.

duxup - 3 days ago

It feels like unless you're one of the big social media companies, accepting user content is slowly becoming a larger and larger risk.

shadowgovt - 3 days ago

Not sure who changed the HN headline, but I appreciate the change. Especially since the concept in the headline is buried at the bottom of the post.

Post author is throwing a lot of sand at Google for a process that has (a) been around for, what, over a decade now and (b) works. The fact of the matter is this hosting provider was too open, several users of the provider used it to put up content intended to attack users, and as far as Google (or anyone else on the web is concerned) the TLD is where the buck stops for that kind of behavior. This is one of the reasons why you host user-generated content off your TLD, and several providers have gotten the memo; it is unfortunate statichost.eu had not yet.

I'm sorry this domain admin had to learn an industry lesson the hard way, but at least they won't forget it.

andy_xor_andrew - 3 days ago

> In order to limit the impact of similar issues in the future, all sites on statichost.eu are now created with a statichost.page domain instead.

This read like a dark twist in a horror novel - the .page tld is controlled by Google!

https://get.page/

veeti - 3 days ago

It can happen to anyone and cause a reputational risk. Once upon a time $workplace had a Zoho Form that would be blacklisted by Google Safe Browsing or Microsoft Edge for arbitrary periods of time, presumably because someone used Zoho to make a phishing site, leading to some very confused calls.

bluesmoon - 3 days ago

Github discovered the same thing a long long time ago which is why you now have the github.io domain.

sire-vc - 3 days ago

I am a solo developer. I recently created a new web app for a client. Google has marked as phishing so they can't use it. Obviously I can't do anything about it except report error and wait. I'm worried if I move it to a new domain that one will get marked as well. Not sure what to do TBH.

kyledrake - 3 days ago

Google has some sort of internal flag for determining origin is different on some platforms. We don't get a complete takedown of Neocities every time there's a spam site reported. It is likely that they were not on that list but perhaps have been manually added to whatever that internal list is at this point.

The public suffix list (https://publicsuffix.org/) is good and if I were to start from scratch I would do it that way (with a different root domain) but it's not absolutely required, the search engines can and do make exceptions that don't just exclusively use the PSL, but you'll hit a few bumps in the road before that gets established.

Ultimately Google needs to have a search engine that isn't full of crap, so moving user content to a root domain on the PSL that is infested with phishing attacks isn't going to save you. You need to do prolific and active moderation to root out this activity or you'll just be right back on their shit list. Google could certainly improve this process by providing better tooling (a safe browsing report/response API would be extremely helpful) but ultimately the burdon is on platforms to weed out malicious activity and prevent it from happening, and it's a 24/7 job.

BTW the PSL is a great example of the XKCD "one critical person doing thankless unpaid work" comic, unless that has changed in recent years. I am a strong advocate of having the PSL management become an annual fee driven structure (https://groups.google.com/g/publicsuffix-discuss/c/xJZHBlyqq...), the maintainer deserves compensation for his work and requiring the fee will allow the many abandoned domains on the list to drop off of it.

progbits - 3 days ago

Hosts phishing sites, gets blocked by anti phishing mechanism. Works as expected from my point of view.

Get yourself on public suffix list or get better moderation. But of course just moaning about bad google is easier.

seanw265 - 3 days ago

I’ve got a random subdomain hosting a little internal tool. About twice a year, Google Safe Browsing decides it’s phishing and flags it. Sometimes they flag the whole domain for good measure.

Search Console always points to my internal login page, which isn’t public and definitely isn’t phishing.

They clear it quickly when I appeal, and since it’s just for me, I’ve mostly stopped worrying about it.

thehyperflux - 3 days ago

Google services simply behaved the way I would expect them to here. Who knows... they may even have saved some users from coming to harm.

junar - 3 days ago

I was curious how other browsers handle this. Apparently Safari and Firefox delegate to Google.

https://www.apple.com/legal/privacy/data/en/safari/

https://support.mozilla.org/en-US/kb/how-does-phishing-and-m...

Microsoft seems to do its own thing for Edge, though.

https://learn.microsoft.com/en-us/deployedge/microsoft-edge-...

dynm - 3 days ago

This is a bit of a tangent, the whole concept of "domain reputation" can be infuriating. For example, my blog has been marked as suspicious by spamhaus.org: https://check.spamhaus.org/results?query=dynomight.net

As a result, some ISPs apparently block the domain. Why is it listed? I have no idea. There are no ads, there is no user content, and I've never sent any email from the domain. I've tried contacting spamhaus, but they instantly closed the ticket with a nonsensical response to "contact my IT department" and then blocked further communication. (Oddly enough, my personal blog does not have an IT department.)

Just like it's slowly become quasi-impossible for an individual to host their own email, I fear the same may happen with independent websites.

rgj - 3 days ago

So… you were hosting user generated content on the same TLD as your website, without using the PSL, and you blamed G when things went south?

By putting UGC on the same TLD you also put your own security at risk, so they basically did you a favor…

kijin - 3 days ago

It's also good from a security perspective.

Anyone who can upload HTML pages to subdomain.domain.com can read and write cookies for *.domain.com, unless you declare yourself a public suffix and enough time has passed for all the major browsers to have updated themselves.

I've seen web hosts in the wild who could have their control panel sessions trivially stolen by any customer site. Reported the problem to two different companies. One responded fairly quickly, but the other one took several years to take any action. They eventually moved customers to a separate domain, so the control panel is now safe. But customers can still execute session fixation attacks against one another.

n1try - 2 days ago

Super interesting article! We were facing the exact same problem last Monday . Also write a couple of words about the incident: https://muetsch.io/how-google-accidentally-took-us-off-the-i....

sarathyweb - 3 days ago

Does anyone know if adding our domains to Public Suffix List will prevent incidents like this?

Retr0id - 3 days ago

I don't like nor trust google, but "Use your own judgement and hard-earned Internet street smarts" doesn't work either, because the median internet user does not have anything resembling internet street smarts.

fukka42 - 3 days ago

Still not sure why it's legal for Google to slander companies like this. They often have no proof or it's a false positive, meanwhile they're screaming about how malicious you are.

johnwheeler - 3 days ago

Seems like a reasonable trade-off I mean six hours is not the worst thing in the world. What if you were hosting mission-critical such as such? Were you?

gwbas1c - 3 days ago

> To be fair, many or even most sites on the Google Safe Browsing blacklist are probably unworthy. But I’m pretty sure this was not the first false positive.

The bigger issue is that the internet needs governance. And, in the absence of regulation, someone has stepped in and done it in a way that the author didn't like.

Perhaps we could start by requiring that Google provide ways to contact a living, breathing human. (Not an AI bot that they claim is equivalent.)

oefrha - 3 days ago

Honestly, this is extremely basic stuff in hosting, not only due to safe browsing, but also—and more importantly—cookie safety, etc. If a hosting provider didn’t know (already bad enough) and turn to whining after being hit, then

> Static site hosting you can trust

is more like amateur hour static site hosting you can’t trust. Sorry.

haktan - 3 days ago

If user1.statichost.page gets blacklisted now will it affect user2.statichost.page as well?

lucb1e - 3 days ago

I have the same issue. Think of my site as WeTransfer, but instead of only files, you can also use it as a link shortener or pastebin. Abuse works the same as on every other site or service: I do spot checks and users can report content. This was fine until uBlock Origin decided the website was malicious, per one of the lists that is default-enabled for everyone

That list doesn't have a clear way to get off of it. I would be happy to give them the heads up that their users are complaining about a website being broken, but there is no such thing, neither for users nor for me. In looking around, there's many "sources" that allegedly independently decided around the same day that my site needs to not work anymore, so now there's a dozen parties I need to talk to and more popping up the further you look. Netcraft started sending complaints to the registrar (which got the whole domain put on hold), some other list said they sent abuse to the IP space owner (my ISP), public resolvers have started delisting the domain (pretending "there is no such domain" by returning NXDOMAIN), as well as the mentioned adblockers

There's only one person who hasn't been contacted: the owner. I could actually do something about the abusive content...

It's like the intended path is that users start to complaint "your site doesn't work" (works for me, wdym?) and you need to figure out what software is it they're using, what DNS resolver they use, what antivirus, what browser, if a DOH provider is enabled... to find out who it might be that's breaking the site. People don't know how many blocklists they're using, and the blocklists don't give a shit if you're not a brand name they recognize. That's the only difference between my site and a place like Github: if I report "github.com hosts malware", nobody thinks "oh, we need to nuke that malicious site asap!"

I'd broaden the submitted post to say that it's not only Google with too much power, but these blocklists have no notification mechanism or general recourse method. It's a whack-a-mole situation which, as an open source site with no profit model (intentionally so), I will never win. Big tech is what wins. Idk if these lists do a trademark registration check or how they decide who's okay and who's not, but I suspect it's simply a brand name thing and your reviewer needs to know you

> Luckily, Google provided me with a helpful list of the offending sites

Google is doing better than most others with that feature. Most "intelligence providers", which other blocklists like e.g. Quad9 uses, are secretive about why they're listing you, or never even respond at all

IKnowThings - 3 days ago

I have recently had the pleasure of speaking with Google senior leadership involved in the Safe Browsing product on the topic of getting my SaaS product placed on their, "naughty list." The platform was down for 6 or so hours due to a false positive hit for phishing.

I have read A LOT of blogs/rants/incidents on social media about startups, small businesses, and individuals getting screwed by large companies in similar capacities. I am VERY sympathetic to those cries into the sky, shaking fists at clouds, knowing very well we are all very small and how the large providers seem to not care. With that in mind, I am not blind to the privilege my organization has to rope in Google to discuss root causes for incidents.

I am writing about it here because I believe most people will never be able to pull a key Google stakeholder into a 40 minute video call to deeply discuss the RCAs. The details of the discussion are probably protected by NDA so I'll be speaking in general terms.

Google has a product called Web Risk (https://cloud.google.com/web-risk/docs/overview), I hear it's mostly used by Google Enterprise customers in regulated verticals and some large social media orgs. Web Risk protects the employees of these enterprise organizations by analyzing URLs for indicators of risk, such as phishing, brand impersonation, etc.

My SaaS platform is well established and caters mostly to large enterprise. I provide enterprise customers with optional branded SSO landing pages. Customers can either use sign-in from the branded site (SP-initiated) or redirect from their own internal identity provider to sign-in (IdP-initiated). The SSO site branding is directed by the customer, think along the lines of what Microsoft does for Entra ID branded sign-in pages. Company logo(s), name, visual styling, and other verbiage may be included. The branded/vanity FQDN is (company).productname.mydomain.com.

You may be able to see where I'm headed at this point... Why was my domain blocked? For suspected phishing.

A mutual enterprise customer was subscribed to Google's Web Risk. When their employees navigated to their SSO site, Google scanned it. Numerous heuristics flagged the branded SSO site as phishing and we were blocked by Safe Browsing across all major web browsers (Safari, Chrome, Firefox, Edge, and probably others). Google told us that had our customer put the SSO site on their Web Risk allow-list, we wouldn't have been blocked.

I'm no spring chicken, I cannot rely nor expect a customer to do that, so I pressed for more which led to a lengthy conversation on risk and the seemingly, from my perspective, arbitrary decisions made by a black box without any sort of feedback loop.

I was provided a tiny bit of insight into the heuristic secret sauce, which led to prescribed guidance on what could be done to greatly reduce the risk of getting false positive flag for phishing again. Those specifics I assume I cannot detail here, however the overall gist of it is domain reputation. Google was unable to positively ascertain my domain's reputation.

My recommendation is for those of you out there in the inter-tubes who have experienced false positive Safe Browsing blocks, think about what you can do to increase your domain's public reputation. Also, get a GCP account so if you do get blocked, you can open a ticket from that portal. I was told it would be escalated to the appropriate team and be actioned on within 10-15 minutes.

oxguy3 - 3 days ago

Another day, another IT company learning the hard way about the public suffix list, or well-known URIs, or some other well-documented-but-niche security technology.

I love that IT is a field where there's no required formal education track and you can succeed with any learning path, but we definitely need some better way to make sure new devs are learning about some of these gotchas.

npodbielski - 3 days ago

Wow 19€ for web hosting. I pay like 8 for whole vps. Crazy

bawolff - 2 days ago

How dare google tag my website unsafe just because im hosting a bunch of phishing sites!

Like i get that google has a lot of power, but you think they would use a case where google was actually in the wrong.

NitpickLawyer - 3 days ago

The PSA is good, the article is meh. There is too much misdirected anger towards google here, IMO. I agree it sucks to be the false positive, but it'd also suck more to unknowingly be part of phishing campaigns and not know.

On top of that, it is also recommended to serve user content from another domain for security reasons. It's much easier to avoid entire classes of exploits this way. For the site admins: treat it as a learning experience instead of lashing out on goog. In the long run you'll be better off, having learned a good lesson.