A human postmortem of the 1996 AOL outage

ngrok.com

62 points by EndEntire 3 days ago


jyounker - 9 minutes ago

It's first paragraph leaves me disappointed. It's 1996. Worries over a tech bubble are a few years away. And there are no tensions with Russia really, because Americans didn't give a damn about the first Chechen war.

dobermanz - 9 hours ago

Its 1996 - AOHell is loading, Aphex Twin blasts over 28.8, cordless phones hidden, a pizza is on the way…

nikau - 2 hours ago

In a touch of irony this blog is throwing some null error

knuckleheads - 9 hours ago

Something that I have started doing lately is asking ChatGPT et al to check usenet for reactions from users about events (if it is the right 80's/90's time period). Sure enough, aol.sucks on usenet had some choice words about the outage:

>What does Cisco stand for?? Case's Internet System Crapped Out. That's right, Steve Case and his AOL pig fell victim to some mickey mouse networking equipment. Unfortunatly for AOL, they were the first ISP to feel real pain from using equipment made by Cisco Systems.

https://groups.google.com/g/alt.aol-sucks/c/iqjd7crtPs4 https://groups.google.com/g/alt.aol-sucks/c/K75nltM31Bw https://groups.google.com/g/alt.aol-sucks/c/vVup-HvlPWM

Here's a reporter asking for comments and getting laughed at and trolled: https://groups.google.com/g/alt.aol-sucks/c/mStonlu_H8E

Some more serious reactions over on comp.risks: https://catless.ncl.ac.uk/Risks/18/30#subj2 https://catless.ncl.ac.uk/Risks/18/31#subj3 https://catless.ncl.ac.uk/Risks/18/41#subj3

>Yesterday morning, I got a call because their mail system was backing up heavily. It took a while to discover the cause, but it turned out to be AOL. Because AOL's incoming mail from the Internet runs on relatively slow systems, and because they receive hundreds of thousands of Internet messages a day, they have 30 systems to receive incoming mail, all pointed at from the AOL.COM name. That means that any mail system trying to send mail to AOL would have to individually try all 30 addresses before giving up. Translate that to a 60 second (typical) wait for a connection timeout, and you've got a 30 minute time-in-queue for an AOL message.

nanog on seclists was an interesting read too https://seclists.org/nanog/1996/Aug/51

Flamewar over sendmail not handling outage well > Remember the AOL outage? One host built up a backlog of 2000 messages for AOL---but, because it was running qmail, it didn't even slow down. Meanwhile, sendmail users were choking on much smaller queues. https://groups.google.com/g/comp.mail.sendmail/c/TeNdv2laT94

ThrowawayTestr - 2 hours ago

The bit about Steve Schalchlin really affected me. The idea that someone's whole life could have been different, or much shorter, if they hadn't seen a piece of info at the right time. Gives me chills.

Ozzie-D - 6 hours ago

[flagged]

draw_down - 8 hours ago

[dead]

stigz - 8 hours ago

> We, ngrok, have sponsored Mac to write this post because we think it’s an underexplored perspective on the topic of reliability.

Uh, okay. Were there any reliability perspectives gained from this 30-year-old postmortem that would help us in the modern age? After reading the article, I feel the answer is "none". Not that I'm complaining I love this era of the internet. But I fail to see any importance here.