What's up with all those equals signs anyway?

lars.ingebrigtsen.no

579 points by todsacerdoti 13 hours ago


kstrauser - 7 hours ago

For context, this is the Lars Ingebrigtsen who wrote the manual for Gnus[0], a common Emacs package for reading email and Usenet. It’s clever, funny, and wildly informative. Lars has probably forgotten more about email parsing than 99% of us here will ever have learned.

The manual itself says[1]:

> Often when I read the manual, I think that we should take a collection up to have Lars psycho-analysed.

0: https://www.gnu.org/software/emacs/manual/html_mono/gnus.htm...

1: https://www.gnus.org/manual.html

ruhith - 11 hours ago

The real punchline is that this is a perfect example of "just enough knowledge to be dangerous." Whoever processed these emails knew enough to know emails aren't plain text, but not enough to know that quoted-printable decoding isn't something you hand-roll with find-and-replace. It's the same class of bug as manually parsing HTML with regex, it works right up until it doesn't, and then you get congressional evidence full of mystery equals signs.

tiborsaas - 12 hours ago

> We see that that’s a quite a long line. Mail servers don’t like that

Why do mail server care about how long a line is? Why don't they just let the client reading the mail worry about wrapping the lines?

heikkilevanto - 11 hours ago

I thought the article would be about the various meanings of operators like = == === .=. <== ==> <<== ==>> (==) => =~=

TazeTSchnitzel - 8 hours ago

The most interesting thing to me wasn't the equals signs, which I knew are from quoted-printable, but the fact that when an equals sign appears, a letter that should have been preceding or following it is missing. It's as if an off-by-one error has occurred, where instead of getting rid of the equals sign, it's gotten rid of part of the actual text. Perhaps the CRLF/LF thing is part of it.

xg15 - 11 hours ago

I'm just wondering why this problem shows up now. Why do lots of people suddenly post their old emails with a defective QP decoder?

> For some reason or other, people have been posting a lot of excerpts from old emails on Twitter over the last few days.

On the risk of having missed the latest meme or social media drama, but does anyone know what this "some reason or other" is?

Edit: Question answered.

thedanbob - 11 hours ago

I wrote my own email archiving software. The hardest part was dealing with all the weird edge cases in my 20+ year collection of .eml files. For being so simple conceptually, email is surprisingly complicated.

cachius - an hour ago

I'd like a good .eml viewer that undoes the quoted printable transformation for the contained plain and html text. useful for mails downloaded from Outlook.

beejiu - 12 hours ago

> So what’s happened here? Well, whoever collected these emails first converted from CRLF (i.e., “Windows” line ending coding) to “NL” (i.e., “Unix” line ending coding). This is pretty normal if you want to deal with email. But you then have one byte fewer:

I think there is a second possible conclusion, which is that the transformation happened historically. Everyone assumes these emails are an exact dump from Gmail, but isn't it possible that Epstein was syncing emails from Gmail to a third party mail server?

Since the Stackoverflow post details the exact situation in 2011, I think we should be open to the idea that we're seeing data collected from a secondary mail server, not Gmail directly.

Do we have anything to discount this?

(If I'm not mistaken, I think you can also see the "=" issue simply by applying the Quoted-Printable encoding twice, not just by mishandling the line-endings, which also makes me think two mail servers. It also explains why the "=" symbol is retained.)

maartin0 - 11 hours ago

Fun how the archive.today article near the top has this exact issue

https://pastes.io/correspond

https://news.ycombinator.com/item?id=46843805

anthk - 31 minutes ago

Dear GNU's: rewrite the fetching core so it gets performant enough to not crawl under 10000 headers from either usenet or Email. No, even native compilation is fast enough.

jojomodding - 13 hours ago

https://web.archive.org/web/20260203094902/https://lars.inge...

Did the site get the HN kiss of death?

JKCalhoun - 9 hours ago

(The title of the blog reminded me the late Bob Pease [1] who had the signature, "What's all this XXX stuff, anyhow?" [2] where XXX might be "noise gain", "capacitor leakage"…)

[1] https://en.wikipedia.org/wiki/Bob_Pease

[2] https://www.qsl.net/n9zia/pease/index.html

- 8 hours ago
[deleted]
lordnacho - 13 hours ago

I love how HN always floats up the answers to questions that were in my mind, without occupying my mind.

I, too, was reading about the new Epstein files, wondering what text artifact was causing things to look like that.

quibono - 13 hours ago

CLRF vs LF strikes again. Partly at least.

I wonder why even have a max line length limit in the first place? I.e. is this for a technical reason or just display related?

ErigmolCt - 6 hours ago

What's funny is that the failure mode here is so quietly destructive

Hanzklatil369 - 3 hours ago

27b09b80f93cecf1-000000001b5e2c7f-0000000069825928

voxelghost - 11 hours ago

My main takeaway from this article, is that I want to know what happened to the modified pigs with non-cloven hoofs

lucb1e - 11 hours ago

    cat title | sed 's/anyway/in email/'
would save a click for those already familiar with =20 etc.
noduerme - 12 hours ago

Great. Can't wait for equal signs to be the next (((whatever this is))). Maybe it's a secret code. j/k

On a side note: There are actually products marketed as kosher bacon (it's usually beef or turkey). And secular Jews frequently make jokes like this about our kosher bros who aren't allowed to eat the real stuff for some dumb reason like it has too many toes.

MarginalGainz - 10 hours ago

"It’s a fascinating case of 'Abstraction Leak'.

We’ve become so accustomed to modern libraries handling encoding transparently that when raw data surfaces (like in these dumps), we often lack the 'Digital Archeology' skills to recognize basic Quoted-Printable.

These artifacts (=20, =3D) are effectively fossils of the transport layer. It’s a stark reminder that underneath our modern AI/React/JSON world, the internet is still largely held together by 7-bit ASCII constraints and protocols from the 1980s.

seydor - 13 hours ago

TLDR "=\r\n" was converted to "=\n"

VoodooJuJu - 10 hours ago

[dead]

ValveFan6969 - 9 hours ago

[flagged]

brador - 12 hours ago

Could be worsened by inaccurate optical character recognition in some cases.

Back in those days optical scanners were still used.

zabzonk - 11 hours ago

People posting Excel formulae?

ccppurcell - 12 hours ago

Rock dots? You mean diacritics? Yeah someone invented them: the ancient Greeks, idiöt.