When your hash becomes a string: Hunting Ruby's million-to-one memory bug

mensfeld.pl

130 points by phmx 6 days ago


Animats - 16 hours ago

So they turned on GC after every allocate ("GC stress"), and

"With GC.stress = true, the GC runs after every possible allocation. That causes immediate segfaults because objects get freed before Ruby can even allocate new objects in their memory slots."

That would seem to indicate a situation so broken that you can't expect anything to work reliably. The wrong-value situation would seem to be a subset of a bigger problem. It's like finding C code that depends on use-after-free working and which fails when you turn on buffer scrubbing at free.

lifthrasiir - 13 hours ago

> Million-to-one bugs are real, not theoretical. They happen during initialization and restart, not runtime. When they trigger, they cascade - 2,500 errors from one root cause. In high-restart environments, rare becomes routine.

Million-to-one bugs are not only real but high enough to matter, depending on which million. Many years ago I had a rare bug that corrupted timestamps in the logs, with an emperical probability of about one to 3--5 million (IIRC). Turned out that that seemingly benign bug was connected to a critical data corruption issue with real consumer complaints. (I have described this bug in detail in the past, see my past comment for details.)

mwkaufma - a day ago

A little strange to write up a bug hunt that was resolved by the ffi upstream already, and not by the hunt itself. OP didn't fix the bug, though identifying that the upgrade was relevant is of some interest. Writing could have been clearer.

dmix - 13 hours ago

A good example of why everyone should learn a bit of C and low level memory management

fleshmonad - 21 hours ago

LLM slop. Why do people (presumably) take the time to debug something like this, do tests and go to great lengths, but are too lazy to do a little manual writeup? Maybe the hour saved makes up for being associated with publishing AI slop under your own name? Like there is no way the author would have written a text that reads more convoluted than what we have here.