When your hash becomes a string: Hunting Ruby's million-to-one memory bug
mensfeld.pl130 points by phmx 6 days ago
130 points by phmx 6 days ago
So they turned on GC after every allocate ("GC stress"), and
"With GC.stress = true, the GC runs after every possible allocation. That causes immediate segfaults because objects get freed before Ruby can even allocate new objects in their memory slots."
That would seem to indicate a situation so broken that you can't expect anything to work reliably. The wrong-value situation would seem to be a subset of a bigger problem. It's like finding C code that depends on use-after-free working and which fails when you turn on buffer scrubbing at free.
That’s exactly what it was. He discovered the customer was using a version of ffi that had this “use-after-free” (ish) bug, but the question “is this actually what my customer was seeing or is there _another_ bug lurking” still needed to be answered.
> Million-to-one bugs are real, not theoretical. They happen during initialization and restart, not runtime. When they trigger, they cascade - 2,500 errors from one root cause. In high-restart environments, rare becomes routine.
Million-to-one bugs are not only real but high enough to matter, depending on which million. Many years ago I had a rare bug that corrupted timestamps in the logs, with an emperical probability of about one to 3--5 million (IIRC). Turned out that that seemingly benign bug was connected to a critical data corruption issue with real consumer complaints. (I have described this bug in detail in the past, see my past comment for details.)
A little strange to write up a bug hunt that was resolved by the ffi upstream already, and not by the hunt itself. OP didn't fix the bug, though identifying that the upgrade was relevant is of some interest. Writing could have been clearer.
The bug that was fixed in upstream manifested differently than what he was experiencing so the journey was to validate it for his case.
OTOH I'm a bit surprised he didn't pull back earlier and suggest to his user to update to the latest version though and let him know.
15 or so years ago I had a similar journey - a single python interpreter "impossible" segfault in production that turned out to be a bug in glibc realloc, that had already been fixed in an update, we just didn't figure out to even look for one until we'd narrowed it down that far. (We were shipping custom Debian installs on DVD, a fair number of our customer installs weren't internet accessible so casual upgrades were both impossible and unwanted, but it was also a process mistake on my part to not notice the existence of the upgrade sooner.)
Never wrote it up externally because it was already solved and "Debian updates to existing releases are so rare that you really want to pay attention to all of them" (1) was already obvious (2) was only relevant to a really small set of people (3) this somewhat tortured example wasn't going to reach that small set anyway. (Made a reasonable interview story, though.)
A good example of why everyone should learn a bit of C and low level memory management
LLM slop. Why do people (presumably) take the time to debug something like this, do tests and go to great lengths, but are too lazy to do a little manual writeup? Maybe the hour saved makes up for being associated with publishing AI slop under your own name? Like there is no way the author would have written a text that reads more convoluted than what we have here.
> Why do people […] take the time to debug […] but are too lazy to do a little manual writeup[?]
They like to code. They don’t like to write.
I’m not excusing it, but after you asked the question the conclusion seems logical.
> They like to code. They don’t like to write.
People like reading LLM slop less than either of those. So it should become a common understanding not to waste your (or our) time to "write" this. It's frustrating to give it a chance then get rug-pulled with nonsense and there's really no reason to excuse it.
I read it just fine and everything made sense in it.
I would spend similar time debugging this if I were the author. It's a pretty serious bug, a non obvious issue, and would be impossible to connect to the ffi fix unless you already knew the problem.
I have no idea whether the text was generated from an LLM, but “slop” it absolutely is not - it’s clearly a very logically ordered walkthrough about a very thorough debugging process.
If you call anything that comes out of a model “slop” the term uses all meaning.
Sorry, why is this LLM slop? I only got about halfway through because I don’t care about this enough to finish the read, but I don’t see the “obvious LLM” signal you do.
It's clearest in the conclusion.
I still don’t see it.
I feel like the “this is AI” crowd is getting ridiculous. Too perfect? Clearly AI. Too sloppy? That’s clearly AI too.
Rarely is there anything concrete that the person claiming AI can point to. It’s just “I can tell”. Same confident assurance that all the teachers trusting “AI detectors” have.
I came to this thread hoping to read an interesting discussion of a topic I don’t understand well; instead it’s this
I have opened a wager r.e. detecting LLM/AI use in blogs: https://dkdc.dev/posts/llm-ai-blog-challenge/
I feel like it’s on every other article now. The “this is ai” comments detract way more from the conversation than whatever supposed ai content is actually in the article.
These ai hunters are like the transvestigators who are certain they can always tell who’s trans.
No. These articles are annoying to read, the same dumb patterns and structures over and over again in every one. It's a waste of time; the content gives off a generic tone and it's not interesting.
Are we reading the same article?
Also, you do realize that writing is taught in an incredibly formulaic way? I can't speak to English as second language authors, but I imagine it doesn't make it easier.
say that! that’s independent of whether AI/LLM tools were used to write it and more valuable (“this was boring and repetitive” vs “I don’t like the tool I suspect you may have used to write this”)
So is the vast majority of comments on HN (and in any comment section of any website) well before LLMs came into being, yet we give them a benefit of doubt. Users on forums tend to behave in a starkly bot-like way, often having a very limited set of responses pertaining to their particular hobby horses, so much so that others could easily predict how the most prolific users would react to any topic and in what precise words.
Now, apparently, we have a generation of "this is AI slop!" "bots".
> I will make a bet for $1,000,000!
> I won't actually make this bet!
> But if I did make this bet, I would win!
???
if two parties put up $1,000,000 each and I get a large cut I’ll do the work! one commenter already wagered $1,000, which I’d easily win, but I suspect this would take me idk at least a few days of work (not worth the time). and, again, for a million dollars I’d make sure I win
see other comment though, the point is that assessing quality of content on whether AI was used is stupid (and getting really annoying)
I don't have a million dollars but I'll take you up on it for like a grand. I'm serious, email me.
the problem is it’s a lot of work (not actually worth it for me for a thousand dollars) — but you cannot win
just one scenario, I write 100 rather short, very similar blog posts. run 50 through Claude Code with instructions “copy this file”. have fun distinguishing! of course that’s an extreme way to go about it, but I could use the AI more and end up at the same result trivially
This is so childish and pathetic it doesn't deserve a response.
why? LLM/AI use doesn’t denote anything about style or quality of a blog, that’s the point — and why this type of commentary all of HackerNews and elsewhere is so annoying.
obviously if a million dollars are on the line I’m going to do what I can to win. I’m just pointing out how that can be taken to the extreme, but again I can use the tools more in the spirit of the challenge and (very easily) end up with the same results