Ask HN: We just had an actual UUID v4 collision...

135 points by mittermayr 10 hours ago


I know what you're thinking... and I still can't believe it, but...

This morning, our database flagged a duplicate UUID (v4). I checked, thinking it may have been a double-insert bug or something, but no.

The original UUID was from a record added in 2025 (about a year ago), and today the system inserted a new document with a fresh UUIDv4 and it came up with the exact same one:

b6133fd6-70fe-4fe3-bed6-8ca8fc9386cd

We're using this: https://www.npmjs.com/package/uuid

I thought this is technically impossible, and it will never happen, and since we're not modifying the UUIDs in any way, I really wonder how that.... is possible!? We're literally only calling:

import { v4 as uuidv4 } from "uuid";

const document_id = uuidv4();

... and then insert into the database, that's it.

Additionally, the database only has about 15.000 records, and now one collision. Statistically... impossible.

Has that ever happened to anyone?! What in the...

jandrewrogers - an hour ago

This is surprisingly common.

The security of UUIDv4 is based on the assumption of a high-quality entropy source. This assumption is invalidated by hardware defects, normal software bugs, and developers not understanding what "high-quality entropy" actually means and that it is required for UUIDv4 to work as advertised.

It is relatively expensive to detect when an entropy source is broken, so almost no one ever does. They find out when a collision happens, like you just did.

UUIDv4 is explicitly forbidden for a lot of high-assurance and high-reliability software systems for this reason.

throwaway_19sz - 7 hours ago

Funny story no one will believe, but it’s true. A good friend of mine joined a startup as CTO 10 years ago, high growth phase, maybe 200 devs… In his first week he discovered the company had a microservice for generating new UUIDs. One endpoint with its own dedicated team of 3 engineers …including a database guy (the plot thickens). Other teams were instructed to call this service every time they needed a new ‘safe’ UUID. My pal asked wtf. It turned out this service had its own DB to store every previously issued UUID. Requests were handled as follows: it would generate a UUID, then ‘validate’ it by checking its own database to ensure the newly generated UUID didn’t match any previously generated UUIDs, then insert it, then return it to the client. Peace of mind I guess. The team had its own kanban board and sprints.

e12e - an hour ago

Some discussion here:

https://github.com/uuidjs/uuid/issues/546

Eg:

> FWIW, I just tested crypto.getRandomValues() behavior on googlebot and it is also deterministic(!)

dweez - an hour ago

Good moment to revisit this fun article: https://jasonfantl.com/posts/Universal-Unique-IDs/

If the entire universe were turned into a giant computer and did nothing but generate uuids until its heat death, how many bits would you need for the ID space?

juancn - 4 hours ago

Something off on how the RNG is initialized? Lack of entropy?

If the rng is not customized it will use:

    const rnds8 = new Uint8Array(16);
    export default function rng() {
        return crypto.getRandomValues(rnds8);
    }
getRandomValues doesn't specify a minimum amount of entropy.
adyavanapalli - 8 hours ago

What you're talking about is so extremely rare that it's much more likely that the entire Earth is destroyed by an asteroid right this inst...

mdavid626 - 11 minutes ago

Or there is some other explanation, eg. somebody messed with the request manually, or with the db.

merlindru - 4 hours ago

Gotta be a seeding issue. If it's not, and you can prove it, you're about to be a little famous probably :P

sqquima - 20 minutes ago

Meta, but if I had a question like this, I'd likely have asked on Twitter or Reddit first. I'll keep in mind using HN as an alternative Q&A site.

danfritz - 18 minutes ago

Always let your db generate uuids. On postgres this is easy since v18 it supports uuid v7!

There is no need to set uuids through javascript or node imo

mittermayr - 10 hours ago

I fully agree. It makes no sense. Yet...

The only guesses I'm having is that we originally generated UUIDv4s on a user's phone before sending it to the database, and the UUID generated this morning that collided was created on an Ubuntu server.

I don't fully know how UUIDv4s are generated and what (if anything) about the machine it's being generated on is part of the algorithm, but that's really the only change I can think of, that it used to generated on-device by users, and for many months now, has moved to being generated on server.

Geee - 7 hours ago

According to the many-worlds interpretation of quantum mechanics, there's bound to be one branch of universe where every UUID is the same. Can you imagine what those guys are thinking?

jbverschoor - an hour ago

Most plausible cause: uuid package depends on some random number generator package, which has recently been compromised in order to make “random” numbers predictable. As a result, many crypto (ssl + currency) projects are compromised due to a supplychain attack.

sudb - an hour ago

This is first time I have experienced some vindication that choosing CUID2[1] for one of my projects was actually a good idea.

1. https://github.com/paralleldrive/cuid2

tumdum_ - 8 hours ago

Poorly seeded prng.

baq - an hour ago

the vm you're running on virtualized all the entropy away.

sbuttgereit - an hour ago

> I thought this is technically impossible

No, very technically possible... though, with good randomness, very, very unlikely.

But nothing technically prevents a UUIDv4 from generating a duplicate value.

shortercode - 32 minutes ago

Fun thing about random is that these things happen. UUIDv7 is less prone to this as it includes both a time component and random. I’ve been using ULID in a few project which has similar attributes to uuidv7 but more space efficient.

leni536 - 7 hours ago

It's not happening by chance, there is a bug somewhere.

From what I skimmed the package should just call to the js runtime's crypto.randomUUID(). I think it should always be properly seeded.

I think it is extremely unlikely that the runtime has a bug here, but who knows? What js runtime do you use?

jordiburgos - 9 hours ago

Please, do not use b6133fd6-70fe-4fe3-bed6-8ca8fc9386cd, I checked my database and I was using it already.

serf - 10 hours ago

1 in 4.72 × 10²⁸

1 in 47.3 octillion.

i'd be suspecting a race condition or some other naive mistake, otherwise id be stocking up on lottery tickets.

(lol at the other user posting at the same time about the lottery ticket.. great minds and all that.)

- 3 hours ago
[deleted]
glaslong - 8 hours ago

Buy some lava lamps

NKosmatos - 7 hours ago

> I thought this is technically impossible

Actually it's not impossible, but very very improbable.

P.S. You should play a lottery/powerball ticket

P.P.S. Whenever I use the word improbable, the https://hitchhikers.fandom.com/wiki/Infinite_Improbability_D... comes in mind

beardyw - 10 hours ago

Just a stupid question, but why not append the date, even in seconds as hex. It's just a few bytes and would guarantee that everything OK now will be OK in the future?

wg0 - 9 hours ago

Would the UUID v7 be more collision proof? Hard to say because it takes time into account but then the number of entropy bits are reduced hence the UUID generated exactly at the same time have more chance of a collusion because number of entropy bits are a much smaller space hence could result in collusions more easily.

Thoughts?

lyfeninja - 7 hours ago

Although incredibly rare, it's not impossible so probably best to just plan for collisions. A simply retry should suffice. But I agree I feel like something is going on somewhere else ...

not_math - 7 hours ago

Reminds me of some code I saw running in production. Every time we added a new entry, we were pulling all the UUIDs from this table, generating a new UUID, and checking for collisions up to 10 times.

OutOfHere - 5 hours ago

This is why I prefer to use a random base32 string over UUID. At least you get a proper 128 bit entropy instead of just a 122 bit entropy as with UUIDv4. That's a 64x difference in collision probability. I always thought UUIDs were a toy, not for serious use. If you control the strings, you can even make a longer ID.

Also, numerous applications that use a unique ID per record frequently need to check for ID collisions. I know I do for a short URL generator.

AndreyK1984 - 8 hours ago

Why not to have timestamp-uuid instead ?

naikrovek - 9 hours ago

The chance of a UUIDv4 collision is very low, but it is never zero.

If everything is done properly, then this is very likely the one and only time anyone involved in the telling or reading of this account will ever experience this.

ares623 - 8 hours ago

Buy a lottery ticket

sublinear - 7 hours ago

> We're using this: https://www.npmjs.com/package/uuid

Why? There's a built-in for this.

https://nodejs.org/api/crypto.html#cryptorandomuuidoptions

ESAM_C - 10 hours ago

[dead]

samdhar - 10 hours ago

[flagged]