Ask HN: We just had an actual UUID v4 collision...

135 points by mittermayr 10 hours ago

I know what you're thinking... and I still can't believe it, but...

This morning, our database flagged a duplicate UUID (v4). I checked, thinking it may have been a double-insert bug or something, but no.

The original UUID was from a record added in 2025 (about a year ago), and today the system inserted a new document with a fresh UUIDv4 and it came up with the exact same one:

b6133fd6-70fe-4fe3-bed6-8ca8fc9386cd

We're using this: https://www.npmjs.com/package/uuid

I thought this is technically impossible, and it will never happen, and since we're not modifying the UUIDs in any way, I really wonder how that.... is possible!? We're literally only calling:

import { v4 as uuidv4 } from "uuid";

const document_id = uuidv4();

... and then insert into the database, that's it.

Additionally, the database only has about 15.000 records, and now one collision. Statistically... impossible.

Has that ever happened to anyone?! What in the...

jandrewrogers - an hour ago

This is surprisingly common.

The security of UUIDv4 is based on the assumption of a high-quality entropy source. This assumption is invalidated by hardware defects, normal software bugs, and developers not understanding what "high-quality entropy" actually means and that it is required for UUIDv4 to work as advertised.

It is relatively expensive to detect when an entropy source is broken, so almost no one ever does. They find out when a collision happens, like you just did.

UUIDv4 is explicitly forbidden for a lot of high-assurance and high-reliability software systems for this reason.

LocalH - 8 minutes ago

This is why CloudFlare has done what they did with the lava lamp wall. Not that the wall is such a great source of entropy on its own - I'm sure it's not their only source, but you can never have too many sources of entropy - but it makes it visible in a way that can grab those who don't fully understand the concepts of RNGs and how entropy plays into that.
The more sources of entropy, the more closely you approach "perfect" randomization. And a large chunk of those entropy sources need to be non-deterministic. Even on the small level (local applications running on local systems, like games) can use things like the mouse coordinates, the timings between button presses, the exact frame count since game start before the player presses Start) to greatly enhance randomness while still using PRNGs under the hood
Yes, for the latter, that's technically deterministic (and the older the game considered, the more deterministic it is, see TAS runs of old games obliterating the "RNG"). But when you have fifty different parameters feeding into the initial seed, that's fifty things an attack would have to perfectly predict or replay (and there are other ways to avoid replay attacks that can be layered on top)
If CloudFlare had less than 100 different sources of entropy, I'd be disappointed. And that's assuming their algorithm for blending those entropy sources into a single seed value is good
thecloud - an hour ago

Thanks for the insight! Mind expanding on what alternatives are being used in high reliability systems instead of UUIDv4?
- filcuk - 37 minutes ago
  
  The latest UUID (7?) Uses half random gen, half timestamp. This not only makes it sortable by creation, but would also make a collision like this impossible.
  - stanmancan - 16 minutes ago
    
    It's still possible in most implementations of UUIDv7.
    UUIDv7 assigns the first 48 bits for the timestamp in milliseconds. You can generate a lot of UUID's in a millisecond though!
    Then you have another 12 bits that you can use as you wish; "rand_a". The spec has a few methods they suggest on how to use these bits including 12 bits of random data, using it for sub-millisecond timestamps, or creating a monotonic counter, but each have their downsides:
    - Purely random data means you can still run into collisions and anything within the same millisecond is unordered
    - Sub millisecond you can run into collisions
    - Monotonic counters can overflow before the next tick, and it's only monotonic to the system that's generating the UUID
    You can steal some of the 62 bits in rand_b if you want as well; you can use rand_a for sub-millisecond accuracy, and then use a few bits of rand_b for a monotonic counter. There's still a chance of collision here, but it's exceedingly low at the expense of less truly random data at the end.
    If you want truly collision free, you'd also need to assign a couple of bits to identify the subsystem generating the UUID so that the monotonic counter is unique to that subsystem. You lose the ordering part of the monotonic counter this way though, but I guess you could argue that in nearly 100% of cases the accuracy of sub-millisecond order in a distributed system is a lie anyways.
  - ffsm8 - 22 minutes ago
    
    Considering the context I think it's worth pointing out that it's technically not impossible - it's just even less likely.
    Everything in crypto is always a probability - never a certainty
    
    nitsky - 16 minutes ago
    
    True, but it makes the specific collision the post observed completely impossible.
    
    stanmancan - 5 minutes ago
    
    I left a more detailed comment on the parent, but it's definitely not impossible!
  - - 13 minutes ago
    
    [deleted]
- lazide - an hour ago
  
  Sequences, generally.
perching_aix - an hour ago

How is UUIDv4 to blame for a broken source of entropy? Or am I misinterpreting your words?
- hmry - 25 minutes ago
  
  I wouldn't say it's "to blame", but it is more susceptible to bad RNG.
  If the RNG is bad, you'll get more benefit from adding non-random bits than you would from additional badly RNG'd bits.
  The probability for collisions also rises even more as you generate more IDs. If you incorporate non-random bits, you can alleviate that:
  - timestamps make the collision probability not grow over time as you have more existing UUIDs that could collide
  - known-distinct machine IDs make the collision probability not grow as you add more machines
- hombre_fatal - an hour ago
  
  Presumably they mean using randomness as unique IDs.

throwaway_19sz - 7 hours ago

Funny story no one will believe, but it’s true. A good friend of mine joined a startup as CTO 10 years ago, high growth phase, maybe 200 devs… In his first week he discovered the company had a microservice for generating new UUIDs. One endpoint with its own dedicated team of 3 engineers …including a database guy (the plot thickens). Other teams were instructed to call this service every time they needed a new ‘safe’ UUID. My pal asked wtf. It turned out this service had its own DB to store every previously issued UUID. Requests were handled as follows: it would generate a UUID, then ‘validate’ it by checking its own database to ensure the newly generated UUID didn’t match any previously generated UUIDs, then insert it, then return it to the client. Peace of mind I guess. The team had its own kanban board and sprints.

wongarsu - 5 hours ago

At some point someone optimizes the system to a global company-wide incrementing 128 bit counter. Instead of needing a costly database lookup against a growing database the microservice just fetches the current counter, increments it by one and hands out the new value. Easy, fast O(1) operation.
This even allows you to shard the service to provide high availability and distribute the service globally to reduce latency. Just give each instance a dedicated id range it can hand out. I'd suggest reserving some of the high bits to indicate data center id, and a couple more bits for id-generator instance within that dc.
Wait a second, this starts to look familiar ... does Twitter still do that, or did they eventually switch?
- kuratkull - an hour ago
  
  Define a random 128 bit key that you will never change. Use that key to encrypt 128 bit integers in sequence using AES-128, each one comes out as a, for all practical purposes, unique unpredictable ID.
- throw0101c - an hour ago
  
  > At some point someone optimizes the system to a global company-wide incrementing 128 bit counter.
  Some UUID versions include time, so there's a bit of a counter in that.
- sheept - an hour ago
  
  Twitter snowflakes haven't changed. Most of the bits go to the timestamp, which I guess is a global incrementing counter as you described
roryirvine - 5 hours ago

I've seen similar, buried deep within a major SV tech co.
Their process was a bit more complex because the master list of in-use UUIDs was stored in an external CMDB service run by a different department. They got a daily dump of that db, so were able to check that when generating a "provisional" id. Only once it had been properly submitted to the CMDB did it became "confirmed".
They had guardrails in place to prevent "provisional" ids being used in production, and a process for recycling unused "confirmed" ids. Oh, and they did regular audits which were taken very seriously by management.
Last I heard, they were 18 months into a 6 month project to move their local database cache to Zookeeper...
mrbonner - 34 minutes ago

We have had a service to add two numbers. What make you think this is not realistic? :-)
- morkalork - 29 minutes ago
  
  I too have witnessed a "add two numbers" service! Turns out you can be too extreme with rules for isolating out business logic..
giancarlostoro - an hour ago

I can believe it, and I often wondered "can I win the UUID misfortune lottery" I wonder if this is equally common with Microsoft's flavor aka GUIDs.
- tracker1 - 9 minutes ago
  
  GUIDs are UUIDs are effectively the same thing... the issues often come down the the means of generation and storage... where UUID have versions with specific implementation details that aren't always followed, MS has internal implementations that also aren't always followed. Also worth being aware of are COMB, SequencialIDs (MS-SQL) and other serialization approaches as well as how they affect indexes in practice.
  Alternatives include sequencial number generator services, or sequence services that may be entirely sequencial, etc, but may lead to out of order inserts in practice.
  Also, generally worth considering UUIDv7 assuming your sotrage and indexing use the time portion at the front of the index process.
franktankbank - 6 hours ago

Who has the balls to form that team? Were they disbanded?
- giancarlostoro - an hour ago
  
  I will gladly assume that this team was formed after several collisions with UUID's my assumption is that they had tremendous amount of data and enough revenue to justify all of this at least financially. I would have re-evaluated the UUID version used or if adopting Snowflakes would be better at some point.
ryandvm - 5 hours ago

Pffft - they didn't need to store the whole UUID, just a hash. Dummies.
- dd8601fn - 4 hours ago
  
  They thought of that, but they were still working on hiring a team to maintain the hashing microservice.
  - mstaoru - 3 hours ago
    
    Hashing microservice deployment was blocked by random generator microservice stuck in Pending because it needed an UUID from UUID microservice which was blocked by hashing.
    
    alserio - an hour ago
    
    "Learned a lot today, love Galactus"
- mrsvanwinkle - 3 hours ago
  
  already laughing from parent comment this is well done
- _3u10 - 36 minutes ago
  
  one hash is insufficent, they need k-hashes.
  i get the joke, but seriously a bloomfilter would be a good idea.

e12e - an hour ago

Some discussion here:

https://github.com/uuidjs/uuid/issues/546

Eg:

> FWIW, I just tested crypto.getRandomValues() behavior on googlebot and it is also deterministic(!)

dweez - an hour ago

Good moment to revisit this fun article: https://jasonfantl.com/posts/Universal-Unique-IDs/

If the entire universe were turned into a giant computer and did nothing but generate uuids until its heat death, how many bits would you need for the ID space?

juancn - 4 hours ago

Something off on how the RNG is initialized? Lack of entropy?

If the rng is not customized it will use:

    const rnds8 = new Uint8Array(16);
    export default function rng() {
        return crypto.getRandomValues(rnds8);
    }

getRandomValues doesn't specify a minimum amount of entropy.

Hizonner - 4 hours ago

It's a near certainty that something is badly wrong with the RNG, and, yes, probably in how it's seeded.
It's probably messing up the cryptography, too.
- Onavo - an hour ago
  
  But defaults should be sane and safe. RNG isn't the sort of thing you want to be messing up. Every JS dev was taught that Math.random is not safe by default, but the crypto package is.

adyavanapalli - 8 hours ago

What you're talking about is so extremely rare that it's much more likely that the entire Earth is destroyed by an asteroid right this inst...

thomasmg - an hour ago

It is not quite as rare. I calculated it to be less common than being hit by a meteorite, and added a section about that and the Birthday Paradox to Wikipedia, to the article about UUIDs. It got removed / replaced a few years ago however. (If my source was correct, there was actually a woman hit by a meteorite, but she survived, with a leg injury.)
If you do have a UUID collision, chances are extremely high that it's either a software bug, or glitch in the computer. It could be a cosmic ray. Cosmic rays messing with the computer memory or CPU are actually relatively common.
delichon - 5 hours ago

About as rare as an asteroid typing an ellipsis and clicking the add comment button.
- throw0101c - an hour ago
  
  Well, this joke dates back to (at least) the dial-up days where {#`%${%&`+'${`%& NO CARRIER
- xerox13ster - an hour ago
  
  That’s just a result of jounce from localized gravity effects and atmospheric pressure disturbances in the moments before impact.
  Think the ultrasonic typing hacking scene in Pantheon combined with the keyboard bouncing due to rumbling.
sebazzz - 4 hours ago

Well it would be statistically even rarer for that UUID collision to happen and the earth to be destroyed by an asteroid.
crazylogger - 3 hours ago

For a single database using UUIDs, yes, it's astronomically rare. But it's quite a different thing to say that no computer system on Earth has ever experienced a UUID collision. The number of systems out there is also astronomical.
- nathanmills - an hour ago
  
  >The number of systems out there is also astronomical.
  Not even close

mdavid626 - 11 minutes ago

Or there is some other explanation, eg. somebody messed with the request manually, or with the db.

merlindru - 4 hours ago

Gotta be a seeding issue. If it's not, and you can prove it, you're about to be a little famous probably :P

sqquima - 20 minutes ago

Meta, but if I had a question like this, I'd likely have asked on Twitter or Reddit first. I'll keep in mind using HN as an alternative Q&A site.

danfritz - 18 minutes ago

Always let your db generate uuids. On postgres this is easy since v18 it supports uuid v7!

There is no need to set uuids through javascript or node imo

hx8 - 10 minutes ago

There's plenty of reasons to set a unique identifier before database save, or to want a unique identifier that doesn't have a 1-to-1 relationship with your object.
For example, in the idempotent kafka consumer pattern we set a unique ID in the header of every kafka message at the time of message publishing. We then have our consumers do a quick check of the ID against their data store to see if they have processed the message before or not. This way there is no impact if a consumer sees the same message twice. This allows us more flexibility during rebalancing events or replaying old offsets.

mittermayr - 10 hours ago

I fully agree. It makes no sense. Yet...

The only guesses I'm having is that we originally generated UUIDv4s on a user's phone before sending it to the database, and the UUID generated this morning that collided was created on an Ubuntu server.

I don't fully know how UUIDv4s are generated and what (if anything) about the machine it's being generated on is part of the algorithm, but that's really the only change I can think of, that it used to generated on-device by users, and for many months now, has moved to being generated on server.

AntiUSAbah - 8 hours ago

You let users generate a UUID?
To be honest, the chance that you are doing something weird is probably higher than you experiencing a real UUID conflict.
How did your database 'flag' that conflict?
- wongarsu - 5 hours ago
  
  If it's UUIDv4 and you validate that the UUID is valid and not conflicting I don't really see the issue with user-generated UUIDs. Being able to generate unique keys in an uncoordinated manner is the main selling point of UUIDs
  Sure, it's something I'd flag in any design to spend two minutes to talk about potential security implications. But usually there aren't any
  - AntiUSAbah - 5 hours ago
    
    Validation etc. every thing which should not be controlled by a user, will not be controlled by a user.
- mittermayr - 8 hours ago
  
  user-generated (as in: on the user's phone) was only at the very early stages of this product, and we've since moved to on-server. It's a cash-register type of app, where the same invoice must not be stored twice. So we used to generate a fresh invoice_id (uuidv4) on the user's device for each new invoice, and a double-send of that would automatically be flagged server-side (same id twice). This has since moved on to a server-only mechanism.
  The database flagged it simply by having a UNIQUE key on the invoice_id column. First entry was from 2025, second entry from today.
  - bitsandbits - a minute ago
    
    If the server or the user's phone had the wrong time and if the date is used in generating the ID...
wongarsu - 5 hours ago

If it was two on-device generated UUIDs I could see a collision happening. There have been instances of cheap end devices not properly seeding their random number generators, leading to colliding "random" values. And cases of libraries using cheap RNGs instead of a proper cryptographic RNG, making it even worse
But on a server that shouldn't happen, especially not in 2026 (in the past, seeding the rngs of VMs used to be a bit of an issue). Even if one UUID was badly generated, a truly random UUID statistically shouldn't collide with it. You'd need an issue in both generators
stubish - 9 hours ago

The UUIDv4 collision is statistically extremely unlikely. What is more likely is both systems used the same seed. This might be just a handful of bytes, increasing the chance of collision to one in billions or even millions.
lazyjones - 7 hours ago

Better check what crypto.js is actually doing in your exact setup. Weak polyfills exist...

Geee - 7 hours ago

According to the many-worlds interpretation of quantum mechanics, there's bound to be one branch of universe where every UUID is the same. Can you imagine what those guys are thinking?

BobaFloutist - an hour ago

Not only that, there's vastly more where every UUID except one is the same, but they never got to that one because they didn't ever use them.
Or where the first two are unique, but every following one is one of the first two.
zeeveener - an hour ago

"Huh, this is just an identity function. Cool. Let's move on."
nyantaro1 - 5 hours ago

This is why I am not a fan of the Everett approach

jbverschoor - an hour ago

Most plausible cause: uuid package depends on some random number generator package, which has recently been compromised in order to make “random” numbers predictable. As a result, many crypto (ssl + currency) projects are compromised due to a supplychain attack.

jbverschoor - an hour ago

Changed 3 weeks ago:

uuid/src/rng.ts : the random array is const. Every call will share the same random number. Subsequent call will update your old random code, so if you generated something important... good luck

The old code used to do a slice() which creates a new copy.

Might be unintentional. Although I have no idea how this would pass any tests, as you would think to test generating 2 randomnumbers and hope they are not the same.

jbverschoor - an hour ago

Didn't actually want to write a test myself.. but I miss Claudia confirmed it. Pretty concearning.

Synchronous / serial calls:

   import rng from './rng';
   
   const a = rng();
   console.log('a after first call: ', Array.from(a));
   
   const b = rng();
   console.log('a after second call:', Array.from(a));
   console.log('b after second call:', Array.from(b));
   
   console.log('a === b (same reference)?    ', a === b);
   console.log('a equals b (same contents)?  ', a.every((v, i) => v === b[i]));

output:

   a after first call:  [
     101, 193, 125,  19, 142,
     136, 181, 140, 209, 224,
     176, 153, 179, 248, 246,
     166
   ]
   a after second call: [
       4,  29, 48, 215, 162,  60,
      64,  23, 78, 137,   2, 186,
     230, 249, 70, 224
   ]
   b after second call: [
       4,  29, 48, 215, 162,  60,
      64,  23, 78, 137,   2, 186,
     230, 249, 70, 224
   ]
   a === b (same reference)?     true
   a equals b (same contents)?   true

and aynchronous calls:

   import rng from './rng';
   
   async function getId() {
      const bytes = rng();
      await new Promise(r => setTimeout(r, 0)); // yield to the event loop
      return Array.from(bytes);
   }
   
   const [id1, id2] = await Promise.all([getId(), getId()]);
   console.log('id1:', id1);
   console.log('id2:', id2);
   console.log('identical?', id1.every((v, i) => v === id2[i]));

output:

   id1 captured:  [
      61, 116, 151,  35, 153,
      75, 105,  15,  59, 235,
     162, 215, 224, 115,  31,
     122
   ]
   id2 captured:  [
      13,  3,  84,  28, 22, 176,
     160, 70,  67, 246,  1,  37,
      38, 61, 171,  23
   ]
   id1 after await: [
      13,  3,  84,  28, 22, 176,
     160, 70,  67, 246,  1,  37,
      38, 61, 171,  23
   ]
   id2 after await: [
      13,  3,  84,  28, 22, 176,
     160, 70,  67, 246,  1,  37,
      38, 61, 171,  23
   ]
   ---
   final id1: [
      13,  3,  84,  28, 22, 176,
     160, 70,  67, 246,  1,  37,
      38, 61, 171,  23
   ]
   final id2: [
      13,  3,  84,  28, 22, 176,
     160, 70,  67, 246,  1,  37,
      38, 61, 171,  23
   ]
   identical? true

jbverschoor - an hour ago

https://github.com/uuidjs/uuid/blob/e1f42a354593093ba0479f0b...
became
https://github.com/uuidjs/uuid/blob/f2c235f93059325fa43e1106...
Welp.. time to patch and update everything again. Another day, another npm-package headache. Very odd()
Attack vector: call the rng(), and send the result somewhere. You now have now overwritten someone elses "random number" and know about it. The fun things you can do with those numbers!
- jbverschoor - 33 minutes ago
  
  Seems to be "safe" because of it's not exported, and the results get used in a different way. Still is a bug in my book.

sudb - an hour ago

This is first time I have experienced some vindication that choosing CUID2[1] for one of my projects was actually a good idea.

1. https://github.com/paralleldrive/cuid2

tumdum_ - 8 hours ago

Poorly seeded prng.

jdthedisciple - 8 hours ago

most likely the culprit indeed
- nswango - 8 hours ago
  
  But I used nonstandard nonces!

baq - an hour ago

the vm you're running on virtualized all the entropy away.

Imustaskforhelp - 28 minutes ago

This seems very likely to be the case.
Something tangentially cool which is related: https://eu.mouser.com/new/leetronics/leetronics-infinite-noi...

sbuttgereit - an hour ago

> I thought this is technically impossible

No, very technically possible... though, with good randomness, very, very unlikely.

But nothing technically prevents a UUIDv4 from generating a duplicate value.

shortercode - 32 minutes ago

Fun thing about random is that these things happen. UUIDv7 is less prone to this as it includes both a time component and random. I’ve been using ULID in a few project which has similar attributes to uuidv7 but more space efficient.

leni536 - 7 hours ago

It's not happening by chance, there is a bug somewhere.

From what I skimmed the package should just call to the js runtime's crypto.randomUUID(). I think it should always be properly seeded.

I think it is extremely unlikely that the runtime has a bug here, but who knows? What js runtime do you use?

jordiburgos - 9 hours ago

Please, do not use b6133fd6-70fe-4fe3-bed6-8ca8fc9386cd, I checked my database and I was using it already.

rich_sasha - 6 hours ago

I always thought generating UUIDs at random was insane. I now only use LLMs. The prompt is: "generate a UUID. Make sure no one ever used it anywhere in their code or database. Check your work and think hard about each step. Do not output any reasoning or plain English, only th UUID itself".
You're welcome.
- mh2753 - 3 hours ago
  
  Actually asking ChatGPT this query led it giving me this UUID "550e8400-e29b-41d4-a716-446655440000" which happens to be a very common example UUID
  - wolttam - an hour ago
    
    The LLM is mechanistically unable to pick something actually random and outside of its training distribution, so... yep.
    
    antonvs - 4 minutes ago
    
    If you ask it to construct a UUID character by character you should get a somewhat random one, just because of temperature.
- - an hour ago
  
  [deleted]
mittermayr - 9 hours ago

I knew it, we're all getting the same cheap UUIDs and the good ones are reserved for the big dogs.
- antonvs - 4 minutes ago
  
  You mean you’re not already entropymaxxing? n00b
- Galanwe - 8 hours ago
  
  uuid.uuidv4() recently switched to "adaptive entropy" instead of "xmax entropy" in an effort to save costs on non-premium users.
robshep - 8 hours ago

I'm using 16b55183-1697-496e-bc8a-854eb9aae0f3 and probably some more too. I suppose if we all post our list here, then we can all check for duplicates?
- jsnell - 8 hours ago
  
  You can check https://everyuuid.com/ for collisions.
- mittermayr - 8 hours ago
  
  We should all send our already-generated UUIDs to a shared database, we could just put it on Supabase with a shared username/password posted on HN, so we can all ensure that after generating a UUIDv4 locally, it's not used by anyone else. If it's in the database, we know it's taken.
  It's a super simple mechanism, check in common worldwide UUID database, if not in there, you can use it. Perhaps if we use a START TRANSACTION, we could ensure it's not taken as we insert. But that's all easy, I'll ask Claude to wire it up, no problem.
  - broken-kebab - 8 hours ago
    
    But then I will claim I have already used all the UUIDs in my spreadsheets, and my lawyer will send cease&desist letters to every database.
- volemo - 8 hours ago
  
  A site previously posted here could be useful: https://everyuuid.com/
classified - 7 hours ago

That UUID should have my name sticker on it. Don't your UUIDs have name stickers?

serf - 10 hours ago

1 in 4.72 × 10²⁸

1 in 47.3 octillion.

i'd be suspecting a race condition or some other naive mistake, otherwise id be stocking up on lottery tickets.

(lol at the other user posting at the same time about the lottery ticket.. great minds and all that.)

petee - 7 hours ago

I've always looked at it the the other way - being that lucky would mean you have even less chance of something else lucky happening, good time to save your money
k4rli - 5 hours ago

The lottery ticket part makes no sense. Statistically if such an improbable event just happened to him, then chance of it happening again should be even more improbable.
- sowbug - an hour ago
  
  This is probably (ha) a troll thread, but in case anyone here is among today's lucky (ha) 10,000, https://en.wikipedia.org/wiki/Independence_(probability_theo...
- jaccola - an hour ago
  
  The chance of him winning the lottery is identical to before, however the reward if he wins is slightly greater.
  He would win the lottery money + he gets to tell people who don’t understand independence this incredible story!
- angoragoats - an hour ago
  
  No, the events are independent. If you have a UUID collide, your chance of winning the lottery is exactly the same as it was before the UUID collision.

- 3 hours ago

[deleted]

glaslong - 8 hours ago

Buy some lava lamps

NKosmatos - 7 hours ago

> I thought this is technically impossible

Actually it's not impossible, but very very improbable.

P.S. You should play a lottery/powerball ticket

P.P.S. Whenever I use the word improbable, the https://hitchhikers.fandom.com/wiki/Infinite_Improbability_D... comes in mind

sebazzz - 4 hours ago

> P.S. You should play a lottery/powerball ticket
Actually, they should not. That collision and winning the lottery would be even rarer.
- lgeorget - an hour ago
  
  Assuming they are independent events, OP is not more nor less likely to win the lottery now that before running in the collision. I actually have more question if you claim the events in question are NOT independent!
rithdmc - 5 hours ago

Inconceivable!

beardyw - 10 hours ago

Just a stupid question, but why not append the date, even in seconds as hex. It's just a few bytes and would guarantee that everything OK now will be OK in the future?

flohofwoe - 9 hours ago

You can just use a different UUID variant which includes timestamp data instead (e.g. v1 or v7), there are also variants which include the MAC address.
itsyonas - an hour ago

Might as well just use uuidv7
mittermayr - 9 hours ago

yeah, any sort of additional semi-random data could've helped prevent this, I'm sure. That, however, is also kind of the idea of UUIDv4, it has lots of randomness and time built in already.
- flohofwoe - 9 hours ago
  
  UUID v4 consists of only random bits, no timestamp info.
  - mittermayr - 9 hours ago
    
    oh, interesting, I didn't know that and this could possibly be part of the problem perhaps depending on what's used as the seed.
- beardyw - 6 hours ago
  
  But surely hashing the date still allows for a future collision. Leaving the date as is means it will never collide after that one second has passed.
  - kayodelycaon - an hour ago
    
    UUID 7 does not hash the date. It uses 48 bits to store a millisecond resolution timestamp. This allows you to sort uuids by time.
pan69 - 8 hours ago

> but why not append the date
And use uuid v5 to hash it :)

wg0 - 9 hours ago

Would the UUID v7 be more collision proof? Hard to say because it takes time into account but then the number of entropy bits are reduced hence the UUID generated exactly at the same time have more chance of a collusion because number of entropy bits are a much smaller space hence could result in collusions more easily.

Thoughts?

AntiUSAbah - 8 hours ago

You open up every millisecond a new block. Should be even more unlikely

lyfeninja - 7 hours ago

Although incredibly rare, it's not impossible so probably best to just plan for collisions. A simply retry should suffice. But I agree I feel like something is going on somewhere else ...

not_math - 7 hours ago

Reminds me of some code I saw running in production. Every time we added a new entry, we were pulling all the UUIDs from this table, generating a new UUID, and checking for collisions up to 10 times.

OutOfHere - 5 hours ago

This is why I prefer to use a random base32 string over UUID. At least you get a proper 128 bit entropy instead of just a 122 bit entropy as with UUIDv4. That's a 64x difference in collision probability. I always thought UUIDs were a toy, not for serious use. If you control the strings, you can even make a longer ID.

Also, numerous applications that use a unique ID per record frequently need to check for ID collisions. I know I do for a short URL generator.

AndreyK1984 - 8 hours ago

Why not to have timestamp-uuid instead ?

dgellow - 7 hours ago

How confident are you that your machines clocks are in perfect sync? What about the risk of clock drift + correction, or hardware issues?
- croon - 6 hours ago
  
  Not GP, but: not confident. How confident would I be to avoid a (slightly lower entropy) UUID collision while also avoiding a clock desync landing on the exact same logged millisecond? Very, which is how confident I was about not encountering an UUID collision before this thread, so very++ I guess.

naikrovek - 9 hours ago

The chance of a UUIDv4 collision is very low, but it is never zero.

If everything is done properly, then this is very likely the one and only time anyone involved in the telling or reading of this account will ever experience this.

dalmo3 - 9 hours ago

Classic gamblers fallacy!
- jaccola - an hour ago
  
  Ironically one of the few comments in this thread that isn’t necessarily the gamblers fallacy!
  The chance anyone involved saw or heard about the first one was near zero, now they’ve seen this one the chance they see another is still near zero (I.e unchanged).

ares623 - 8 hours ago

Buy a lottery ticket

sublinear - 7 hours ago

> We're using this: https://www.npmjs.com/package/uuid

Why? There's a built-in for this.

https://nodejs.org/api/crypto.html#cryptorandomuuidoptions

OptionOfT - an hour ago

That's what the package uses. And if `crypto.randomUUID()` doesn't exist, it falls back to `crypto.getRandomValues()`, which per the documentation isn't AS strong:
https://developer.mozilla.org/en-US/docs/Web/API/Crypto/getR...
So by using the package you actually lose visibility of cases where `crypto.randomUUID()` would fail.

ESAM_C - 10 hours ago

[dead]

samdhar - 10 hours ago

[flagged]

uncircle - 8 hours ago

Statistically speaking, does extremely unlikely mean impossible? If it were replicable I'd raise my eyebrow, otherwise it's fair game, no?
As someone that enjoys the unterminable complaints about RNG in the video game scene, I would never trust any human's rationalization of random outcomes.
- mschild - 8 hours ago
  
  > Statistically speaking, does extremely unlikely mean impossible?
  No, it means extremely unlikely. Collisions can occur, as op just found out, but the chances are so abysmally small that most people don't care.
  Any application I have worked on, I always had a pre-save check to see if the UUID was already present and generate a new one if it was. Don't think it ever triggered unless a bug was introduced somewhere but good practice anyway.
- nubg - 8 hours ago
  
  You are replying to an AI bot
  - harperlee - 8 hours ago
    
    Would be cool to have a plugin that shows % of bot per user, based on their history of comments.
ashleyn - 8 hours ago

There could be a problem with the way the system generates entropy for randomness.
nubg - 8 hours ago

Question to fellow HNers, do you recognize that this comment was written by AI?
- prakka - 8 hours ago
  
  No, to be honest. However, as soon as it was pointed out, I checked again and it made sense.
  In my opinion, these kind of intuitions have to grow over time. And every time it’s pointed out, you learn. So please, keep pointing it out :).
- tirutiru - 8 hours ago
  
  I did not. Post-conditioning by your comment and the other one,I can see some signs such attempting to be unusually comprehensive. The 'atoms in your liver' could be an awkward human trying to be poetic about scales.
  I still don't see idiomatic markers of AI so that's scary if your claim is correct.
- uncircle - 8 hours ago
  
  I guess not, and I feel dirty now. I'm logging off for the day.
- nottorp - 7 hours ago
  
  Interesting enough, I skipped it when scrolling through the comments the first time. I think I instinctually do that to most karma whoring comments, no matter if manual or LLM generated.
  Only noticed it because I did another pass and saw the replies talking about "AI".
- piva00 - 7 hours ago
  
  Yes but as a feeling (hunch?) not as something my brain analysed and reached a conclusion.
  Weird how I'm already somewhat conditioned to spot it on a intuitive level.
- mschild - 8 hours ago
  
  Kind of. It reads a bit too much like tech support you'd get when asking one for help.
- - 8 hours ago
  
  [deleted]
- ssenssei - 7 hours ago
  
  when it started going on about all the different cases in the second bullet point... yeah
- speedgoose - 7 hours ago
  
  Yes, stupid comparison with atoms in the liver and a bullet list below? I stopped reading.
  - - 7 hours ago
    
    [deleted]