UUID collision probability
On this page
Short answer. No. Generating a v4 UUID has a 1-in-2¹²² chance of duplicating any given value. Even at planet-scale generation rates, you will never see a collision in your career. The interesting cases are when the answer changes.
The math
A v4 UUID has 122 bits of randomness (128 minus 4 version bits and 2 variant bits). That’s 2^122 ≈ 5.32 × 10^36 possible values.
The relevant question is the birthday paradox: how many UUIDs do you need to generate before two of them are the same with probability ≥ 50%?
The approximation is:
N ≈ √(2 · S · ln(1/(1−p)))
where S = 2^122 is the population size and p is the desired collision probability. For p = 0.5:
N ≈ √(2 · 2^122 · 0.693) ≈ 2^61.5 ≈ 2.7 × 10^18
You’d need to generate about 2.7 quintillion v4 UUIDs before having a 50% chance that any two of them are the same.
Concrete numbers
| Number of v4 UUIDs generated | Probability of any collision |
|---|---|
| 1 billion (10⁹) | ≈ 9.4 × 10⁻²⁰ |
| 1 trillion (10¹²) | ≈ 9.4 × 10⁻¹⁴ |
| 1 quadrillion (10¹⁵) | ≈ 9.4 × 10⁻⁸ |
| 1 quintillion (10¹⁸) | ≈ 0.094 |
For perspective: 1 quintillion UUIDs at one billion per second takes 31 years.
What this means in practice
- A web service generating 1,000 UUIDs per second for 10 years: ~3.15 × 10¹¹ UUIDs. Collision probability ~10⁻¹⁵. Don’t worry.
- A SaaS with a million customers each generating 10 UUIDs per second for a decade: ~3.15 × 10¹⁷ UUIDs. Collision probability ~10⁻³. Still essentially zero. Don’t worry.
- AWS-scale, generating UUIDs across every service worldwide: estimated billions per second, but still nowhere near 10¹⁸. They don’t worry, and they’re using v4 everywhere.
When the answer changes
There are scenarios where the naive math doesn’t apply.
1. Bad RNG
If the random source isn’t actually random, all bets are off. The classic case: generating UUIDs at process startup before /dev/urandom is seeded, on an embedded device or VM. The result can be UUIDs that look random but cluster in a tiny subspace.
Mitigation: use crypto.randomUUID() (browser, Node 19+, Bun) or the platform’s CSPRNG-backed UUID library. Don’t use Math.random() or unseeded language-default RNGs.
2. v3/v5 collisions on adversarial inputs
v3 (MD5) and v5 (SHA-1) are deterministic hashes. For honest inputs they’re collision-free in practice. For adversarially-chosen inputs:
- MD5 has practical collision attacks. Two different names can produce the same v3 UUID if an attacker controls them.
- SHA-1 is broken for collision resistance (SHAttered, 2017). v5 has known theoretical collisions.
If your namespace contains user-supplied strings and a collision could be exploited (e.g. account takeover via duplicate ID), don’t use v3 or v5 — use v4 instead.
3. v1 with deterministic node ID
v1’s uniqueness depends on the node ID being unique per generator. Two VMs that boot from the same image and generate UUIDs in the same millisecond can collide if they have the same node ID and clock sequence.
Mitigation: use a randomized node ID (the default in modern libraries) or, better, use v7 instead.
4. Truncating UUIDs
The “I’ll just use the first 8 hex characters” pattern. You now have 32 random bits. The birthday-paradox 50% threshold drops to ~65,000 generations. Don’t truncate UUIDs unless you’ve done the math for your specific volume.
v7 collision probability
UUID v7 has 74 bits of randomness (after the 48-bit timestamp and version/variant bits). Within a single millisecond, the birthday-paradox 50% threshold is:
N ≈ √(2 · 2^74 · 0.693) ≈ 2^37.5 ≈ 1.9 × 10^11
You’d need ~190 billion v7 UUIDs in the same millisecond before collision becomes likely. At realistic application rates, v7 is just as collision-free as v4 in practice.
(Some v7 implementations add a monotonic counter for the same-millisecond case, eliminating the issue entirely.)
So when should I worry?
In order of seriousness:
- You wrote your own UUID generator. Stop. Use the standard library.
- You’re using
Math.random()or another non-cryptographic RNG. Switch to a CSPRNG. - You’re truncating UUIDs to 8/12/16 characters. Re-do the math.
- You have user-controlled inputs flowing into v3/v5. Use v4.
- You’re operating at AWS-scale and worried about v4 across a multi-decade horizon. Set up unique constraints in your database; the universe will still run out of heat before you collide.
For everyone else: pick v4 for one-off IDs, v7 for primary keys, generate them here, and move on.