UUID as primary key
On this page
- The classic case against UUIDs
- The case for UUIDs
- How v7 changes the math
- Decision matrix
- Performance numbers (typical PostgreSQL workload)
- Common compromise: separate public ID
- What about clustered indexes (SQL Server)?
- What about MySQL / InnoDB?
- What about NoSQL (DynamoDB, Mongo, Firestore)?
- When NOT to use UUIDs
- Migration: existing v4 → v7
- Try the tools
- Related across the network
For 20 years the conventional wisdom was: “Don’t use UUIDs as primary keys.” That advice was right at the time. It’s now mostly wrong, because of UUID v7.
This article explains the trade-offs, the v4-vs-v7 difference, and when an auto-increment integer is still the right call.
The classic case against UUIDs
Three concerns, in decreasing order of severity:
1. Index fragmentation
Random UUIDs (v4) inserted into a B-tree index land at random positions. Each insert touches a different leaf page, dirties it, and may cause a page split. On a large table:
- More disk writes per insert
- Worse cache locality during reads
- Larger index size due to page splits
- Slower bulk inserts (no buffered writes)
For a 100M-row table, this can be 2–5× slower than the same table with monotonic integer keys.
2. Storage overhead
A bigint is 8 bytes. A uuid is 16 bytes. Every foreign key column is twice as wide. Every index is twice as wide. For a deeply-related schema this adds up — though typically to single-digit-percent overhead at scale, not catastrophe.
3. Readability
/users/42 is human-comprehensible. /users/0e6f1b8c-2c33-4f1f-9c0b-2a3d4e5f6a7b is not. Logs, error messages, and ad-hoc SQL queries are all marginally harder to scan. (This is also a privacy advantage — see below.)
The case for UUIDs
1. No coordination required
Two services, two databases, two offline mobile clients can all generate IDs that never collide. With auto-increment, you need a central authority — a single sequence service, or careful manual partitioning.
2. No information leak
/users/42 tells someone there are 41 users before me. /orders/1000123 tells a competitor your daily order volume. UUIDs reveal nothing about magnitude or order.
3. Mergeable systems
Acquiring a company? Importing data from a competitor? Their UUIDs slot into your tables without renumbering. Their auto-increment IDs collide on day one.
4. Stable across environments
A row created in dev keeps the same ID when copied to staging or prod. Auto-increment IDs do not.
How v7 changes the math
UUID v7 puts a 48-bit Unix-millisecond timestamp at the front, followed by 74 random bits. Sequentially-generated v7 UUIDs are roughly monotonic — they cluster on the rightmost B-tree leaf, the same way auto-increment integers do.
This eliminates concern #1 above. Insert performance with v7 is within ~10% of bigserial/bigint on Postgres for typical workloads. The “UUIDs fragment indexes” critique applied to v4. It does not apply to v7.
v4 inserts: random distribution → page splits everywhere → slow
v7 inserts: monotonic prefix → rightmost-page-only → ~native speed
The other concerns (storage, readability) still apply, but they’re orders of magnitude smaller than the index-fragmentation hit.
Decision matrix
| Constraint | Recommendation |
|---|---|
| Single Postgres instance, single app, no offline clients | bigint identity is fine |
| Multiple services or databases | UUID v7 |
| Public-facing URLs reveal IDs | UUID (any version) |
| Mobile app with offline ID creation | UUID v7 |
| Inserts are < 1000/sec, table < 10M rows | Either is fine |
| Inserts are >100K/sec, table >100M rows | UUID v7 or bigint identity |
| Greenfield, no specific reason either way | UUID v7 (it’s the boring future-proof choice) |
Performance numbers (typical PostgreSQL workload)
These are order-of-magnitude — your numbers will vary based on hardware, schema, and access pattern:
| Approach | Insert rate (rows/sec) | Index size at 100M rows |
|---|---|---|
bigint identity | 100% (baseline) | 100% |
uuid with uuidv7() | ~92% | ~150% |
uuid with gen_random_uuid() (v4) | ~30–50% | ~180–220% (with bloat) |
uuid with v4, repacked weekly | ~40–60% | ~160% |
The v4 row is the one that gave UUIDs a bad name. Use v7 instead.
Common compromise: separate public ID
Some teams use a bigint primary key internally and add a UUID for external exposure:
CREATE TABLE orders (
id bigserial PRIMARY KEY, -- internal, fast joins
public_id uuid UNIQUE NOT NULL DEFAULT uuidv7() -- external URLs
);
Pros: peak insert performance, clean URLs. Cons: two indexes, two columns to keep in sync, two ways to mistakenly query.
This is a fine choice for very high-throughput tables. For most apps, just use uuid v7 as the primary key and skip the second column.
What about clustered indexes (SQL Server)?
SQL Server clusters tables by their primary key by default. Random v4 Guids as a clustered index are particularly painful — every insert physically rewrites a page. With v7 (Guid.CreateVersion7() in .NET 9+) or NEWSEQUENTIALID(), the clustered index stays sorted.
-- Bad on SQL Server clustered index
CREATE TABLE Orders (Id uniqueidentifier PRIMARY KEY DEFAULT NEWID(), ...);
-- Good — sequential variant
CREATE TABLE Orders (Id uniqueidentifier PRIMARY KEY DEFAULT NEWSEQUENTIALID(), ...);
-- Good — v7 from app code (.NET 9+)
INSERT INTO Orders (Id, ...) VALUES (@id, ...); -- @id from Guid.CreateVersion7()
If you can’t use v7 yet (old .NET), use NEWSEQUENTIALID() as a default only. Generated v4 Guids passed in from the app are still bad.
What about MySQL / InnoDB?
InnoDB clusters by primary key just like SQL Server. The same advice applies: UUID v7 ✅, v4 ❌. Store as BINARY(16) (not CHAR(36)) for half the storage.
-- Recommended
CREATE TABLE orders (
id BINARY(16) PRIMARY KEY, -- v7 generated in app code
...
);
-- Helper functions for human-readable I/O
SELECT BIN_TO_UUID(id) FROM orders;
What about NoSQL (DynamoDB, Mongo, Firestore)?
These don’t have B-tree clustered primary keys, so the v4 fragmentation argument doesn’t apply. Use whatever ID your team prefers. UUIDs are the safe default; ULID and nanoid are reasonable for shorter strings.
When NOT to use UUIDs
- Tiny lookup tables. A
categoriestable with 50 rows doesn’t need 16-byte keys. Useint. - Composite natural keys. If
(year, month, code)is already unique and meaningful, don’t add a synthetic UUID. - Pure event-log inserts where order matters more than identity. A
bigserialis simpler and sortable. - You really need readable URLs and you control all clients. Public-facing slugs (
/u/sam) beat UUIDs every time. (You can still use UUID internally — see the “compromise” pattern above.)
Migration: existing v4 → v7
You don’t need to backfill old rows. Just change the default for new ones:
ALTER TABLE orders ALTER COLUMN id SET DEFAULT uuidv7();
Existing v4 IDs stay v4 forever. New rows get v7. The table will be a mix until old data ages out, which is fine — the version is encoded in each ID, so anything that needs to know can check.
Try the tools
- UUID v7 generator — for new primary keys
- PostgreSQL UUID guide — DB-specific patterns
- C# / .NET Guid guide —
Guid.CreateVersion7()in .NET 9 - UUID validator — confirm a UUID is the version you expect
Related across the network
- epoch.tooljo.com — UUID v7 encodes a millisecond timestamp; paste a v7 to decode the time bits.
- hash.tooljo.com/which-hash-function — UUID v3 (MD5) and v5 (SHA-1) are deterministic hashes; the article explains when those are safe.
- date.tooljo.com — for the time math when comparing UUID v7 IDs across timezones.