In modern distributed systems, choosing the right identifier strategy is a foundational architectural decision. For decades, RFC 4122 UUIDv4 (fully random) was the industry standard for decentralized ID generation. However, UUIDv4’s randomness introduced a severe database performance penalty: index fragmentation in B-Tree indexed databases.
To solve this, “lexicographically sortable” or “time-ordered” identifiers emerged. This article provides a comprehensive comparison between the two leading modern standards: ULID (Universally Unique Lexicographically Sortable Identifier) and UUIDv7 (Universally Unique Identifier version 7).
1. The Core Problem: Why Not UUIDv4?
Most relational databases (such as PostgreSQL, MySQL, and SQL Server) use B-Trees or B+Trees for primary key indexing. B-Trees perform optimally when new keys are inserted in sequential order.
When you insert records with fully random UUIDv4 keys:
- New keys are scattered uniformly across the index space.
- This causes frequent page splits as the database attempts to force keys into already-full index pages.
- Disk I/O spikes, cache hit ratios drop, and write performance degrades exponentially as the dataset grows.
To prevent this, we need identifiers that are time-ordered (monotonically increasing over time) yet still globally unique and cryptographically secure without a central coordinator.
2. Under the Hood: Structural Breakdown
Both UUIDv7 and ULID are 128-bit identifiers that combine a 48-bit millisecond-precision timestamp with random entropy. However, they allocate and format those bits differently.
ULID Structure
ULID was designed as an alternative to UUIDv4 with readability in mind. It consists of:
- Timestamp (48 bits): Milliseconds since Unix Epoch (2^48 milliseconds allows representation up to the year 10889 AD).
- Entropy (80 bits): Secure random data, which can optionally be incremented to guarantee strict monotonicity within the same millisecond.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| 32-bit Timestamp |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| 16-bit Timestamp | 16-bit Entropy |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| 32-bit Entropy |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| 32-bit Entropy |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
UUIDv7 Structure (RFC 9562)
Published in May 2024, RFC 9562 officially standardized UUIDv7 to bring time-ordering to the standard UUID specification while maintaining backwards compatibility with the layout of previous RFC 4122 versions.
A UUIDv7 allocates:
- Timestamp (48 bits): Milliseconds since Unix Epoch.
- Version (4 bits): Hardcoded to 0111 (binary for 7).
- Variant (2 bits): Hardcoded to 10 (standard IETF variant).
- Entropy (74 bits): Random sequence. The RFC also allows utilizing a portion of these bits for a sequence counter to guarantee sub-millisecond monotonicity.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| 32-bit Timestamp |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| 16-bit Timestamp | Ver | 12-bit Rand |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Var| 62-bit Rand |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
3. Key Commonalities (The “Likes”)
Despite their structural nuances, UUIDv7 and ULID share several core properties:
- 128-Bit Storage Footprint: Both compile down to exactly 16 bytes of binary data.
- K-Sortability: Because both start with a 48-bit Unix timestamp, IDs generated at different times will naturally sort chronologically. Within the same millisecond, sorting defaults to random order (or sequential order if monotonicity features are used).
- Millisecond Resolution: Both roll over to the next timestamp step every 1 millisecond.
- Collision Resistance: With 74 bits (UUIDv7) and 80 bits (ULID) of random entropy, the mathematical probability of a collision in a distributed system is practically zero.
$$P_{collision} \approx 0$$
4. Architectural Differences
| Feature | UUIDv7 (RFC 9562) | ULID |
|---|---|---|
| Standardization Status | Official IETF Standard (RFC 9562) | De facto community specification |
| Text Representation | Standard hex-and-hyphen (8-4-4-4-12) | Crockford’s Base32 (no hyphens) |
| Length (Text) | 36 characters | 26 characters |
| Case Sensitivity | Case-insensitive (typically lowercase) | Case-insensitive (typically uppercase) |
| Entropy Bits | 74 bits | 80 bits |
| Monotonicity | Optional (via sub-millisecond sequence) | Native (specification details incrementation) |
| URL Safety | Yes, but longer | Yes, extremely compact and clean |
| Database Compatibility | Native UUID types natively support it | Requires storage as raw bytes or strings |
Text Formatting Comparison
- UUIDv7: 018f8e5b-b9d2-7c3a-8b1a-2895fc7bf321 (36 characters with hyphens)
- ULID: 01HWGQ7EFJDGZ8P6D8JPWVQWS1 (26 alphanumeric characters)
ULID’s Crockford’s Base32 encoding excludes ambiguous characters (I, L, O, U) to avoid human-reading errors, making it highly suitable for user-facing applications (such as invoice numbers or URL routes).
5. Optimal Use Cases
When to choose UUIDv7:
- Enterprise & Legacy Databases: If you are using PostgreSQL, Microsoft SQL Server, or Oracle, they already possess optimized UUID data types. Storing UUIDv7 requires zero schema migration and immediately resolves B-Tree fragmentation.
- Standards-Driven Environments: For environments with strict compliance guidelines where only officially sanctioned standards (like IETF/RFCs) are permitted.
- Interoperability: When your APIs and services already consume and validate standard 36-character UUID patterns.
When to choose ULID:
- User-Facing Identifiers: If IDs appear in URLs, customer portals, or emails. ULIDs are shorter, lack intimidating hyphens, and prevent character confusion.
- NoSQL Ecosystems: Databases like MongoDB, DynamoDB, or Redis do not have strict native UUID optimizations. Storing ULID as a 26-character string is highly performant.
- Strict Monotonicity Requirements: If you require sequential sorting of events generated within the exact same millisecond on a single worker node.
6. Implementation in C
The following C program provides a fully functional, self-contained implementation of both UUIDv7 and ULID generation, complete with string serialization. It uses standard POSIX headers to acquire sub-second system time.
#include <stdio.h>
#include <stdint.h>
#include <stdlib.h>
#include <time.h>
#include <sys/time.h>
#include <string.h>
// Struct representing a 128-bit identifier
typedef struct {
uint8_t bytes[16];
} id128_t;
// Crockford's Base32 Alphabet (used by ULID)
static const char BASE32_ALPHABET[] = "0123456789ABCDEFGHJKMNPQRSTVWXYZ";
// Helper: Get Unix epoch timestamp in milliseconds
uint64_t get_current_time_ms(void) {
struct timeval tv;
gettimeofday(&tv, NULL);
return ((uint64_t)tv.tv_sec * 1000ULL) + ((uint64_t)tv.tv_usec / 1000ULL);
}
// Helper: Generate pseudo-random bytes
// NOTE: For production environments, use a cryptographically secure source like /dev/urandom
void fill_random_bytes(uint8_t *buffer, size_t length) {
for (size_t i = 0; i < length; i++) {
buffer[i] = rand() & 0xFF;
}
}
// Generates an RFC 9562 compliant UUIDv7
id128_t generate_uuid_v7(void) {
id128_t uuid;
uint64_t ts = get_current_time_ms();
// 1. 48-bit timestamp (bytes 0 to 5)
uuid.bytes[0] = (uint8_t)((ts >> 40) & 0xFF);
uuid.bytes[1] = (uint8_t)((ts >> 32) & 0xFF);
uuid.bytes[2] = (uint8_t)((ts >> 24) & 0xFF);
uuid.bytes[3] = (uint8_t)((ts >> 16) & 0xFF);
uuid.bytes[4] = (uint8_t)((ts >> 8) & 0xFF);
uuid.bytes[5] = (uint8_t)(ts & 0xFF);
// 2. Fill remaining 10 bytes with random entropy
fill_random_bytes(&uuid.bytes[6], 10);
// 3. Set UUIDv7 version bits (0111) in byte 6 (bits 4-7)
uuid.bytes[6] = (uuid.bytes[6] & 0x0F) | 0x70;
// 4. Set UUID variant bits (10xx) in byte 8 (bits 6-7)
uuid.bytes[8] = (uuid.bytes[8] & 0x3F) | 0x80;
return uuid;
}
// Generates a ULID compliant binary payload
id128_t generate_ulid(void) {
id128_t ulid;
uint64_t ts = get_current_time_ms();
// 1. 48-bit timestamp (bytes 0 to 5)
ulid.bytes[0] = (uint8_t)((ts >> 40) & 0xFF);
ulid.bytes[1] = (uint8_t)((ts >> 32) & 0xFF);
ulid.bytes[2] = (uint8_t)((ts >> 24) & 0xFF);
ulid.bytes[3] = (uint8_t)((ts >> 16) & 0xFF);
ulid.bytes[4] = (uint8_t)((ts >> 8) & 0xFF);
ulid.bytes[5] = (uint8_t)(ts & 0xFF);
// 2. 80-bit entropy (bytes 6 to 15)
fill_random_bytes(&ulid.bytes[6], 10);
return ulid;
}
// Format a 128-bit ID as an RFC 4122/9562 UUID string
void format_uuid_string(const id128_t *id, char *out_str) {
sprintf(out_str,
"%02x%02x%02x%02x-%02x%02x-%02x%02x-%02x%02x-%02x%02x%02x%02x%02x%02x",
id->bytes[0], id->bytes[1], id->bytes[2], id->bytes[3],
id->bytes[4], id->bytes[5],
id->bytes[6], id->bytes[7],
id->bytes[8], id->bytes[9],
id->bytes[10], id->bytes[11], id->bytes[12], id->bytes[13], id->bytes[14], id->bytes[15]);
}
// Format a 128-bit ID as a 26-character Crockford Base32 ULID string
void format_ulid_string(const id128_t *id, char *out_str) {
// ULID string decoding operates on 5-bit chunks.
// 128 bits represented in 26 characters (leaving 2 unused bits in the first character).
uint8_t temp[26];
// Unpack 128 bits into 26 5-bit indices
temp[0] = (id->bytes[0] & 224) >> 5;
temp[1] = id->bytes[0] & 31;
temp[2] = (id->bytes[1] & 248) >> 3;
temp[3] = ((id->bytes[1] & 7) << 2) | ((id->bytes[2] & 192) >> 6);
temp[4] = (id->bytes[2] & 62) >> 1;
temp[5] = ((id->bytes[2] & 1) << 4) | ((id->bytes[3] & 240) >> 4);
temp[6] = ((id->bytes[3] & 15) << 1) | ((id->bytes[4] & 128) >> 7);
temp[7] = (id->bytes[4] & 124) >> 2;
temp[8] = ((id->bytes[4] & 3) << 3) | ((id->bytes[5] & 224) >> 5);
temp[9] = id->bytes[5] & 31;
temp[10] = (id->bytes[6] & 248) >> 3;
temp[11] = ((id->bytes[6] & 7) << 2) | ((id->bytes[7] & 192) >> 6);
temp[12] = (id->bytes[7] & 62) >> 1;
temp[13] = ((id->bytes[7] & 1) << 4) | ((id->bytes[8] & 240) >> 4);
temp[14] = ((id->bytes[8] & 15) << 1) | ((id->bytes[9] & 128) >> 7);
temp[15] = (id->bytes[9] & 124) >> 2;
temp[16] = ((id->bytes[9] & 3) << 3) | ((id->bytes[10] & 224) >> 5);
temp[17] = id->bytes[10] & 31;
temp[18] = (id->bytes[11] & 248) >> 3;
temp[19] = ((id->bytes[11] & 7) << 2) | ((id->bytes[12] & 192) >> 6);
temp[20] = (id->bytes[12] & 62) >> 1;
temp[21] = ((id->bytes[12] & 1) << 4) | ((id->bytes[13] & 240) >> 4);
temp[22] = ((id->bytes[13] & 15) << 1) | ((id->bytes[14] & 128) >> 7);
temp[23] = (id->bytes[14] & 124) >> 2;
temp[24] = ((id->bytes[14] & 3) << 3) | ((id->bytes[15] & 224) >> 5);
temp[25] = id->bytes[15] & 31;
for (int i = 0; i < 26; i++) {
out_str[i] = BASE32_ALPHABET[temp[i]];
}
out_str[26] = '\0';
}
int main(void) {
// Seed PRNG
srand((unsigned int)time(NULL));
printf("--- Generating UUIDv7 and ULID Identifiers ---\n\n");
for (int i = 0; i < 5; i++) {
id128_t uuid = generate_uuid_v7();
id128_t ulid = generate_ulid();
char uuid_str[37];
char ulid_str[27];
format_uuid_string(&uuid, uuid_str);
format_ulid_string(&ulid, ulid_str);
printf("Generation Iteration [%d]:\n", i + 1);
printf(" UUIDv7 string : %s\n", uuid_str);
printf(" ULID string : %s\n\n", ulid_str);
}
return 0;
}7. Conclusion
Ultimately, the choice between UUIDv7 and ULID is a question of platform target rather than architectural efficacy.
- Go with UUIDv7 if your platform relies heavily on relational databases (PostgreSQL, SQL Server, etc.) and you want to maintain broad compatibility with standard UUID validators and schemas.
- Go with ULID if your priorities involve compact URL representations, NoSQL ecosystems, and user-facing ID readability.