ML-KEM-512 to ML-KEM-1024: Cache Every Size

May 1, 2026 | 16 min read | Engineering

ML-KEM (Module-Lattice Key Encapsulation Mechanism, FIPS 203) is the post-quantum key exchange standard. Every TLS 1.3 connection that wants post-quantum forward secrecy uses ML-KEM. Chrome ships ML-KEM-768. Firefox ships ML-KEM-768. The post-quantum TLS handshake includes an ML-KEM ciphertext that is 1,088 bytes at Level 3 -- and that ciphertext, along with the derived shared secret, needs to be cached for session resumption. This post walks through all three ML-KEM parameter sets, measures their cache impact at 100K to 10M sessions, and explains why in-process caching is the only viable option at ML-KEM-1024 scale.

800 B

ML-KEM-512 Public Key

1,184 B

ML-KEM-768 Public Key

1,568 B

ML-KEM-1024 Public Key

ML-KEM Parameter Sets: The Complete Picture

ML-KEM defines three parameter sets at NIST security levels 1, 3, and 5. Each set increases the module dimension, which increases key and ciphertext sizes but also increases security against both classical and quantum attacks. The three sets are designed to match the classical security of AES-128, AES-192, and AES-256 respectively, under the assumption that a cryptographically relevant quantum computer exists.

Parameter	ML-KEM-512	ML-KEM-768	ML-KEM-1024
NIST Security Level	1 (128-bit)	3 (192-bit)	5 (256-bit)
Module dimension (k)	2	3	4
Public key size	800 B	1,184 B	1,568 B
Ciphertext size	768 B	1,088 B	1,568 B
Shared secret size	32 B	32 B	32 B
Private key size	1,632 B	2,400 B	3,168 B
Encapsulation time	~35 us	~52 us	~72 us
Decapsulation time	~42 us	~63 us	~88 us
FIPS standard	FIPS 203	FIPS 203	FIPS 203

All three parameter sets use the same algebraic structure: a module lattice over the polynomial ring Z_q[x]/(x^256 + 1) with q = 3329. The module dimension k determines the security level. ML-KEM-512 uses k=2 (a 2x2 matrix over the ring), ML-KEM-768 uses k=3 (3x3), and ML-KEM-1024 uses k=4 (4x4). The public key consists of the matrix A (generated from a seed, so not transmitted) and the vector t = As + e, where s and e are short error vectors. The ciphertext consists of two components: u = A^T r + e1 and v = t^T r + e2 + encode(message), where r is a random ephemeral vector.

The shared secret is always 32 bytes regardless of the parameter set. This is the output of a key derivation function applied to the encapsulated message. From a caching perspective, the shared secret itself is tiny -- the cache cost is dominated by the ciphertext and public key, which are needed for session resumption.

What Gets Cached in TLS 1.3

TLS 1.3 session resumption using pre-shared keys (PSK) is the primary use case for ML-KEM caching. When a client completes a full TLS handshake with ML-KEM key exchange, the server issues a session ticket that allows the client to resume the connection without a full handshake. The session ticket contains (or references) the pre-shared key derived from the original handshake, along with metadata about the negotiated parameters.

The server must cache the session state to validate resumption. The cached session state includes the pre-shared key (32 bytes, derived from the ML-KEM shared secret), the ML-KEM ciphertext (768-1,568 bytes, depending on the parameter set), the negotiated cipher suite and protocol version (a few bytes), the client certificate (if mutual TLS, variable size), the session ticket identifier (32 bytes), and metadata (creation time, expiration time, SNI, etc., approximately 100-200 bytes).

The total cached session state for ML-KEM-768 (the most commonly deployed parameter set) is approximately 1,088 (ciphertext) + 32 (PSK) + 32 (ticket ID) + 150 (metadata) = 1,302 bytes of session data, plus 72 bytes of cache overhead, for a total of 1,374 bytes per session entry.

For ML-KEM-512, the ciphertext is 768 bytes, so the total is approximately 768 + 214 + 72 = 1,054 bytes per entry. For ML-KEM-1024, the ciphertext is 1,568 bytes, so the total is approximately 1,568 + 214 + 72 = 1,854 bytes per entry.

Why Caching the Ciphertext Matters

You might wonder why the ciphertext needs to be cached at all. The shared secret has already been derived -- why not just cache the 32-byte PSK and discard the ciphertext? The answer depends on your session resumption architecture.

In the standard TLS 1.3 PSK mode, you only need the PSK. The ciphertext can be discarded after decapsulation. The cached entry is 32 (PSK) + 32 (ticket ID) + 150 (metadata) + 72 (overhead) = 286 bytes. This is small and scales well: 10 million sessions at 286 bytes each is 2.86 GB.

However, in architectures that support early data (0-RTT) with post-quantum replay protection, the server may need the original ciphertext to verify that a 0-RTT request is a legitimate replay of the original handshake and not a quantum-attacker replay. In this case, the full ciphertext must be cached alongside the PSK. This is the scenario where ML-KEM parameter set choice has the biggest impact on cache size.

Additionally, some architectures cache the full handshake transcript for audit logging, compliance, or post-incident forensics. In these cases, the ML-KEM public key (sent by the server) and the ciphertext (sent by the client) are both part of the cached transcript. The cache entry includes both the public key and the ciphertext: 800+768 = 1,568 bytes for ML-KEM-512, 1,184+1,088 = 2,272 bytes for ML-KEM-768, and 1,568+1,568 = 3,136 bytes for ML-KEM-1024.

Memory Math at Scale

The following tables show cache memory requirements for each ML-KEM parameter set at various session counts. We show three scenarios: PSK-only caching (minimal, 286 bytes/entry), ciphertext caching (1,054-1,854 bytes/entry), and full transcript caching (1,640-3,208 bytes/entry).

Scenario 1: PSK-Only Caching (286 bytes/entry)

Sessions	ML-KEM-512	ML-KEM-768	ML-KEM-1024
100,000	28.6 MB	28.6 MB	28.6 MB
1,000,000	286 MB	286 MB	286 MB
10,000,000	2.86 GB	2.86 GB	2.86 GB

In PSK-only mode, the ML-KEM parameter set has zero impact on cache size because the ciphertext is discarded. The cache cost is identical across all three parameter sets. This is the most memory-efficient approach and is appropriate when 0-RTT replay protection and transcript auditing are not required.

Scenario 2: Ciphertext Caching

Sessions	ML-KEM-512 (1,054 B)	ML-KEM-768 (1,374 B)	ML-KEM-1024 (1,854 B)
100,000	105 MB	137 MB	185 MB
1,000,000	1.05 GB	1.37 GB	1.85 GB
10,000,000	10.5 GB	13.7 GB	18.5 GB

Scenario 3: Full Transcript Caching

Sessions	ML-KEM-512 (1,640 B)	ML-KEM-768 (2,344 B)	ML-KEM-1024 (3,208 B)
100,000	164 MB	234 MB	321 MB
1,000,000	1.64 GB	2.34 GB	3.21 GB
10,000,000	16.4 GB	23.4 GB	32.1 GB

At 10 million sessions with full transcript caching, ML-KEM-1024 requires 32.1 GB -- more than the available RAM on many standard server configurations. Even ML-KEM-768 at 23.4 GB is a significant memory commitment. ML-KEM-512 at 16.4 GB is the most feasible for single-process in-memory caching, but it still requires a large-memory instance.

Browser Adoption: ML-KEM-768 Is the Default

As of early 2026, both Chrome and Firefox use ML-KEM-768 in their TLS 1.3 implementations. The hybrid key exchange combines X25519 (32-byte public key, 32-byte shared secret) with ML-KEM-768 (1,184-byte public key, 1,088-byte ciphertext, 32-byte shared secret). The combined ClientHello key share is 32 + 1,088 = 1,120 bytes for the client's contribution (X25519 public key + ML-KEM-768 ciphertext), and the server's key share is 32 + 1,184 = 1,216 bytes (X25519 public key + ML-KEM-768 public key). The total hybrid key exchange adds approximately 2,336 bytes to the TLS handshake compared to X25519-only.

The choice of ML-KEM-768 (NIST Level 3) over ML-KEM-512 (Level 1) reflects a conservative security posture: browsers want a comfortable security margin against potential improvements in lattice attacks, even if ML-KEM-512 is believed to be sufficient today. The choice over ML-KEM-1024 (Level 5) reflects a practical bandwidth concern: the additional 480 bytes per handshake (1,568 vs 1,088 ciphertext) matters at browser scale, where millions of handshakes per second aggregate to gigabytes of additional bandwidth.

The Hybrid Mode: X25519 + ML-KEM-768

The hybrid key exchange (X25519MLKEM768 in TLS terminology) combines a classical key exchange (X25519) with a post-quantum key exchange (ML-KEM-768). The shared secret is the concatenation of both shared secrets, fed through a key derivation function. This hybrid provides security against both classical attackers (if ML-KEM is broken) and quantum attackers (if X25519 is broken). It is the defense-in-depth approach recommended by NIST and adopted by all major browsers.

For caching, the hybrid mode means the session state includes both the X25519 and ML-KEM components. The X25519 shared secret is 32 bytes, and the ML-KEM shared secret is 32 bytes. If you are caching the full ciphertext, you cache the ML-KEM ciphertext (1,088 bytes for ML-KEM-768) plus the X25519 public key (32 bytes). The X25519 public key is small enough to be irrelevant to the cache size calculation. The ML-KEM ciphertext dominates.

The combined hybrid ciphertext for session caching is 32 (X25519) + 1,088 (ML-KEM-768) = 1,120 bytes. With session metadata and cache overhead, the total per-entry cost is approximately 1,120 + 214 + 72 = 1,406 bytes. At 10 million sessions, this is 14.1 GB -- a significant but manageable amount for a dedicated session cache server with 32 GB RAM.

Why Redis Fails at ML-KEM-1024 Scale

Consider a deployment that caches 10 million TLS session states with ML-KEM-1024 full transcript caching (3,208 bytes per entry). The total data size is 32.1 GB. Redis can handle this data volume -- a Redis instance with 48 GB RAM would hold the dataset with room for fragmentation and overhead. The problem is not storage. The problem is latency and throughput.

A Redis GET for a 3,208-byte value takes approximately 200 microseconds on a same-AZ deployment. At 100,000 session resumptions per second (a reasonable load for a large web service), the Redis cluster processes 100,000 GETs per second, each returning 3.2 KB. The total data throughput is 320 MB/s, which is feasible on a 10 Gbps network link but represents a significant fraction of the available bandwidth. The average latency of 200 microseconds adds directly to the session resumption time.

Compare this to an in-process DashMap. The same 100,000 lookups per second take 31 nanoseconds each, for a total of 3.1 milliseconds of CPU time per second. The data stays in-process -- no network round-trip, no serialization, no TCP overhead. The throughput is limited only by CPU speed and memory bandwidth, both of which are orders of magnitude above what Redis can deliver over the network.

The in-process approach does require 32.1 GB of RAM per process, which means each process that handles session resumption must be deployed on a high-memory instance. But the alternative -- Redis with 200-microsecond latency -- adds 200 microseconds to every session resumption. For a TLS handshake that should complete in 1-2 milliseconds, adding 200 microseconds (10-20% of the total) for a cache lookup is a significant overhead. The in-process lookup at 31 nanoseconds adds 0.0031% overhead.

// TLS session cache with ML-KEM support
struct SessionEntry {
    psk: [u8; 32],                    // Pre-shared key
    ml_kem_ciphertext: Vec<u8>,       // 768-1,568 bytes
    x25519_public: [u8; 32],          // Classical key share
    cipher_suite: u16,                // Negotiated suite
    created_at: u64,                  // Unix timestamp
    expires_at: u64,                  // TTL
    sni: String,                      // Server name
}

// ML-KEM-768: ~1,374 bytes/entry, 13.7 GB at 10M sessions
// ML-KEM-1024: ~1,854 bytes/entry, 18.5 GB at 10M sessions
// Both: 31ns lookup latency, in-process

fn resume_session(ticket_id: &[u8; 32]) -> Option<SessionEntry> {
    SESSION_CACHE.get(ticket_id)  // 31ns, CacheeLFU eviction
}

Choosing the Right ML-KEM Parameter Set

The choice between ML-KEM-512, ML-KEM-768, and ML-KEM-1024 for your caching infrastructure depends on three factors: required security level, cache memory budget, and compatibility with browser defaults.

ML-KEM-512: Maximum Cache Efficiency

ML-KEM-512 offers NIST Level 1 security (equivalent to AES-128) with the smallest keys and ciphertexts. Public key: 800 bytes. Ciphertext: 768 bytes. At 10 million sessions with ciphertext caching, the total is 10.5 GB. This is the most cache-efficient option, but it provides the lowest security margin. No major browser currently deploys ML-KEM-512 for TLS, so using it requires custom client implementations or server-to-server communication where you control both endpoints.

Use ML-KEM-512 when: you control both sides of the connection (inter-service communication, IoT device-to-gateway), AES-128-equivalent security is sufficient for your threat model, and minimizing cache memory is a priority.

ML-KEM-768: Browser-Compatible Default

ML-KEM-768 offers NIST Level 3 security (equivalent to AES-192) and is the parameter set deployed by Chrome and Firefox. Public key: 1,184 bytes. Ciphertext: 1,088 bytes. At 10 million sessions with ciphertext caching, the total is 13.7 GB. This is the parameter set you will use for any deployment that needs to interoperate with standard web browsers.

Use ML-KEM-768 when: you need compatibility with Chrome, Firefox, or any browser-based client; you want the parameter set that the broader ecosystem has tested and optimized for; or when your compliance requirements specify NIST Level 3 or higher. This is the default choice for most deployments.

ML-KEM-1024: Maximum Security

ML-KEM-1024 offers NIST Level 5 security (equivalent to AES-256) with the largest keys and ciphertexts. Public key: 1,568 bytes. Ciphertext: 1,568 bytes. At 10 million sessions with ciphertext caching, the total is 18.5 GB. This parameter set is appropriate for high-security deployments where the additional memory cost is justified by the security requirement.

Use ML-KEM-1024 when: your security policy mandates NIST Level 5 (256-bit equivalent security against quantum attacks), you are protecting data with a long secrecy horizon (30+ years), or you want the maximum security margin against future improvements in lattice attacks. The additional memory cost (18.5 GB vs 13.7 GB at 10M sessions, a 35% increase over ML-KEM-768) is the price of the additional security margin.

Eviction Policy for Session Caches

TLS session caches have a natural TTL: session tickets expire (typically after 1-24 hours). But the access pattern within that TTL is not uniform. Some sessions are accessed frequently (long-lived connections with periodic resumption), while others are accessed once and never resumed (clients that connect, make one request, and leave).

CacheeLFU is the correct eviction policy for TLS session caches. Frequency-based eviction keeps the most-resumed sessions in cache and evicts sessions that were established but never resumed. In a typical web application, approximately 30-40% of sessions are resumed at least once, and 10-15% are resumed more than 5 times. CacheeLFU keeps the high-resumption sessions in cache even when the cache is full, which maximizes the hit rate on the entries that matter most (the entries that are actually used for session resumption).

LRU eviction would evict long-lived, frequently resumed sessions in favor of recently created but never-resumed sessions. This is pathological for session caches: the most valuable entries (active sessions) are replaced by the least valuable entries (abandoned sessions). CacheeLFU avoids this by tracking access frequency, not just recency.

ML-KEM Session Key Lifetime

ML-KEM shared secrets derived during TLS handshakes must be treated as ephemeral. Do not cache ML-KEM shared secrets beyond the session ticket lifetime (typically 1-24 hours). Longer caching increases the window during which a compromised cache exposes session keys. The TTL on cached session entries should match the session ticket lifetime configured in your TLS server, and entries should be securely erased (zeroed) on eviction, not just marked as deleted. In-process caching with CacheeLFU supports both TTL-based expiration and secure erasure callbacks on eviction.

Cross-Level Caching in Multi-Tenant Deployments

Large deployments often serve multiple security levels simultaneously. A CDN, for example, might terminate TLS connections for thousands of customers, each with different security requirements. Customer A requires ML-KEM-1024 for NIST Level 5 compliance. Customer B uses the browser default of ML-KEM-768. Customer C uses ML-KEM-512 for inter-service communication. All three share the same TLS termination infrastructure and the same session cache.

The session cache must handle entries of different sizes efficiently. A naive approach stores all entries in the same DashMap with the size variation absorbed by per-entry heap allocation. This works but leads to memory fragmentation: the allocator must handle a mix of 1,054-byte, 1,374-byte, and 1,854-byte allocations, which reduces allocation efficiency.

A better approach uses size-class segregation: three separate DashMaps, one per ML-KEM parameter set. Each map uses a fixed-size allocation pool (no fragmentation within a size class), and the CacheeLFU eviction runs independently per map (so a flood of ML-KEM-512 sessions does not evict ML-KEM-1024 sessions). The routing is simple: the session ticket includes a byte indicating the ML-KEM parameter set, and the cache lookup dispatches to the appropriate map.

struct MultiLevelSessionCache {
    kem_512: DashMap<[u8; 32], Session512>,   // 1,054 B/entry
    kem_768: DashMap<[u8; 32], Session768>,   // 1,374 B/entry
    kem_1024: DashMap<[u8; 32], Session1024>, // 1,854 B/entry
}

fn lookup(&self, ticket: &SessionTicket) -> Option<SessionState> {
    match ticket.kem_level() {
        KemLevel::L1 => self.kem_512.get(&ticket.id()).map(|s| s.into()),
        KemLevel::L3 => self.kem_768.get(&ticket.id()).map(|s| s.into()),
        KemLevel::L5 => self.kem_1024.get(&ticket.id()).map(|s| s.into()),
    }
    // 31ns regardless of level
}

This architecture keeps lookup latency at 31 nanoseconds regardless of which ML-KEM level the session uses. The per-map capacity can be configured independently: more capacity for ML-KEM-768 (the browser default, highest volume) and less for ML-KEM-512 and ML-KEM-1024 (lower volume). The total memory is the sum of the three maps, which can be monitored and tuned independently.

The Bottom Line

ML-KEM at three security levels produces public keys of 800, 1,184, and 1,568 bytes and ciphertexts of 768, 1,088, and 1,568 bytes. At 10 million TLS sessions with ciphertext caching, the cache requirements range from 10.5 GB (ML-KEM-512) to 18.5 GB (ML-KEM-1024). Redis adds 200 microseconds per session resumption at these sizes -- 10-20% of the total TLS handshake time. In-process caching at 31 nanoseconds eliminates this overhead entirely. ML-KEM-768 is the browser default and the right choice for most deployments. Use CacheeLFU eviction with TTL matching your session ticket lifetime, and consider size-class segregation for multi-tenant deployments that serve multiple security levels.

Cache ML-KEM sessions at 31 nanoseconds. Every parameter set, every security level.

brew install cachee PQ TLS Session Caching