ML-KEM FIPS 203 TLS 1.3 CNSA 2.0

PQ Key Exchange Caching

ML-KEM ciphertexts are 34-49x larger than X25519. Every TLS session pays the cost.
Redis adds 0.4ms. In-process: 31 nanoseconds.

49x
ML-KEM-1024 vs X25519
1,568B
ML-KEM-1024 Ciphertext
31ns
Cached Lookup
10M+
Sessions at Scale
Definition

PQ key exchange caching stores the results of post-quantum key encapsulation (ML-KEM) for TLS session resumption, VPN establishment, and encrypted messaging. ML-KEM ciphertexts range from 768 to 1,568 bytes -- 34-49x larger than classical X25519 key shares (32 bytes). Caching these results in-process at 31 nanoseconds eliminates the latency and memory overhead of re-encapsulation on every connection, while keeping the session cache ready for the post-quantum transition mandated by CNSA 2.0.

The Size Explosion

X25519 Key Share vs ML-KEM Ciphertexts

Every TLS handshake sends a key share. Post-quantum key shares are dramatically larger.

X25519 Key Share (classical) 32 bytes
ML-KEM-512 Ciphertext (Level 1) 768 bytes
24x
ML-KEM-768 Ciphertext (Level 3 -- Chrome/Firefox default) 1,088 bytes
34x
ML-KEM-1024 Ciphertext (Level 5) 1,568 bytes
49x
49x
ML-KEM-1024 ciphertext vs X25519 key share. This is the new baseline for every TLS handshake.
FIPS 203 Parameter Sets

Three ML-KEM Levels. One Cache.

Each parameter set trades size for security. All three are cacheable. All three are dramatically larger than what your infrastructure was built for.

ML-KEM-512
NIST Level 1
Public Key800 B
Ciphertext768 B
Shared Secret32 B
Encapsulation~50 us
Cached Lookup31 ns
Lightest
ML-KEM-768
NIST Level 3
Public Key1,184 B
Ciphertext1,088 B
Shared Secret32 B
Encapsulation~80 us
Cached Lookup31 ns
Chrome / Firefox Default
ML-KEM-1024
NIST Level 5
Public Key1,568 B
Ciphertext1,568 B
Shared Secret32 B
Encapsulation~120 us
Cached Lookup31 ns
Maximum Security

The shared secret is always 32 bytes regardless of parameter set. The ciphertext is what explodes. And the ciphertext is what gets stored in your session cache.

TLS 1.3 Integration

Where ML-KEM Lives in Your Handshake

The TLS 1.3 handshake includes a key_share extension in both the ClientHello and ServerHello. With post-quantum key exchange, the key share contains the ML-KEM ciphertext instead of (or in addition to) an X25519 point. This is where the size explosion hits the wire.

1
C→SClientHello: supported_versions, cipher_suites, key_share (ML-KEM-768 public key: 1,184 B)
2
S→CServerHello: key_share (ML-KEM-768 ciphertext: 1,088 B) -- this is what gets cached
3
Derive shared secret (32 B) from ciphertext + private key
4
HKDF-Expand: handshake keys, application keys, finished verification
5
S→CNewSessionTicket: contains encrypted session state + ML-KEM ciphertext for resumption

Step 5 is where caching matters most. The session ticket is stored server-side for resumption. At 10 million concurrent sessions, the ML-KEM ciphertexts alone consume gigabytes. An in-process cache with 31ns lookups makes session resumption nearly free.

Memory Math

Session Cache At Scale

Classical session tickets are small. Post-quantum session tickets are not. Here is what happens to your session cache as you scale.

Sessions X25519 (32B) ML-KEM-512 (768B) ML-KEM-768 (1,088B) ML-KEM-1024 (1,568B)
100K 3.1 MB 73.2 MB 103.8 MB 149.5 MB
500K 15.3 MB 366.2 MB 518.8 MB 747.7 MB
1M 30.5 MB 732.4 MB 1.04 GB 1.49 GB
10M 305 MB 7.32 GB 10.37 GB 14.95 GB

Note: These numbers include session metadata (ticket ID, creation time, expiry, cipher suite) at approximately 200 bytes per session in addition to the key share. The ML-KEM ciphertext is the dominant component. Classical X25519 sessions fit in L3 cache at 1M sessions. ML-KEM-1024 sessions require 49x more memory at every scale.

Hybrid Key Exchange

X25519 + ML-KEM-768 = 1,280 Bytes

Chrome, Firefox, and Cloudflare already deploy hybrid key exchange in production. The X25519Kyber768Draft00 (now X25519MLKEM768) key share combines classical and post-quantum key exchange in a single TLS handshake. The combined key share is 1,280 bytes -- the X25519 point (32 bytes) plus the ML-KEM-768 ciphertext (1,088 bytes) plus encoding overhead (160 bytes).

Hybrid Key Share Structure

X25519
32 B
+
ML-KEM-768 ct
1,088 B
+
Encoding
160 B
=
1,280 B total
40x classical

Both shared secrets are derived independently and combined via HKDF. Security holds if either X25519 or ML-KEM-768 remains unbroken. The cache stores the full hybrid key share for resumption.

This is not theoretical. Google reported that 10-20% of Chrome TLS connections already use hybrid key exchange as of late 2025. By the time CNSA 2.0 deadlines arrive, 100% of connections will require it. Your session cache needs to handle 1,280-byte values at the same latency it handles 32-byte values today.

Architecture

Two-Tier Session Cache

In-process L1 for session ticket lookups at 31ns. Optional L2 for cross-instance session sharing.

L1: In-Process Cache (31ns)
TLS session resumption request arrives
Ticket ID lookup in DashMap
Return cached ML-KEM ciphertext + shared secret (31ns)
Skip encapsulation. Resume session.
Same process, zero network, zero serialization.
L2: Cross-Instance (Cachee RESP)
L1 cache miss (new instance, cold start)
Query Cachee cluster via RESP protocol
Return cached session data (~0.4ms network)
Populate L1. Subsequent lookups at 31ns.
Cross-instance consistency without fresh encapsulation.

L1 handles 99%+ of session resumptions. L2 handles cold starts, failovers, and cross-AZ consistency. Both are faster than running ML-KEM encapsulation (50-120 microseconds) on every connection.

Latency

ML-KEM Encapsulation vs Cached Lookup

Fresh encapsulation vs cached session resumption. The math is not close.

ML-KEM-768 Encapsulation (polynomial multiply + NTT + compress + encode) 80,000 ns
Re-encapsulate every connection
Cached Session Ticket (hash lookup + pointer dereference) 31 ns
2,580x
No polynomial arithmetic. No NTT. No noise sampling. Just the session ticket, from cache.
CNSA 2.0 Compliance

The Deadline is Not Negotiable

NSA's CNSA 2.0 sets hard deadlines for post-quantum migration. Key exchange is the first category to be affected because harvest-now-decrypt-later attacks make every classical handshake a future liability. Your session cache is the component that needs to be ready first.

2025 -- Now
Chrome, Firefox, Cloudflare deploying hybrid X25519+ML-KEM-768. 10-20% of TLS connections already post-quantum. Your session cache is already handling 1,088-byte key shares.
2027 -- CNSA 2.0 Preference
NSS systems should prefer ML-KEM for all key establishment. Hybrid mode becomes the minimum for government contractors and regulated industries.
2029 -- CNSA 2.0 Exclusive
Classical-only key exchange deprecated for NSS. All key establishment must include a CNSA 2.0 approved algorithm (ML-KEM). Session caches must handle PQ-sized values natively.
2030 -- CNSA 2.0 Mandatory
All key establishment in NSS must use exclusively CNSA 2.0 algorithms. No classical fallback. ML-KEM session caching is not optional -- it is infrastructure.

Harvest-now-decrypt-later means every classical TLS session is a future plaintext. The session cache is where the transition starts: swap the key share, cache the result, resume at 31ns. The wire format changes. The latency does not.

cachee-mlkem-session-demo
[1] TLS ClientHello: X25519MLKEM768, key_share=1,184B public key
[2] ML-KEM-768 encapsulate: ciphertext=1,088B, shared_secret=32B
[3] Cache session ticket: SET tls:ticket_a9f3 {ct+ss+meta} TTL 7200
[4] Encapsulation time: 82us
 
[5] Client reconnects Session resumption
[6] Cache hit: GET tls:ticket_a9f3 31ns
[7] Skip encapsulation. Derive keys from cached shared secret.
 
    2,580x faster. Zero re-encapsulation.

Run it yourself: brew install cachee && cachee-mlkem-demo

The Redis Problem

Why Redis Cannot Keep Up

Redis is designed for general-purpose key-value storage. It is not designed for sub-microsecond session ticket lookups with 1,088-byte values. The bottleneck is not Redis itself -- it is the network round-trip, serialization, and deserialization that wrap every operation.

OperationRedis (network)Cachee L1 (in-process)Speedup
GET 32B (X25519)~350 us28 ns12,500x
GET 1,088B (ML-KEM-768)~400 us31 ns12,900x
GET 1,568B (ML-KEM-1024)~420 us33 ns12,727x
SET 1,088B + TTL~450 us45 ns10,000x

For TLS session resumption, where the entire point is to skip the handshake and resume instantly, adding 400 microseconds of cache latency defeats the purpose. In-process caching at 31ns keeps session resumption below the 1-millisecond threshold that users perceive as instantaneous.

Install

Get Started

brew tap h33ai-postquantum/tap && brew install cachee cachee init && cachee start # Cache an ML-KEM session ticket SET tls:session_abc {ciphertext+shared_secret+meta} TTL 7200 # Retrieve at 31ns for session resumption GET tls:session_abc # Bulk preload session tickets from L2 MGET tls:session_* --warm-l1

140+ Redis-compatible commands. Drop-in replacement for your existing session cache infrastructure. The TLS library integration does not change -- only the backing store.

Your session cache is the first thing that needs to go post-quantum.

ML-KEM ciphertexts are here. Chrome is sending them now. Cache them at 31ns.

Install Cachee Computation Caching

Deep Dives