What is PQ key exchange caching?

PQ key exchange caching stores the results of post-quantum key encapsulation mechanisms (ML-KEM, formerly CRYSTALS-Kyber) so that TLS session resumption, VPN re-establishment, and encrypted messaging handshakes do not require fresh key encapsulation on every connection. The cached ciphertext and shared secret are served from an in-process cache at 31 nanoseconds instead of recomputing key encapsulation at 120-180 microseconds.

Why are ML-KEM ciphertexts so large?

ML-KEM ciphertexts are large because lattice-based cryptography operates over polynomial rings with noisy coefficients. ML-KEM-768 produces a 1,088-byte ciphertext compared to 32 bytes for X25519. This 34x size increase is the mathematical cost of quantum resistance — the noise distribution and polynomial ring structure require more bytes to encode the same security guarantee. ML-KEM-1024 reaches 1,568 bytes (49x X25519) for NIST Level 5 security.

How does ML-KEM affect TLS session caching?

ML-KEM increases TLS session ticket sizes dramatically. A classical TLS 1.3 session ticket with X25519 key share is approximately 200-300 bytes. With ML-KEM-768 (the Chrome and Firefox default), the key share alone adds 1,088 bytes. Session ticket storage at 1 million concurrent sessions jumps from ~250 MB to ~1.3 GB. The cache infrastructure must handle larger values without latency regression, which is why in-process caching at 31ns matters more than ever.

Can Redis handle ML-KEM session caching?

Redis can store ML-KEM session data, but at a cost. A Redis GET for a 1,088-byte value takes approximately 400 microseconds over the network (serialization, TCP round-trip, deserialization). For TLS session resumption, where the goal is sub-millisecond handshake completion, 400 microseconds is a significant fraction of the total budget. In-process caching eliminates the network round-trip entirely, serving the same value at 31 nanoseconds — a 12,900x improvement over Redis for this specific access pattern.

ML-KEM FIPS 203 TLS 1.3 CNSA 2.0

PQ Key Exchange Caching

ML-KEM ciphertexts are 34-49x larger than X25519. Every TLS session pays the cost.
Redis adds 0.4ms. In-process: 31 nanoseconds.

49x

ML-KEM-1024 vs X25519

1,568B

ML-KEM-1024 Ciphertext

31ns

Cached Lookup

10M+

Sessions at Scale

Definition

PQ key exchange caching stores the results of post-quantum key encapsulation (ML-KEM) for TLS session resumption, VPN establishment, and encrypted messaging. ML-KEM ciphertexts range from 768 to 1,568 bytes -- 34-49x larger than classical X25519 key shares (32 bytes). Caching these results in-process at 31 nanoseconds eliminates the latency and memory overhead of re-encapsulation on every connection, while keeping the session cache ready for the post-quantum transition mandated by CNSA 2.0.

The Size Explosion

X25519 Key Share vs ML-KEM Ciphertexts

Every TLS handshake sends a key share. Post-quantum key shares are dramatically larger.

X25519 Key Share (classical) 32 bytes

ML-KEM-512 Ciphertext (Level 1) 768 bytes

24x

ML-KEM-768 Ciphertext (Level 3 -- Chrome/Firefox default) 1,088 bytes

34x

ML-KEM-1024 Ciphertext (Level 5) 1,568 bytes

49x

ML-KEM-1024 ciphertext vs X25519 key share. This is the new baseline for every TLS handshake.

FIPS 203 Parameter Sets

Three ML-KEM Levels. One Cache.

Each parameter set trades size for security. All three are cacheable. All three are dramatically larger than what your infrastructure was built for.

ML-KEM-512

NIST Level 1

Public Key800 B

Ciphertext768 B

Shared Secret32 B

Encapsulation~50 us

Cached Lookup31 ns

Lightest

ML-KEM-768

NIST Level 3

Public Key1,184 B

Ciphertext1,088 B

Shared Secret32 B

Encapsulation~80 us

Cached Lookup31 ns

Chrome / Firefox Default

ML-KEM-1024

NIST Level 5

Public Key1,568 B

Ciphertext1,568 B

Shared Secret32 B

Encapsulation~120 us

Cached Lookup31 ns

Maximum Security

The shared secret is always 32 bytes regardless of parameter set. The ciphertext is what explodes. And the ciphertext is what gets stored in your session cache.

TLS 1.3 Integration

Where ML-KEM Lives in Your Handshake

The TLS 1.3 handshake includes a key_share extension in both the ClientHello and ServerHello. With post-quantum key exchange, the key share contains the ML-KEM ciphertext instead of (or in addition to) an X25519 point. This is where the size explosion hits the wire.

C→SClientHello: supported_versions, cipher_suites, key_share (ML-KEM-768 public key: 1,184 B)

S→CServerHello: key_share (ML-KEM-768 ciphertext: 1,088 B) -- this is what gets cached

↔Derive shared secret (32 B) from ciphertext + private key

↔HKDF-Expand: handshake keys, application keys, finished verification

S→CNewSessionTicket: contains encrypted session state + ML-KEM ciphertext for resumption

Step 5 is where caching matters most. The session ticket is stored server-side for resumption. At 10 million concurrent sessions, the ML-KEM ciphertexts alone consume gigabytes. An in-process cache with 31ns lookups makes session resumption nearly free.

Memory Math

Session Cache At Scale

Classical session tickets are small. Post-quantum session tickets are not. Here is what happens to your session cache as you scale.

Sessions	X25519 (32B)	ML-KEM-512 (768B)	ML-KEM-768 (1,088B)	ML-KEM-1024 (1,568B)
100K	3.1 MB	73.2 MB	103.8 MB	149.5 MB
500K	15.3 MB	366.2 MB	518.8 MB	747.7 MB
1M	30.5 MB	732.4 MB	1.04 GB	1.49 GB
10M	305 MB	7.32 GB	10.37 GB	14.95 GB

Note: These numbers include session metadata (ticket ID, creation time, expiry, cipher suite) at approximately 200 bytes per session in addition to the key share. The ML-KEM ciphertext is the dominant component. Classical X25519 sessions fit in L3 cache at 1M sessions. ML-KEM-1024 sessions require 49x more memory at every scale.

Hybrid Key Exchange

X25519 + ML-KEM-768 = 1,280 Bytes

Chrome, Firefox, and Cloudflare already deploy hybrid key exchange in production. The X25519Kyber768Draft00 (now X25519MLKEM768) key share combines classical and post-quantum key exchange in a single TLS handshake. The combined key share is 1,280 bytes -- the X25519 point (32 bytes) plus the ML-KEM-768 ciphertext (1,088 bytes) plus encoding overhead (160 bytes).

Hybrid Key Share Structure

X25519
32 B

ML-KEM-768 ct
1,088 B

Encoding
160 B

1,280 B total
40x classical

Both shared secrets are derived independently and combined via HKDF. Security holds if either X25519 or ML-KEM-768 remains unbroken. The cache stores the full hybrid key share for resumption.

This is not theoretical. Google reported that 10-20% of Chrome TLS connections already use hybrid key exchange as of late 2025. By the time CNSA 2.0 deadlines arrive, 100% of connections will require it. Your session cache needs to handle 1,280-byte values at the same latency it handles 32-byte values today.

Architecture

Two-Tier Session Cache

In-process L1 for session ticket lookups at 31ns. Optional L2 for cross-instance session sharing.

L1: In-Process Cache (31ns)

TLS session resumption request arrives

↓

Ticket ID lookup in DashMap

↓

Return cached ML-KEM ciphertext + shared secret (31ns)

↓

Skip encapsulation. Resume session.

Same process, zero network, zero serialization.

L2: Cross-Instance (Cachee RESP)

L1 cache miss (new instance, cold start)

↓

Query Cachee cluster via RESP protocol

↓

Return cached session data (~0.4ms network)

↓

Populate L1. Subsequent lookups at 31ns.

Cross-instance consistency without fresh encapsulation.

L1 handles 99%+ of session resumptions. L2 handles cold starts, failovers, and cross-AZ consistency. Both are faster than running ML-KEM encapsulation (50-120 microseconds) on every connection.

Latency

ML-KEM Encapsulation vs Cached Lookup

Fresh encapsulation vs cached session resumption. The math is not close.

ML-KEM-768 Encapsulation (polynomial multiply + NTT + compress + encode) 80,000 ns

Re-encapsulate every connection

Cached Session Ticket (hash lookup + pointer dereference) 31 ns

2,580x

No polynomial arithmetic. No NTT. No noise sampling. Just the session ticket, from cache.

CNSA 2.0 Compliance

The Deadline is Not Negotiable

NSA's CNSA 2.0 sets hard deadlines for post-quantum migration. Key exchange is the first category to be affected because harvest-now-decrypt-later attacks make every classical handshake a future liability. Your session cache is the component that needs to be ready first.

2025 -- Now

Chrome, Firefox, Cloudflare deploying hybrid X25519+ML-KEM-768. 10-20% of TLS connections already post-quantum. Your session cache is already handling 1,088-byte key shares.

2027 -- CNSA 2.0 Preference

NSS systems should prefer ML-KEM for all key establishment. Hybrid mode becomes the minimum for government contractors and regulated industries.

2029 -- CNSA 2.0 Exclusive

Classical-only key exchange deprecated for NSS. All key establishment must include a CNSA 2.0 approved algorithm (ML-KEM). Session caches must handle PQ-sized values natively.

2030 -- CNSA 2.0 Mandatory

All key establishment in NSS must use exclusively CNSA 2.0 algorithms. No classical fallback. ML-KEM session caching is not optional -- it is infrastructure.

Harvest-now-decrypt-later means every classical TLS session is a future plaintext. The session cache is where the transition starts: swap the key share, cache the result, resume at 31ns. The wire format changes. The latency does not.

cachee-mlkem-session-demo

[1] TLS ClientHello: X25519MLKEM768, key_share=1,184B public key

[2] ML-KEM-768 encapsulate: ciphertext=1,088B, shared_secret=32B

[3] Cache session ticket: SET tls:ticket_a9f3 {ct+ss+meta} TTL 7200

[4] Encapsulation time: 82us

[5] Client reconnects Session resumption

[6] Cache hit: GET tls:ticket_a9f3 31ns

[7] Skip encapsulation. Derive keys from cached shared secret.

2,580x faster. Zero re-encapsulation.

Run it yourself: brew install cachee && cachee-mlkem-demo

The Redis Problem

Why Redis Cannot Keep Up

Redis is designed for general-purpose key-value storage. It is not designed for sub-microsecond session ticket lookups with 1,088-byte values. The bottleneck is not Redis itself -- it is the network round-trip, serialization, and deserialization that wrap every operation.

Operation	Redis (network)	Cachee L1 (in-process)	Speedup
GET 32B (X25519)	~350 us	28 ns	12,500x
GET 1,088B (ML-KEM-768)	~400 us	31 ns	12,900x
GET 1,568B (ML-KEM-1024)	~420 us	33 ns	12,727x
SET 1,088B + TTL	~450 us	45 ns	10,000x

For TLS session resumption, where the entire point is to skip the handshake and resume instantly, adding 400 microseconds of cache latency defeats the purpose. In-process caching at 31ns keeps session resumption below the 1-millisecond threshold that users perceive as instantaneous.

Install

Get Started

brew tap h33ai-postquantum/tap && brew install cachee
cachee init && cachee start

# Cache an ML-KEM session ticket
SET tls:session_abc {ciphertext+shared_secret+meta} TTL 7200

# Retrieve at 31ns for session resumption
GET tls:session_abc

# Bulk preload session tickets from L2
MGET tls:session_* --warm-l1

140+ Redis-compatible commands. Drop-in replacement for your existing session cache infrastructure. The TLS library integration does not change -- only the backing store.

Your session cache is the first thing that needs to go post-quantum.

ML-KEM ciphertexts are here. Chrome is sending them now. Cache them at 31ns.

Install Cachee Computation Caching

Deep Dives

→ML-KEM Session Caching for TLS Resumption →Hybrid Key Exchange: X25519 + ML-KEM-768 →Post-Quantum Key Sizes Reference →CNSA 2.0 Session Cache Migration Guide →Why Redis Cannot Handle PQ Session Tickets →ZK Caching: STARK and SNARK Verification →Proof Reuse: Verify Once, Serve Forever →What is Verifiable Computation Caching?