Post-Quantum TLS Session Cache: The Size Problem

April 18, 2026 | 11 min read | Engineering

Chrome 131 shipped ML-KEM (formerly Kyber) for TLS 1.3 key exchange in November 2024. Firefox 132 followed weeks later. As of April 2026, every major browser negotiates post-quantum key exchange by default on connections to servers that support it. Cloudflare, AWS, and Google Cloud all accept ML-KEM-768 in their TLS stacks. The post-quantum transition is not a future event. It is production traffic today.

The security implications of this migration have been exhaustively discussed. What has not been discussed is the infrastructure implication that nobody planned for: your TLS session cache just grew 30-50x, and the systems you use to manage session resumption were designed for a world where key material was 32 bytes.

This article walks through what a TLS 1.3 session ticket actually contains, how post-quantum key exchange inflates it, what happens to your session ticket infrastructure at scale, and why in-process caching is the only architecture that avoids adding latency to TLS negotiation.

What Is Inside a TLS 1.3 Session Ticket

TLS 1.3 session resumption works through session tickets. After a successful handshake, the server sends the client one or more NewSessionTicket messages. The client stores these tickets and presents them in subsequent ClientHello messages to skip the full handshake. The server decrypts the ticket, recovers the session state, and resumes the connection in a single round trip (0-RTT or 1-RTT).

A classical TLS 1.3 session ticket typically contains:

Resumption master secret: 32 bytes (derived from the handshake)
Cipher suite identifier: 2 bytes
Ticket lifetime and age: 8 bytes
Server certificate chain hash: 32 bytes
ALPN protocol: ~10 bytes
SNI hostname: ~30 bytes
Ticket nonce and encryption overhead: ~48 bytes

Total: approximately 162 bytes per session ticket when using X25519 key exchange. Some implementations add extensions or padding, bringing it to 200-300 bytes in practice. Nginx stores session tickets in shared memory at roughly 256 bytes each. HAProxy's session cache uses a similar footprint.

The PQ Inflation

When the TLS handshake uses ML-KEM-768 for key exchange (the hybrid X25519+ML-KEM-768 that Chrome and Cloudflare deploy), the session state grows substantially. ML-KEM-768 does not affect the resumption master secret itself -- that remains 32 bytes. But the session ticket must carry enough state for the server to validate the resumption context, and implementations increasingly embed the following PQ-related material:

Component	Classical (X25519)	PQ (ML-KEM-768)	Multiplier
Key exchange public key	32 B	1,184 B	37x
Key exchange ciphertext	32 B	1,088 B	34x
Shared secret	32 B	32 B	1x
Hybrid overhead (X25519 shares retained)	0 B	64 B	--
Certificate (ML-DSA-65 leaf)	~800 B (RSA-2048)	~4,627 B	5.8x
Session ticket total	~256 B	~7,250 B	28x

In configurations that also deploy post-quantum certificates (ML-DSA-65 for authentication, as CNSA 2.0 will require), the certificate chain embedded in or referenced by the session state adds another 4-8 KB. Servers that cache the full handshake transcript hash for binding verification add more. The practical range for a PQ-enabled TLS 1.3 session ticket is 4 KB to 12 KB, depending on implementation choices and certificate chain depth.

1,088 B

ML-KEM-768 ciphertext

28-50x

Session ticket inflation

7-12 KB

PQ session ticket size

The Math: 500K Sessions, Classical vs. Post-Quantum

Consider a moderately busy HTTPS termination endpoint: 500,000 active session tickets. This is not a hyperscale number. A single Nginx instance handling a busy API, an e-commerce site during a sale, or a CDN edge node for a regional market easily maintains 500K concurrent sessions.

Metric	Classical (X25519)	PQ (ML-KEM-768)	PQ + ML-DSA Cert
Ticket size	256 B	4 KB	12 KB
500K sessions	128 MB	2 GB	6 GB
1M sessions	256 MB	4 GB	12 GB
5M sessions	1.28 GB	20 GB	60 GB

At the classical ticket size, 500K sessions fit comfortably in a small Redis instance or Nginx's shared memory zone. At 128 MB, this is a trivial allocation. Many teams configure ssl_session_cache shared:SSL:128m in Nginx without a second thought.

At the PQ ticket size, 500K sessions consume 2 GB. That is still within reach for a dedicated Redis instance, but it is no longer "set and forget." At 1 million sessions, you need 4 GB of dedicated session cache memory. At 5 million sessions with full PQ certificates, you need 60 GB. These are numbers that change your infrastructure architecture.

Nginx ssl_session_cache Limit

Nginx's ssl_session_cache shared:SSL:NNm directive allocates a fixed-size shared memory zone. The default recommendation of 10 MB holds approximately 40,000 classical sessions. With PQ session tickets at 4 KB each, the same 10 MB zone holds approximately 2,500 sessions. Most Nginx deployments will silently evict PQ sessions at a rate that makes session resumption effectively useless, falling back to full handshakes on every connection. Full PQ handshakes are 2-5x slower than classical, so the cache eviction directly translates to user-facing latency.

Where Session Tickets Are Cached Today

In a typical production HTTPS deployment, session tickets are cached in one or more of these locations:

1. Nginx Shared Memory

Nginx uses a shared memory zone (ssl_session_cache shared:SSL:Nm) backed by a slab allocator. Session entries are stored in a red-black tree indexed by session ID. This is in-process (shared across worker processes via mmap). Lookups are fast -- a tree traversal plus a mutex acquisition. The problem is size. The shared memory zone is a fixed allocation at startup. You cannot resize it without a reload, and large zones (multi-gigabyte) consume memory from the OS page cache that your workers also need.

For PQ sessions: you need to increase this zone by 16-50x. A site that ran comfortably with shared:SSL:64m now needs shared:SSL:2048m or more. This is viable on machines with sufficient RAM, but it is a configuration change that most teams will not anticipate until session resumption rates collapse.

2. HAProxy Session Cache

HAProxy maintains a session cache per listener, configurable via tune.ssl.cachesize (default: 20,000 entries). The cache stores session data in a hash table with LRU eviction. HAProxy's implementation assumes relatively small session entries. At 4-12 KB per entry, 20,000 sessions consume 80-240 MB instead of the classical 5 MB. At 100,000 entries, you are looking at 400 MB to 1.2 GB.

3. Redis or Memcached (External Session Store)

Some architectures externalize session ticket storage to Redis for sharing across multiple TLS terminators. This enables session resumption even when a client reconnects to a different server behind a load balancer. The approach works at classical sizes -- a Redis GET of 256 bytes takes 0.3ms, which is negligible compared to the TLS handshake itself.

At PQ sizes, every session ticket lookup is a Redis GET of 4-12 KB. The network transfer time alone is 0.2-0.5ms. Serialization and deserialization add another 0.1-0.3ms. A single session resumption lookup now takes 0.5-0.8ms from Redis. The entire point of session resumption is to avoid the 2-5ms cost of a full PQ handshake. If the session lookup itself takes 0.8ms, you have consumed 16-40% of the savings on the cache lookup alone.

Worse, at scale: 50,000 session resumptions per second at 8 KB per ticket means Redis is handling 400 MB/s of session ticket traffic. This is not a cache workload. This is a streaming workload. Your Redis instance, originally provisioned for small session tokens, is now a bandwidth-limited bottleneck.

4. TLS Libraries (OpenSSL, BoringSSL)

OpenSSL maintains an internal session cache configurable via SSL_CTX_sess_set_cache_size(). The default is 20,000 sessions. BoringSSL (used by Chrome, Cloudflare, and Go's crypto/tls) has a similar but more opinionated cache with automatic expiration. These caches are in-process and per-context. They work well for single-process servers but do not share state across instances. For multi-instance deployments, teams must implement custom SSL_CTX_sess_set_new_cb and SSL_CTX_sess_set_get_cb callbacks that store/retrieve sessions from an external store -- which brings us back to the Redis problem.

Why In-Process Caching Is the Only Option That Does Not Add to TLS Latency

The TLS handshake is the most latency-sensitive operation in your networking stack. Every millisecond of handshake latency is felt directly by the user as time-to-first-byte. Session resumption exists specifically to reduce this latency. If the mechanism you use to look up session tickets adds latency, you have defeated the purpose.

An in-process session cache has zero network overhead. The TLS library calls a callback function. The callback performs a hash lookup in shared memory. The lookup takes 31-100 nanoseconds. The session ticket data is returned as a pointer -- no serialization, no copy, no TCP round trip. The lookup cost is indistinguishable from noise in the context of a TLS handshake.

Compare the architectures:

Session Cache Location	Lookup Latency (4KB ticket)	Bandwidth at 50K resumptions/sec	Adds to Handshake?
In-process (Cachee L0)	31ns	0 (no network)	No
Nginx shared memory	~500ns	0 (shared mmap)	Negligible
Redis (same AZ)	500-800us	400 MB/s	Yes
Redis (cross-AZ)	1.2-2ms	400 MB/s	Yes
Memcached (same AZ)	400-700us	400 MB/s	Yes

Nginx's shared memory zone is acceptable latency-wise but has the fixed-size problem described above. An in-process cache like Cachee combines the latency of shared memory with dynamic sizing and intelligent eviction. CacheeLFU tracks access frequency per session and preferentially retains sessions for clients that reconnect frequently -- the exact sessions where resumption provides the most value.

The CNSA 2.0 Timeline: This Is Not Optional

The U.S. National Security Agency published CNSA 2.0 in September 2022, establishing a timeline for post-quantum migration in national security systems:

2025: ML-KEM for key establishment in new systems
2027: ML-DSA and SLH-DSA for digital signatures in new systems
2030: All national security systems must use PQ algorithms exclusively
2033: ML-KEM for all key establishment (no classical fallback)
2035: All software and firmware signing must use PQ algorithms exclusively

Any organization that processes government data, holds FedRAMP authorization, or contracts with DoD/IC agencies is on this timeline. CNSA 2.0 is not a recommendation; it is a requirement. By 2030, every TLS connection to or from a national security system must use ML-KEM for key exchange and ML-DSA or SLH-DSA for authentication.

For the private sector, the timeline is softer but the direction is identical. Google already runs ML-KEM-768 on all Chrome TLS connections by default. Cloudflare offers ML-KEM on its entire edge network. AWS Certificate Manager supports PQ certificates. The infrastructure providers have moved. The session cache layer has not.

Practical Mitigation: What to Do Today

1. Measure Your Current Session Cache

Before you can plan for PQ session tickets, you need to know your current session cache footprint. Check your Nginx configuration for ssl_session_cache size. Check your HAProxy tune.ssl.cachesize. If you use Redis for session sharing, measure the memory consumption of your session keys. Multiply by 28-50x. That is your PQ session cache requirement.

2. Move Session Caching In-Process

If your TLS terminators are single-instance or you can tolerate session affinity (most CDN edge deployments can), move session caching entirely in-process. This eliminates network latency from the resumption path and removes the bandwidth bottleneck. Configure your TLS library's session cache callbacks to use an in-process cache with CacheeLFU eviction. Sessions that are never resumed get evicted quickly. Sessions for repeat visitors stay resident.

3. Size Your Memory Budget for PQ

If you must support 500K concurrent PQ sessions with ML-KEM-768 tickets (4 KB each), budget 2 GB of session cache memory per TLS terminator. If you also use ML-DSA-65 certificates (12 KB tickets), budget 6 GB. These numbers are per-instance. Plan your instance sizes accordingly. This is cheaper than the latency penalty of falling back to full PQ handshakes on every connection.

4. Implement Tiered Session Caching

Not all sessions deserve equal cache residency. A session for a user who connects once and never returns should not consume 12 KB of hot memory for its full TTL. Tiered caching with frequency-based admission keeps the highest-value sessions in L0 (in-process, 31ns) and lets infrequent sessions fall to L1 or evict entirely. The user who reconnects every 30 seconds gets instant resumption. The user who connected once last Tuesday gets a full handshake -- which is fine, because they are already tolerating the latency of navigating to your site for the first time in a week.

5. Test Session Resumption Rates Under PQ

Enable PQ key exchange on a canary deployment and monitor your session resumption rate. If it drops significantly (from, say, 85% to 40%), your session cache is too small for PQ tickets and is evicting sessions before clients reconnect. This is the most common failure mode: silent degradation of resumption rates that manifests as higher average TLS handshake latency across your fleet.

# Check session resumption rate with openssl
openssl s_client -connect your-server:443 \
  -sess_out /tmp/session.pem -curves X25519MLKEM768

# Resume with saved session
openssl s_client -connect your-server:443 \
  -sess_in /tmp/session.pem -curves X25519MLKEM768

# Look for "Reused, TLSv1.3" in output
# If it says "New, TLSv1.3" - your session was evicted

In-Process PQ Session Cache with Cachee

Cachee handles PQ session tickets natively. The in-process L0 tier stores session tickets at 31ns access latency regardless of ticket size. CacheeLFU admission ensures high-reconnect clients retain their sessions while one-time visitors are evicted first. A 2 GB Cachee allocation holds 500K PQ session tickets with zero network overhead and zero serialization cost. Your TLS handshake does not wait for Redis. It reads from local memory.

The Bigger Picture

The post-quantum TLS transition is often framed as a cryptographic problem: choose new algorithms, update your libraries, rotate your certificates. That framing is incomplete. PQ is also a systems engineering problem. The algorithms are larger. The key material is larger. The signatures are larger. Every system that stores, caches, transmits, or processes cryptographic material needs to account for the size increase.

Session ticket caching is the first place this hits because it sits directly in the TLS fast path. But the same inflation affects certificate caching in OCSP stapling responders, CRL distribution point caches, HPKE key caches for encrypted client hello (ECH), and ACME certificate storage. Each of these systems was sized for classical cryptography. Each needs re-evaluation.

The organizations that handle this transition smoothly will be the ones that treat PQ as an infrastructure migration, not just a cryptographic library swap. Your algorithms are ready. Your cache layer is not. Fix the cache layer before the session resumption rate tells you there is a problem.

PQ session tickets at 31ns. No network overhead. No serialization cost.

Install Cachee PQ Key Size Guide