NIST PQ Migration: What Breaks in Your Cache Layer

April 20, 2026 | 9 min read | Engineering

Three NIST standards are final. FIPS 203 (ML-KEM), FIPS 204 (ML-DSA), and FIPS 205 (SLH-DSA) define the post-quantum algorithms that will replace RSA, ECDH, and ECDSA across every system that handles cryptographic material. A fourth, FN-DSA (FALCON), is expected in 2025. These are not draft proposals. They are published, numbered, mandatory-use standards for US federal systems.

CNSA 2.0 sets the deadline. By 2030, all national security systems must use post-quantum algorithms for key exchange. By 2033, all software and firmware signing must be post-quantum. By 2035, all systems must be fully transitioned.

The migration is coming. The question for infrastructure teams is not whether to prepare, but what breaks first. The answer, almost always, is the cache layer.

The Three Standards and What They Replace

Standard	Algorithm	Replaces	Function
FIPS 203	ML-KEM (Kyber)	ECDH, RSA key exchange	Key encapsulation (TLS, VPN, messaging)
FIPS 204	ML-DSA (Dilithium)	ECDSA, RSA signatures	Digital signatures (JWT, certificates, code signing)
FIPS 205	SLH-DSA (SPHINCS+)	ECDSA, RSA signatures	Stateless hash signatures (conservative, large)
(Pending)	FN-DSA (FALCON)	ECDSA, RSA signatures	Compact lattice signatures (constrained environments)

Each standard defines multiple parameter sets at different security levels. The parameter sets determine key and signature sizes. And it is the sizes that break your infrastructure.

The Size Explosion

Classical cryptographic keys are small enough that no infrastructure team thinks about them. An ECDH public key is 32 bytes. An Ed25519 signature is 64 bytes. An RSA-2048 public key is 256 bytes. These sizes are negligible in any cache, database, or network protocol. They fit in a single TCP packet. They serialize in nanoseconds. They are invisible in your latency budget.

Post-quantum keys are not invisible.

FIPS 203: ML-KEM Key Sizes

Parameter Set	Security Level	Public Key	Ciphertext	vs ECDH (32B)
ML-KEM-512	NIST Level 1	800 B	768 B	25x
ML-KEM-768	NIST Level 3	1,184 B	1,088 B	37x
ML-KEM-1024	NIST Level 5	1,568 B	1,568 B	49x

Every TLS 1.3 handshake that uses ML-KEM generates a ciphertext that the server caches for session resumption. Chrome 131+ and Firefox 132+ already ship ML-KEM-768 in their TLS stacks. This is not a future event. It is happening in production today. Every session ticket your TLS terminator caches now carries 1,088 bytes of ML-KEM ciphertext where it previously carried 32 bytes of X25519 key share.

FIPS 204: ML-DSA Signature Sizes

Parameter Set	Security Level	Public Key	Signature	vs Ed25519 (64B sig)
ML-DSA-44	NIST Level 2	1,312 B	2,420 B	38x
ML-DSA-65	NIST Level 3	1,952 B	3,309 B	52x
ML-DSA-87	NIST Level 5	2,592 B	4,627 B	72x

ML-DSA replaces the signature algorithm in JWTs, X.509 certificates, code signing, and API authentication tokens. Every system that caches a JWT for token validation now caches an additional 3,309 bytes of signature per token (at ML-DSA-65). Every API gateway that stores issuer public keys for JWT verification now stores 1,952 bytes per issuer instead of 32 bytes.

FIPS 205: SLH-DSA Signature Sizes

Parameter Set	Security Level	Public Key	Signature	vs Ed25519 (64B sig)
SLH-DSA-SHA2-128f	NIST Level 1	32 B	17,088 B	267x
SLH-DSA-SHA2-192f	NIST Level 3	48 B	35,664 B	557x
SLH-DSA-SHA2-256f	NIST Level 5	64 B	49,856 B	779x

SLH-DSA is the conservative choice. Its security relies only on the properties of hash functions, not on lattice assumptions. This makes it the fallback if lattice-based schemes are ever broken. The cost is size: a single SLH-DSA-256f signature is 49,856 bytes. Nearly 50 KB for one signature. Any system that caches these signatures will feel it immediately.

The Compounding Effect

Most production systems do not use a single algorithm in isolation. A TLS handshake uses ML-KEM for key exchange AND ML-DSA for certificate signatures. A session token carries an ML-DSA signature AND an ML-KEM-derived session key. An API credential includes an ML-DSA-signed JWT AND an ML-KEM-encapsulated API secret. The sizes compound. A single session record that was 96 bytes of classical crypto material becomes 4,000-8,000 bytes of post-quantum material. That is a 40-80x increase per cached entry.

The CNSA 2.0 Timeline

2024

FIPS 203, 204, 205 published. Standards are final. Implementations begin shipping in OpenSSL, BoringSSL, AWS-LC.

2025

Chrome, Firefox, Edge ship ML-KEM in TLS 1.3 by default. FN-DSA (FALCON) expected to be standardized. PQ certificates begin appearing in production.

2027

CNSA 2.0: All new systems must prefer PQ algorithms. Hybrid classical+PQ required during transition.

2030

CNSA 2.0: All key exchange must be post-quantum. ML-KEM mandatory for TLS, VPN, messaging. Classical ECDH deprecated.

2033

CNSA 2.0: All software and firmware signing must use PQ algorithms. ML-DSA or SLH-DSA mandatory for code signing, certificates.

2035

CNSA 2.0: Full transition complete. All systems must be exclusively post-quantum. Classical algorithms prohibited for national security use.

This timeline applies directly to US federal systems and contractors. But the commercial impact is broader: any organization that handles federal data, operates in regulated industries (finance, healthcare, defense), or sells to government customers will need to comply. The PQ transition is not an academic exercise. It is a procurement requirement with a deadline.

What Breaks in Your Cache

1. Session Stores

A session store holding 1 million active sessions with classical key material uses approximately 96 MB of cache memory (96 bytes of crypto per session). After the ML-KEM-768 + ML-DSA-65 transition, the same 1 million sessions require 4.49 GB. The cache that was running comfortably on a single Redis instance now needs a cluster, or a fundamentally different architecture.

Worse: session stores are the hottest cache in most applications. Every authenticated request hits the session cache. The latency of every session lookup increases linearly with value size in network-bound caches like Redis. A session validation that took 0.3ms at 96 bytes takes 0.5-0.8ms at 4,493 bytes. Multiply by the number of requests per second and the cumulative latency becomes the dominant cost in your request pipeline.

2. JWT Verification Caches

API gateways cache JWT issuer public keys to avoid fetching them on every request. A gateway serving 20 API issuers with Ed25519 keys caches 640 bytes total. With ML-DSA-65 keys, the same 20 issuers require 39 KB. That is still negligible for key storage. The problem is the JWTs themselves: if you cache validated tokens for deduplication or rate limiting, each token now carries a 3,309-byte ML-DSA signature. At 100K cached tokens, the signature material alone is 331 MB.

3. TLS Session Ticket Caches

Nginx, HAProxy, and cloud load balancers cache TLS session tickets for 0-RTT resumption. A session ticket with X25519 key material is approximately 256 bytes. With ML-KEM-768, the ticket grows to 1,344+ bytes. A TLS terminator handling 500K concurrent sessions goes from 128 MB of ticket cache to 672 MB. At ML-KEM-1024 with ML-DSA certificate signatures, the same cache exceeds 2 GB.

4. Certificate Chain Caches

OCSP stapling and certificate chain caches currently hold chains of 3-5 certificates at roughly 1-2 KB per chain. With ML-DSA-65 signatures on each certificate (3,309 bytes per signature, 3 signatures per chain), a single certificate chain grows to 12-15 KB. With SLH-DSA for the root certificate (a common conservative choice), a single chain can exceed 65 KB. CDN edge nodes caching certificate chains for thousands of domains will need to re-evaluate their memory budgets.

5. API Credential Stores

Service-to-service authentication tokens, OAuth access tokens, and API keys that carry cryptographic proofs all grow proportionally. A microservices architecture with 50 services, each caching credentials for the other 49, currently stores negligible crypto material. After the PQ transition, each credential pair carries ML-KEM encapsulated secrets (1,088 bytes) and ML-DSA signatures (3,309 bytes). The mesh of cached credentials becomes a meaningful memory consumer.

Why Network Caches Fail at PQ Sizes

Redis, Memcached, and ElastiCache add latency that scales linearly with value size. The three costs that scale are serialization (encoding the value for wire protocol), TCP transfer (sending bytes across the network), and deserialization (decoding on the client side). For a 96-byte session token, these costs are negligible. For a 4,493-byte PQ session token, they are 47 times larger.

Cached Value	Classical Size	PQ Size	Redis Latency (classical)	Redis Latency (PQ)
Session token	96 B	4,493 B	0.3ms	0.6ms
JWT (with sig)	256 B	3,565 B	0.3ms	0.5ms
TLS ticket	256 B	1,344 B	0.3ms	0.4ms
Certificate chain	2 KB	15-65 KB	0.35ms	1.0-1.8ms
SLH-DSA signature	64 B	17-49 KB	0.3ms	0.9-1.4ms

The individual increases look small. But these lookups happen on every request. A typical request flow hits the session cache, the JWT cache, and the rate limiter. Three cache lookups per request at 0.3ms each was 0.9ms total. Three cache lookups at PQ sizes is 1.5-2.8ms total. That is a 67-211% increase in cumulative cache latency per request, caused entirely by key size growth.

In-Process Caching: Size-Independent Latency

An in-process cache stores values in the application's own address space. A GET is a hash lookup and a pointer dereference. There is no serialization. No TCP transfer. No deserialization. The latency is 31 nanoseconds regardless of value size. A 96-byte classical session token and a 4,493-byte PQ session token are accessed at exactly the same speed. The post-quantum transition does not change your cache latency if your cache runs in-process.

The Migration Playbook

Inventory your cached crypto material. Identify every cache that stores keys, signatures, tokens, or certificates. Session stores, JWT caches, TLS ticket caches, OCSP caches, API credential stores, certificate chain caches. For each one, document the current value size and the PQ equivalent using the tables above.
Calculate your multiplier. If you are adopting ML-KEM-768 + ML-DSA-65 (the most common choice), multiply your current crypto-material cache footprint by 47x. If you are using SLH-DSA for any component, multiply that component by 267-779x. These are not estimates. They are the algorithm specifications.
Separate the payload from the proof. You do not need to cache a full 17 KB SLH-DSA signature to remember that a value was verified. Cache the verification result (a boolean + content hash) instead of the full signature. Cache the issuer public key (accessed frequently, changes rarely) separately from the per-token signatures (accessed once, large).
Move hot-path crypto lookups to in-process cache. Session validation, JWT verification, and rate limiting happen on every request. These must be sub-millisecond. At PQ sizes, network caches cannot guarantee sub-millisecond for values over 1 KB. In-process caching at 31ns eliminates value size from the latency equation entirely.
Keep cold-path material in existing infrastructure. Certificate revocation lists, audit logs, archival signatures, and historical session records can remain in Redis or persistent storage. Their access frequency does not justify in-process caching, and their size growth is manageable with standard capacity planning.
Plan for hybrid mode. During the transition (2025-2035), many systems will carry both classical and PQ key material simultaneously. A hybrid TLS session ticket includes both an X25519 key share (32 bytes) and an ML-KEM-768 ciphertext (1,088 bytes). Your cache must accommodate both until the classical material is deprecated. Budget for 1.5-2x the PQ-only footprint during the hybrid period.

The Window Is Now

Chrome and Firefox are shipping ML-KEM today. AWS, Cloudflare, and Google Cloud offer PQ-enabled endpoints. The libraries are production-ready. Every month that passes, your infrastructure accumulates more classical key material that will eventually need to be replaced with material that is 10-100x larger.

The cache layer is where the size increase hits first and hardest. It is also the easiest to fix. Moving hot-path crypto lookups from a network cache to an in-process cache is a configuration change, not an architecture rewrite. The latency improvement is immediate. The preparation for PQ key sizes is automatic. And when the CNSA 2.0 deadlines arrive, your cache layer is already ready.

Cachee handles PQ key sizes at 31ns. Value size does not affect latency.

Install Cachee PQ Key Size Guide