Caching Post-Quantum Keys: A Size Guide for ML-KEM, FALCON, SLH-DSA, and ML-DSA

April 17, 2026 | 10 min read | Engineering

NIST finalized three post-quantum cryptographic standards in 2024: ML-KEM (FIPS 203), ML-DSA (FIPS 204), and SLH-DSA (FIPS 205). A fourth, FN-DSA (FALCON), is expected in 2025. Together, they replace the RSA, ECDH, and ECDSA primitives that underpin every TLS handshake, JWT, API token, and session credential in production today.

The migration is not optional. CNSA 2.0 mandates post-quantum algorithms for US national security systems by 2030. NIST SP 800-227 recommends transition planning now. The question is not whether your infrastructure will handle PQ key material, but when.

And when it does, your cache layer will feel it first.

The Size Problem

Classical cryptographic keys are small. An ECDH public key is 32 bytes. An Ed25519 signature is 64 bytes. An RSA-2048 public key is 256 bytes. These fit comfortably in cache lines, session tokens, and TLS records. Your infrastructure was built around these sizes.

Post-quantum keys are not small.

Algorithm	Type	Public Key	Private Key	Signature / Ciphertext	vs Classical
ECDH (P-256)	KEM	32 B	32 B	32 B (shared secret)	baseline
ML-KEM-512	KEM	800 B	1,632 B	768 B	25x pub key
ML-KEM-768	KEM	1,184 B	2,400 B	1,088 B	37x pub key
ML-KEM-1024	KEM	1,568 B	3,168 B	1,568 B	49x pub key
Ed25519	Signature	32 B	32 B	64 B	baseline
ML-DSA-44	Signature	1,312 B	2,560 B	2,420 B	41x pub key
ML-DSA-65	Signature	1,952 B	4,032 B	3,309 B	61x pub key
ML-DSA-87	Signature	2,592 B	4,896 B	4,627 B	81x pub key
FALCON-512	Signature	897 B	1,281 B	690 B	28x pub key
FALCON-1024	Signature	1,793 B	2,305 B	1,330 B	56x pub key
SLH-DSA-128f	Signature	32 B	64 B	17,088 B	267x signature
SLH-DSA-192f	Signature	48 B	96 B	35,664 B	557x signature
SLH-DSA-256f	Signature	64 B	128 B	49,856 B	779x signature

The numbers are unambiguous. A single ML-DSA-87 public key is 2,592 bytes -- larger than an entire classical TLS ClientHello. A single SLH-DSA-256f signature is 49,856 bytes -- nearly 50 KB for one signature. These are not edge cases. These are the NIST-standardized defaults that every library, framework, and infrastructure vendor will ship.

What This Means for Your Cache

Every system that caches cryptographic material -- session tokens, TLS session tickets, API credentials, JWTs, certificate chains, OCSP responses -- will see its working set grow by an order of magnitude. The math is straightforward.

Consider a session store holding 1 million active sessions. Today, each session includes an ECDH ephemeral key (32 bytes) and an Ed25519 signature (64 bytes) -- 96 bytes of crypto material per session. Total: 96 MB.

After the PQ transition, each session carries an ML-KEM-768 encapsulation key (1,184 bytes) and an ML-DSA-65 signature (3,309 bytes) -- 4,493 bytes of crypto material per session. Total: 4.49 GB.

That is a 47x increase in cache memory consumed by key material alone. The session data itself -- user IDs, permissions, metadata -- does not change. But the cryptographic envelope around it explodes.

The Network Cache Problem

Redis, Memcached, and ElastiCache are network-bound caches. Every GET and SET crosses the network: serialization, TCP round-trip, deserialization. When your values were 96 bytes, the serialization overhead was negligible. When your values are 4,493 bytes, serialization time scales linearly. A Redis GET that took 0.5ms at 96 bytes takes 1.2ms at 4,493 bytes on the same network. Multiply that across every TLS handshake, every session validation, every API auth check. The latency budget that was invisible at classical key sizes becomes the dominant cost at PQ key sizes.

Key-by-Key Cache Analysis

ML-KEM (FIPS 203) -- Key Encapsulation

1,568 B

Public key (ML-KEM-1024)

3,168 B

Private key

1,568 B

Ciphertext

Where it lives in your cache: TLS session tickets, ephemeral key exchange results, pre-shared keys. Every TLS 1.3 handshake that uses ML-KEM generates a ciphertext that the server must cache for session resumption. At ML-KEM-1024, that is 1,568 bytes per session ticket -- compared to 32 bytes for X25519 today.

Cache impact: If your TLS terminator caches 500K session tickets for resumption, the key material alone grows from 16 MB (X25519) to 784 MB (ML-KEM-1024). This does not fit in a Redis instance that was sized for classical keys. It does fit in an in-process L1 cache with no serialization overhead.

Recommendation: Cache ML-KEM encapsulation results in-process. The 1,568-byte ciphertext is accessed once per resumed handshake and must be served at sub-microsecond latency to avoid adding to TLS negotiation time. Network round-trips are unacceptable here.

ML-DSA (FIPS 204) -- Digital Signatures (Dilithium)

1,952 B

Public key (ML-DSA-65)

3,309 B

Signature

4,032 B

Private key

Where it lives in your cache: JWT verification keys, API token signatures, certificate chain validation, code signing verification. Every API gateway that validates JWTs must cache the issuer's ML-DSA public key. Every microservice that verifies inter-service auth tokens must cache the signing key.

Cache impact: A gateway serving 20 API issuers caches 20 ML-DSA-65 public keys: 39 KB -- trivial in isolation. But the signatures attached to every JWT are 3,309 bytes each. If you cache 100K validated tokens for deduplication, the signature material alone is 331 MB.

Recommendation: Cache ML-DSA public keys aggressively -- they change rarely and are accessed on every verification. Cache validated token results (not the full signatures) to avoid storing 3.3 KB per token. An in-process cache with CacheeLFU admission control keeps the hot issuer keys in L0 at 31ns access while evicting cold tokens automatically.

FALCON / FN-DSA -- Compact Lattice Signatures

897 B

Public key (FALCON-512)

690 B

Signature

1,281 B

Private key

Where it lives in your cache: FALCON is the most cache-friendly PQ signature scheme. Its 690-byte signatures are 4.8x smaller than ML-DSA-65 (3,309 bytes) and 24.8x smaller than SLH-DSA-128f (17,088 bytes). This makes it the preferred choice for systems where signatures are cached or transmitted frequently: real-time auth tokens, WebSocket session credentials, IoT device attestations.

Cache impact: Moderate. At 690 bytes per signature, caching 1M validated FALCON signatures requires 690 MB -- significant but manageable. The 897-byte public keys are comparable to ML-KEM-512 and cache efficiently.

Recommendation: FALCON is the right choice when cache memory is constrained. Its compact signatures reduce both cache pressure and network transfer. The tradeoff: FALCON key generation requires careful constant-time sampling (NTRU lattice), making it computationally heavier than ML-DSA. Cache the generated keys and signatures, not the generation process.

SLH-DSA (FIPS 205) -- Stateless Hash-Based Signatures (SPHINCS+)

32 B

Public key (SLH-DSA-128f)

17,088 B

Signature

64 B

Private key

Where it lives in your cache: SLH-DSA is the outlier. Its public keys are tiny (32 bytes -- same as Ed25519), but its signatures are enormous. SLH-DSA-128f produces 17 KB signatures. SLH-DSA-256f produces 49 KB signatures. This is the conservative choice -- its security relies only on hash function properties, not lattice assumptions -- but the size cost is severe.

Cache impact: Catastrophic for signature caching. Caching 100K SLH-DSA-128f signatures requires 1.7 GB. At the 256f security level, that becomes 4.9 GB. No network cache handles this gracefully. The serialization overhead alone (17 KB per GET/SET across TCP) adds 0.5-2ms per operation on ElastiCache.

Recommendation: Do not cache full SLH-DSA signatures unless absolutely necessary. Cache the verification result (a boolean) alongside the content hash, not the signature itself. When you must cache the signature (audit trails, compliance), use an in-process engine with zero serialization. A 17 KB value served from in-process memory at 31ns is 50,000x faster than the same value fetched from ElastiCache at 1.5ms.

The Working Set Math

Here is the cache memory required for 1 million sessions under each PQ migration scenario, compared to the classical baseline:

Scenario	Key Material / Session	1M Sessions	vs Classical
Classical (ECDH + Ed25519)	96 B	96 MB	baseline
ML-KEM-768 + ML-DSA-65	4,493 B	4.49 GB	47x
ML-KEM-1024 + FALCON-512	2,258 B	2.26 GB	24x
ML-KEM-768 + SLH-DSA-128f	18,272 B	18.27 GB	190x
Hybrid (X25519 + ML-KEM-768)	1,280 B	1.28 GB	13x

Even the most conservative hybrid approach (X25519 + ML-KEM-768) increases cache memory by 13x. The SLH-DSA path increases it by 190x. These are not theoretical projections -- they are the byte-level arithmetic of the algorithms NIST has standardized.

Why In-Process Caching Wins at PQ Scale

Network-bound caches have two costs that scale with value size: serialization and transfer. A 96-byte session token serializes in nanoseconds and transfers in microseconds. A 4,493-byte PQ session token takes 47x longer to serialize and 47x longer to transfer. The network round-trip that was invisible at classical sizes becomes the bottleneck at PQ sizes.

In-process caching eliminates both costs entirely. The value sits in the same address space as your application. A GET is a hash lookup and a pointer dereference -- 31 nanoseconds regardless of value size. Whether the cached value is 96 bytes or 49,856 bytes, the access time is the same. There is no serialization, no TCP connection, no network hop.

Cachee: Built for PQ Key Sizes

Cachee is an in-process cache engine that runs natively in Rust alongside your application. 140+ Redis-compatible commands. 32M+ ops/sec single-thread. 31ns L1 reads. CacheeLFU adaptive eviction keeps hot keys (issuer public keys, active session tokens) in L0 while automatically evicting cold entries. Zero serialization overhead means PQ key material is cached at the same latency as classical keys. The transition to post-quantum cryptography does not require a transition in your cache architecture -- it requires a cache that was built for the payload sizes PQ demands.

Practical Migration Steps

Audit your cached key material. Identify every place your infrastructure caches cryptographic keys, signatures, or tokens. TLS session stores, JWT verification caches, API credential stores, certificate caches, OCSP stapling caches.
Calculate the multiplier. For each cache, multiply the current key material size by the PQ equivalent from the table above. If you are using ML-KEM-768 + ML-DSA-65, multiply by 47x. If you are using FALCON-512, multiply by 24x.
Decide what to cache. Not everything needs to be cached at full fidelity. Cache verification results (booleans) instead of full SLH-DSA signatures. Cache public keys (accessed frequently, change rarely) more aggressively than ciphertexts (accessed once per session).
Move hot-path crypto material to in-process cache. TLS session tickets, JWT issuer keys, and auth tokens are accessed on every request. These must be served at sub-microsecond latency. Network caches add unacceptable overhead at PQ sizes.
Keep cold-path material in your existing infrastructure. Certificate revocation lists, audit logs, and archival signatures can remain in Redis or persistent storage. The access frequency does not justify in-process caching.

The Timeline

CNSA 2.0 mandates PQ key exchange for national security systems by 2030. Major browsers already support ML-KEM in TLS 1.3 (Chrome 131+, Firefox 132+). AWS, Cloudflare, and Google Cloud offer PQ-enabled endpoints. The libraries are shipping. The standards are final. The only question is whether your cache layer is ready for the key sizes they produce.

Every month you wait, your infrastructure accumulates more classical key material that will eventually need to be migrated. The cache layer is the first place the size increase hits -- and the last place most teams think to prepare.

Cachee handles PQ key sizes at 31ns. No serialization. No network hop. No infrastructure change.

Install Cachee PQ Cache Details