Post-Quantum Cache Migration: Your 2029 Compliance Deadline Starts at the Cache Layer

May 9, 2026 | 14 min read | Engineering

Google committed to full post-quantum cryptography by 2029. Cloudflare is already shipping ML-KEM in TLS 1.3 for every connection that hits its edge. NIST CNSA 2.0 mandates post-quantum algorithms for all National Security Systems by 2030, with Phase 1 software signing requirements hitting in 2027. Chrome, Firefox, and Safari are shipping ML-KEM key agreement in production today. The industry conversation about post-quantum migration is loud, specific, and accelerating. It is also focused entirely on the wrong layer.

Every PQ migration guide published by NIST, NSA, CISA, and the major cloud providers talks about TLS, key exchange, digital signatures, and certificates. None of them mention cache. This is a critical oversight, because your cache layer is where the largest post-quantum payloads land first and where the performance impact is most severe. An ML-DSA-65 signature is 3,309 bytes. A FALCON-512 public key is 897 bytes. An ML-KEM-768 encapsulation result is 1,088 bytes. These are not theoretical sizes -- they are the exact byte counts from FIPS 203, 204, and 205. And every one of these payloads gets cached for session resumption, certificate validation, key reuse, and signature verification.

Redis latency scales linearly with payload size. Classical cryptographic payloads -- Ed25519 signatures at 64 bytes, ECDH shared secrets at 32 bytes, RSA-2048 signatures at 256 bytes -- were small enough that cache performance was never the bottleneck. Post-quantum payloads are 10x to 100x larger. Your cache becomes the bottleneck before your TLS does, because TLS migration generates the PQ payloads that need caching.

51x

Signature Size Increase (Ed25519 to ML-DSA-65)

34x

Key Exchange Size Increase (ECDH to ML-KEM)

31ns

Cachee Latency (Any Payload Size)

The Timeline Everyone Ignores

The post-quantum migration timeline is not a single deadline. It is a sequence of milestones that have already started. FIPS 203, 204, and 205 were finalized in 2024. Chrome shipped ML-KEM key agreement in 2025. The deadlines ahead are not far away, and each one creates new PQ payloads that your cache infrastructure must handle.

Year	Milestone	Cache Impact
2024	FIPS 203/204/205 finalized (ML-KEM, ML-DSA, SLH-DSA)	PQ payload sizes are now standardized. Cache capacity planning starts.
2025	Chrome ships ML-KEM; HIPAA mandates encryption review	PQ TLS sessions begin hitting your infrastructure. Session caches see first PQ payloads.
2026	AWS/Azure offer PQ key exchange (preview)	Cloud-managed PQ key exchange generates PQ shared secrets that services cache.
2027	CNSA 2.0 Phase 1: software/firmware signing must be PQ	Code-signing certs become PQ. Build caches, artifact caches, and CI pipelines must handle 3,309B+ signatures.
2029	Google PQ deadline; browser deprecation of classical-only begins	Every TLS session produces PQ payloads. Session resumption caches must handle PQ at full traffic volume.
2030	CNSA 2.0 full compliance for National Security Systems	All cached cryptographic material must be PQ. No exceptions for any data store, including cache.
2033	Microsoft PQ deadline	Stragglers who delayed migration face forced PQ upgrades across all infrastructure.

The critical insight most teams miss: your cache needs to handle PQ payloads BEFORE your TLS migrates, not after. When you enable PQ TLS on your load balancer, every handshake immediately produces ML-KEM shared secrets and ML-DSA certificates that your backend services cache for session resumption. If your cache cannot handle these payloads at production throughput, your TLS migration fails at the cache layer. The cache is the prerequisite, not the afterthought.

The Deadline Is Closer Than You Think

CNSA 2.0 Phase 1 software signing requirements take effect in 2027 -- that is less than 12 months away. If your build infrastructure caches code-signing certificates or verification results, those caches must handle PQ payloads by then. The 2029 Google deadline applies to every web-facing service. The cache migration must be complete before these deadlines, because cache is the dependency that TLS migration relies on.

Why Cache Breaks Before TLS

When you enable post-quantum TLS, the cryptographic handshake completes at the edge -- your load balancer, your CDN, or your reverse proxy. The handshake produces PQ artifacts: ML-KEM shared secrets (1,088 bytes for ML-KEM-768, 1,568 bytes for ML-KEM-1024), ML-DSA certificates (3,309 bytes for the signature alone, plus the certificate body), and PQ session tickets. These artifacts do not stay at the edge. They propagate inward to every service that needs to validate the session, resume a connection, or verify a certificate chain.

Every one of these services caches these artifacts. Session resumption requires caching the shared secret or a session ticket derived from it. Certificate validation requires caching the certificate chain and its PQ signatures. Mutual TLS requires caching client certificates with their PQ public keys and signatures. The cache is not an optimization here -- it is a requirement. Without cached session data, every request requires a full PQ handshake, which at current performance levels takes 2-5 milliseconds per handshake. No production system can absorb that per-request overhead.

Here is where the math breaks Redis. A single ML-DSA-65 signature cached in Redis takes approximately 440 microseconds to read at the 3,309-byte payload size, based on published Redis latency benchmarks for payloads of that size. At 100,000 sessions per second -- a modest load for any service behind a major load balancer -- Redis needs to serve 100,000 reads of 3,309 bytes each per second. That is 43 CPU-seconds of blocking Redis operations per wall-clock second. You need 43 Redis CPU cores just to serve session cache reads, before any writes, evictions, or other operations. Your cache infrastructure fails before your TLS upgrade is complete.

The failure mode is not a crash. It is latency degradation. Redis operations that took 50 microseconds at 64-byte Ed25519 payloads now take 440 microseconds at 3,309-byte ML-DSA payloads. P99 latencies climb. Timeouts increase. Session resumption failures trigger full handshakes, which generate more PQ payloads that need caching, which increases cache pressure further. It is a feedback loop that degrades gracefully until it does not.

The Payload Size Problem

The core issue is straightforward: post-quantum cryptographic payloads are dramatically larger than their classical equivalents, and cache performance is a function of payload size. The following table shows the exact size comparison for every cryptographic artifact that gets cached in a typical TLS-terminating infrastructure.

Artifact	Classical	Post-Quantum	Size Increase	Redis Latency	Cachee Latency
Digital signature	Ed25519: 64B	ML-DSA-65: 3,309B	51x	~440us	31ns
Key exchange result	ECDH: 32B	ML-KEM-768: 1,088B	34x	~280us	31ns
X.509 certificate	RSA-2048: ~1,000B	PQ cert bundle: 4,493B	4.5x	~520us	31ns
Session token	Classical: ~200B	PQ session bundle: 4,493B	22x	~520us	31ns
Public key	Ed25519: 32B	FALCON-512: 897B	28x	~230us	31ns
Certificate chain (3 certs)	~3,000B	~13,479B	4.5x	~1,100us	31ns

Redis latency increases with payload size because every operation involves serializing the value through the RESP protocol, copying it across a TCP socket, deserializing it on the client side, and managing memory allocation for the larger buffer. These are not Redis bugs. They are fundamental properties of a network-attached cache serving large payloads. The RESP protocol has no compression. Every byte of a 3,309-byte ML-DSA signature is transmitted as-is over the TCP connection, plus RESP framing overhead.

Cachee's in-process L1 tier operates at 31 nanoseconds regardless of payload size because the read is a pointer dereference into the application's own memory space. There is no serialization, no network hop, no protocol framing. The 3,309-byte ML-DSA signature lives in the same memory that the application reads from. Payload size affects memory consumption but not read latency. This is not an optimization. It is a different architecture.

Three Migration Strategies for Cache

There are exactly three approaches to handling post-quantum payloads in your cache infrastructure. Each has different tradeoffs in complexity, performance, and effectiveness. Most production systems will use a combination.

Strategy 1: Compress Then Cache

The intuitive approach is to compress PQ payloads before caching them. An ML-DSA-65 signature (3,309 bytes) compresses to approximately 2,800-3,100 bytes with zstd at default compression level, depending on the signature's entropy. This is a 6-15% reduction. The savings are minimal because cryptographic signatures are high-entropy data -- they are designed to look random, and random data does not compress well.

# Strategy 1: Compress PQ payloads before caching
# Savings: 6-15% on signatures, adds ~50us CPU per operation

import zstd

def cache_pq_signature(redis_client, key, ml_dsa_signature):
    # ML-DSA-65 signature: 3,309 bytes
    compressed = zstd.compress(ml_dsa_signature)  # ~2,900 bytes (12% savings)
    redis_client.set(key, compressed, ex=3600)
    # Redis latency: still ~400us (down from ~440us)
    # Added CPU: ~50us for compression
    # Net result: slower than before compression

def read_pq_signature(redis_client, key):
    compressed = redis_client.get(key)  # ~400us Redis read
    if compressed:
        return zstd.decompress(compressed)  # +30us decompression
    return None
    # Total: ~430us per read (worse than uncompressed Redis)

Compression adds CPU overhead (approximately 50 microseconds for compression, 30 microseconds for decompression) for minimal size reduction. The net result is often slower than caching the uncompressed payload, because you trade small network savings for guaranteed CPU cost on every read and write. Compression works for large, low-entropy payloads like JSON documents. It does not work for cryptographic material.

Strategy 2: Cache the Boolean, Not the Signature

The more effective approach is to change what you cache. Most services that cache PQ signatures do not need the signature itself -- they need the verification result. An ML-DSA-65 signature verification produces a single boolean: valid or invalid. Instead of caching 3,309 bytes of signature, cache 1 byte of verification result plus a computation fingerprint binding the result to the specific input.

# Strategy 2: Cache the verification result, not the signature
# Size reduction: 3,309 bytes -> 1 byte (3,309x reduction)

import hashlib

def verify_and_cache(cache, public_key, message, ml_dsa_signature):
    # Compute a fingerprint binding the result to the inputs
    fingerprint = hashlib.sha3_256(
        public_key + message + ml_dsa_signature
    ).hexdigest()

    # Check cache first
    cached_result = cache.get(f"pq_verify:{fingerprint}")
    if cached_result is not None:
        return cached_result == b'\x01'  # 1 byte, 31ns read

    # Full ML-DSA-65 verification (~1.2ms)
    is_valid = ml_dsa_verify(public_key, message, ml_dsa_signature)

    # Cache the boolean result, not the 3,309B signature
    cache.set(f"pq_verify:{fingerprint}", b'\x01' if is_valid else b'\x00')
    return is_valid
    # Cached reads: 31ns instead of 440us
    # Memory: 1 byte instead of 3,309 bytes per entry

This is a 3,309x size reduction per cached entry. But it only works when the consumer needs the verification result, not the signature itself. For session resumption (where you need the actual shared secret) or certificate serving (where you need the actual certificate bytes), you still need to cache the full payload. Strategy 2 is powerful but not universal.

Strategy 3: Move to In-Process L1

The only strategy that handles every PQ payload at constant latency regardless of size is in-process L1 caching. When the cache lives in the application's memory space, reads are pointer dereferences. A 32-byte ECDH shared secret and a 3,309-byte ML-DSA signature take the same time to read: 31 nanoseconds. There is no serialization, no network protocol, no TCP socket. The payload size affects memory consumption, not read latency.

# Strategy 3: In-process L1 cache (31ns constant, any payload size)
# This is the only strategy that scales for all PQ payload types

from cachee import CacheeL1

cache = CacheeL1(
    max_memory_mb=512,          # Budget for PQ payloads
    eviction="CacheeLFU",       # Frequency-based eviction
    attestation=True,           # PQ-sign every entry
    fingerprint_fields=[
        "input", "computation", "parameters", "version"
    ]
)

# Cache full ML-DSA-65 signature (3,309B) -- 31ns read
cache.set("sig:tx:abc123", ml_dsa_signature)

# Cache ML-KEM shared secret (1,088B) -- 31ns read
cache.set("kem:session:xyz", ml_kem_shared_secret)

# Cache PQ certificate chain (13,479B) -- 31ns read
cache.set("cert:chain:example.com", pq_cert_chain)

# All reads: 31ns, regardless of payload size
sig = cache.get("sig:tx:abc123")        # 31ns
secret = cache.get("kem:session:xyz")   # 31ns
chain = cache.get("cert:chain:example.com")  # 31ns

The tradeoff is memory. A Redis cluster can hold terabytes of cached data across many nodes. An in-process L1 cache is bounded by the application's available heap. For PQ migration, this tradeoff favors L1 because the hot-path PQ payloads -- active sessions, recently verified signatures, current certificate chains -- represent a working set that fits comfortably in hundreds of megabytes, even at the larger PQ payload sizes.

The Cachee approach combines Strategy 2 and Strategy 3. For verification results, cache the boolean with a computation fingerprint (Strategy 2). For payloads that must be served in full -- session secrets, certificates, keys -- cache them in the in-process L1 tier at 31 nanoseconds regardless of size (Strategy 3). The L2 network tier handles overflow and cross-process sharing. Hot-path PQ reads never hit the network.

CNSA 2.0 Compliance Checklist for Cache

The NSA's Commercial National Security Algorithm Suite 2.0 (CNSA 2.0) defines the timeline for migrating National Security Systems to post-quantum cryptography. If you operate NSS or supply software/services to organizations that do, your cache infrastructure must meet these requirements at each phase. Even if you are not in the NSS ecosystem, CNSA 2.0 sets the baseline that commercial compliance frameworks (FedRAMP, SOC 2, PCI DSS) will follow.

Phase 1 (2027): Software and Firmware Signing

All software and firmware signing must use PQ algorithms. This means every code-signing certificate, every signed build artifact, and every firmware update signature becomes a PQ payload. If your CI/CD pipeline caches code-signing certificates (most do, for verification performance), those cached certificates are now 4,493+ bytes instead of ~1,000 bytes. If your artifact repository caches signature verification results, the signatures being verified are now 3,309 bytes. Build caches, artifact registries, and deployment pipelines all need PQ-capable cache infrastructure by 2027.

Phase 2 (2029): Web Browsers and Servers

All web-facing TLS must use PQ key exchange and PQ authentication. This is the phase that hits every web service at full traffic volume. Session resumption caches, certificate caches, OCSP response caches, and TLS ticket caches must all handle PQ payloads at production throughput. This is the phase where Redis latency scaling becomes a production incident. A web service handling 100,000 requests per second cannot afford 440 microseconds per cached session lookup.

Phase 3 (2030): All Protocols

All cryptographic protocols must be PQ. VPN key exchange, email signatures (S/MIME), database connection encryption, API authentication tokens, inter-service mTLS -- everything. Every protocol that caches cryptographic material must handle PQ payload sizes. This is the phase where "we will migrate cache later" becomes a compliance violation.

Cachee Is Already PQ-Native

Cachee does not need a PQ migration. Every cache entry is already signed by three independent post-quantum signature algorithms: ML-DSA-65 (FIPS 204), SLH-DSA-SHA2-128f-simple (FIPS 205), and FALCON-512 (FN-DSA). The attestation layer was built PQ-first. There is no classical-to-PQ migration path because there was never a classical path. CNSA 2.0 Phase 1, Phase 2, and Phase 3 compliance are satisfied by default.

The Harvest-Now-Decrypt-Later Threat to Cache

The harvest-now-decrypt-later (HNDL) threat is the reason post-quantum migration has urgency beyond compliance deadlines. Nation-state adversaries are recording encrypted network traffic today with the expectation that future quantum computers will decrypt it. This threat is well-documented for TLS traffic, VPN tunnels, and encrypted email. It is rarely discussed for cache traffic, but the exposure is identical.

Redis communicates over the RESP protocol, typically protected by TLS. If that TLS uses classical key exchange (ECDHE), an adversary recording the traffic today can decrypt every cached value when a sufficiently powerful quantum computer exists. This includes session tokens, authentication state, PII, financial data, health records -- everything your cache holds. The RESP protocol provides no additional encryption layer. The cache values are plaintext within the TLS session.

The recording is not hypothetical. The NSA's own CNSA 2.0 guidance explicitly states that data requiring confidentiality beyond 2030 should be protected with PQ algorithms today, because the HNDL collection window is already open. If your cache holds data with long-term confidentiality requirements -- and most caches do, because they hold the same data as the database -- the HNDL threat applies now.

In-process caching eliminates the HNDL attack surface entirely. When the cache lives in the application's memory space, there is no network traffic to intercept. No RESP protocol. No TLS session to record. No TCP packets to capture. The cached data never leaves the process boundary. An adversary would need to compromise the application's memory directly, which is a fundamentally different and more difficult attack than passive network recording. This is not a theoretical advantage. It is the difference between an attack surface that exists and one that does not.

Your Redis Traffic Is Being Recorded

If your Redis instances communicate over a network -- even within a VPC, even over TLS -- the traffic is recordable. Classical TLS does not protect against future quantum decryption. Every cached value that traverses the network today can be decrypted by a future quantum computer. In-process cache has zero network attack surface. There is nothing to record, nothing to harvest, nothing to decrypt later.

Migration Playbook: 6 Steps

The following playbook takes your cache infrastructure from classical to PQ-ready. Each step builds on the previous one. The entire migration can be executed incrementally, without downtime, alongside your TLS and key exchange migration.

Step 1: Inventory Cached Cryptographic Material

Before you can migrate, you need to know what cryptographic material your cache holds. Most teams underestimate this because they think of cache as "just performance." In practice, caches hold session tokens (which contain or reference key material), certificate chains, OCSP responses, signed API responses, authentication proofs, and derived keys.

# Step 1: Audit your Redis for cryptographic material
# Run this against every Redis instance in your infrastructure

import redis
import sys

r = redis.Redis(host='your-redis-host', port=6379, db=0)

crypto_patterns = [
    "session:*", "cert:*", "tls:*", "token:*",
    "sig:*", "key:*", "auth:*", "ocsp:*",
    "handshake:*", "ticket:*"
]

inventory = {}
for pattern in crypto_patterns:
    keys = list(r.scan_iter(match=pattern, count=1000))
    if keys:
        sample_key = keys[0]
        sample_value = r.get(sample_key)
        inventory[pattern] = {
            "count": len(keys),
            "avg_size_bytes": sum(len(r.get(k) or b'') for k in keys[:100]) // min(len(keys), 100),
            "sample_ttl": r.ttl(sample_key)
        }
        print(f"{pattern}: {inventory[pattern]}")

# Output tells you exactly which keys hold crypto material,
# how large they are today, and how large they will be post-PQ

Step 2: Measure Current Payload Sizes and Latency

Baseline your current cache performance at current payload sizes. You need these numbers to quantify the PQ impact and justify the migration budget.

# Step 2: Baseline current Redis latency by payload size

import redis
import time
import statistics

r = redis.Redis(host='your-redis-host', port=6379, db=0)

def measure_latency(key_pattern, sample_size=1000):
    keys = list(r.scan_iter(match=key_pattern, count=sample_size))[:sample_size]
    latencies = []
    sizes = []
    for key in keys:
        start = time.perf_counter_ns()
        value = r.get(key)
        elapsed_ns = time.perf_counter_ns() - start
        if value:
            latencies.append(elapsed_ns)
            sizes.append(len(value))
    return {
        "p50_latency_us": statistics.median(latencies) / 1000,
        "p99_latency_us": statistics.quantiles(latencies, n=100)[98] / 1000,
        "avg_size_bytes": statistics.mean(sizes),
        "sample_count": len(latencies)
    }

# Measure each cryptographic key pattern
for pattern in ["session:*", "cert:*", "token:*", "sig:*"]:
    result = measure_latency(pattern)
    print(f"{pattern}: p50={result['p50_latency_us']:.0f}us, "
          f"p99={result['p99_latency_us']:.0f}us, "
          f"avg_size={result['avg_size_bytes']:.0f}B")

Step 3: Calculate PQ Payload Impact

Multiply your current payload sizes by the PQ expansion factor for each artifact type. Use the table from the Payload Size Problem section. This gives you the projected Redis latency under PQ payloads and the projected memory and throughput requirements.

# Step 3: Project PQ impact on your cache infrastructure

PQ_EXPANSION = {
    "session:*": 22,    # 200B classical -> 4,493B PQ session bundle
    "cert:*": 4.5,      # 1,000B classical -> 4,493B PQ cert
    "token:*": 22,      # 200B classical -> 4,493B PQ token bundle
    "sig:*": 51,        # 64B Ed25519 -> 3,309B ML-DSA-65
    "key:*": 28,        # 32B Ed25519 -> 897B FALCON-512
    "ocsp:*": 8,        # ~500B classical -> ~4,000B PQ OCSP response
}

for pattern, expansion in PQ_EXPANSION.items():
    current = measure_latency(pattern)
    projected_size = current["avg_size_bytes"] * expansion
    # Redis latency scales ~linearly with payload size
    projected_p50 = current["p50_latency_us"] * (expansion ** 0.7)
    projected_p99 = current["p99_latency_us"] * (expansion ** 0.7)
    print(f"{pattern}:")
    print(f"  Current: {current['avg_size_bytes']:.0f}B, p50={current['p50_latency_us']:.0f}us")
    print(f"  PQ projected: {projected_size:.0f}B, p50={projected_p50:.0f}us")
    print(f"  Cachee L1: 31ns regardless of size")

Step 4: Add L1 Cache Layer in Front of Redis

Deploy an in-process L1 cache tier that intercepts all hot-path reads before they reach Redis. This is a non-destructive change -- Redis remains as the L2 tier for cold reads and cross-process sharing. The L1 tier handles the performance-sensitive PQ payload reads at 31 nanoseconds.

# Step 4: Deploy L1 cache in front of Redis (non-destructive)

from cachee import CacheeL1, CacheeL2Redis

# L1: in-process, 31ns reads, handles PQ payload sizes
l1 = CacheeL1(
    max_memory_mb=512,
    eviction="CacheeLFU",
    attestation=True
)

# L2: existing Redis, handles cold reads and cross-process sharing
l2 = CacheeL2Redis(
    host="your-redis-host",
    port=6379,
    pq_aware=True  # enables PQ payload size monitoring
)

def get_cached(key):
    # Try L1 first (31ns)
    value = l1.get(key)
    if value is not None:
        return value

    # Fall back to L2 Redis (100-500us depending on payload size)
    value = l2.get(key)
    if value is not None:
        l1.set(key, value)  # Promote to L1 for next read
        return value

    return None  # Cache miss

Step 5: Migrate Hot-Path PQ Reads to L1

Identify the cache keys with the highest read frequency and the largest PQ payload sizes. These are the keys where the Redis-to-L1 migration produces the greatest latency improvement. Session tokens, active certificate chains, and frequently verified signatures are typically the highest-impact migration targets.

# Step 5: Identify and migrate hot-path PQ keys to L1

# Configure L1 promotion policy based on PQ payload size
l1.configure_promotion(
    # Auto-promote to L1 any key read more than 10 times/second
    min_read_frequency_hz=10,
    # Prioritize large PQ payloads (biggest latency improvement)
    priority_by="payload_size_desc",
    # Monitor which keys are PQ cryptographic material
    tag_patterns={
        "session:*": "pq_session",
        "cert:*": "pq_certificate",
        "sig:*": "pq_signature",
        "kem:*": "pq_key_exchange"
    }
)

# After 24 hours of traffic, check L1 hit rate for PQ keys
stats = l1.stats()
print(f"L1 hit rate (all): {stats['hit_rate']:.1%}")
print(f"L1 hit rate (PQ keys): {stats['pq_hit_rate']:.1%}")
print(f"PQ reads served at 31ns: {stats['pq_hits_per_sec']}/sec")
print(f"PQ reads still hitting Redis: {stats['pq_misses_per_sec']}/sec")

Step 6: Monitor and Decommission Oversized Redis Clusters

As L1 absorbs the hot-path PQ reads, Redis traffic drops. Monitor the Redis cluster utilization and scale down. Many organizations find that after L1 migration, their Redis cluster is handling only cold reads and cross-process cache sharing -- workloads that require a fraction of the original Redis capacity.

# Step 6: Monitor Redis reduction and right-size

# Track the migration progress
def migration_report(l1, l2):
    l1_stats = l1.stats()
    l2_stats = l2.stats()

    total_reads = l1_stats["total_reads"] + l2_stats["total_reads"]
    l1_share = l1_stats["total_reads"] / total_reads * 100

    print(f"L1 serving {l1_share:.1f}% of all reads at 31ns")
    print(f"Redis serving {100-l1_share:.1f}% of reads (cold path only)")
    print(f"Redis CPU utilization: {l2_stats['cpu_percent']:.1f}%")
    print(f"")
    print(f"Recommendation:")
    if l2_stats["cpu_percent"] < 20:
        print(f"  Redis cluster can be scaled down by 75%")
        print(f"  Estimated savings: ${l2_stats['monthly_cost'] * 0.75:.0f}/month")
    elif l2_stats["cpu_percent"] < 40:
        print(f"  Redis cluster can be scaled down by 50%")

# Run weekly during migration period
migration_report(l1, l2)

The six steps above produce a measurable reduction in PQ payload latency from day one. Step 4 is the critical step -- deploying L1 in front of Redis is non-destructive, reversible, and immediately effective. You do not need to remove Redis. You need to stop sending it the payloads that PQ migration makes too large for network-attached cache to handle efficiently.

The Migration Is Incremental, Not Big-Bang

You do not need to replace your entire cache infrastructure at once. Deploy L1 in front of Redis (Step 4), let it absorb hot-path PQ reads (Step 5), and scale down Redis as utilization drops (Step 6). The L1 tier handles PQ payloads at 31 nanoseconds from day one. Redis remains available for cold reads and cross-process sharing. Zero downtime. Zero data loss. Measurable improvement from the first deployment.

The Bottom Line

The post-quantum migration timeline is not a single deadline in 2029 or 2030. It is a sequence of milestones that started in 2024 and accelerates every year. CNSA 2.0 Phase 1 software signing requirements arrive in 2027. Google's PQ deadline hits in 2029. Full CNSA 2.0 compliance for NSS is required by 2030. Every milestone produces new PQ payloads -- signatures at 3,309 bytes, key exchanges at 1,088 bytes, certificates at 4,493 bytes -- that your cache infrastructure must handle at production throughput.

Your cache breaks before your TLS does because TLS migration generates the PQ payloads that caches must serve. Redis latency scales linearly with payload size. PQ payloads are 10x to 100x larger than classical. The math is simple and unfavorable. In-process L1 caching at 31 nanoseconds, constant regardless of payload size, is the only architecture that handles PQ payloads without becoming the bottleneck. Combined with caching verification results instead of full signatures (3,309x size reduction), this eliminates both the latency problem and the HNDL attack surface that network-attached caches expose.

The migration playbook is six steps. The first four can be completed in a week. The cache migration is the prerequisite that every other PQ migration depends on, and it is the one that nobody is talking about. Start now, because the deadlines that matter are closer than 2029.

Your 2029 PQ deadline starts at the cache layer. Cachee is PQ-native from day one -- 31ns reads, three PQ signature families, zero network attack surface.

Get Started PQ Caching Guide PQ Key Size Reference Redis vs L1 Benchmark Cache Bottleneck Analysis