Cachee vs Redis at Post-Quantum Key Sizes

Redis was built for small values. Post-quantum keys are not small. This is what happens when you push PQ-sized cryptographic material through a network cache designed for the classical era.

Published: April 14, 2026 | Author: Eric Beans, CEO, H33.ai, Inc.

1. The Test

Seven post-quantum value sizes tested on both Redis 7.4 (ElastiCache r7g.xlarge, same-AZ, TLS disabled for best-case) and Cachee L0 (in-process DashMap, same host). Not synthetic key-value microbenchmarks -- realistic cache operations that mirror production use:

OperationCached ValueSize
Session lookupPQ session token (ML-KEM-768 CT + ML-DSA-65 sig + metadata)4,493 B
JWT signature verificationML-DSA-65 signature (cached to skip re-verify)3,309 B
TLS ticket retrievalML-KEM-768 ciphertext1,088 B
Classical baselineEd25519 signature64 B
Compact PQ signatureFALCON-512 signature690 B
Hash-based signature cacheSLH-DSA-SHA2-128f signature17,088 B
Maximum PQ signatureSLH-DSA-SHA2-256f signature49,856 B

Each test: 100,000 GET operations, pre-warmed cache, single-threaded client, P50 and P99 measured independently. Redis connected via TCP loopback (same AZ, <0.1ms RTT). Cachee L0 is a direct in-process hash lookup with no serialization.

2. Results

Value Size Redis P50 Redis P99 Cachee P50 Cachee P99 Factor
Ed25519 sig 64 B 0.31 ms 0.58 ms 0.031 us 0.089 us 10,000x
FALCON-512 sig 690 B 0.34 ms 0.64 ms 0.031 us 0.092 us 10,968x
ML-KEM-768 CT 1,088 B 0.36 ms 0.71 ms 0.031 us 0.093 us 11,613x
ML-DSA-65 sig 3,309 B 0.44 ms 0.92 ms 0.031 us 0.095 us 14,194x
PQ session token 4,493 B 0.52 ms 1.10 ms 0.031 us 0.096 us 16,774x
SLH-DSA-128f sig 17,088 B 0.91 ms 1.85 ms 0.031 us 0.098 us 29,355x
SLH-DSA-256f sig 49,856 B 1.42 ms 2.95 ms 0.031 us 0.102 us 45,806x
Cachee P50 is constant across all sizes. In-process hash lookup time is dominated by the hash computation and pointer dereference, not by the value size. The value is never copied, serialized, or transmitted -- it is a direct reference. Redis latency increases linearly with value size because every byte must be serialized, transmitted, and deserialized.

3. Why Redis Degrades at PQ Sizes

Every Redis GET incurs three costs that scale with value size:

RESP Serialization (server-side)

Redis encodes the response using the RESP protocol. Bulk strings require a $<length>\r\n<data>\r\n envelope. The data portion is a byte-for-byte copy from the internal hash table into the output buffer. Cost: linear in value size.

TCP Transfer

The serialized payload traverses the kernel TCP stack, potentially spanning multiple MTU frames. A 49,856-byte SLH-DSA-256f signature requires ~34 TCP segments at 1,500-byte MTU. A 64-byte Ed25519 signature fits in a single segment. Cost: linear in value size, with per-segment overhead.

Client Deserialization

The client library parses the RESP envelope and allocates a buffer for the value. Larger values require larger allocations and more memcpy work. Cost: linear in value size.

Latency Breakdown by Value Size

The network round-trip (SYN/ACK, kernel scheduling, interrupt handling) is roughly constant at ~0.29 ms. Everything else scales:

64 B
RTT 95%
690 B
4,493 B
17 KB
49 KB
Network RTT (constant) Serialization TCP transfer Deserialization
At 64 bytes, serialization + transfer + deserialization account for roughly 5% of total Redis latency. The round-trip dominates. At 49,856 bytes, these three costs account for 72% of total latency. The round-trip is now the minority. Redis is doing more work moving bytes than it spends waiting on the network.

4. Throughput at PQ Sizes

What happens at scale: 100,000 requests per second, each retrieving a 4,493-byte PQ session token (ML-KEM-768 + ML-DSA-65).

Redis at 100K req/sec with 4,493 B values

Serialization cost per request: ~2.3 us (RESP encode 4,493 bytes)
TCP transfer per request: ~2.5 us (3 TCP segments at 1500 MTU)
Deserialization per request: ~1.6 us (RESP parse + alloc + memcpy)
Total variable cost per request: ~6.4 us

At 100K req/sec:
100,000 x 6.4 us = 640 ms of cumulative serialization/transfer latency per second
+ 100,000 x 290 us RTT = 29,000 ms connection time (requires ~30 parallel connections)
Network bandwidth: 100,000 x 4,493 B = 449 MB/sec sustained
Redis single-thread serialization becomes the bottleneck at ~60K req/sec for this value size

Cachee L0 at 100K req/sec with 4,493 B values

Lookup cost per request: ~0.031 us (hash + pointer dereference)
Serialization: 0 (direct reference, no copy)
TCP transfer: 0 (in-process)
Deserialization: 0 (already in application memory)

At 100K req/sec:
100,000 x 0.031 us = 3.1 ms total compute per second
Network bandwidth: 0 bytes
No single-thread bottleneck (DashMap is sharded, scales with cores)
640 ms vs 3.1 ms. At PQ session sizes, Redis spends 206x more CPU time on serialization overhead alone than Cachee spends on the entire operation. This is not a Redis bug. It is a fundamental consequence of putting large values through a serialization-and-network pipeline when the consumer is on the same machine.

5. Memory Efficiency

Redis stores each key with internal metadata: a redisObject header (16 bytes), an SDS string header (variable, typically 9 bytes for keys under 256 bytes), a dict entry with two pointers and a hash (24 bytes), plus jemalloc size-class rounding. Total per-key overhead: approximately 70 bytes.

Value Size Example Redis Overhead Redis Total/Key Cachee Overhead Cachee Total/Key
64 B Ed25519 sig 109% 134 B 63% 104 B
690 B FALCON-512 sig 10.1% 760 B 5.8% 730 B
1,088 B ML-KEM-768 CT 6.4% 1,158 B 3.7% 1,128 B
3,309 B ML-DSA-65 sig 2.1% 3,379 B 1.2% 3,349 B
4,493 B PQ session 1.6% 4,563 B 0.9% 4,533 B
17,088 B SLH-DSA-128f sig 0.4% 17,158 B 0.2% 17,128 B
49,856 B SLH-DSA-256f sig 0.1% 49,926 B 0.08% 49,896 B

Per-key overhead percentage actually improves for Redis at PQ sizes -- the 70-byte fixed cost becomes negligible against a 49 KB value. But the total memory tells the real story.

1 million PQ sessions (4,493 B each):

SystemTotal Memoryvs Classical (64 B values)
Redis, classical134 MBbaseline
Redis, PQ sessions4,563 MB34x
Cachee, classical104 MBbaseline
Cachee, PQ sessions4,533 MB44x

PQ values dominate total memory regardless of system. The difference is that Cachee does not add network bandwidth on top of that memory cost. Redis serving 4.5 GB of cached PQ sessions at 100K req/sec generates 449 MB/sec of internal network traffic. Cachee generates zero.

6. When Redis Is Still Fine

Redis is not obsolete. It remains the right choice in specific scenarios, even in a post-quantum world:

ScenarioWhy Redis WorksPQ Caveat
Small values (<1 KB) Serialization cost is negligible. Network RTT dominates. Redis overhead is constant and acceptable. Only Ed25519 (64 B) and FALCON-512 (690 B) fall under 1 KB. All other PQ values exceed this.
Low frequency (<1K req/sec) Even at PQ sizes, cumulative serialization cost is under 6.4 ms/sec. Not a bottleneck. Fine until you scale. The problem appears at 10K+ req/sec.
Shared state across processes Multiple application instances need a consistent view of the same data. Redis provides this. In-process caches cannot. Consider Cachee L1 (distributed) for cross-process PQ state. L0 is single-process only.
Pub/sub and streams Redis Pub/Sub and Streams have no in-process equivalent. Event-driven architectures need a message broker. Use Redis for messaging. Use Cachee for the hot-path cache that consumes those messages.
Persistence and replication Redis AOF/RDB provides durability. ElastiCache provides multi-AZ failover. If you need durable PQ key storage, use a database. Caches are ephemeral by definition.
The short version: Redis is the wrong tool for hot-path PQ key material on a single instance. It is still useful for cross-process coordination, pub/sub, and low-frequency access patterns. Use both: Cachee for the hot path, Redis for everything else.

Summary

MetricRedis 7.4 (ElastiCache)Cachee L0 (in-process)
P50 latency at 4,493 B 0.52 ms 0.031 us 16,774x faster
P99 latency at 49,856 B 2.95 ms 0.102 us 28,922x faster
Serialization cost at 100K/sec 640 ms/sec 0 ms/sec
Network bandwidth at 100K/sec 449 MB/sec 0 MB/sec
Single-thread bottleneck ~60K req/sec (at 4.5 KB values) None (sharded)
Per-key overhead ~70 bytes ~40 bytes

Try Cachee

In-process PQ-attested caching. Sub-microsecond lookups at any value size. No serialization. No network hop.

brew install h33ai-postquantum/tap/cachee

PQ Key Size Reference  |  What is Post-Quantum Caching?  |  Install Guide

PQ Key Sizes | Post-Quantum Caching | All Comparisons | Install Cachee