Cachee vs Redis at Post-Quantum Key Sizes

Redis was built for small values. Post-quantum keys are not small. This is what happens when you push PQ-sized cryptographic material through a network cache designed for the classical era.

Published: April 14, 2026 | Author: Eric Beans, CEO, H33.ai, Inc.

1. The Test

Seven post-quantum value sizes tested on both Redis 7.4 (ElastiCache r7g.xlarge, same-AZ, TLS disabled for best-case) and Cachee L0 (in-process DashMap, same host). Not synthetic key-value microbenchmarks -- realistic cache operations that mirror production use:

Operation	Cached Value	Size
Session lookup	PQ session token (ML-KEM-768 CT + ML-DSA-65 sig + metadata)	4,493 B
JWT signature verification	ML-DSA-65 signature (cached to skip re-verify)	3,309 B
TLS ticket retrieval	ML-KEM-768 ciphertext	1,088 B
Classical baseline	Ed25519 signature	64 B
Compact PQ signature	FALCON-512 signature	690 B
Hash-based signature cache	SLH-DSA-SHA2-128f signature	17,088 B
Maximum PQ signature	SLH-DSA-SHA2-256f signature	49,856 B

Each test: 100,000 GET operations, pre-warmed cache, single-threaded client, P50 and P99 measured independently. Redis connected via TCP loopback (same AZ, <0.1ms RTT). Cachee L0 is a direct in-process hash lookup with no serialization.

2. Results

Value	Size	Redis P50	Redis P99	Cachee P50	Cachee P99	Factor
Ed25519 sig	64 B	0.31 ms	0.58 ms	0.031 us	0.089 us	10,000x
FALCON-512 sig	690 B	0.34 ms	0.64 ms	0.031 us	0.092 us	10,968x
ML-KEM-768 CT	1,088 B	0.36 ms	0.71 ms	0.031 us	0.093 us	11,613x
ML-DSA-65 sig	3,309 B	0.44 ms	0.92 ms	0.031 us	0.095 us	14,194x
PQ session token	4,493 B	0.52 ms	1.10 ms	0.031 us	0.096 us	16,774x
SLH-DSA-128f sig	17,088 B	0.91 ms	1.85 ms	0.031 us	0.098 us	29,355x
SLH-DSA-256f sig	49,856 B	1.42 ms	2.95 ms	0.031 us	0.102 us	45,806x

Cachee P50 is constant across all sizes. In-process hash lookup time is dominated by the hash computation and pointer dereference, not by the value size. The value is never copied, serialized, or transmitted -- it is a direct reference. Redis latency increases linearly with value size because every byte must be serialized, transmitted, and deserialized.

3. Why Redis Degrades at PQ Sizes

Every Redis GET incurs three costs that scale with value size:

RESP Serialization (server-side)

Redis encodes the response using the RESP protocol. Bulk strings require a $<length>\r\n<data>\r\n envelope. The data portion is a byte-for-byte copy from the internal hash table into the output buffer. Cost: linear in value size.

TCP Transfer

The serialized payload traverses the kernel TCP stack, potentially spanning multiple MTU frames. A 49,856-byte SLH-DSA-256f signature requires ~34 TCP segments at 1,500-byte MTU. A 64-byte Ed25519 signature fits in a single segment. Cost: linear in value size, with per-segment overhead.

Client Deserialization

The client library parses the RESP envelope and allocates a buffer for the value. Larger values require larger allocations and more memcpy work. Cost: linear in value size.

Latency Breakdown by Value Size

The network round-trip (SYN/ACK, kernel scheduling, interrupt handling) is roughly constant at ~0.29 ms. Everything else scales:

64 B

RTT 95%

690 B

4,493 B

17 KB

49 KB

Network RTT (constant) Serialization TCP transfer Deserialization

At 64 bytes, serialization + transfer + deserialization account for roughly 5% of total Redis latency. The round-trip dominates. At 49,856 bytes, these three costs account for 72% of total latency. The round-trip is now the minority. Redis is doing more work moving bytes than it spends waiting on the network.

4. Throughput at PQ Sizes

What happens at scale: 100,000 requests per second, each retrieving a 4,493-byte PQ session token (ML-KEM-768 + ML-DSA-65).

Redis at 100K req/sec with 4,493 B values

Serialization cost per request: ~2.3 us (RESP encode 4,493 bytes) TCP transfer per request: ~2.5 us (3 TCP segments at 1500 MTU) Deserialization per request: ~1.6 us (RESP parse + alloc + memcpy) Total variable cost per request: ~6.4 us At 100K req/sec: 100,000 x 6.4 us = 640 ms of cumulative serialization/transfer latency per second + 100,000 x 290 us RTT = 29,000 ms connection time (requires ~30 parallel connections) Network bandwidth: 100,000 x 4,493 B = 449 MB/sec sustained Redis single-thread serialization becomes the bottleneck at ~60K req/sec for this value size

Cachee L0 at 100K req/sec with 4,493 B values

Lookup cost per request: ~0.031 us (hash + pointer dereference) Serialization: 0 (direct reference, no copy) TCP transfer: 0 (in-process) Deserialization: 0 (already in application memory) At 100K req/sec: 100,000 x 0.031 us = 3.1 ms total compute per second Network bandwidth: 0 bytes No single-thread bottleneck (DashMap is sharded, scales with cores)

640 ms vs 3.1 ms. At PQ session sizes, Redis spends 206x more CPU time on serialization overhead alone than Cachee spends on the entire operation. This is not a Redis bug. It is a fundamental consequence of putting large values through a serialization-and-network pipeline when the consumer is on the same machine.

5. Memory Efficiency

Redis stores each key with internal metadata: a redisObject header (16 bytes), an SDS string header (variable, typically 9 bytes for keys under 256 bytes), a dict entry with two pointers and a hash (24 bytes), plus jemalloc size-class rounding. Total per-key overhead: approximately 70 bytes.

Value Size	Example	Redis Overhead	Redis Total/Key	Cachee Overhead	Cachee Total/Key
64 B	Ed25519 sig	109%	134 B	63%	104 B
690 B	FALCON-512 sig	10.1%	760 B	5.8%	730 B
1,088 B	ML-KEM-768 CT	6.4%	1,158 B	3.7%	1,128 B
3,309 B	ML-DSA-65 sig	2.1%	3,379 B	1.2%	3,349 B
4,493 B	PQ session	1.6%	4,563 B	0.9%	4,533 B
17,088 B	SLH-DSA-128f sig	0.4%	17,158 B	0.2%	17,128 B
49,856 B	SLH-DSA-256f sig	0.1%	49,926 B	0.08%	49,896 B

Per-key overhead percentage actually improves for Redis at PQ sizes -- the 70-byte fixed cost becomes negligible against a 49 KB value. But the total memory tells the real story.

1 million PQ sessions (4,493 B each):

System	Total Memory	vs Classical (64 B values)
Redis, classical	134 MB	baseline
Redis, PQ sessions	4,563 MB	34x
Cachee, classical	104 MB	baseline
Cachee, PQ sessions	4,533 MB	44x

PQ values dominate total memory regardless of system. The difference is that Cachee does not add network bandwidth on top of that memory cost. Redis serving 4.5 GB of cached PQ sessions at 100K req/sec generates 449 MB/sec of internal network traffic. Cachee generates zero.

6. When Redis Is Still Fine

Redis is not obsolete. It remains the right choice in specific scenarios, even in a post-quantum world:

Scenario	Why Redis Works	PQ Caveat
Small values (<1 KB)	Serialization cost is negligible. Network RTT dominates. Redis overhead is constant and acceptable.	Only Ed25519 (64 B) and FALCON-512 (690 B) fall under 1 KB. All other PQ values exceed this.
Low frequency (<1K req/sec)	Even at PQ sizes, cumulative serialization cost is under 6.4 ms/sec. Not a bottleneck.	Fine until you scale. The problem appears at 10K+ req/sec.
Shared state across processes	Multiple application instances need a consistent view of the same data. Redis provides this. In-process caches cannot.	Consider Cachee L1 (distributed) for cross-process PQ state. L0 is single-process only.
Pub/sub and streams	Redis Pub/Sub and Streams have no in-process equivalent. Event-driven architectures need a message broker.	Use Redis for messaging. Use Cachee for the hot-path cache that consumes those messages.
Persistence and replication	Redis AOF/RDB provides durability. ElastiCache provides multi-AZ failover.	If you need durable PQ key storage, use a database. Caches are ephemeral by definition.

The short version: Redis is the wrong tool for hot-path PQ key material on a single instance. It is still useful for cross-process coordination, pub/sub, and low-frequency access patterns. Use both: Cachee for the hot path, Redis for everything else.

Summary

Metric	Redis 7.4 (ElastiCache)	Cachee L0 (in-process)
P50 latency at 4,493 B	0.52 ms	0.031 us 16,774x faster
P99 latency at 49,856 B	2.95 ms	0.102 us 28,922x faster
Serialization cost at 100K/sec	640 ms/sec	0 ms/sec
Network bandwidth at 100K/sec	449 MB/sec	0 MB/sec
Single-thread bottleneck	~60K req/sec (at 4.5 KB values)	None (sharded)
Per-key overhead	~70 bytes	~40 bytes

Try Cachee

In-process PQ-attested caching. Sub-microsecond lookups at any value size. No serialization. No network hop.

brew install h33ai-postquantum/tap/cachee

PQ Key Size Reference | What is Post-Quantum Caching? | Install Guide

PQ Key Sizes | Post-Quantum Caching | All Comparisons | Install Cachee