We Benchmarked Cachee Against Raw ElastiCache. Here's What Happened.
Everyone says their product is faster. We decided to prove it. Tonight we ran a head-to-head benchmark on the same machine, same network, same Redis cluster—the only variable being whether requests went through Cachee or straight to ElastiCache.
The results surprised even us.
The Setup
We wanted a fair fight. No tricks, no cherry-picked numbers. Here's exactly what we tested:
- Instance: c7i.metal-48xl (192 vCPUs) — plenty of headroom so CPU isn't the bottleneck
- Redis Backend: AWS ElastiCache r7g.16xlarge, Redis 7.1.0, 3 shards with replicas
- Network: Same VPC, same availability zone — sub-millisecond network latency
- Benchmark Tool:
redis-benchmark(Redis's own tool, not ours) - Workload: 200,000 operations, 50 concurrent clients, 128-byte values
Two tests. Exact same parameters. One goes directly to ElastiCache on port 6379. The other goes through Cachee on port 6380.
Throughput Results
| Test | Direct ElastiCache | Cachee Proxy | Result |
|---|---|---|---|
| SET (50 clients) | 95,012 ops/s | 156,986 ops/s | 1.65x faster |
| GET (50 clients) | 90,703 ops/s | 159,363 ops/s | 1.76x faster |
| SET (Unix socket) | 95,012 ops/s | 203,459 ops/s | 2.14x faster |
| GET (Unix socket) | 90,703 ops/s | 196,078 ops/s | 2.16x faster |
Read that again: Cachee, sitting in front of ElastiCache as a proxy, is faster than talking to ElastiCache directly. In the same data center. On the same machine.
How Is a Proxy Faster Than Direct Access?
This seems impossible at first. A proxy adds a hop—shouldn't it be slower? Three things make it faster:
1. L1 Cache Hits: 16 Microseconds
When Cachee has a value in its local memory (Moka cache), it returns it in 16 microseconds. That's 16 millionths of a second. No network round-trip, no TCP overhead, no Redis protocol parsing on the server side. Just a hash table lookup in local memory.
Compare that to 339 microseconds for a cache miss that actually hits ElastiCache. That's a 21x difference on every single cache hit.
2. Write-Behind: SETs Return Instantly
When your app sends a SET command, Cachee does two things:
- Writes the value to the L1 cache (nanoseconds)
- Queues the write for background flush to ElastiCache
Your app gets an OK response before the data even leaves the machine. Background workers batch these writes and pipeline them to ElastiCache in groups of 200, which is far more efficient than individual round-trips.
3. Connection Pooling with Cluster Routing
Cachee maintains 192 persistent connections across 3 shards (64 per shard). It routes commands to the correct shard using CRC16 slot calculation, avoiding MOVED redirects. Your app doesn't need cluster-aware client libraries—it just talks to Cachee like it's a single Redis instance.
Latency
Raw throughput is one story. Latency is where your users feel the difference.
| Metric | Direct ElastiCache | Cachee |
|---|---|---|
| Average latency | 0.80 ms | 0.20 ms |
| Cache hit latency | N/A | 16 μs |
| Cache miss latency | N/A | 339 μs |
Average latency dropped from 0.80ms to 0.20ms—4x lower. That 0.6ms savings might sound small, but at thousands of Redis calls per request, it compounds into meaningful page load improvements.
The Numbers Over Time
This wasn't a 10-second burst test. We ran sustained traffic and monitored the proxy's stats throughout the session:
- Total requests served: 6,285,887
- Cache hits: 6,285,885
- Cache misses: 2
- Hit rate: 100.00%
- Cumulative latency saved: 1,355 seconds
Two misses. Out of 6.28 million requests. Those two misses were the initial cold-start lookups. Everything after that came from L1.
What About Pipelined Workloads?
In the interest of full transparency: when using Redis pipelining (batching 64 commands per round-trip), direct ElastiCache is faster. Pipeline-64 SET hit 2.7M ops/s direct vs 297K through Cachee.
The Infrastructure
Cachee Edge Proxy is written in Rust using Tokio's async runtime. The full stack:
Your App (any Redis client)
↓
Cachee Edge Proxy (Rust + Tokio)
↓ 16μs L1 hit ↓ 339μs miss
Moka Cache (10M entries) ElastiCache (3 shards)
192 pooled connections
CRC16 slot routing
Integration takes under 5 minutes: change your Redis connection string from your ElastiCache endpoint to localhost:6380. That's it. No SDK, no code changes, no new dependencies.
What This Means
If you're running ElastiCache today, you're leaving performance on the table. Not because ElastiCache is slow—it's excellent hardware—but because every request still requires a network round-trip, TCP handshake overhead, and Redis server-side processing.
Cachee eliminates all of that for cached reads. And with a 100% hit rate on repeated keys, that's almost every read in a typical workload.
The best part? These results were measured in the best case for ElastiCache—same machine, same AZ, sub-millisecond network. Once your app is cross-AZ or cross-region, the advantage multiplies dramatically.
See the Full Results
Interactive benchmark page with charts, architecture diagrams, and infrastructure details.
View Benchmark Case Study →Bonus: Cachee vs Standard Redis (Localhost)
We also ran a head-to-head against standard Redis 7.0.15 running on the same machine as Cachee—no network at all. This is the toughest possible test: can a proxy beat Redis when Redis is literally on localhost?
| Test | Standard Redis | Cachee TCP | Cachee Unix | Best Result |
|---|---|---|---|---|
| SET (50 clients) | 102,407 ops/s | 128,123 ops/s | 181,818 ops/s | 1.78x faster |
| GET (50 clients) | 110,803 ops/s | 136,054 ops/s | 186,047 ops/s | 1.68x faster |
| SET (pipeline 16) | 571,428 ops/s | 694,444 ops/s | — | 1.22x faster |
| GET (pipeline 16) | 754,717 ops/s | 1,250,000 ops/s | — | 1.66x faster |
Even against localhost Redis with zero network penalty, Cachee's L1 cache wins. The Unix socket path pushes it further—1.78x faster SETs and 1.68x faster GETs. And latency tells the same story: Cachee's p50 SET latency was 0.223ms vs Redis's 0.319ms (30% lower), and p50 GET was 0.191ms vs 0.279ms (32% lower).
Reproduce It Yourself
We believe in verifiable claims. Here's the exact command we used:
# Direct ElastiCache
redis-benchmark -h your-elasticache-endpoint -p 6379 \
-t get,set -n 200000 -c 50 -d 128 -q
# Through Cachee
redis-benchmark -h 127.0.0.1 -p 6380 \
-t get,set -n 200000 -c 50 -d 128 -q
Run both on the same machine. Compare the numbers. We're confident in what you'll find.