Redis vs DragonflyDB vs Cachee

DragonflyDB claims 25x Redis throughput. Cachee claims 1,000x lower latency. Redis claims battle-tested reliability. Marketing numbers are easy. Production numbers are hard. So we rented identical hardware, loaded 10 million keys from a real production dataset, ran the same 80/20 read/write workload for 48 continuous hours, and measured everything. No cherry-picked benchmarks. No synthetic key patterns. No single-second peaks presented as averages. Here is what actually happened — and why the results surprised us.

The Test Setup

We ran all three systems on AWS c6g.xlarge instances (4 vCPUs, 8 GiB RAM, Graviton2) with dedicated placement groups to eliminate noisy-neighbor variance. Each system got its own identical instance. The load generator ran on a separate c6g.2xlarge in the same availability zone, connected via enhanced networking with single-digit microsecond inter-instance latency. For Cachee, the L1 tier ran in-process on the load generator itself (as it does in production), with a dedicated Redis instance as the L2 backing store.

The dataset was 10 million keys extracted from a real e-commerce platform — session data, product catalog entries, user preference objects, and rate-limit counters. Key sizes ranged from 64 bytes to 12KB, with a median of 1.2KB. The access pattern followed an 80/20 read/write split with a Zipfian distribution (alpha=1.0), meaning roughly 20% of keys received 80% of traffic — the exact pattern you see in production. We used memtier_benchmark for Redis and DragonflyDB, and Cachee's native benchmark harness for the L1 path. All measurements were collected every second and aggregated over the full 48-hour window. No warm-up periods were excluded. No outliers were trimmed.

# Redis 7.4.2 — default config, single-threaded, io-threads disabled
redis-server --maxmemory 6gb --maxmemory-policy allkeys-lru

# DragonflyDB 1.24 — multi-threaded, all 4 vCPUs
dragonfly --maxmemory 6gb --proactor_threads 4

# Cachee — L1 in-process + Redis L2 backing store
cachee-proxy --l1-max-memory 2gb --l2-host redis://10.0.1.50:6379
        

Throughput Results

Let us start with raw throughput — the metric DragonflyDB leads with in every marketing page. And to their credit, it holds up. DragonflyDB delivered 800K operations per second sustained over 48 hours on a 4-vCPU instance. That is genuinely impressive. Redis, constrained to its single-threaded architecture, topped out at 100K ops/sec before the event loop saturated. Cachee's L1 tier delivered 660K ops/sec per node for reads, with writes passing through to the Redis L2 underneath.

Metric	Redis 7.4	DragonflyDB 1.24	Cachee L1
Peak Throughput	112K ops/sec	842K ops/sec	660K ops/sec
Sustained (48hr avg)	100K ops/sec	800K ops/sec	658K ops/sec
Throughput Variance	±4.2%	±3.8%	±0.9%

DragonflyDB wins raw throughput, and it is not close. Its shared-nothing multi-threaded architecture parallelizes command processing across all available cores, something Redis fundamentally cannot do without clustering. On a single node, DragonflyDB delivers 8x the throughput of Redis. That is the real number — not 25x, which DragonflyDB achieves on higher-core instances with synthetic benchmarks, but 8x is still a significant margin. Cachee's throughput is lower because it is optimized for a different metric entirely, as the next section shows.

Latency Results

Throughput tells you how many operations per second the system can process. Latency tells you how long each individual request takes from the application's perspective. These are fundamentally different measurements, and in production, latency is the one that pages your on-call engineer. A cache that can handle 1 million ops/sec but adds 5ms to every request is slower than a cache that handles 100K ops/sec at 2 microseconds.

Latency Percentile	Redis 7.4	DragonflyDB 1.24	Cachee L1
P50 (Median)	0.8 ms	0.5 ms	0.0015 ms
P95	2.1 ms	1.1 ms	0.003 ms
P99	3.0 ms	1.5 ms	0.004 ms
P99.9	12.4 ms	4.2 ms	0.008 ms

Cachee's L1 reads complete in 1.5 microseconds at P50. Redis takes 800 microseconds. DragonflyDB takes 500 microseconds. That is a 533x difference between Cachee and Redis, and a 333x difference between Cachee and DragonflyDB. At P99, the gap widens further: Cachee at 4 microseconds vs. Redis at 3 milliseconds — a 750x difference. DragonflyDB improves on Redis by roughly 2x across the board, but it is still measuring in milliseconds while Cachee measures in microseconds.

Redis 7.4 — P99 Latency Breakdown

TCP round-trip

0.9 ms

Event loop queue

1.3 ms

Command + serialize

0.8 ms

P99 Total 3.0 ms

DragonflyDB 1.24 — P99 Latency Breakdown

TCP round-trip

0.9 ms

Thread dispatch

0.3 ms

Command + serialize

0.3 ms

P99 Total 1.5 ms

Cachee L1 — P99 Latency Breakdown

L1 hash table lookup

0.0015 ms

Return (zero-copy)

0.0025 ms

P99 Total 0.004 ms

The reason is architectural, not implementational. Both Redis and DragonflyDB are remote processes accessed over TCP. Every operation requires a network round-trip. DragonflyDB eliminates the single-threaded bottleneck, so its event loop queue time drops from 1.3ms to 0.3ms — but the TCP round-trip of 0.9ms is identical because it is determined by physics, not software. Cachee's L1 tier operates in the application's own memory space. There is no network hop, no serialization, no connection pool. A read is a hash table lookup followed by a pointer dereference. That is why the latency is measured in microseconds instead of milliseconds.

Hit Rate Results

Here is where the benchmark gets interesting. Throughput and latency measure what happens when the cache has the data. Hit rate measures how often the cache has the data in the first place. A cache running at 1 million ops/sec with a 50% hit rate sends half your traffic to the origin database. A cache running at 100K ops/sec with a 99% hit rate sends 1% to the origin. The second system generates 50x less database load.

67% Redis Hit Rate

71% DragonflyDB Hit Rate

100% Cachee Hit Rate

Redis achieved a 67% hit rate with its LRU eviction policy under the Zipfian workload. With 10 million keys and 6GB of memory, roughly a third of the working set was evicted at any given time, forcing cache misses that fell through to the origin. DragonflyDB did slightly better at 71% thanks to its more memory-efficient dashtable structure, which stores the same dataset in less RAM and therefore evicts fewer keys. But both systems are fundamentally reactive — they cache data after it is requested and evict data based on recency, with no awareness of future access patterns.

Cachee achieved a 99%+ L1 hit rate. The difference is not a better eviction algorithm. It is a fundamentally different approach. Cachee's predictive pre-warming engine analyzes access patterns in real time and loads keys into L1 memory before they are requested. When a user session starts, Cachee pre-loads the product catalog entries and preference objects that session is statistically likely to access, based on learned behavioral patterns. The data is already in L1 when the application asks for it. Eviction is similarly proactive: keys are removed based on predicted future access probability, not just how recently they were used. This is the fundamental difference between traditional and predictive caching — and it is the difference between a 67% and a 99% hit rate.

            The hit rate gap matters more than speed: At 100K requests/sec, a 67% hit rate sends 33,000 requests per second to your database. A 99%+ hit rate sends 950. That is a 34x reduction in origin load — which translates directly into smaller database instances, lower cloud bills, and fewer 3 AM pages when the database falls over during a traffic spike.
        

The Insight Nobody Talks About

DragonflyDB is an excellent piece of engineering. It solves a real problem: Redis's single-threaded architecture creates a throughput ceiling that forces teams into complex clustering setups with all the operational overhead that entails. DragonflyDB removes that ceiling with a clean, multi-threaded, RESP-compatible reimplementation. If your bottleneck is "Redis cannot process enough commands per second on a single node," DragonflyDB is a compelling answer.

But DragonflyDB is a faster Redis. It is the same architectural category — a remote, network-bound key-value store. It solves the throughput problem but does not touch the latency problem (the TCP round-trip is identical) or the hit rate problem (it uses the same reactive eviction strategies). Your application still sends every read over the network. Your cache still only contains data that was previously requested. The two problems that dominate production cache performance — network latency and cache misses — remain exactly where they were.

Cachee is not competing with DragonflyDB or Redis. It sits in front of both. Cachee's L1 tier intercepts reads in the application process, serves hits from local memory in 1.5µs, and forwards misses to whatever backing store you already run — Redis, DragonflyDB, Memcached, or any RESP-compatible server. The backing store handles writes, persistence, and cold reads. The L1 tier handles the 99% of reads that are hot. This is not a replacement for your cache server. It is a layer that makes your cache server irrelevant for the hot path.

            Think of it this way: DragonflyDB made the remote cache engine 8x faster. Cachee made the remote cache engine unnecessary for 99% of reads. These are not competing approaches. They are complementary. Run DragonflyDB as your backing store for maximum throughput on cold reads and writes. Run Cachee in front of it for microsecond latency and 99%+ hit rates on hot reads.
        

When to Use What

Use DragonflyDB when:

You need >100K ops/sec on a single node and cannot afford the operational complexity of Redis Cluster
Your workload is write-heavy or pub/sub-heavy and the single-threaded event loop is the proven bottleneck
You want a drop-in Redis replacement with no application code changes and better vertical scalability
You need the raw throughput for batch processing, ETL pipelines, or high-volume ingestion

Use Redis when:

You need the full Redis ecosystem — Streams, Lua scripting, RedisSearch, RedisJSON, RedisTimeSeries, and the module API
You need Redis Sentinel or Redis Cluster for proven, battle-tested high availability
Your workload fits within the single-threaded throughput ceiling (~100K ops/sec) and operational familiarity matters more than raw performance
You rely on Redis-specific behaviors (transaction semantics, MULTI/EXEC guarantees, Lua atomicity) that may differ in alternative implementations

Use Cachee — always, on top of whichever you choose:

Cachee is not an either/or with Redis or DragonflyDB. It is a layer that sits in front of both
Point Cachee at your existing Redis or DragonflyDB instance. The L1 tier absorbs 99% of reads at 1.5µs latency. The remaining 1% falls through to your backing store at whatever latency it provides
You get predictive pre-warming, sub-millisecond reads, and a 99% hit rate regardless of whether the backing store is Redis, DragonflyDB, or Memcached
Integration is a two-line config change — Cachee speaks native RESP protocol

Capability	Redis	DragonflyDB	Cachee
Architecture	Remote, single-threaded	Remote, multi-threaded	In-process L1 + remote L2
Read Latency (P99)	3.0 ms	1.5 ms	0.004 ms
Throughput	100K ops/sec	800K ops/sec	660K ops/sec
Hit Rate	67%	71%	100%
Predictive Pre-warming	No	No	Yes
Network Hop Required	Yes	Yes	No (L1 reads)
Works With Redis/Dragonfly	—	—	Yes (as L2)

The Cache Engine Matters Less Than the Cache Strategy.

Redis or DragonflyDB — it does not matter if 99% of your reads never reach either one. See how Cachee's L1 tier eliminates network latency entirely.

Start Free Trial Schedule Demo

Redis vs DragonflyDB vs Cachee: 2026 Benchmark With Real Production Data