Redis 8 shipped with semantic caching, overhauled I/O threads, and RDMA support. Valkey 8.1 countered with multi-core improvements and better memory efficiency. Both claim roughly 1 million operations per second on modern hardware. Marketing materials on both sides read like arms-race brochures. We took identical c7g.xlarge instances, ran the same 80/20 Zipfian workload for 48 hours, and measured everything. The results are illuminating — but probably not for the reason you expect. The performance gap between Redis 8 and Valkey 8.1 is small. The performance gap between either of them and an in-process L1 cache is enormous.
What’s New in Redis 8
Redis 8 is the most significant major release since Redis 7 introduced Functions. The headline feature is LangCache — a built-in semantic caching layer designed for LLM and RAG workloads. LangCache stores vector embeddings alongside cached values, enabling similarity-based lookups instead of exact key matches. If a user asks “What is the refund policy?” and someone previously asked “How do I get a refund?”, LangCache can return the cached response without hitting the upstream model. For teams running inference pipelines, this is a genuinely useful feature that previously required cobbling together Redis with a separate vector database.
Beyond LangCache, Redis 8 brings improved I/O threading. Previous versions offloaded read/write syscalls to I/O threads but still serialized command execution on the main thread. Redis 8 extends threading further into the pipeline, allowing certain read-only commands to execute without main-thread contention. RESP3 gets better client-side caching support with more granular invalidation messages. Active-Active geo-replication sees conflict resolution improvements for CRDTs. The elephant in the room remains the license: Redis 8 ships under the RSALv2 / SSPLv1 dual license, which prohibits offering Redis as a managed service without a commercial agreement. For most application developers this is irrelevant, but it is the reason Valkey exists in the first place.
What’s New in Valkey 8.1
Valkey 8.1 is the Linux Foundation’s answer to Redis 8, and it takes a different approach to the performance problem. Where Redis 8 added a semantic caching layer, Valkey 8.1 focused on making the core engine faster. The standout improvement is RDMA (Remote Direct Memory Access) support, which bypasses the kernel networking stack entirely and allows cache operations over InfiniBand or RoCE fabrics. In data center environments with RDMA-capable NICs, this cuts network latency from the typical 100–200 microseconds of TCP to under 10 microseconds per round-trip.
Valkey 8.1 also delivers multi-core GET/SET optimization. While Valkey remains fundamentally single-threaded for command execution (same as Redis), version 8.1 parallelizes more of the surrounding work: connection handling, output buffer management, and memory allocation. The result is higher throughput under heavy connection counts. Memory defragmentation has been reworked to run more aggressively in the background without stalling client requests — Valkey’s internal benchmarks show 8–12% lower RSS after 24 hours of sustained load compared to Redis 8 under identical workloads. Dual-channel replication separates the replication backlog from the main data stream, reducing replica lag during write bursts. And the license: BSD-3-Clause, no usage restrictions, backed by the Linux Foundation with contributors from AWS, Google, Oracle, Ericsson, and Snap.
The Benchmark
Setup
We wanted to eliminate as many variables as possible. Both tests ran on AWS c7g.xlarge instances (4 vCPUs, 8 GiB RAM, Graviton3) in the same availability zone, same VPC, same security group. The benchmark client ran on a separate c7g.2xlarge in the same placement group to minimize network variance. We used memtier_benchmark (latest) with identical parameters: 80/20 read/write ratio, Zipfian key distribution (alpha=0.99, simulating realistic hot-key skew), 128-byte values, 50 concurrent connections, 4 threads. Each test ran for 48 hours continuously. We collected throughput, latency percentiles, memory usage, and CPU utilization at 1-second granularity.
Both Redis 8 and Valkey 8.1 ran with default configurations, with one exception: we enabled I/O threads (io-threads 4) on both, since both now support it and it reflects how teams actually deploy in production. Persistence was disabled (save "") to isolate pure in-memory performance — this is a caching benchmark, not a durability benchmark. We pre-loaded 5 million keys before starting the measurement window.
Results
| Metric | Redis 8 | Valkey 8.1 | Delta |
|---|---|---|---|
| Throughput (ops/sec) | 947,382 | 962,114 | +1.6% Valkey |
| P50 latency (GET) | 0.143 ms | 0.139 ms | -2.8% Valkey |
| P99 latency (GET) | 0.287 ms | 0.298 ms | +3.8% Redis |
| P99.9 latency (GET) | 0.511 ms | 0.492 ms | -3.7% Valkey |
| P50 latency (SET) | 0.151 ms | 0.154 ms | +2.0% Redis |
| P99 latency (SET) | 0.312 ms | 0.301 ms | -3.5% Valkey |
| RSS after 48h (MB) | 1,847 | 1,698 | -8.1% Valkey |
| CPU utilization (avg) | 72.3% | 68.9% | -4.7% Valkey |
| Throughput variance | ±2.1% | ±1.8% | Comparable |
Let us be honest about these numbers: they are functionally identical. Valkey 8.1 edges out Redis 8 on throughput by 1.6%. Redis 8 wins P99 GET latency by 3.8%. Valkey takes P99.9 GET by 3.7%. They trade blows on SET latency. Run this test on a different day, different instance, different AZ, and the winner column shuffles. The only clear, consistent differentiator is memory efficiency: Valkey 8.1 used 8.1% less RSS after 48 hours, which aligns with its reworked defragmentation engine. That is real, reproducible, and meaningful if you are running at memory limits. Everything else is within the margin of noise.
The Feature That Matters More Than Performance
Here is the uncomfortable truth that neither the Redis nor the Valkey marketing page will tell you: both are still network-bound. A Redis 8 GET takes roughly 140 microseconds. A Valkey 8.1 GET takes roughly 139 microseconds. The difference is 1 microsecond. An in-process L1 cache lookup takes 0.2 microseconds. That is not a 5% improvement. That is a 1,000x improvement. The debate over which remote cache is 3% faster is like arguing about whether a bicycle or a scooter is faster when there is a jet on the runway.
Both Redis 8 and Valkey 8.1 still use reactive eviction. A key expires, requests miss, the key gets repopulated. During that gap, every request pays the full miss penalty: network round-trip to the cache, miss detection, round-trip to the origin, serialization, write-back to the cache. With predictive caching, there is no gap. The L1 layer learns access patterns and pre-warms data before it is requested. Miss rate drops below 1%. The entire category of problems that benchmarks measure — GET latency, SET latency, throughput under contention — becomes irrelevant when 99% of your reads never leave the application process. This is the fundamental architectural difference between traditional caching and what an L1 layer provides.
Decision Framework
If the benchmark results are a wash, the decision between Redis 8 and Valkey 8.1 comes down to features and philosophy, not performance. Here is the practical framework:
Choose Redis 8 if: You need LangCache semantic caching for LLM/RAG workloads and want it built-in rather than assembled from separate components. You are already on Redis Enterprise or Redis Cloud and want the upgrade path. You are not building a managed service (license restriction does not apply to application use). You need Active-Active geo-replication with CRDT conflict resolution. Compare against alternatives on our Redis comparison page.
Choose Valkey 8.1 if: You want a fully open-source, BSD-3 licensed cache with no usage restrictions. You are running on AWS and want native ElastiCache/MemoryDB integration (AWS has migrated these services to Valkey). You have RDMA-capable infrastructure and want kernel-bypass networking. You are running at memory limits and the 8% RSS improvement matters. Linux Foundation governance gives you confidence in long-term neutrality. Compare against alternatives on our Valkey comparison page.
Choose Cachee on top of either: The version of your remote cache matters far less than whether you are using an L1 layer in front of it. Cachee deploys as an SDK or sidecar on top of Redis, Valkey, Memcached, DynamoDB, or any backing store. It adds sub-microsecond in-process lookups, predictive pre-warming, and automatic invalidation. The backing store becomes your system of record. The hot path never leaves your process. You can switch from Redis to Valkey (or back) without changing a single line of application code, because Cachee abstracts the backing store entirely.
What We Would Actually Spend Time On
If we had a team of engineers and a week to improve cache performance, we would not spend it migrating from Redis to Valkey or vice versa. The 1.6% throughput difference does not justify the migration risk, the testing burden, or the operational toil. Instead, we would spend that week on three things that deliver orders-of-magnitude improvement:
- Add an L1 layer. One integration, 500,000x faster reads on cache hits. This is the highest-leverage change available. Read our guide to reducing Redis latency for implementation details.
- Profile serialization costs. Identify the top 10 cached objects by size. If any exceed 100KB, consider caching pre-shaped views or using a binary format. This often yields a 2–5x improvement on specific endpoints.
- Implement predictive pre-warming. Replace TTL-based expiration with access-pattern-driven refresh. Cache misses drop below 1%, which eliminates stampedes, miss penalties, and the latency spikes that make P99 charts ugly. Our traditional vs. predictive caching comparison explains the difference.
None of these depend on whether you are running Redis 8 or Valkey 8.1. They work on both. They work on Redis 7. They work on Valkey 7.2. The version of your remote cache is a footnote. The architecture of your caching layer is the chapter.
Further Reading
- Compare Cachee to Other Caching Solutions
- Cachee vs Redis
- Cachee vs Valkey
- Predictive Caching: How AI Pre-Warming Works
- Traditional vs Predictive Caching
- How to Reduce Redis Latency in Production
- Cachee Performance Benchmarks
Also Read
The Version of Redis Matters Less Than Whether You’re Using L1.
Add sub-microsecond in-process lookups on top of Redis or Valkey. Same backing store, 500,000x faster reads.
Start Free Trial Schedule Demo