Skip to main content
Why CacheeHow It Works
All Verticals5G TelecomAd TechAI InfrastructureFraud DetectionGamingTrading
PricingDocsBlogSchedule DemoLog InStart Free Trial
← Back to Blog
Engineering

MVCC for Caches: Zero-Contention Reads at 96-Worker Scale

DashMap is the best concurrent hash map in the Rust ecosystem. Its sharded architecture delivers sub-microsecond reads under heavy concurrency, and it is the foundation of Cachee's in-process cache engine. But at 96 workers on Graviton4 with a 30% write ratio, we measured something that benchmarks with pure-read workloads never reveal: same-shard contention adding 1–3 microseconds of P99 jitter. For most workloads, this is invisible. For HFT, ML feature stores, and real-time pricing engines, it is the difference between acceptable and unacceptable. We built MVCC into the cache engine to eliminate it.

The Ceiling You Don't See in Read-Only Benchmarks

DashMap divides keys into shards, each protected by a read-write lock. Multiple readers can hold the same shard lock concurrently, but a writer requires exclusive access. This is excellent engineering — it reduces contention by a factor equal to the number of shards. At 64 shards (the default), the probability of two operations colliding on the same shard is roughly 1 in 64.

The problem is that probability compounds with worker count. At 96 workers performing tight loops of reads and writes (the FHE/NTT batch pipeline, for example), the expected number of same-shard collisions per second is not negligible. It is statistically frequent. A reader that arrives during the ~13µs window of a concurrent write to the same shard will block until the write completes. The P50 is unaffected — most reads hit uncontended shards. But the P99 tells a different story.

We measured this on a c8g.metal-48xl (192 vCPUs, Graviton4) running 96 workers with a workload mix of 70% reads and 30% writes:

A 2.5µs P99 increase sounds trivial until you consider the workloads where it matters. An HFT system with a 10µs tick-to-trade budget just lost 25% of its latency budget to lock contention in the cache. An ML inference pipeline reading features while a streaming ingestion pipeline writes them sees unpredictable jitter in a path that is supposed to be deterministic. A pricing engine reading prices for order validation while market feeds write continuous updates gets occasional stalls on the read path.

The core insight: Low contention is not zero contention. DashMap's sharded locking reduces contention by 64x. MVCC eliminates it. For workloads where microseconds are P&L, the difference between “reduced” and “eliminated” is the entire product decision.

How MVCC Eliminates Read-Path Contention

Multi-Version Concurrency Control borrows a technique from database engines (PostgreSQL, MySQL/InnoDB, Oracle) and applies it to in-process cache reads. The core idea is simple: instead of overwriting a value in place (which requires a write lock that blocks readers), each write creates a new version of the value. Readers see a consistent snapshot at their read timestamp. No lock required.

The implementation has three components:

Version Chains

Each key maintains a linked chain of versions, ordered newest to oldest. When a writer updates a key, it allocates a new version struct (value + timestamp + epoch = 24 bytes of overhead), sets the value, and atomically swaps the head pointer to the new version. The previous version remains accessible to any reader that started before the write.

Epoch-Based Reads

When a reader begins, it captures the current global epoch (a single atomic load — one CPU instruction on ARM). It then traverses the version chain and returns the most recent version whose epoch is less than or equal to the reader's epoch. This guarantees a consistent snapshot: the reader sees the state of the cache as it existed at the moment it started reading, regardless of any concurrent writes.

The read path is completely lock-free. Not “mostly lock-free” like DashMap (which is lock-free until a concurrent write to the same shard acquires the write lock). Unconditionally lock-free. A reader never waits on any writer, on any shard, under any level of concurrency.

Epoch-Based Garbage Collection

Old versions cannot live forever. A background GC thread runs every 100µs (configurable) and scans version chains for versions that are no longer visible to any active reader. When all active readers have epoch greater than a version's epoch, that version is reclaimed. The GC is non-blocking — it operates on a separate thread and never pauses the read or write path.

Under sustained load, versions are GC'd within 100–500µs of becoming unreachable. Memory overhead stays bounded at approximately 24 bytes per version per key. With the default of 2 versions per key and 10 million keys, the total version overhead is ~480 MB.

Before and After: The Numbers

Same hardware (c8g.metal-48xl), same worker count (96), same workload mix (70/30 read/write):

The P50 is unchanged because the common case (uncontended reads) was already fast. The P99 drops by 55% because the uncommon-but-critical case (reads that collide with writes on the same shard) is eliminated entirely. The write latency increase of 0.001ms is the cost of allocating a 24-byte version struct and performing one atomic CAS. For write-latency-sensitive workloads, this is invisible.

Who Needs This

MVCC is not for every workload. If your write ratio is below 5%, DashMap contention is negligible and MVCC adds memory overhead for no measurable benefit. Enable MVCC when P99 jitter under concurrent writes is a measured problem, not a theoretical concern.

The workloads where it matters most:

The decision rule: Measure your P99 read latency under your actual read/write ratio at your actual worker count. If it is materially higher than your P50, you have lock contention. CONFIG SET mvcc.enabled true eliminates it.

Configuration

MVCC is transparent to the client. No API changes, no code changes. Three config parameters:

CONFIG SET mvcc.enabled true          # Enable MVCC
CONFIG SET mvcc.max_versions 2        # Versions retained per key (default: 2)
CONFIG SET mvcc.gc_interval_us 100    # GC scan interval in microseconds (default: 100)

All three are changeable at runtime. Enabling MVCC does not require a restart or data migration. Disabling it collapses all version chains back to single versions during the next GC cycle.

Related Reading

Also Read

Eliminate Lock Contention. Ship Deterministic Latency.

MVCC for the cache engine. Zero-contention reads under concurrent writes. One config flag to enable.

Start Free Trial Schedule Demo