Metal Benchmark Updated — April 2, 2026

Cachee Engine: 31ns L0 GET, 32M ops/sec

Native Rust cache engine with L0 hot cache, Cachee-FLU eviction, and 4 protocol interfaces. Measured on metal_bench with pre-allocated keys. Production context from Graviton4 c8g.metal-48xl.

31ns
L0 GET — M4 Max
Warmed L0 hot cache
59ns
DashMap GET — Graviton4
c8g.metal-48xl, no L0
32M
Single-Thread ops/sec
M4 Max, L0 warm
99%+
L0 Hit Rate
Self-promoting on GET

Cachee vs Every Major Cache

Cache Language GET Latency ops/sec (1T) vs Cachee
Cachee L0 (M4 Max) Rust 31 ns 32M
Cachee DashMap (Graviton4) Rust 59 ns ~17M no L0
MokaRust~50 ns~20M1.6x slower
CaffeineJava~65 ns~15M2.1x slower
StrettoGo/Rust~80 ns~12M2.6x slower
RistrettoGo~125 ns~10M4x slower
Guava CacheJava~150 ns~7M4.8x slower
Hazelcast NearJava~300 ns~6M9.7x slower
DragonflyC++~400 ns~3M12.9x slower
RedisC~500 ns~2M16x slower
ElastiCacheManaged~339,000 ns~150K10,935x slower
Why 31ns is possible
L0 is a 64-shard RwLock<HashMap<u64, Bytes>> hot cache in front of DashMap. On GET: xxh3_64 hash (~2ns), shard select (~1ns), read lock (~3ns), HashMap lookup (~20ns), Bytes clone (~3ns), drop (~2ns) = 32.1ns measured. SET is intentionally 17x slower (548ns) because all eviction intelligence (Cachee-FLU admission, CMS, SegLRU) runs on the write path to keep reads at 32ns.

Production Architecture

Two-Tier Cache Stack
L0/L1 — Cachee-FLU Engine
L0 Hot Cache + DashMap + Atomic Count-Min Sketch
31ns L0 GET, 64-shard RwLock, 128-shard DashMap, cached clock, Cachee-FLU eviction
L2 — ElastiCache Redis 7.1
cache.r7g.12xlarge — 317GB RAM
48 vCPU, sub-1ms latency, circuit breaker protected
Compute
Graviton4 c8g.16xlarge — 64 vCPU
Docker container, us-east-1, ARM64
Dashboard
cachee.ai/admin/dashboard.html
Live metrics, click-through detail view, 10s auto-refresh

Engine Evolution: v1.0 → v2.0 → v3.0 → v4.3

Metric v1.0 (Redis Proxy) v2.0 (JS L1) v3.0 (NAPI L1) v4.3 (Native Engine)
L1 Hit Latency N/A (no L1) 0.0085ms 0.0145ms / 4.65µs 0.0015ms / 31ns
P99 Latency N/A ~30µs ~31ns 3.7µs
L2 Hit Latency 0.55ms 0.55ms 0.55ms 0.55ms (same Redis)
L1 Hit Rate 0% (no L1) 85% 100% (warm set) 100% (production)
GET Throughput N/A ~100K ops/s 215K ops/s 660K+ ops/s
Horizontal Scaling N/A N/A Single instance Pub/sub cluster
Engine None (pass-through) JS L1 Cache (Node.js) NAPI L1 Cache Native Cachee Engine + DashMap

Capacity Planning — Redis Memory Analysis

ElastiCache r7g.12xlarge — Memory Breakdown
Total Allocated76.10 MB
Startup Overhead (Redis engine)9.17 MB
Allocator Fragmentation66.86 MB
Client Buffers (42 connections)0.04 MB
Actual Session Data0 bytes (all keys expired via TTL)
Keys in Redis0 (DBSIZE confirmed)
Fragmentation Ratio1.23 (normal jemalloc)
76MB ≠ Session Data
The 76.33MB reported is 100% Redis engine overhead + jemalloc fragmentation. Zero session keys exist. At 3.5KB/session on 253GB usable (80% of 317GB): ~72 million sessions capacity — original projection holds.

Live Test Timeline — Feb 12, 2026

02:41 UTC
Cachee v3.0 server started via PM2 — native engine initialized, L2 Redis connected (76MB)
02:52 UTC
Dashboard reporter enabled — auto-sending metrics to cachee.ai every 10s
02:53 UTC
Test burst: 5 ops (2 SET, 3 GET) — L1 hit rate 100%, 0.017ms avg latency
02:59 UTC
Production traffic burst: 1,007 ops, 503 keys loaded, 99%+ L1 hit rate maintained
02:59 UTC
Reporter confirmed: 1,002 ops pushed to dashboard (213 + 787 + 2 batches)
03:16 UTC
H33 card visible on cachee.ai/admin/dashboard.html — live, connected, auto-refreshing

Key Findings

What Worked

  • Native L1 engine: 1.8x faster raw GET vs JS
  • 99%+ L1 hit rate on warm working set
  • Zero errors across all test traffic
  • NAPI-RS FFI: zero-copy, no serialization overhead
  • Dashboard reporter: seamless metrics pipeline
  • Redis capacity math validated (3.5KB/session holds)

v4.3 Key Optimizations

  • Fast lane middleware — bypasses compression, CORS, security headers
  • Inline auth — constant-time API key check in hot path (~1µs)
  • Pre-compression — Brotli + gzip stored at write time
  • Pub/sub cache coherence — Redis channel for cross-instance invalidation
  • ETag support — 304 Not Modified for conditional requests
  • Request deduplication — concurrent identical GETs coalesced