Per-Key Overhead

Every cache entry carries structural overhead beyond the key and value bytes themselves. This overhead includes hash table metadata, pointers, TTL tracking, and eviction bookkeeping. The difference between Cachee and Redis comes down to architecture.

Component Redis Cachee
Hash table entry (bucket + pointers) ~56 bytes (dictEntry + SDS headers) ~40 bytes (DashMap shard entry)
TTL metadata ~16 bytes (expires dict entry) 8 bytes (inline AtomicU64)
Eviction metadata (LRU/LFU) ~24 bytes (redisObject header) ~16 bytes (W-TinyLFU frequency counter)
Total per-key overhead ~90–100 bytes ~64–72 bytes (20% less)

Redis uses a redisObject wrapper (16 bytes) around every value, plus SDS (Simple Dynamic Strings) headers for both key and value, plus a separate expires dictionary entry for TTL-tracked keys. Cachee stores entries directly in DashMap shards with inline TTL and eviction metadata, avoiding the double-indirection overhead.

Connection Buffers: The Hidden Cost

Redis is an external process that communicates over TCP. Every client connection allocates kernel-level and application-level buffers.

Buffer Redis Cachee
Query buffer (per connection) ~1 KB default, up to 1 GB 0 (in-process)
Output buffer (per connection) ~16 KB default 0 (in-process)
TCP socket buffers (kernel) ~17 KB per connection (send+recv) 0 (in-process)
100 connections ~3.3 MB 0 bytes
1,000 connections ~33 MB 0 bytes
10,000 connections ~330 MB 0 bytes

At 10,000 connections (common in microservices architectures), Redis consumes ~330 MB before storing a single key. Cachee operates in-process — function calls, not network sockets — so connection buffer overhead is literally zero.

Why This Matters

Connection buffer memory is invisible to redis-cli INFO memory. It shows up in the OS process RSS but not in Redis's own memory accounting. This is one of the most common reasons Redis uses more memory than expected in production.

Persistence & Replication Overhead

Cachee is an in-process cache, not a database. It does not persist to disk or replicate. This eliminates several categories of Redis memory overhead.

  • No AOF rewrite buffer: Redis buffers writes during AOF rewrite, consuming up to 2x memory during the rewrite window.
  • No RDB fork overhead: Redis BGSAVE forks the process. Copy-on-write means pages are duplicated as they are modified during the fork. Under heavy write load, the forked process can consume up to 2x the parent's memory.
  • No replication backlog: Redis maintains a circular buffer (default 1 MB, often increased to 256 MB+) for partial resync with replicas.
  • No serialization buffers: Redis serializes values for network transfer (RESP encoding). Cachee returns direct memory references — zero-copy reads.

Memory Fragmentation

Metric Redis Cachee
Fragmentation ratio (typical) 1.1–1.5x <1.05x
Fragmentation cause jemalloc size classes + frequent alloc/free of variable-size SDS strings DashMap uses fewer, larger allocations per shard
Defragmentation Active defrag (CPU cost, may increase latency) Not needed (fragmentation stays low)

Redis uses jemalloc, which allocates in fixed size classes. A 33-byte string gets a 48-byte allocation. A 49-byte string gets a 64-byte allocation. The wasted space accumulates. Redis's active defragmentation feature can reclaim some of this, but it consumes CPU and can spike latency during defrag runs. Cachee's DashMap architecture uses fewer, shard-level allocations that fragment less.

Capacity Planning

Total memory = (per-key overhead + key bytes + value bytes) × key count × fragmentation ratio. The tables below compare Redis and Cachee at common scales with 256-byte and 1 KB values.

256-Byte Values

Keys Redis (1.2x frag) Cachee (<1.05x frag) Savings
100K 42 MB 34 MB 19%
1M 420 MB 340 MB 19%
10M 4.2 GB 3.4 GB 19%

1 KB Values

Keys Redis (1.2x frag) Cachee (<1.05x frag) Savings
100K 132 MB 113 MB 14%
1M 1.32 GB 1.13 GB 14%
10M 13.2 GB 11.3 GB 14%

The per-key overhead savings are constant (~26 bytes per key), so the percentage savings decreases as value size increases. With small values (256 bytes), overhead is a larger fraction of total memory, so the 20% per-key savings translates to ~19% total savings. With larger values (1 KB), the overhead is diluted and savings are ~14%. The connection buffer savings (potentially hundreds of MB) are in addition to these numbers.

W-TinyLFU Admission Overhead

Cachee uses W-TinyLFU for admission control — a frequency sketch that determines whether a new item should replace an existing one. The frequency sketch is a Count-Min Sketch that uses approximately 4 bytes per tracked item.

L1_MAX_KEYS Count-Min Sketch Size Overhead per Key
100K ~400 KB ~4 bytes
1M ~4 MB ~4 bytes
10M ~40 MB ~4 bytes

The Count-Min Sketch is a fixed-size data structure allocated at startup based on L1_MAX_KEYS. It does not grow with actual key count. Redis LFU uses 24 bits per key embedded in the object header — comparable per-key cost but without the admission gating benefit.

Config # Set maximum number of keys in the cache CONFIG SET L1_MAX_KEYS 1000000 # Eviction kicks in when key count approaches L1_MAX_KEYS # W-TinyLFU ensures only frequent items survive eviction

The Tradeoff: Isolation vs Speed

The honest comparison: Redis runs in a separate process with its own memory space. If Redis crashes or leaks memory, your application is unaffected. Cachee runs in your application's process. Its memory is your application's memory.

  • Redis advantage: Memory isolation. A Redis OOM does not crash your app. You can restart Redis independently. Memory limits are enforced by the OS.
  • Cachee advantage: Zero network overhead, zero serialization, zero connection buffers, 20% less per-key overhead, <1.05x fragmentation, and 667x lower latency (0.0015ms vs ~1ms).

Cachee mitigates the isolation tradeoff with L1_MAX_KEYS: a hard cap on how many keys the cache will hold. Eviction is automatic via W-TinyLFU. Memory usage is bounded and predictable. You are not giving up control — you are trading process-level isolation for microsecond-level performance in a memory-bounded container.

Recommendation

20% less memory AND 667x lower latency. The only cost is sharing your application's memory space. For most applications, that is not a cost — it is the architecture you wanted all along. Set L1_MAX_KEYS to bound memory, and your cache is both faster and smaller than Redis.