Per-Key Overhead
Every cache entry carries structural overhead beyond the key and value bytes themselves. This overhead includes hash table metadata, pointers, TTL tracking, and eviction bookkeeping. The difference between Cachee and Redis comes down to architecture.
| Component | Redis | Cachee |
|---|---|---|
| Hash table entry (bucket + pointers) | ~56 bytes (dictEntry + SDS headers) | ~40 bytes (DashMap shard entry) |
| TTL metadata | ~16 bytes (expires dict entry) | 8 bytes (inline AtomicU64) |
| Eviction metadata (LRU/LFU) | ~24 bytes (redisObject header) | ~16 bytes (W-TinyLFU frequency counter) |
| Total per-key overhead | ~90–100 bytes | ~64–72 bytes (20% less) |
Redis uses a redisObject wrapper (16 bytes) around every value, plus SDS (Simple Dynamic Strings) headers for both key and value, plus a separate expires dictionary entry for TTL-tracked keys. Cachee stores entries directly in DashMap shards with inline TTL and eviction metadata, avoiding the double-indirection overhead.
Connection Buffers: The Hidden Cost
Redis is an external process that communicates over TCP. Every client connection allocates kernel-level and application-level buffers.
| Buffer | Redis | Cachee |
|---|---|---|
| Query buffer (per connection) | ~1 KB default, up to 1 GB | 0 (in-process) |
| Output buffer (per connection) | ~16 KB default | 0 (in-process) |
| TCP socket buffers (kernel) | ~17 KB per connection (send+recv) | 0 (in-process) |
| 100 connections | ~3.3 MB | 0 bytes |
| 1,000 connections | ~33 MB | 0 bytes |
| 10,000 connections | ~330 MB | 0 bytes |
At 10,000 connections (common in microservices architectures), Redis consumes ~330 MB before storing a single key. Cachee operates in-process — function calls, not network sockets — so connection buffer overhead is literally zero.
Connection buffer memory is invisible to redis-cli INFO memory. It shows up in the OS process RSS but not in Redis's own memory accounting. This is one of the most common reasons Redis uses more memory than expected in production.
Persistence & Replication Overhead
Cachee is an in-process cache, not a database. It does not persist to disk or replicate. This eliminates several categories of Redis memory overhead.
- No AOF rewrite buffer: Redis buffers writes during AOF rewrite, consuming up to 2x memory during the rewrite window.
- No RDB fork overhead: Redis
BGSAVEforks the process. Copy-on-write means pages are duplicated as they are modified during the fork. Under heavy write load, the forked process can consume up to 2x the parent's memory. - No replication backlog: Redis maintains a circular buffer (default 1 MB, often increased to 256 MB+) for partial resync with replicas.
- No serialization buffers: Redis serializes values for network transfer (RESP encoding). Cachee returns direct memory references — zero-copy reads.
Memory Fragmentation
| Metric | Redis | Cachee |
|---|---|---|
| Fragmentation ratio (typical) | 1.1–1.5x | <1.05x |
| Fragmentation cause | jemalloc size classes + frequent alloc/free of variable-size SDS strings | DashMap uses fewer, larger allocations per shard |
| Defragmentation | Active defrag (CPU cost, may increase latency) | Not needed (fragmentation stays low) |
Redis uses jemalloc, which allocates in fixed size classes. A 33-byte string gets a 48-byte allocation. A 49-byte string gets a 64-byte allocation. The wasted space accumulates. Redis's active defragmentation feature can reclaim some of this, but it consumes CPU and can spike latency during defrag runs. Cachee's DashMap architecture uses fewer, shard-level allocations that fragment less.
Capacity Planning
Total memory = (per-key overhead + key bytes + value bytes) × key count × fragmentation ratio. The tables below compare Redis and Cachee at common scales with 256-byte and 1 KB values.
256-Byte Values
| Keys | Redis (1.2x frag) | Cachee (<1.05x frag) | Savings |
|---|---|---|---|
| 100K | 42 MB | 34 MB | 19% |
| 1M | 420 MB | 340 MB | 19% |
| 10M | 4.2 GB | 3.4 GB | 19% |
1 KB Values
| Keys | Redis (1.2x frag) | Cachee (<1.05x frag) | Savings |
|---|---|---|---|
| 100K | 132 MB | 113 MB | 14% |
| 1M | 1.32 GB | 1.13 GB | 14% |
| 10M | 13.2 GB | 11.3 GB | 14% |
The per-key overhead savings are constant (~26 bytes per key), so the percentage savings decreases as value size increases. With small values (256 bytes), overhead is a larger fraction of total memory, so the 20% per-key savings translates to ~19% total savings. With larger values (1 KB), the overhead is diluted and savings are ~14%. The connection buffer savings (potentially hundreds of MB) are in addition to these numbers.
W-TinyLFU Admission Overhead
Cachee uses W-TinyLFU for admission control — a frequency sketch that determines whether a new item should replace an existing one. The frequency sketch is a Count-Min Sketch that uses approximately 4 bytes per tracked item.
| L1_MAX_KEYS | Count-Min Sketch Size | Overhead per Key |
|---|---|---|
| 100K | ~400 KB | ~4 bytes |
| 1M | ~4 MB | ~4 bytes |
| 10M | ~40 MB | ~4 bytes |
The Count-Min Sketch is a fixed-size data structure allocated at startup based on L1_MAX_KEYS. It does not grow with actual key count. Redis LFU uses 24 bits per key embedded in the object header — comparable per-key cost but without the admission gating benefit.
The Tradeoff: Isolation vs Speed
The honest comparison: Redis runs in a separate process with its own memory space. If Redis crashes or leaks memory, your application is unaffected. Cachee runs in your application's process. Its memory is your application's memory.
- Redis advantage: Memory isolation. A Redis OOM does not crash your app. You can restart Redis independently. Memory limits are enforced by the OS.
- Cachee advantage: Zero network overhead, zero serialization, zero connection buffers, 20% less per-key overhead, <1.05x fragmentation, and 667x lower latency (0.0015ms vs ~1ms).
Cachee mitigates the isolation tradeoff with L1_MAX_KEYS: a hard cap on how many keys the cache will hold. Eviction is automatic via W-TinyLFU. Memory usage is bounded and predictable. You are not giving up control — you are trading process-level isolation for microsecond-level performance in a memory-bounded container.
20% less memory AND 667x lower latency. The only cost is sharing your application's memory space. For most applications, that is not a cost — it is the architecture you wanted all along. Set L1_MAX_KEYS to bound memory, and your cache is both faster and smaller than Redis.