Cost-Aware Eviction: Why Your Cache Should Know What

Your cache is full. Something has to go. The eviction algorithm picks the least recently used key and drops it. That key happened to be a derivatives pricing computation that took 50 milliseconds to produce. The key that survived? A simple database lookup that takes 0.5 milliseconds to recompute. Your cache just made the worst possible decision, and it had no way to know that, because standard eviction policies treat all keys as equally valuable.

The Blind Spot in Every Eviction Policy

LRU, LFU, Cachee-FLU — every standard eviction algorithm makes decisions based on access patterns. How recently was the key used? How frequently? These are useful signals. But they are missing the most important dimension: what does it cost to recompute this value if the cache evicts it?

Consider a production cache with these four keys, all accessed at roughly the same frequency:

user:123:name — simple DB lookup, 0.3ms to recompute
product:456:price — simple DB lookup, 0.5ms to recompute
portfolio:789:risk-score — ML model inference + 6-table JOIN, 45ms to recompute
report:daily:revenue — aggregation across 12M rows, 200ms to recompute

To LRU, these are four identical keys. Whichever was accessed least recently gets evicted. There is a 50% chance the eviction algorithm drops a key that takes 100x longer to rebuild than the key it kept. Under LFU, the bias is toward frequency — but a key accessed 10 times per minute with a 200ms recomputation cost is far more valuable to keep than a key accessed 12 times per minute with a 0.3ms cost.

The cache is optimizing for the wrong objective. It is minimizing miss count when it should be minimizing miss cost.

The COST Parameter

Cachee introduces a COST parameter on writes that tells the cache how expensive a value is to recompute:

SET portfolio:789:risk-score <value> COST 500
SET report:daily:revenue <value> COST 2000
SET user:123:name <value> COST 1
SET product:456:price <value> COST 5

The cost value is unitless — it represents relative recomputation expense. You can set it to the computation time in milliseconds, the number of database queries involved, the dollar cost of the API call, or any other metric that captures how painful a miss is. The only thing that matters is the relative ordering: a key with COST 500 is 500x more expensive to lose than a key with COST 1.

When the cache needs to evict, the decision incorporates cost alongside frequency:

eviction_score = frequency * log2(cost + 1)

Keys with higher scores survive. Keys with lower scores are evicted first. A high-cost key needs less frequency to justify its place in the cache. A low-cost key needs high frequency to survive alongside expensive entries.

            The logarithmic curve is deliberate: Using log2(cost + 1) instead of raw cost prevents a single extremely expensive key from becoming permanently pinned. A key with COST 1,000,000 gets a score multiplier of ~20x, not 1,000,000x. High cost provides strong protection, but not infinite protection. Diminishing returns ensure the cache remains responsive to actual access patterns.
        

Why Diminishing Returns Matter

A naive approach would be to weight eviction linearly by cost: score = frequency * cost. This breaks immediately. A key with COST 1000000 (say, a complex ML pipeline) would survive over a million accesses of a low-cost key. It would effectively be pinned forever, regardless of whether anyone is actually reading it.

The logarithmic curve prevents this. Here is how cost translates to score multiplier:

COST 1 → multiplier of 1.0
COST 10 → multiplier of 3.5
COST 100 → multiplier of 6.7
COST 1000 → multiplier of 10.0
COST 1000000 → multiplier of 20.0

A key that costs 1000x more to recompute gets a 10x advantage, not a 1000x advantage. This means cost-aware eviction is a preference, not a guarantee. If a high-cost key stops being accessed, it will eventually be evicted — it just has a longer runway to prove its value. If a low-cost key is extremely hot, it will survive alongside expensive keys because its frequency compensates for the lower cost multiplier.

Real-World Use Cases

ML Feature Computations

Feature stores cache computed features for ML models. A simple feature like “user age” costs 0.1ms to compute. A complex feature like “30-day purchase velocity with seasonal adjustment” costs 35ms across three data sources. Assigning COST 350 to the complex feature ensures it survives eviction pressure over the trivial features, keeping ML inference latency predictable.

Aggregated Analytics

Dashboard queries that aggregate millions of rows are enormously expensive to recompute. Assigning COST 2000 to a cached dashboard result protects it during memory pressure, while allowing cheap per-row lookups to be evicted and recomputed trivially.

Complex JOINs

A 6-table JOIN that produces a denormalized view takes 40ms on a warmed database. The individual table lookups that contribute to it take 0.5ms each. With cost-aware eviction, the JOIN result stays cached while the individual lookups rotate freely through the cache.

Rate-Limited API Responses

If you cache responses from a third-party API that rate-limits you to 100 requests per minute, the cost of a cache miss is not just latency — it is a consumed rate-limit token. Assigning COST 1000 to rate-limited API responses protects them from eviction far more aggressively than standard access-pattern metrics would.

# Rate-limited external API: protect aggressively
SET weather:nyc:forecast <value> COST 1000

# Simple internal lookup: evict freely
SET user:123:timezone <value> COST 1

            The goal is not to keep everything forever: Cost-aware eviction does not prevent eviction. It makes eviction smarter. When something has to go, the cache drops the cheapest-to-rebuild key first. The expensive computations, the rate-limited API responses, the multi-table aggregations — they get the protection they deserve.
        

Composition With Triggers and Dependency Graphs

Cost-aware eviction composes with cache triggers to provide eviction observability. Register an ON_EVICT trigger that logs the cost of every evicted key. Over time, this gives you a precise picture of whether your cache is sized correctly. If you are regularly evicting keys with COST > 500, your cache is too small. If evicted keys are consistently COST < 10, you have headroom.

It also composes with dependency graphs. When a derived key is built from multiple sources, its cost should reflect the aggregate computation. A dashboard key that depends on five expensive sub-queries should carry a cost that reflects the full rebuild, not just the final assembly. The cost metadata follows the key through the dependency graph, ensuring that eviction decisions account for the full cascade cost.

What Changes

Standard eviction optimizes for cache hit count. Cost-aware eviction optimizes for cache hit value. The difference is subtle in benchmarks and enormous in production. Two caches with the same hit rate can have vastly different performance profiles depending on which keys are hitting and which are missing. A 95% hit rate where the 5% misses are all 200ms aggregations feels very different from a 95% hit rate where the misses are all 0.5ms lookups.

Your cache should know what things cost. Now it does.

Stop Evicting Your Most Expensive Data.

Cost-aware eviction. Logarithmic weighting. Full observability via triggers. Your cache finally knows what things are worth.

Start Free Trial Schedule Demo

Cost-Aware Eviction: Why Your Cache Should Know What Things Cost