Machine Learning Cache Eviction: Beyond LRU and LFU
Cache eviction determines which data gets removed when memory fills up. For decades, we've relied on simple policies like LRU (Least Recently Used) and LFU (Least Frequently Used). These algorithms are fast, predictable, and fundamentally limited. Machine learning changes everything by predicting future access patterns instead of just reacting to past behavior.
Why Traditional Eviction Policies Fall Short
LRU (Least Recently Used)
LRU evicts the item accessed longest ago. It's simple and works well for many workloads, but fails badly in common scenarios:
# Scenario: Scanning through data once
for i in range(1000000):
cache.get(f"item:{i}") # Each item accessed once
# Problem: Recently-used scan data evicts
# frequently-accessed hot data
LRU weakness: One-time sequential scans pollute the cache, evicting valuable hot data. A single bulk operation can destroy your hit rate.
LFU (Least Frequently Used)
LFU evicts the least-accessed items. Better for workloads with stable hot data, but struggles with changing patterns:
# Scenario: Yesterday's popular content
# "viral-video-123" accessed 1M times yesterday
# "viral-video-456" accessed 100K times today
# Problem: LFU keeps old viral content
# and evicts today's trending content
LFU weakness: Historical frequency dominates, making the cache slow to adapt to changing access patterns. Popular old data crowds out important new data.
The Core Problem: Reacting vs. Predicting
Traditional policies react to past access patterns. ML-powered eviction predicts future access probability. This fundamental shift enables dramatic improvements:
- 15-25% higher hit rates with same memory
- Or 30-40% less memory for same hit rate
- Automatic adaptation to traffic pattern changes
- No manual tuning or configuration
How ML-Powered Eviction Works
Feature Extraction
For each cached item, the system tracks features that correlate with future access:
{
"key": "user:profile:12345",
"features": {
"access_count_1h": 45,
"access_count_24h": 203,
"access_count_7d": 1847,
"time_since_last_access": 120, // seconds
"time_of_day": 14, // hour
"day_of_week": 3, // Wednesday
"size_bytes": 2048,
"computation_cost_ms": 35,
"ttl_remaining": 1800,
"key_pattern": "user:profile:*",
"access_variance": 0.34,
"trend": "increasing" // +12% hour-over-hour
}
}
Access Prediction Model
A lightweight neural network or gradient boosted tree predicts "probability of access in next N minutes" for each cached item:
# Simplified prediction model
def predict_access_probability(features):
# Combine multiple signals
recency_score = 1.0 / (1 + features.time_since_last_access)
frequency_score = features.access_count_1h / max_access_rate
trend_score = features.trend_coefficient
time_pattern_score = temporal_model.predict(
features.time_of_day,
features.day_of_week
)
# ML model weighs and combines signals
probability = ml_model.predict([
recency_score,
frequency_score,
trend_score,
time_pattern_score,
features.computation_cost_ms,
features.size_bytes
])
return probability
Cost-Aware Eviction
The system calculates eviction cost as:
eviction_cost = (
access_probability *
computation_cost *
size_efficiency_factor
)
# Evict items with lowest cost
# High probability + expensive to recompute = keep in cache
# Low probability + cheap to recompute = safe to evict
Real-World Performance Improvements
E-Commerce Product Catalog
- Workload: 10M products, 5M accessed daily, heavy temporal patterns
- LRU hit rate: 82%
- ML eviction hit rate: 94%
- Improvement: +12% absolute, 67% reduction in misses
Social Media Feed
- Workload: Rapidly changing content, temporal access patterns
- LFU hit rate: 76% (stale content problem)
- ML eviction hit rate: 91%
- Improvement: +15% absolute, 63% reduction in misses
Key ML Eviction Strategies
1. Temporal Pattern Recognition
ML models detect time-based patterns humans miss:
# Detected pattern: User profiles accessed heavily
# Mon-Fri 9am-5pm, minimal weekend access
# Traditional LRU/LFU: Treats all times equally
# ML eviction: Aggressively caches profiles during
# weekday business hours, allows eviction on weekends
2. Trend Detection
Identify rising and falling access trends:
# Trending up: Keep in cache even with low historical count
# Trending down: Evict even with high historical count
def calculate_trend(access_history):
recent_rate = access_history.last_1h
historical_rate = access_history.last_24h / 24
return (recent_rate - historical_rate) / historical_rate
3. Size-Efficiency Optimization
Large low-value items get evicted before small high-value items:
# 10MB video thumbnail (rarely accessed)
# vs 2KB user session (frequently accessed)
value_per_byte = access_probability / size_bytes
# Evict low value-per-byte items first
# Even if access count is similar
4. Computation-Cost Weighting
Items expensive to regenerate stay cached longer:
# Computed recommendation: 500ms to generate
# vs simple database query: 5ms
keep_score = access_probability * computation_cost_ms
# High computation cost items stay cached
# even with lower access probability
Online Learning: Adapting to Traffic Changes
Static models become stale as traffic patterns evolve. Online learning continuously updates the eviction model:
# Every 5 minutes:
1. Measure actual access patterns vs predictions
2. Calculate prediction accuracy
3. Update model weights based on errors
4. Deploy updated model with <1ms interruption
# Result: Model adapts to traffic shifts in minutes
# instead of weeks of manual retraining
Implementation Considerations
Computational Overhead
ML eviction must be fast enough for production use:
- Feature collection: 0.1-0.5μs per operation
- Prediction: 1-5μs per item during eviction
- Model update: Background process, zero impact
Modern implementations add <1% CPU overhead compared to LRU.
Memory Overhead
Feature storage requires additional memory:
# Per-item metadata:
# Traditional LRU: 16 bytes (timestamp)
# ML eviction: 64-128 bytes (features + prediction)
# For 1M cached items:
# LRU overhead: 16MB
# ML overhead: 64-128MB
# Trade-off: 48-112MB extra memory for 15%+ hit rate improvement
When to Use ML Eviction
ML-powered eviction provides the most value when:
- Complex access patterns: Temporal trends, seasonal traffic
- High eviction rate: Limited memory, aggressive churn
- Expensive cache misses: Database queries, API calls, computation
- Changing workloads: Daily/weekly pattern shifts
Less valuable for:
- Small caches (<1000 items)
- Uniform random access patterns
- Workloads with 95%+ hit rates already
Conclusion
LRU and LFU served us well for decades, but modern applications demand smarter eviction. Machine learning transforms cache eviction from reactive to predictive, considering recency, frequency, trends, costs, and temporal patterns simultaneously.
The result: 15-25% higher hit rates, 30-40% memory savings, and automatic adaptation to changing traffic—all with minimal overhead. As cache workloads grow more complex, ML-powered eviction is becoming essential infrastructure.
Experience ML-Powered Cache Eviction
Cachee.ai uses adaptive ML models to optimize eviction automatically. 94% hit rates, zero configuration.
Start Free Trial