Cache Invalidation: The Only Guide You Need

April 29, 2026 | 14 min read | Engineering

Phil Karlton famously said there are only two hard things in computer science: cache invalidation and naming things. The quote is overused because the problem is real. Every engineer who has added caching to a system has, at some point, stared at stale data wondering why the cache did not update. Every production outage postmortem that mentions "stale cache" is a cache invalidation failure. And yet, despite the problem being decades old, most teams approach it ad hoc: they add a cache.delete(key) call wherever they remember to, miss a few write paths, and then spend the next six months chasing inconsistency bugs.

The root cause of most cache invalidation bugs is not technical. It is architectural. Teams do not choose an invalidation pattern. They write invalidation code piecemeal, one write path at a time, and the result is a patchwork of TTL-based expiry here, explicit deletion there, and nothing at all on the three write paths that the original developer did not know about. The fix is not better code. It is choosing a single invalidation pattern and applying it systematically across every write path.

This guide walks through six invalidation patterns, ranked from simplest to most sophisticated. Each pattern has a specific set of tradeoffs. None of them is universally best. The right choice depends on your consistency requirements, your infrastructure, and how often your data changes. But any one of them, applied consistently, is better than the ad hoc approach that most teams use.

Invalidation Patterns

0 sec

Best-Case Staleness

Pattern to Pick

Pattern 1: TTL-Based Invalidation

How It Works

Every cache entry gets a time-to-live (TTL). When the TTL expires, the entry is removed (or marked as stale) and the next read triggers a fresh fetch from the source of truth. No explicit invalidation is needed. The cache is self-healing: even if you never explicitly invalidate anything, every entry will eventually be refreshed.

TTL-based invalidation is the simplest pattern and the most commonly used. It requires no coordination between writers and the cache. Writers write to the database. Readers read from the cache. The cache refreshes itself on a schedule determined by the TTL. The maximum staleness is equal to the TTL: if you set a 60-second TTL, the cache may serve data that is up to 60 seconds old. If you set a 5-second TTL, the maximum staleness is 5 seconds, but you pay more cache misses and more database load.

Implementation

# Simple TTL-based caching
def get_user(user_id):
    cache_key = f"user:{user_id}"

    # Check cache
    cached = cache.get(cache_key)
    if cached is not None:
        return cached

    # Cache miss: fetch from database
    user = db.query("SELECT * FROM users WHERE id = %s", user_id)

    # Store with 60-second TTL
    cache.set(cache_key, user, ttl=60)

    return user

# No invalidation code needed on writes.
# The cache entry will expire in at most 60 seconds.
def update_user(user_id, data):
    db.execute("UPDATE users SET ... WHERE id = %s", user_id, data)
    # No cache.delete() needed -- TTL handles it

When to Use

TTL-based invalidation is the right choice when your application can tolerate bounded staleness. If serving data that is 30 seconds old is acceptable, set a 30-second TTL and stop worrying about invalidation. This covers the vast majority of cache use cases: API response caches, feature flags, configuration values, product catalog data, user profiles (for display, not for authentication), and search results. The key question is: "What is the worst thing that happens if a user sees data that is N seconds old?" If the answer is "nothing significant," TTL-based invalidation is sufficient.

The Tradeoff

The staleness window is fixed and unconditional. If data changes one second after a cache entry is written, the stale version is served for TTL minus one seconds. There is no way to invalidate early without adding explicit invalidation (which means you are no longer using pure TTL-based invalidation). The other tradeoff is the thundering herd problem: when a popular cache entry expires, many concurrent requests may simultaneously miss the cache and hit the database. Mitigate this with request coalescing (also called "single-flight"): when multiple requests miss the same key simultaneously, only one request fetches from the database, and the others wait for its result.

Pattern 2: Write-Through Invalidation

How It Works

Every write to the database is accompanied by an explicit cache invalidation (delete) or update (set). When you update a user's email in the database, you also delete or update the cached user object. The cache is always consistent with the database, modulo the latency between the database write and the cache operation (typically microseconds on the same network).

There are two variants: write-through delete (delete the cache entry on write, forcing the next read to re-populate from the database) and write-through update (write the new value to both the database and the cache simultaneously). Write-through delete is simpler and safer: you do not need to construct the new cached value on the write path, and you avoid the race condition where two concurrent writes update the cache in the wrong order. Write-through update is faster for reads (no miss after a write) but more complex to implement correctly.

Implementation

# Write-through delete: invalidate on every write
def update_user(user_id, data):
    # Write to database
    db.execute("UPDATE users SET ... WHERE id = %s", user_id, data)

    # Invalidate cache (next read will re-populate)
    cache.delete(f"user:{user_id}")

# Write-through update: update both on every write
def update_user_v2(user_id, data):
    # Write to database
    db.execute("UPDATE users SET ... WHERE id = %s", user_id, data)

    # Fetch the updated record and write to cache
    updated_user = db.query("SELECT * FROM users WHERE id = %s", user_id)
    cache.set(f"user:{user_id}", updated_user, ttl=300)

# IMPORTANT: delete-then-write ordering matters.
# If you write to cache first and the DB write fails,
# the cache has data that does not exist in the DB.
# Always write to the database first, then update cache.

When to Use

Write-through invalidation is the right choice when you need zero staleness and you control all write paths. The critical requirement is "all write paths." If your application writes to the database through 15 different endpoints, all 15 must include the cache invalidation call. If a background job, a migration script, or another service writes to the same database table without invalidating the cache, you get stale data. Write-through works best for systems with a small number of well-defined write paths: a CRUD API with 4 endpoints, a user service with create/update/delete operations, or a configuration management system with a single admin interface.

The Tradeoff

The primary tradeoff is write amplification. Every database write now includes an additional network call to the cache. For low-write workloads, this is negligible. For high-write workloads (thousands of writes per second), the additional cache operations add latency to the write path and load to the cache server. The second tradeoff is coverage: you must ensure that every write path includes the invalidation call. If you miss one, you have an inconsistency bug that may not manifest for hours or days (until the TTL expires, if you have one as a safety net). For this reason, most teams that use write-through invalidation also set a TTL as a fallback, creating a belt-and-suspenders approach where explicit invalidation handles the common case and TTL handles the edge cases.

Pattern 3: Event-Driven / Change Data Capture (CDC)

How It Works

Instead of adding cache invalidation to every write path in your application code, you listen to the database's change stream (WAL, binlog, change feed) and invalidate the cache whenever a relevant row changes. This decouples cache invalidation from application code entirely. No matter how the data changes -- through your API, a background job, a migration script, a direct SQL query, or another service -- the cache is invalidated because the invalidation is triggered by the database, not by the application.

In PostgreSQL, this uses logical replication slots and the pgoutput plugin. In MySQL, it uses the binlog. In MongoDB, it uses change streams. A CDC consumer (Debezium, AWS DMS, or a custom consumer) reads the change events and translates them into cache invalidation commands. When a row in the users table is updated, the CDC consumer receives the change event and deletes user:{id} from the cache.

Implementation

# CDC consumer (simplified): listens to PostgreSQL WAL
# and invalidates cache entries when rows change

import psycopg2
from psycopg2.extras import LogicalReplicationConnection

def cdc_consumer():
    conn = psycopg2.connect(dsn, connection_factory=LogicalReplicationConnection)
    cursor = conn.cursor()
    cursor.start_replication(slot_name='cache_invalidation',
                             decode=True, options={'publication_names': 'cache_pub'})

    for msg in cursor:
        change = parse_wal_message(msg.payload)

        if change.table == 'users':
            cache.delete(f"user:{change.row['id']}")
        elif change.table == 'products':
            cache.delete(f"product:{change.row['id']}")
            cache.delete(f"product:slug:{change.row['slug']}")

        msg.cursor.send_feedback(flush_lsn=msg.data_start)

# The application code has ZERO cache invalidation logic.
# Writers just write to the database:
def update_user(user_id, data):
    db.execute("UPDATE users SET ... WHERE id = %s", user_id, data)
    # No cache code here. CDC handles it.

When to Use

CDC-based invalidation is the right choice when you have multiple write paths to the same data (multiple services, background jobs, admin tools, migrations) and you cannot guarantee that every write path will include cache invalidation. It is also the right choice when you want to separate concerns: application developers write business logic, and the caching layer handles its own consistency. CDC is particularly powerful in microservice architectures where multiple services write to shared data stores and coordinating cache invalidation across service boundaries is impractical.

The Tradeoff

CDC adds infrastructure complexity. You need a replication slot or binlog consumer, a message broker or streaming platform (Kafka, Kinesis, or direct consumption), and a CDC consumer service that translates database changes into cache operations. This is significant operational overhead compared to adding a cache.delete() call to your write path. The latency between a database write and the cache invalidation is also non-zero: typically 50-500 milliseconds depending on your CDC pipeline. During that window, the cache may serve stale data. For most applications this latency is acceptable, but for applications that require strong read-after-write consistency, CDC alone is not sufficient -- you need write-through invalidation on the write path with CDC as a safety net for write paths you do not control.

Pattern 4: Tag-Based Invalidation

How It Works

Each cache entry is associated with one or more tags. When you need to invalidate a group of related entries, you invalidate by tag rather than by individual key. For example, a cached product page might be tagged with product:123, category:electronics, and user:456 (the user who last reviewed it). When product 123 is updated, you invalidate the product:123 tag, which removes all cache entries associated with that tag -- the product page, the category listing that includes the product, the search results that reference it, and any other cached response that contains product 123 data.

Tag-based invalidation solves the "derived data" problem. In most applications, cached responses are derived from multiple source records. A product page includes data from the products table, the reviews table, the inventory table, and the pricing table. If any of those source records changes, the cached product page is stale. With key-based invalidation, you need to know every cache key that is affected by a change to each source record, which quickly becomes intractable. With tag-based invalidation, you tag each cache entry with the source records it depends on, and invalidation is automatic.

Implementation

# Tag-based cache: each entry has one or more tags
def get_product_page(product_id):
    cache_key = f"page:product:{product_id}"
    cached = cache.get(cache_key)
    if cached:
        return cached

    # Build the page from multiple data sources
    product = db.query("SELECT * FROM products WHERE id = %s", product_id)
    reviews = db.query("SELECT * FROM reviews WHERE product_id = %s", product_id)
    inventory = db.query("SELECT * FROM inventory WHERE product_id = %s", product_id)

    page = render_product_page(product, reviews, inventory)

    # Cache with tags for all source records
    cache.set(cache_key, page, ttl=300, tags=[
        f"product:{product_id}",
        f"reviews:product:{product_id}",
        f"inventory:product:{product_id}"
    ])
    return page

# When product data changes, invalidate by tag:
def update_product(product_id, data):
    db.execute("UPDATE products SET ... WHERE id = %s", product_id, data)
    cache.invalidate_tag(f"product:{product_id}")
    # This removes ALL cache entries tagged with "product:123",
    # including the product page, category listings, search results, etc.

# When a review is added, invalidate the review tag:
def add_review(product_id, review):
    db.execute("INSERT INTO reviews ...", product_id, review)
    cache.invalidate_tag(f"reviews:product:{product_id}")
    # This removes cache entries that include review data for this product

When to Use

Tag-based invalidation is the right choice when your cached entries are composed from multiple source records and you need to invalidate all cached entries that depend on a specific source record. This is common in content management systems (a page depends on multiple content blocks), e-commerce (a product page depends on product, reviews, inventory, and pricing), and dashboards (a dashboard widget depends on multiple data sources). Tag-based invalidation is also useful for "invalidate everything for this user" scenarios: tag all cache entries for user 123 with user:123, and a single tag invalidation clears all of that user's cached data.

The Tradeoff

Tag-based invalidation requires the cache layer to maintain a mapping from tags to keys, which adds memory overhead and complexity. Redis does not natively support tag-based invalidation (though you can implement it with sets: SADD tag:product:123 page:product:123 category:electronics:page:2). The invalidation operation is O(N) in the number of keys associated with a tag, which can be slow if a single tag is associated with thousands of keys. The operational complexity is also higher: you need to ensure that tags are correctly assigned when entries are written and correctly cleaned up when entries are evicted or expired. Stale tag-to-key mappings (where the tag references a key that no longer exists) waste memory and slow down invalidation operations.

Pattern 5: Version-Based Invalidation

How It Works

Instead of invalidating cache entries when data changes, you include a version number in the cache key. When the data changes, you increment the version number. New reads use the new version number in the cache key, which is a cache miss (forcing a fresh fetch). Old entries with the old version number are never read again and naturally expire via TTL.

The version number can be a simple counter, a timestamp, or a content hash. The key property is that it changes whenever the underlying data changes. You store the current version number in a fast-access location (a separate cache key, an in-memory variable, or a database column) and include it in every cache key for that data.

Implementation

# Version-based invalidation
def get_product(product_id):
    # Get current version for this product
    version = cache.get(f"product:{product_id}:version") or "1"

    # Cache key includes the version
    cache_key = f"product:{product_id}:v{version}"

    cached = cache.get(cache_key)
    if cached:
        return cached

    product = db.query("SELECT * FROM products WHERE id = %s", product_id)
    cache.set(cache_key, product, ttl=3600)  # Long TTL is safe
    return product

def update_product(product_id, data):
    db.execute("UPDATE products SET ... WHERE id = %s", product_id, data)

    # Increment version: old cache key is now orphaned
    cache.incr(f"product:{product_id}:version")

    # Old entry "product:123:v1" still exists but will never be read.
    # It expires naturally when its TTL runs out.
    # New reads use "product:123:v2", which is a cache miss.

When to Use

Version-based invalidation is elegant for scenarios where you want instant invalidation without the complexity of tag-based systems. It is particularly useful when you cannot reliably delete cache entries (for example, when using a CDN that caches responses and you cannot purge individual URLs). By changing the version in the cache key, you effectively create a new cache namespace, and the old namespace is abandoned. Version-based invalidation is also useful for batch invalidation: if you need to invalidate all cached data for a service (after a schema migration, for example), you can increment a global version counter instead of enumerating and deleting millions of individual keys.

The Tradeoff

The primary tradeoff is memory waste. When you increment the version, the old cache entry is orphaned: it is still in the cache, consuming memory, but it will never be read again. It sits there until its TTL expires. If you have a 1-hour TTL and update frequently, you may have many orphaned entries consuming memory. This is usually acceptable because the entries expire eventually, but for memory-constrained environments, it is a real cost. The second tradeoff is the extra lookup: every cache read now requires two operations (fetch the version, then fetch the data), which doubles your cache latency for misses and adds one round-trip for hits. You can mitigate this by caching the version number in your L1 in-process cache with a very short TTL (5 seconds), so the version lookup is usually a 31-nanosecond L1 hit rather than a 300-microsecond Redis GET.

Pattern 6: Semantic Invalidation

How It Works

Semantic invalidation uses the meaning of the data -- not the key name or a tag -- to determine which cache entries should be invalidated. When a new data point arrives, the system computes its semantic similarity to existing cache entries and invalidates entries that are "close enough" in meaning space. This is primarily useful for AI and ML workloads where queries are natural language or vector embeddings, and "similar" queries should return the same cached result.

Consider an LLM caching layer. A user asks "What is the capital of France?" and the response is cached. Another user asks "What's France's capital city?" -- this is semantically identical but lexically different. A traditional key-based cache would treat these as two different entries. A semantic cache computes the embedding of each query, finds that they are 0.97 cosine-similar, and returns the cached response for the first query. When the underlying data changes (perhaps France moves its capital, which seems unlikely but illustrates the point), semantic invalidation removes all cache entries whose embeddings are within a similarity threshold of the invalidation vector.

Implementation

# Semantic cache with vector similarity
import numpy as np

class SemanticCache:
    def __init__(self, similarity_threshold=0.92):
        self.entries = {}  # key: (embedding, value, timestamp)
        self.threshold = similarity_threshold

    def get(self, query_embedding):
        best_match = None
        best_similarity = 0

        for key, (emb, value, ts) in self.entries.items():
            sim = np.dot(query_embedding, emb)  # cosine similarity
            if sim > best_similarity and sim >= self.threshold:
                best_similarity = sim
                best_match = value

        return best_match

    def set(self, query_embedding, value):
        key = hash(query_embedding.tobytes())
        self.entries[key] = (query_embedding, value, time.time())

    def invalidate_semantic(self, invalidation_embedding, radius=0.85):
        """Remove all entries semantically similar to the
        invalidation vector."""
        to_remove = []
        for key, (emb, value, ts) in self.entries.items():
            sim = np.dot(invalidation_embedding, emb)
            if sim >= radius:
                to_remove.append(key)
        for key in to_remove:
            del self.entries[key]

# When the data about France changes:
france_embedding = embed("information about France capital")
cache.invalidate_semantic(france_embedding, radius=0.85)
# Removes all cached entries semantically related to France's capital

When to Use

Semantic invalidation is a specialized pattern for AI workloads: LLM response caching, RAG (retrieval-augmented generation) pipelines, embedding search results, and recommendation engines. It is not appropriate for traditional key-value cache workloads where keys are deterministic and exact-match lookup is sufficient. If your cache keys are user IDs, session tokens, or API endpoint paths, you do not need semantic invalidation. If your cache keys are natural language queries, vector embeddings, or content fingerprints, semantic invalidation lets you invalidate by meaning rather than by exact key match, which dramatically reduces the number of stale entries in your cache.

The Tradeoff

Semantic invalidation is computationally expensive. Computing the similarity between an invalidation vector and every cache entry is O(N) in the number of entries. For a cache with 1 million entries, this requires 1 million dot products, which takes approximately 50 milliseconds on modern hardware. You can reduce this with approximate nearest neighbor (ANN) indexing (HNSW, IVF), which reduces the search to O(log N) but adds index maintenance overhead. The second tradeoff is precision: semantic similarity is fuzzy. A threshold of 0.92 might invalidate entries that should be retained, or retain entries that should be invalidated. There is no perfect threshold -- it depends on the embedding model and the data domain. You need to tune the threshold empirically based on your specific workload.

The Meta-Insight: Pick One Pattern and Apply It Everywhere

The six patterns above cover the full spectrum from simple to sophisticated. Most applications need only one or two. The most common mistake is not picking a pattern at all. When invalidation logic is ad hoc -- a TTL here, an explicit delete there, a "we'll fix it later" on the third write path -- the result is a system where nobody can reason about cache consistency because there is no consistent model to reason about.

The Most Common Invalidation Bug

The most common cache invalidation bug is not a race condition or a timing issue. It is a missing invalidation call on a write path that the original developer did not know about. A developer adds caching to the user profile endpoint and adds a cache.delete() call to the user update endpoint. But the admin panel has its own user update endpoint. The password reset flow updates the user's password hash. The billing system updates the user's plan. The onboarding flow updates the user's preferences. None of these write paths invalidate the cache. The user updates their email through the main app and sees the change immediately. Their admin updates their role through the admin panel and the user sees the old role for the next 5 minutes (the TTL). The bug is not in the cache. The bug is in the assumption that one developer can enumerate all write paths.

The fix is to choose one pattern and apply it systematically. For most teams, the right starting point is TTL-based invalidation (Pattern 1) as the baseline, combined with write-through delete (Pattern 2) on the primary write paths. This gives you zero-staleness on the write paths you control and bounded staleness (equal to the TTL) on the write paths you do not control. As your system grows, you can upgrade to CDC (Pattern 3) to handle the write paths you do not control, and add tag-based invalidation (Pattern 4) for derived data that depends on multiple source records.

Pattern	Complexity	Max Staleness	Coverage	Best For
1. TTL-based	Low	TTL seconds	Automatic	Most workloads
2. Write-through	Low	Near-zero	Known write paths only	CRUD APIs
3. CDC / event-driven	High	50-500ms	All write paths	Microservices
4. Tag-based	Medium	Near-zero	Tagged entries	Derived data
5. Version-based	Low	Near-zero	Versioned keys	CDN / batch
6. Semantic	High	Near-zero	Similarity radius	AI / LLM cache

The Bottom Line

Cache invalidation is hard not because the patterns are complex, but because most teams do not pick a pattern. They write invalidation code one endpoint at a time, miss write paths, and end up with a system that is sometimes consistent and sometimes not. The fix is architectural: choose one of the six patterns above, apply it to every write path, and use TTL as a safety net. If you do this, cache invalidation stops being a hard problem and becomes a solved problem with known tradeoffs.

Built-in TTL, tag-based invalidation, and L1 tiering. Cache consistency without the headaches.

brew install cachee Distributed vs Local Cache