Overview
Traditional cache invalidation relies on key names: you know the exact key, or you use a prefix pattern. But what happens when a pricing change affects cart:*, checkout:*, recommendation:*, and email_template:promo:*? You need to know every affected prefix in advance. One missed prefix = stale data.
Semantic Invalidation solves this by associating each cache key with an embedding vector that captures its meaning. When you invalidate, you provide an intent embedding (e.g., "pricing change") and a confidence threshold. Cachee finds all keys whose embeddings are semantically similar to the intent and invalidates them — regardless of their key naming convention.
Semantic Invalidation is most valuable when your cache keys span multiple naming conventions but share conceptual relationships. It supplements (not replaces) exact-key and prefix-based invalidation. Use it for cross-cutting concerns like "all pricing-related data" or "all data affected by user GDPR deletion."
Data Structure
Embeddings are stored as normalized Vec<f32> vectors. Normalization happens on ingestion — all stored embeddings have unit L2 norm, which means cosine similarity reduces to a simple dot product. This makes similarity computation extremely fast.
Registering Embeddings
Associate an embedding with a cache key at write time. The embedding captures the semantic meaning of the cached value.
Cosine Similarity
Given two unit-normalized vectors A and B, cosine similarity is their dot product:
Because embeddings are pre-normalized at registration time, no division or square root is needed at query time. A 128-dimensional dot product completes in ~50ns on modern hardware.
find_related_keys
Given an intent embedding and a threshold, scan all registered embeddings and return keys whose similarity exceeds the threshold.
INVALIDATE EMBEDDING Command
The CONFIDENCE parameter controls the similarity threshold. Higher values = fewer keys invalidated (more precise). Lower values = more keys invalidated (broader reach). If omitted, the engine uses semantic.default_threshold.
invalidate_by_intent
A higher-level function that accepts a text string, generates an embedding via the configured embedding model, and then performs the similarity search. This is a convenience wrapper for applications that do not pre-compute embeddings.
The INVALIDATE INTENT command requires a configured embedding model endpoint (semantic.embedding_endpoint). The INVALIDATE EMBEDDING command works without a model — you provide the pre-computed vector directly.
Configuration
| Parameter | Default | Description |
|---|---|---|
semantic.enabled |
false | Enable the semantic invalidation subsystem |
semantic.dimensions |
128 | Embedding dimensionality. All embeddings must match this size. |
semantic.default_threshold |
0.85 | Default cosine similarity threshold for invalidation (0.0–1.0) |
semantic.max_embeddings |
500000 | Maximum number of key→embedding registrations |
semantic.embedding_endpoint |
(none) | HTTP URL for text→embedding conversion (required for INVALIDATE INTENT) |
Performance & Memory
| Metric | Value |
|---|---|
| Single dot product (128-dim) | ~50 ns |
| Full scan, 100K embeddings | ~5 ms |
| Full scan, 500K embeddings | ~25 ms |
| Memory per embedding (128-dim) | 512 bytes (128 × 4 bytes) + key overhead |
| 100K embeddings total memory | ~60 MB |
| 500K embeddings total memory | ~300 MB |
Semantic invalidation uses a linear scan over all embeddings. This is fast at 100K scale but grows linearly. For caches with more than 500K semantically-indexed keys, consider partitioning by prefix or using lower-dimensional embeddings (64-dim halves both memory and scan time).
Semantic indexing is opt-in per key. Only register embeddings for keys that participate in cross-cutting invalidation patterns. Keys with predictable prefix-based relationships are better served by the Dependency Graph or standard prefix invalidation.