Redis 8 added Vector Sets over the network. We built HNSW in-process. Same algorithm. 660x faster. Hybrid search with metadata filters in one operation.
Every AI-powered application follows the same pattern: embed the input, find the nearest vectors, serve the result. RAG pipelines query embeddings before every LLM call. Recommendation engines score candidates on every page load. Semantic search runs similarity on every keystroke. These are hot-path operations measured in milliseconds or less.
The conventional answer is a dedicated vector database sitting behind a network connection. That means TCP serialization, connection pooling, and a round-trip for every query. For a RAG pipeline making 3-5 retrieval calls per user request, those milliseconds compound into visible latency. The vector database is not the bottleneck because it is slow at search. It is the bottleneck because it is on the other side of a network hop.
Cachee eliminates the hop. The HNSW index lives in your application's process memory. Vector search runs as a function call, not a network request. The same cache layer that stores your keys, sessions, and API responses now handles vector similarity at 0.0015ms per query. No additional infrastructure. No connection management. No serialization overhead. One process, one memory space, one data layer for both key-value and vector operations.
Cachee already runs AI-optimized caching with predictive pre-warming and ML-driven TTLs. Vector search is the natural extension: same process, same memory, same sub-microsecond performance tier.
Store vectors with metadata using VADD. Search for K nearest neighbors using VSEARCH. Filter by metadata and similarity in a single operation. The entire pipeline runs in your application's memory space.
VADD inserts a vector into the in-process HNSW graph along with arbitrary key-value metadata. Each vector gets a unique ID, a float array of any dimensionality, and optional metadata attributes like category, timestamp, source, or any custom field your application needs.
The HNSW graph builds incrementally. Every VADD updates the navigable small-world graph structure in real time. There is no batch indexing step, no rebuild trigger, and no read-lock during writes. New vectors are immediately searchable after insertion.
VSEARCH takes a query vector and returns the K most similar vectors by your chosen metric: cosine similarity, L2 (Euclidean) distance, or dot product. The HNSW algorithm provides approximate nearest neighbor search with tunable accuracy-performance tradeoffs via the ef_search parameter.
Hybrid search is the key differentiator. VSEARCH accepts metadata filter expressions that are evaluated during graph traversal, not as a post-filter. This means "find the 10 nearest vectors where category = 'electronics' AND price < 50" runs as a single operation. No separate metadata query. No client-side join. One call, one result set, one latency number.
The HNSW implementation uses multi-layer navigable graphs with configurable construction parameters (M, ef_construction) for tuning the recall-vs-speed tradeoff. Default parameters deliver >95% recall at sub-microsecond latency for indices up to 1M vectors. For details on how the broader AI pipeline integrates with vector search, see the architecture page.
Redis 8 introduced Vector Sets as a native data type. It is a meaningful step forward for the Redis ecosystem. But vector search over a TCP connection has a hard latency floor that in-process search eliminates entirely.
| Capability | Redis 8 Vector Sets | Cachee Vector Search |
|---|---|---|
| Query Latency | ~1ms+ (network round-trip) | 0.0015ms (in-process) |
| Index Algorithm | SVS-VAMANA (DiskANN variant) | HNSW (navigable small-world) |
| Hybrid Search | Separate query + filter | Single operation, inline filters |
| Dependencies | Redis server required | Zero — runs in your process |
| Similarity Metrics | Cosine, L2 | Cosine, L2, Dot Product |
| Metadata Filters | Via FT.SEARCH (separate module) | Native, inline with VSEARCH |
| Connection Overhead | TCP + RESP serialization | None (function call) |
| Key-Value + Vector | Same server, different types | Same process, unified API |
| Scaling Model | Cluster sharding | In-process per node + distributed sync |
Redis 8 Vector Sets solve a real problem for teams already running Redis who want to add vector search without deploying a separate database. Cachee solves a different problem: eliminating the network entirely for latency-sensitive vector workloads. If your vector queries are on the critical path of user-facing requests, the 660x difference is not theoretical. It is the gap between "fast enough" and "invisible." For a broader comparison of caching architectures, see our full comparison page.
Vector search at cache speed unlocks use cases that network-bound databases cannot serve in real time.
Five commands cover the full vector search lifecycle. Familiar Redis-style naming. Available through the SDK and RESP-compatible CLI.
Full API documentation with parameter references, filter syntax, and tuning guides is available in the documentation. For predictive caching workflows that combine vector search with intelligent pre-warming, see the integration guide.
The same SDK you use for key-value caching now handles vector operations. No new dependencies, no infrastructure changes.
Key-value caching and vector search in one process. Sub-microsecond latency for both. Start with the free tier and add vector search to your existing Cachee deployment in minutes.