Question 1

How fast is Cachee vector search compared to Redis 8 Vector Sets?

Accepted Answer

Cachee performs in-process HNSW vector similarity search at 0.0015ms per query, which is approximately 660x faster than Redis 8 Vector Sets that require network round-trips and typically measure at 1ms or more per query.

Question 2

What similarity metrics does Cachee vector search support?

Accepted Answer

Cachee supports three similarity metrics: cosine similarity, L2 (Euclidean) distance, and dot product. The metric is specified when adding vectors with VADD and used automatically during VSEARCH queries.

Question 3

What is hybrid vector search with metadata filters?

Accepted Answer

Hybrid search combines vector similarity matching with metadata attribute filtering in a single operation. Instead of running a vector query then filtering results separately, Cachee evaluates both conditions simultaneously — finding the K nearest vectors that also match your metadata constraints, all in one call.

Question 4

Can Cachee vector search replace a dedicated vector database?

Accepted Answer

For latency-sensitive workloads like RAG retrieval, semantic caching, and real-time recommendations, yes. Cachee's in-process HNSW index eliminates the network hop entirely, delivering sub-microsecond queries. For petabyte-scale analytical workloads with complex distributed queries, a dedicated vector database may still be appropriate alongside Cachee as the hot-path cache layer.

Question 5

Does Cachee vector search require a separate server or database?

Accepted Answer

No. Cachee vector search runs entirely in-process with zero external dependencies. The HNSW index lives in your application's memory space, eliminating network serialization, TCP overhead, and connection pooling complexity. It deploys as part of the Cachee SDK — no additional infrastructure to manage.

Question 6

Is there a maximum metadata size per vector in Cachee?

Accepted Answer

No. Metadata is stored as HashMap<String, String> with no enforced size limit. You can store 36-byte UUIDs or 200KB ZK-STARK proofs. The practical limit is available memory. For large payloads (>10KB), we recommend storing small identifying fields as VADD metadata and the full payload as a separate cache key for optimal HNSW index memory efficiency.

Question 7

Does large metadata slow down VSEARCH?

Accepted Answer

No. HNSW graph traversal only accesses vector dimensions (Vec<f32>), never metadata. Whether your metadata is 36 bytes or 200KB, VSEARCH completes in 0.0015ms. Metadata is only touched during filter evaluation (O(1) HashMap lookup) and result serialization.

Question 8

Can I store ZK-STARK proofs as vector metadata?

Accepted Answer

Yes. For proofs over 10KB, use the two-tier pattern: store proof_hash, circuit_id, status, and prover as VADD metadata (~500 bytes), then store the full proof blob as a separate cache key (SET stark:{hash} <proof>). VSEARCH finds matches by similarity, metadata gives you the proof hash, then GET retrieves the full proof at 1.5 microseconds. ZK proofs are immutable, so cache hit rates should exceed 90%.

Question 9

How much memory does vector metadata consume?

Accepted Answer

Memory equals number_of_vectors times average_metadata_size. At 100K vectors: 500-byte metadata uses 50MB, 12KB metadata uses 1.2GB, 200KB metadata uses 20GB. The vectors themselves (100K at 768 dimensions) consume about 300MB. Use the two-tier pattern for large payloads to keep the HNSW index lean.

Question 10

How much time and money does in-process vector search save at scale?

Accepted Answer

At 10 billion vector calls per month, switching from a network vector database (1-5ms) to Cachee in-process HNSW (0.0015ms) saves 231 days of cumulative compute time and $228K-800K per year in vector DB and infrastructure costs. At 100 billion calls (Visa/Mastercard scale), that is 6.3 years of blocked compute recovered every single month, with $2.3-8 million in annual savings. The 1,333x throughput improvement per core means years of growth without buying infrastructure.

Capability	Redis 8 Vector Sets	Cachee Vector Search
Query Latency	~1ms+ (network round-trip)	0.0015ms (in-process)
Index Algorithm	SVS-VAMANA (DiskANN variant)	HNSW (navigable small-world)
Hybrid Search	Separate query + filter	Single operation, inline filters
Dependencies	Redis server required	Zero — runs in your process
Similarity Metrics	Cosine, L2	Cosine, L2, Dot Product
Metadata Filters	Via FT.SEARCH (separate module)	Native, inline with VSEARCH
Connection Overhead	TCP + RESP serialization	None (function call)
Key-Value + Vector	Same server, different types	Same process, unified API
Scaling Model	Cluster sharding	In-process per node + distributed sync

Vector Similarity Search
at 0.0015ms. Not 1ms.

Why Vector Search Belongs in the Cache

In-Process HNSW Vector Search

VADD: Store Vectors with Metadata

VSEARCH: K-Nearest with Hybrid Filters

Cachee vs Redis 8 Vector Sets

Built for Latency-Sensitive AI Workloads

Vector Commands

Add Vector Search in Three Lines

Every AI Application Is a Caching Problem.
Solve Both at Once.

Vector Similarity Searchat 0.0015ms. Not 1ms.