Today we are announcing the most complete in-process Redis-compatible engine ever built. 140+ commands running natively in Rust — hashes, sorted sets, lists, streams, geospatial indexes, HyperLogLog, bitmaps, vector search, Lua scripting, transactions, pub/sub, CDC auto-invalidation, and cache triggers. All at 0.0015ms per operation. All in a 172MB Docker image. Zero external dependencies. When we announced our native Redis-compatible data engine yesterday with 50+ commands, we said we were just getting started. We meant it. In the span of a single development cycle, we went from “covers the basics” to “covers everything production workloads actually need.” This post walks through every command category, what makes the architecture different from anything else on the market, and five features that no other Redis alternative offers at any price.
The Full Command Coverage
Let’s start with what “140+ commands” actually means. Not 140 variations of GET with different flags. 140+ distinct operations spanning every Redis data type, plus capabilities Redis itself does not offer. Here is the full breakdown by category.
Strings (16 commands) cover the foundation: GET, SET, MGET, MSET, INCR, DECR, INCRBY, APPEND, SETNX, SETEX, MSETNX, GETSET, STRLEN, DECRBY, INCRBYFLOAT, and GETRANGE. Every string command supports TTL natively. SET accepts EX, PX, NX, and XX flags — identical to Redis 7.x semantics. Atomic increments and decrements are lock-free on DashMap, which means rate-limiting counters run without contention across threads.
Hashes (11 commands) provide field-level operations on structured data: HSET, HGET, HGETALL, HMGET, HDEL, HEXISTS, HKEYS, HVALS, HLEN, HINCRBY, and HINCRBYFLOAT. Backed by nested DashMap structures, multiple threads can read and write different fields of the same hash simultaneously — something Redis’s single-threaded model fundamentally cannot do.
Sets (12 commands) deliver membership operations and set algebra: SADD, SREM, SMEMBERS, SCARD, SISMEMBER, SMISMEMBER, SUNION, SINTER, SDIFF, SPOP, SRANDMEMBER, and SMOVE. Set operations like SUNION, SINTER, and SDIFF are computed in-process without marshaling data between address spaces, making them orders of magnitude faster than their network-bound equivalents.
Sorted Sets (12 commands) power leaderboards, priority queues, and range queries: ZADD, ZRANGE, ZRANGEBYSCORE, ZREM, ZCARD, ZRANK, ZREVRANK, ZSCORE, ZCOUNT, ZPOPMIN, ZPOPMAX, and ZREMRANGEBYSCORE. ZADD supports NX, XX, GT, and LT flags. The underlying BTreeMap with IEEE 754 score ordering delivers O(log n) inserts and range scans, matching Redis’s skip-list complexity without the memory overhead of skip-list node pointers.
Lists (11 commands) cover queue and stack patterns: LPUSH, RPUSH, LPOP, RPOP, LRANGE, LLEN, LINDEX, LSET, LINSERT, LREM, and LTRIM. VecDeque provides O(1) operations at both ends. Job queues, task pipelines, and message buffers all run without external brokers.
Streams (7 commands) bring event log semantics in-process: XADD, XRANGE, XREVRANGE, XLEN, XDEL, XTRIM, and XREAD. Streams support auto-generated IDs, MAXLEN trimming, and range queries by timestamp. This is the data structure behind real-time event processing — audit logs, activity feeds, and change streams — all without standing up a Kafka or Redis Streams cluster.
Geospatial (4 commands) enable location-aware applications: GEOADD, GEODIST, GEOPOS, and GEOSEARCH. Backed by Haversine distance calculations, you can store coordinates and query by radius or bounding box. Delivery platforms, store locators, and proximity-based services can run geospatial queries at cache speed without PostGIS or a separate geo-index.
HyperLogLog (3 commands) provide probabilistic cardinality estimation: PFADD, PFCOUNT, and PFMERGE. Our implementation uses 16,384 registers with FNV-1a hashing, delivering less than 1% standard error — identical to Redis’s accuracy — while consuming only 12KB per counter. Count unique visitors, unique events, or unique IPs across billions of entries without storing every element.
Bitmaps (5 commands) handle bit-level operations: SETBIT, GETBIT, BITCOUNT, BITOP (AND/OR/XOR/NOT), and BITPOS. Feature flags, bloom filter primitives, and compact boolean state for millions of users — one bit per flag, one key per bitmap.
Vectors (5 commands) add native similarity search: VADD, VSEARCH, VDEL, VCARD, and VINFO. This is not a Redis module bolted onto the side — it is a native HNSW graph index running in the same process. More on this below.
Transactions (3 commands): MULTI, EXEC, and DISCARD. Per-connection command queuing with atomic execution. Batch operations without partial failure.
Scripting (4 commands): EVAL, EVALSHA, SCRIPT LOAD, and SCRIPT EXISTS. Embedded Lua 5.4 via mlua, vendored and sandboxed. Scripts cached by SHA-256 for repeat execution.
Pub/Sub (1 command): PUBLISH delivers messages to subscribed connections via Tokio broadcast channels. Fan-out messaging at in-process speed.
Key Management (7 commands): TTL, PTTL, TYPE, EXPIRE, SCAN, RENAME, and EXISTS. SCAN supports cursor-based iteration with MATCH pattern filtering and COUNT hints — production-safe enumeration without blocking.
redis-py, ioredis, jedis, go-redis — works without code changes over the RESP protocol.
What Makes This Different
The Redis ecosystem has no shortage of alternatives. DragonflyDB, Valkey, KeyDB, Momento, Upstash — all viable projects solving real problems. But they all share a common architecture: a separate server process that your application communicates with over a network socket. Cachee is a fundamentally different design. We are not a Redis fork, and we are not a managed service. We are an in-process engine that runs inside your application.
DragonflyDB is a Redis-compatible server rewritten in C++ with a multi-threaded architecture. Excellent engineering. It benchmarks at roughly 6.4 million operations per second over the network. Valkey (the Redis fork maintained by the Linux Foundation) delivers roughly 1–2 million ops/sec depending on configuration. Both are network-bound: every operation pays the cost of TCP serialization, kernel buffer copies, and context switches.
Cachee’s in-process engine delivers 215 million+ lookups per second. We want to be transparent: this comparison is apples to oranges. Network throughput and in-process throughput are different metrics measuring different things. DragonflyDB’s 6.4M ops/sec includes the cost of accepting connections, parsing RESP over TCP, and sending responses back over the wire — real work that Cachee never does because there is no wire. Our 215M number reflects direct function calls in shared memory with zero serialization overhead. The reason the comparison matters is not because we are “faster than Dragonfly” in any meaningful head-to-head sense. It matters because if your data fits in your application’s memory, the entire network layer is unnecessary overhead. Every microsecond spent on TCP is a microsecond that does not need to exist. Cachee eliminates that layer entirely.
The operational differences are equally significant. No connection pool to exhaust under load. No separate container to monitor, restart, or upgrade. No version compatibility matrix between your client library and the server. No failover coordination when the cache process crashes. Your cache is your application. They start together, scale together, and fail together — which, counterintuitively, is simpler than managing two processes that need to agree on availability.
Five Features Nobody Else Has
Beyond raw command coverage, Cachee Enterprise includes five capabilities that no other Redis alternative — open-source or commercial — offers today.
1. Native Vector Search. VADD, VSEARCH, VDEL, VCARD, and VINFO implement a full HNSW (Hierarchical Navigable Small World) graph index running in-process. Cosine similarity, L2 distance, and dot-product metrics. Hybrid metadata filters let you constrain similarity search by structured fields — “find the 10 most similar products where category = electronics and price < 500.” Redis requires the RediSearch module, a separate binary with its own memory management and its own bugs. Momento and Upstash do not offer vector search at all. Cachee runs it natively at cache speed.
2. CDC Auto-Invalidation. Database change → automatic cache key invalidation. Connect Cachee to your PostgreSQL, MySQL, or MongoDB change stream and cache entries are invalidated the moment the underlying data changes. No manual DEL calls scattered across your codebase. No stale data windows. No invalidation logic to maintain. The cache stays coherent with the source of truth automatically, which eliminates the single hardest problem in caching: knowing when to invalidate.
3. Cache Triggers. Register Lua scripts that fire on cache events: ON_WRITE, ON_EVICT, ON_EXPIRE. When a key is written, your trigger can update a secondary index, publish a notification, or recompute a derived value. When a key is evicted under memory pressure, your trigger can log the event, warm a replacement, or notify the origin. This is event-driven caching — the cache becomes an active participant in your data pipeline rather than a passive store.
4. Cross-Service Coherence. Running multiple application instances? Cachee automatically propagates L1 invalidations across instances. When Instance A updates a key, Instance B’s local cache entry is invalidated within milliseconds. No external coordination service required. No Redis Pub/Sub channel to maintain. The coherence protocol is built into the engine.
5. Cost-Aware Eviction. Standard eviction policies (LRU, LFU, FIFO) treat all keys equally. A cache entry that took 500ms to regenerate from a cold database query is evicted just as readily as one that took 1ms. Cachee’s cost-aware eviction weighs the origin fetch cost of each key. Expensive-to-regenerate entries survive eviction longer, keeping your cache optimized for total system latency rather than just recency or frequency. See our predictive caching documentation for how this integrates with AI-driven pre-warming.
The Architecture
Every data structure in Cachee is purpose-built for concurrent, in-process access. There is no generic “one-size-fits-all” storage engine underneath. Each Redis data type maps to the Rust data structure that provides the best performance for its access patterns.
Strings, hashes, and sets use DashMap — a sharded concurrent hash map that provides lock-free reads and per-shard write locks. Multiple threads reading different keys (or different fields of the same hash) never contend with each other. This is the core reason Cachee scales linearly with CPU cores while Redis is fundamentally single-threaded.
Sorted sets use a BTreeMap with IEEE 754 score ordering. B-tree nodes offer better cache locality than skip-list nodes because they are stored contiguously in memory rather than linked via pointers scattered across the heap. Range queries like ZRANGEBYSCORE walk a contiguous memory region instead of chasing pointers.
Lists use VecDeque for O(1) push and pop from both ends. Eviction operates on whole lists, not individual elements — if a list key is evicted, the entire list goes. This matches how applications actually use list data: a job queue is a single logical unit, not 10,000 independent entries.
Lua scripting embeds Lua 5.4 via the mlua crate with vendored compilation. The Lua runtime is compiled directly into the Cachee binary — no system Lua dependency, no dynamic linking. The sandbox removes os, io, debug, require, loadfile, and dofile. A configurable instruction count limit prevents runaway scripts.
Pub/Sub uses Tokio broadcast channels for message fan-out. PUBLISH delivers to all subscribed connections in the same async runtime. No TCP round-trip per subscriber.
HyperLogLog uses 16,384 registers with FNV-1a hashing, matching Redis’s precision at 12KB per counter. PFMERGE combines multiple HLL structures by taking the per-register maximum — a constant-time operation regardless of the underlying cardinality.
Geospatial operations use Haversine distance calculations for radius queries and bounding-box intersection for rectangular searches. Coordinates are stored as sorted set members with geohash-encoded scores, matching Redis’s internal representation for RESP compatibility.
Vector search builds an HNSW graph index that lives entirely in-process memory. The graph is constructed incrementally as vectors are added via VADD and pruned on VDEL. Search queries traverse the graph’s hierarchical layers with configurable efSearch parameters for precision/recall tradeoffs.
What We Don’t Build (and Why)
Cachee is opinionated about scope. There are things we deliberately do not implement, and the reasons are architectural, not aspirational.
- Cluster/sharding — single-node in-process is the point. Sharding introduces network hops, consensus protocols, and slot migration — the exact overhead we exist to eliminate. For distributed workloads, deploy Cachee per-node with a shared L2 tier.
- Persistence (RDB/AOF) — we are a cache, not a database. If you need durability, your L2 layer handles it. Mixing cache semantics with persistence guarantees creates the worst of both worlds.
- BLPOP/BRPOP — blocking list commands require a V2 async handler that can park connections without tying up a Tokio task. Scoped for a future release.
- ACL — access control at the command level adds overhead to every operation. Use
RESP_AUTH_PASSWORDfor connection-level authentication. Fine-grained ACLs are a V2 feature. - 300+ server admin commands —
CONFIG,DEBUG,SLOWLOG,CLIENT LIST, and the rest of Redis’s administrative surface area. Use our metrics dashboard instead. We expose Prometheus endpoints, not CLI debugging tools.
Further Reading
- Predictive Caching: How AI Pre-Warming Works
- Cachee vs. Redis, KeyDB, and Dragonfly
- Cachee Enterprise
- How to Reduce Redis Latency in Production
- Low-Latency Caching Architecture
- Cachee Performance Benchmarks
- Start Free Trial
Also Read
140+ Commands. Sub-Microsecond. Zero Dependencies. Try It.
Native hashes, sorted sets, lists, streams, geospatial, vectors, Lua scripting, and transactions — all running in-process at 0.0015ms per operation.
Start Free Trial Schedule Demo