Redis KEYS Command in Production: The Performance Killer You Didn't Know About

You have a debug endpoint. Maybe it lists all active sessions. Maybe it counts users by prefix. Somewhere in your codebase, there is a line that calls KEYS user:*. In development, with 100 keys in your local Redis, it returns instantly. In staging, with 50,000 keys, it takes 20 milliseconds — barely noticeable. In production, with 10 million keys, it blocks Redis for 3 full seconds. During those 3 seconds, every other Redis command — every GET, every SET, every cache lookup from every service in your infrastructure — waits. Nothing moves. Your entire caching layer freezes because of a single debug endpoint nobody remembered was there. KEYS is the most dangerous command in Redis, and it is almost certainly in your codebase right now.

Why KEYS Is O(n) and Blocks Everything

Redis is single-threaded for command execution. One thread processes every command, one at a time, in sequence. When a command is running, nothing else runs. For most commands — GET, SET, INCR, HGET — this is not a problem because these operations complete in microseconds. They touch one key, perform one operation, and return. The single-threaded model actually helps here: no locks, no contention, no coordination overhead. It is why Redis is fast.

KEYS breaks this model completely. When you run KEYS user:*, Redis must iterate over every single key in the entire database and test each one against your pattern. It does not use an index. There is no shortcut. It is a linear scan of the full keyspace. At 1 million keys, this takes roughly 200–500 milliseconds. At 10 million keys, you are looking at 2–5 seconds. At 50 million keys, it can exceed 15 seconds. And because Redis is single-threaded, every other operation queues behind this scan. No reads. No writes. No pub/sub messages delivered. No Lua scripts executed. Complete operational freeze.

1M Keys = ~300ms block

10M Keys = ~3s block

50M Keys = ~15s block

The Redis documentation itself warns against this in bold text: “Don’t use KEYS in your regular application code.” The warning exists because the Redis team has seen this cause outages at every scale. It is not a theoretical concern. If you have a production Redis instance with more than a few hundred thousand keys and any code path that calls KEYS, you have a latent production incident waiting to trigger.

            The math is simple: Redis processes roughly 3–5 million key comparisons per second during a KEYS scan. At 10 million keys, that is 2–3 seconds of blocking. During that window, if your application handles 5,000 requests per second and each one needs Redis, you have just queued 10,000–15,000 requests behind a debug command. Connection pools exhaust. Timeouts cascade. Health checks fail. Load balancers remove nodes. A single KEYS call can trigger a full cascading failure.
        

The Commands That Also Block

KEYS is the most notorious offender, but it is not the only one. Several common Redis commands become dangerous when applied to large data structures. The pattern is the same: the command must iterate over an unbounded collection, and Redis blocks until it finishes.

HGETALL on a hash with 100,000 fields behaves like KEYS on a smaller scale — Redis must read every field and value, serialize the result, and send it over the wire. A hash with 500,000 fields can block for hundreds of milliseconds. The safe alternative is HSCAN, which iterates in batches with a cursor.

SMEMBERS on a large set is the same problem. Retrieving all 200,000 members of a set means reading and transmitting every element in a single blocking operation. Use SSCAN to iterate in pages. LRANGE 0 -1 on a list with millions of elements has the same characteristics — paginate with bounded offsets instead.

SORT is often overlooked. Sorting a large set or list forces Redis to copy the elements into a temporary array, sort them (O(n log n)), and return the result. On a set with 1 million members, this can block for over a second. Move sorting to your application layer or pre-sort at write time.

// Dangerous commands and their safe replacements

KEYS user:*           // O(n) full keyspace scan — NEVER in production
SCAN 0 MATCH user:* COUNT 100  // Cursor-based, non-blocking

HGETALL big-hash     // O(n) reads every field
HSCAN big-hash 0 COUNT 100     // Cursor-based iteration

SMEMBERS big-set     // O(n) reads every member
SSCAN big-set 0 COUNT 100      // Cursor-based iteration

LRANGE mylist 0 -1   // O(n) reads entire list
LRANGE mylist 0 99             // Bounded pagination

SORT big-set          // O(n log n) in-memory sort
SORT big-set LIMIT 0 100       // Bounded, or sort client-side
        

How to Find Them in Your Code

The first step is finding every instance of these commands in your codebase. Most Redis client libraries wrap these commands as method calls, so a simple text search catches the majority of cases.

# Search your codebase for dangerous patterns
grep -rn "\.keys(" src/ lib/ app/         # Redis KEYS calls
grep -rn "\.hgetall(" src/ lib/ app/      # Full hash reads
grep -rn "\.smembers(" src/ lib/ app/     # Full set reads
grep -rn "KEYS " src/ lib/ app/            # Raw command strings
grep -rn "LRANGE.*0.*-1" src/ lib/ app/  # Unbounded list reads
        

Next, check your Redis SLOWLOG. Redis records every command that exceeds a configurable threshold (default: 10ms). If KEYS or HGETALL appears in your SLOWLOG, you have already been hit.

# Check SLOWLOG for blocking commands
redis-cli SLOWLOG GET 25

# Check command statistics — how many times each command has been called
redis-cli INFO commandstats

# Look for:
# cmdstat_keys:calls=47,usec=3891204,usec_per_call=82791.57
# ^ 47 calls, averaging 82ms each — this is your problem
        

The INFO commandstats output is particularly revealing. It shows total calls, total microseconds, and average microseconds per call for every command type. If cmdstat_keys exists at all, something in your system is calling KEYS. The usec_per_call value tells you how long each call blocks. Anything above 1,000 microseconds (1ms) on a hot Redis instance deserves investigation. Anything above 10,000 microseconds (10ms) is an active performance problem.

You can also rename or disable KEYS entirely in your Redis configuration to prevent accidental use:

# redis.conf — rename KEYS to prevent accidental use
rename-command KEYS "KEYS_DISABLED_DO_NOT_USE"

# Or disable it entirely (empty string)
rename-command KEYS ""
        

The Fixes

Every dangerous command listed above has a safe, non-blocking alternative. The core principle is the same across all of them: iterate in small batches using cursors instead of fetching everything at once. Redis SCAN-family commands return a cursor and a batch of results. You call them repeatedly with the returned cursor until it reaches zero, processing each batch as it arrives. Redis processes each batch quickly and yields between batches, allowing other commands to execute.

Replace KEYS with SCAN

// BEFORE: Blocks Redis for seconds at scale
const keys = await redis.keys('user:*');
const values = await Promise.all(keys.map(k => redis.get(k)));

// AFTER: Non-blocking cursor iteration
let cursor = '0';
const results = [];
do {
  const [next, keys] = await redis.scan(cursor, 'MATCH', 'user:*', 'COUNT', 100);
  cursor = next;
  if (keys.length) {
    const vals = await redis.mget(...keys);  // Pipeline batch GET
    results.push(...vals);
  }
} while (cursor !== '0');
        

Replace HGETALL with HSCAN

// BEFORE: Reads all 100,000 fields at once
const allFields = await redis.hgetall('user:123:analytics');

// AFTER: Cursor-based iteration over hash fields
let cursor = '0';
const fields = {};
do {
  const [next, batch] = await redis.hscan('user:123:analytics', cursor, 'COUNT', 200);
  cursor = next;
  for (let i = 0; i < batch.length; i += 2) {
    fields[batch[i]] = batch[i + 1];
  }
} while (cursor !== '0');
        

Replace Loops with Pipeline + MGET

// BEFORE: N sequential round-trips
const results = [];
for (const id of userIds) {
  results.push(await redis.get(`user:${id}`));  // 1 round-trip per user
}

// AFTER: Single pipeline, single round-trip
const keys = userIds.map(id => `user:${id}`);
const results = await redis.mget(...keys);  // 1 round-trip total
        

            SCAN is not free. It still iterates the keyspace internally, but it does so in small chunks (controlled by the COUNT parameter) and yields between iterations. Other commands can execute between batches. A full SCAN of 10 million keys takes roughly the same total CPU time as KEYS, but spreads it across hundreds of non-blocking increments instead of one atomic freeze. The difference is the difference between a 3-second outage and zero impact.
        

The L1 Approach: Don’t Touch Redis at All

Every fix above makes Redis safer. But the best optimization is the one you never have to make. If your hot data lives in an in-process L1 cache, the overwhelming majority of your reads — 95% to 99% — never reach Redis in the first place. KEYS becomes irrelevant not because you replaced it with SCAN, but because there is no reason to scan Redis when the data you need is already in the application’s own memory.

This is the architectural shift that separates teams who spend their time fighting Redis performance problems from teams who do not have Redis performance problems. When your cache hit rate at the L1 layer is 99%, Redis handles only cold reads, writes, and invalidation events. The keyspace stays smaller because you are not caching every permutation of every query in Redis — L1 handles the hot subset. Fewer keys means SCAN runs faster when you do need it. It also means fewer connections, less network traffic, and lower Redis latency on the operations that do reach the server.

Predictive pre-warming takes this further. Instead of waiting for a cache miss to trigger a Redis lookup, the L1 layer learns access patterns and pre-fetches data before requests arrive. The result is that Redis access patterns become smooth, predictable background operations instead of bursty, latency-sensitive foreground operations. No stampedes. No hot-key contention. No scan commands needed because the data is already where it needs to be.

The L1 difference: With in-process caching, 99% of reads resolve in ~1.5 microseconds from local memory. Redis handles only the remaining 1% — cold reads and writes. Your keyspace stays lean, dangerous commands become irrelevant, and cache stampedes disappear entirely. You stop managing Redis performance because Redis is no longer on the hot path.

Stop Running Dangerous Commands. Start Caching Intelligently.

See how in-process L1 caching eliminates the need for KEYS, SCAN, and most Redis round-trips entirely.

Start Free Trial Schedule Demo

Redis KEYS Command in Production: The Performance Killer You Didn’t Know About