Redis Memory Optimization: Cut Usage 50%

April 29, 2026 | 14 min read | Engineering

Redis memory is expensive. On AWS ElastiCache, you pay approximately $0.068 per GB per hour for a cache.r7g.large instance. That is $49 per GB per month. If your Redis cluster uses 64 GB, you are spending $3,136 per month purely on memory. And here is the part most teams do not realize: a significant portion of that memory is not storing your data. It is storing Redis's internal overhead for tracking your data.

Every key in Redis carries overhead. There is a dictEntry structure (three pointers, 24 bytes on 64-bit systems). There is a redisObject wrapper (16 bytes). There are two SDS (Simple Dynamic Strings) headers for the key name and the value (at least 3 bytes each, often more depending on string length). There is memory allocator overhead (jemalloc rounds allocations up to size classes). When you add it all up, a single key-value pair where both the key and value are short strings costs 70 to 90 bytes of metadata overhead, regardless of how large or small your actual data is.

At 10 million keys with 100-byte values, your data is 1 GB. But Redis uses approximately 1.7 GB. That extra 700 MB is pure metadata. You are paying $34 per month to store pointers and headers, not data. At 100 million keys, the overhead is 7 GB, which costs $343 per month. This article walks through seven techniques to cut that overhead, ranked by impact and implementation complexity.

70+ bytes

Overhead Per Redis Key

700 MB

Metadata at 10M Keys

$0.068/GB/hr

ElastiCache Memory Cost

Technique 1: Use Hash Ziplist Encoding for Small Objects

How It Works

Redis hashes have two internal encodings: hashtable and ziplist (called listpack in Redis 7+). The hashtable encoding uses the same per-field overhead as top-level keys: dictEntry, redisObject, SDS headers. The ziplist encoding stores all fields and values in a single contiguous byte array with minimal per-entry overhead (just a few bytes for length prefixes). The difference is dramatic.

Consider storing user session data. With individual top-level keys, 5 fields for user 12345 look like this: user:12345:name, user:12345:email, user:12345:role, user:12345:avatar, user:12345:last_login. Each key carries 70+ bytes of overhead. Five keys means 350+ bytes of overhead for perhaps 200 bytes of actual data. With a single hash key user:12345 containing 5 fields, Redis stores the entire hash in ziplist encoding if it meets two conditions: fewer than 128 fields (controlled by hash-max-ziplist-entries) and all field values shorter than 64 bytes (controlled by hash-max-ziplist-value). In ziplist encoding, the total overhead for the hash is one dictEntry plus one redisObject plus the ziplist itself, which stores all 5 field-value pairs with approximately 5 bytes of overhead per entry. Total overhead drops from 350 bytes to roughly 95 bytes. That is a 3.7x reduction for this example. With more fields, the savings are even larger.

Implementation

# Instead of 5 separate keys (350+ bytes overhead):
SET user:12345:name "Alice Chen"
SET user:12345:email "alice@example.com"
SET user:12345:role "admin"
SET user:12345:avatar "/img/alice.jpg"
SET user:12345:last_login "1745900400"

# Use 1 hash key (ziplist encoding, ~95 bytes overhead):
HSET user:12345 name "Alice Chen" email "alice@example.com" \
     role "admin" avatar "/img/alice.jpg" last_login "1745900400"

# Verify encoding:
DEBUG OBJECT user:12345
# Output: encoding:ziplist (or listpack in Redis 7+)

# Tune thresholds if needed:
CONFIG SET hash-max-ziplist-entries 128
CONFIG SET hash-max-ziplist-value 64

Memory Saved

For workloads with many small multi-field objects, converting from individual keys to hash-based ziplist encoding typically reduces memory usage by 50-80%. In one production migration, a session store with 2 million users (10 fields each, 20 million keys) consumed 3.8 GB. After consolidating to 2 million hash keys with 10 fields each, memory dropped to 890 MB. That is a 77% reduction, saving $142 per month on a single ElastiCache node.

The Tradeoff

Ziplist encoding is slower for reads and writes when the number of fields grows large, because Redis must scan the ziplist linearly. Below 128 fields, this scan is negligible (sub-microsecond). Above 128 fields, Redis automatically promotes the hash to hashtable encoding, which is O(1) but consumes more memory. The sweet spot is hashes with 10-100 fields and values under 64 bytes. If your objects have more than 128 fields or large values, this technique does not apply.

Technique 2: Compress Values with LZ4 or Zstandard Before Storing

How It Works

Redis stores values as-is. It does not compress them. If you store a 4 KB JSON object, Redis allocates 4 KB (rounded up by jemalloc) plus the metadata overhead. But JSON is highly compressible. A typical JSON API response compresses 3-5x with LZ4 and 5-8x with Zstandard. A 4 KB JSON value becomes 800 bytes with LZ4 or 500 bytes with Zstandard. Over millions of keys, this saves gigabytes.

The CPU cost of compression is minimal for the memory savings. LZ4 compresses at 3-4 GB/s and decompresses at 5-6 GB/s on modern hardware. For a 4 KB value, compression takes approximately 1 microsecond and decompression takes less than 1 microsecond. Zstandard is slower (500 MB/s compress, 1.5 GB/s decompress at default level) but achieves better ratios. For a 4 KB value, Zstandard compression takes approximately 8 microseconds and decompression takes approximately 3 microseconds. Both are negligible compared to the network round-trip time to Redis (50-500 microseconds).

Implementation

# Python example with LZ4
import lz4.frame
import json

def cache_set(redis_client, key, value, ttl=300):
    serialized = json.dumps(value).encode('utf-8')
    compressed = lz4.frame.compress(serialized)
    redis_client.setex(key, ttl, compressed)

def cache_get(redis_client, key):
    compressed = redis_client.get(key)
    if compressed is None:
        return None
    decompressed = lz4.frame.decompress(compressed)
    return json.loads(decompressed)

# Before: 4,096 bytes stored in Redis per value
# After:  ~900 bytes stored in Redis per value (4.5x reduction)

Memory Saved

The savings depend entirely on value compressibility. JSON, XML, HTML, and repeated-structure data compress 3-8x. Binary data, already-compressed images, and random bytes do not compress at all. For a cache storing JSON API responses averaging 2 KB each, LZ4 compression typically saves 60-75% of value memory. At 5 million keys, that is 6 GB saved, or $294 per month on ElastiCache.

The Tradeoff

You add CPU overhead on every read and write. For LZ4, this is under 2 microseconds for typical values and is invisible in practice. For Zstandard, it is 3-10 microseconds, which may matter if your application is latency-sensitive at the microsecond level. The bigger tradeoff is that you cannot use Redis commands to inspect or manipulate compressed values. APPEND, GETRANGE, INCR, and other manipulation commands do not work on compressed data. You must decompress, modify, recompress, and write back. This is fine for cache workloads where values are read and written atomically, but it is not suitable for values that are incrementally modified in place.

Technique 3: Use Integer Encoding for Numeric Keys and Values

How It Works

Redis has a special optimization for integers. When a value is a string that represents an integer between 0 and 9999, Redis does not allocate an SDS string at all. Instead, it uses a shared object from a pre-allocated pool. For integers between 10000 and the max value of a long long, Redis stores the integer directly in the redisObject pointer field, eliminating the SDS allocation entirely. This saves 8-24 bytes per value compared to storing the same number as a string.

This optimization is automatic but only triggers when the value is a pure integer string. Storing "12345" triggers integer encoding. Storing "12345 " (with a trailing space) does not. Storing "12345.67" does not. If your values are numeric IDs, timestamps, counters, or flags, ensure they are stored as pure integer strings to take advantage of this encoding.

Implementation

# These trigger integer encoding (no SDS allocation):
SET counter:page_views "98234"
SET user:12345:last_seen "1745900400"
SET feature:dark_mode "1"

# These do NOT trigger integer encoding (SDS allocated):
SET counter:page_views "98,234"     # comma
SET user:12345:last_seen "1745900400.5"  # decimal
SET feature:dark_mode "true"       # not an integer

# Verify encoding:
DEBUG OBJECT counter:page_views
# encoding:int  (good - no SDS overhead)

DEBUG OBJECT feature:dark_mode
# encoding:embstr  (SDS allocated - using more memory)

Memory Saved

For workloads with many numeric values (counters, timestamps, flags, IDs), integer encoding saves 15-25% of value memory. The savings are modest per key (8-24 bytes) but add up at scale. At 10 million numeric keys, that is 80-240 MB saved. The best part is that this optimization is free: it requires no code changes beyond ensuring your numeric values are stored as clean integer strings.

The Tradeoff

There is effectively no tradeoff. Integer encoding is strictly better than string encoding for integer values. The only risk is assuming a value is integer-encoded when it is not. Use DEBUG OBJECT to verify during development, and audit your key patterns periodically to ensure new code paths are not accidentally storing numeric values with non-integer formatting.

Technique 4: Set maxmemory-policy to allkeys-lfu

How It Works

When Redis reaches its maxmemory limit, it must evict keys to make room for new writes. The eviction policy determines which keys are evicted. The default policy in many configurations is noeviction, which rejects new writes entirely. This is the worst possible policy for a cache because it means your application starts failing once Redis is full, rather than evicting stale data to make room.

The allkeys-lru policy evicts the least recently used keys. This is better than noeviction but still suboptimal. LRU treats all keys equally regardless of access frequency. A key accessed once three minutes ago is considered "more recently used" than a key accessed 10,000 times four minutes ago. Under LRU, the frequently-accessed key gets evicted first, which is exactly backward.

The allkeys-lfu policy, available since Redis 4.0, evicts the least frequently used keys. It maintains an approximate access frequency counter for each key using Morris's probabilistic counting algorithm, which requires only 8 bits of additional storage per key. Keys that are accessed frequently are retained even if their most recent access was not the most recent in the keyspace. This matches cache workload behavior far better than LRU, because caches should retain hot data and evict cold data.

Implementation

# Set eviction policy to LFU
CONFIG SET maxmemory-policy allkeys-lfu

# Tune LFU parameters (optional):
# lfu-log-factor controls how fast the frequency counter saturates.
# Default is 10. Higher values mean it takes more accesses to reach
# max frequency. Lower values make the counter saturate faster.
CONFIG SET lfu-log-factor 10

# lfu-decay-time controls how often the frequency counter decays.
# Default is 1 (decay every 1 minute). Set to 0 to never decay.
# Higher values make the counter decay slower.
CONFIG SET lfu-decay-time 1

# Verify:
CONFIG GET maxmemory-policy
# "allkeys-lfu"

# Check which keys have low frequency (eviction candidates):
OBJECT FREQ mykey
# Returns the LFU frequency counter (0-255)

Memory Saved

Switching from LRU to LFU does not directly reduce memory usage. Instead, it improves cache hit rate by 5-15% for workloads with skewed access distributions (which is most workloads). Higher hit rate means fewer cache misses, which means fewer database queries, which means you can serve the same workload with a smaller cache. In practice, teams that switch from LRU to LFU and then gradually reduce their maxmemory allocation find they can reduce Redis memory by 10-20% without degrading hit rate. The savings come from better eviction decisions, not from reduced overhead.

The Tradeoff

LFU adds 8 bits of overhead per key for the frequency counter. At 10 million keys, that is 10 MB, which is negligible. The real tradeoff is behavioral: LFU can be slow to adapt to changing access patterns. A key that was hot yesterday but is cold today retains a high frequency counter until it decays. The lfu-decay-time parameter controls this: setting it to 1 minute means counters decay every minute, which is usually fast enough. For workloads with rapidly shifting hot sets, LRU might actually outperform LFU because LRU responds immediately to recency while LFU has decay lag.

Technique 5: Use Key Expiry Aggressively

How It Works

The simplest way to reduce Redis memory usage is to store fewer keys. Many applications set long TTLs (hours or days) or no TTL at all on cache entries, under the assumption that having data in cache is always better than not having it. This assumption is wrong. A cache entry that has not been accessed in 30 minutes is unlikely to be accessed again. It is consuming memory that could be used for actually-hot data. Worse, if Redis hits maxmemory and starts evicting, those cold entries dilute the eviction pool, making it more likely that a hot key gets evicted instead.

Aggressive TTL means setting the shortest TTL that is acceptable for each data category. Session data might need a 30-minute TTL. Feature flags might need a 5-minute TTL. API response caches might only need a 60-second TTL. Configuration values might need a 10-minute TTL. The key insight is that TTL should match the acceptable staleness window, not the expected lifetime of the data. If you can tolerate 60 seconds of staleness for an API response, set a 60-second TTL, even if the underlying data changes once per hour.

Implementation

# Set TTLs appropriate to each data category:

# Session data: 30 minutes
SETEX session:abc123 1800 "{...}"

# Feature flags: 5 minutes (changes are rare, short TTL is fine)
SETEX feature:dark_mode 300 "1"

# API response cache: 60 seconds
SETEX api:/users/123 60 "{...}"

# Rate limit counters: match the rate limit window
SETEX ratelimit:user:123 60 "47"

# Audit your keyspace for keys without TTL:
# This shows how many keys have no expiry set
INFO keyspace
# db0:keys=10234567,expires=7823456,avg_ttl=234567

# 10.2M keys but only 7.8M have TTLs set.
# 2.4M keys have no expiry and will live forever.

Memory Saved

The savings depend on how many keys are currently living longer than they need to. In a typical production deployment, 20-40% of keys have expired in practical terms (nobody will ever read them again) but have not been evicted because they have no TTL and Redis has not hit maxmemory. Removing these zombie keys typically saves 20-40% of total memory. The best diagnostic is to sample 1,000 random keys with RANDOMKEY, check their TTL and last access time, and calculate what percentage have not been accessed in the last hour. That percentage is your potential savings.

The Tradeoff

Shorter TTLs mean more cache misses, which mean more database queries. The tradeoff is straightforward: memory savings versus miss rate. The key is to set TTLs based on access frequency, not on data freshness. If a key is accessed 100 times per second, a 60-second TTL means it will be refreshed every 60 seconds but served from cache for 99.999% of reads. The one miss every 60 seconds is negligible. If a key is accessed once per hour, a 60-second TTL means it is almost never in cache, and you should either extend the TTL or not cache it at all.

Technique 6: Eliminate Duplicate Data Across Keys

How It Works

Many applications store the same data under multiple keys for different access patterns. A user's profile might be stored as user:123 (by ID), user:email:alice@example.com (by email), and user:username:alice (by username). The profile data is duplicated three times. If the profile is 500 bytes, that is 1.5 KB of value storage plus three sets of key overhead (210+ bytes) for data that could be stored once.

The fix is to use indirection: store the data once under the canonical key, and store pointers (the canonical key name) under the secondary keys. This trades one extra Redis lookup on secondary-key access for a 2-3x reduction in memory usage for duplicated data.

Implementation

# Before: 3 copies of the same 500-byte profile
SET user:123 "{full profile JSON, 500 bytes}"
SET user:email:alice@example.com "{full profile JSON, 500 bytes}"
SET user:username:alice "{full profile JSON, 500 bytes}"
# Total: 1,500 bytes data + 210 bytes overhead = 1,710 bytes

# After: 1 copy + 2 pointers
SET user:123 "{full profile JSON, 500 bytes}"
SET user:email:alice@example.com "user:123"
SET user:username:alice "user:123"
# Total: 500 + 8 + 8 = 516 bytes data + 210 bytes overhead = 726 bytes
# Savings: 57%

# Lookup by email requires 2 Redis calls:
canonical_key = redis.get("user:email:alice@example.com")  # "user:123"
profile = redis.get(canonical_key)  # full profile

# Or use a Lua script for atomic resolution:
EVAL "local k = redis.call('GET', KEYS[1]); \
      if k then return redis.call('GET', k) end; \
      return nil" 1 "user:email:alice@example.com"

Memory Saved

The savings scale with the number of secondary access patterns and the size of the duplicated values. If you have 3 access patterns per entity with 500-byte values, deduplication saves approximately 57% of value memory for those entities. For 1 million entities with 3 keys each, that is 1 GB saved. At 10 access patterns per entity (common in search-heavy applications), the savings approach 90%. The overhead of the pointer keys is minimal: each pointer is just the canonical key name, typically 10-30 bytes.

The Tradeoff

Secondary key lookups now require two Redis round-trips instead of one. On a local connection (100 microseconds per round-trip), this adds 100 microseconds. On a cross-AZ connection (300 microseconds per round-trip), it adds 300 microseconds. You can mitigate this with a Lua script that performs both lookups atomically in a single round-trip, or with a pipeline that sends both GET commands in one batch. The Lua approach is better because it avoids the race condition where the canonical key is deleted between the two GET calls. The tradeoff is acceptable for most workloads because secondary key lookups (by email, by username) are typically less frequent than primary key lookups (by ID).

Technique 7: Move Hot Reads to an In-Process L1 Cache

How It Works

This is the most impactful technique, but it works differently from the others. Techniques 1-6 reduce the memory that each key consumes inside Redis. Technique 7 reduces the number of keys that Redis needs to hold at all, by moving the hottest data to an in-process cache that lives inside your application's memory space.

The insight is that most workloads follow a power-law distribution. A small number of keys account for the majority of reads. If 5% of your keys serve 80% of your reads, and you cache those keys in-process, Redis only needs to serve the remaining 20% of reads. The 80% of reads that hit L1 never touch Redis, never consume Redis CPU, and never occupy a Redis connection. The practical effect is that you can reduce your Redis instance size by the percentage of total memory that was being consumed by hot keys, because those keys are now served from application memory.

An in-process hash map lookup takes approximately 31 nanoseconds. A Redis GET over the network takes 100-500 microseconds. That is a 3,000-16,000x latency difference. But the memory benefit is just as important as the latency benefit. Every key you move to L1 is a key that Redis no longer needs to track, which means one fewer dictEntry, one fewer redisObject, one fewer SDS header pair. The 70+ bytes of Redis overhead per key drops to approximately 40 bytes of in-process overhead (the HashMap entry itself).

Implementation

# Python example: L1 in-process cache with TTL
from cachetools import TTLCache
import redis

# L1: in-process, 10,000 keys max, 30-second TTL
l1 = TTLCache(maxsize=10000, ttl=30)

# L2: Redis
l2 = redis.Redis(host='redis.internal', port=6379)

def get_cached(key):
    # Check L1 first (31 nanoseconds)
    value = l1.get(key)
    if value is not None:
        return value

    # L1 miss: check L2 (300 microseconds)
    value = l2.get(key)
    if value is not None:
        l1[key] = value  # Populate L1 for next read
        return value

    # L2 miss: fetch from database
    value = db.query(key)
    if value is not None:
        l2.setex(key, 300, value)  # Populate L2
        l1[key] = value            # Populate L1
    return value

Memory Saved

The memory savings in Redis are proportional to how many hot keys you move to L1. If your hot set is 10,000 keys (which is often enough to absorb 80%+ of reads), Redis sheds approximately 700 KB of overhead for those keys. That sounds small, but the real savings come from the fact that you can now provision a smaller Redis instance. If 80% of your reads hit L1, your Redis instance handles 5x fewer operations per second. You can often downgrade from a cache.r7g.xlarge ($0.136/hr) to a cache.r7g.large ($0.068/hr), saving 50% on your ElastiCache bill. The memory savings in Redis are modest; the infrastructure cost savings from needing a smaller instance are substantial.

More importantly, the L1 approach compounds with all the other techniques. You apply techniques 1-6 to reduce per-key overhead inside Redis. Then you apply technique 7 to reduce the number of keys Redis holds. The combined effect is multiplicative: if techniques 1-6 reduce per-key memory by 50% and technique 7 reduces key count by 80%, your total Redis memory usage drops by 90%.

The Tradeoff

L1 cache introduces eventual consistency between application instances. If instance A writes a new value to Redis and instance B has the old value in its L1, instance B will serve stale data until its L1 TTL expires. For a 30-second L1 TTL, the maximum staleness is 30 seconds. For most cache workloads (session data, feature flags, API responses), 30 seconds of staleness is acceptable. For workloads that require strong consistency (financial transactions, inventory counts), do not cache in L1.

The second tradeoff is application memory usage. Each L1 entry consumes application heap memory. At 10,000 keys with 500-byte average values, L1 uses approximately 5 MB per application instance. At 100,000 keys, it uses 50 MB. This is typically negligible compared to the application's existing memory footprint, but it should be monitored to prevent L1 from growing unboundedly. Set a maxsize limit on your L1 cache to cap memory usage.

Where the 70+ Bytes Come From

Redis's per-key overhead breaks down as follows. A dictEntry contains three pointers (key, value, next): 24 bytes on 64-bit systems. The redisObject wrapper contains type, encoding, LRU clock, and refcount fields plus a pointer to the actual data: 16 bytes. The key name SDS has a header (3-17 bytes depending on length class) plus the string data plus a null terminator. The value SDS has the same structure. jemalloc rounds each allocation to the next size class (8, 16, 32, 48, 64, 80, 96, ...). A short key "user:123" (9 bytes) with SDS-8 header (3 bytes) = 12 bytes, rounded to 16 by jemalloc. A short value "session_data" (12 bytes) with SDS-8 header (3 bytes) = 15 bytes, rounded to 16. Total: 24 (dictEntry) + 16 (robj for key) + 16 (key SDS) + 16 (robj for value) + 16 (value SDS) = 88 bytes minimum. With additional dict bookkeeping and hash table bucket pointers, the effective overhead is 70-90 bytes per key-value pair.

Putting It All Together

The seven techniques above attack Redis memory usage from three angles: reducing per-key overhead (techniques 1, 3), reducing per-value size (techniques 2, 6), and reducing key count (techniques 4, 5, 7). The most effective approach is to apply all three angles simultaneously.

Technique	Effort	Memory Reduction	Best For
1. Hash ziplist encoding	Medium	50-80%	Multi-field objects
2. Value compression (LZ4)	Low	60-75%	JSON/text values > 200 bytes
3. Integer encoding	Low	15-25%	Numeric values
4. allkeys-lfu eviction	Low	10-20%	All cache workloads
5. Aggressive TTLs	Low	20-40%	Keys without expiry
6. Deduplicate values	Medium	40-60%	Multi-key access patterns
7. In-process L1 tier	Medium	50%+ infra cost	Hot-key-heavy workloads

Start with the low-effort changes: set maxmemory-policy allkeys-lfu (technique 4), audit keys without TTLs and add them (technique 5), and verify that numeric values are stored as integers (technique 3). These three changes require minimal code changes and typically save 20-30% in aggregate.

Then tackle the medium-effort changes: consolidate multi-field objects into hash keys (technique 1), add LZ4 compression for values over 200 bytes (technique 2), and deduplicate multi-key access patterns (technique 6). These require application code changes but offer 40-80% additional savings on applicable key patterns.

Finally, add an in-process L1 cache tier (technique 7). This is the single highest-leverage change because it reduces not just memory usage but also Redis CPU, connection count, and network bandwidth. When 80% of reads never reach Redis, you can downgrade your Redis instance to a smaller (cheaper) size class.

The Bottom Line

Redis memory overhead is a tax that scales with key count, not with data size. At 10 million keys, you pay 700 MB just for metadata. The seven techniques above attack this tax from every angle: ziplist encoding reduces per-key overhead, compression reduces per-value size, TTLs reduce key count, and L1 tiering removes hot keys from Redis entirely. Applied together, they routinely cut Redis memory usage by 50% or more. That is not just a performance improvement. At $0.068 per GB per hour, it is a direct reduction in your infrastructure bill.

In-process L1 caching at 31 nanoseconds. Cut your Redis memory bill in half.

brew install cachee Speed Up Redis: 7 Fixes