Skip to main content
Why CacheeHow It Works
All Verticals5G TelecomAd TechAI InfrastructureFraud DetectionGamingTrading
PricingDocsBlogSchedule DemoLog InStart Free Trial
← Back to Blog
Database

PostgreSQL Cache Hit Ratio Below 99%? Here’s Your Fix

Run this query right now: SELECT sum(heap_blks_hit) / sum(heap_blks_hit + heap_blks_read) AS ratio FROM pg_statio_user_tables; If the number that comes back is below 0.99, your PostgreSQL instance is reading data from disk instead of serving it from its buffer cache — and every query in your application is paying for it. A cache hit ratio of 95% sounds acceptable until you realize that the remaining 5% means one out of every twenty page requests triggers a physical disk read. On a table with 10 million rows receiving 2,000 queries per second, that is 100 disk reads per second that should have been memory lookups. Each disk read costs 1–10ms depending on your storage backend. Each memory read costs 0.1–0.3 microseconds. The difference is four orders of magnitude, and it compounds on every query your application runs.

Understanding PostgreSQL Buffer Cache

PostgreSQL does not read directly from disk on every query. It maintains a region of shared memory called the buffer cache, controlled by the shared_buffers configuration parameter. When PostgreSQL needs a page of data (an 8KB block from a table or index), it first checks whether that page is already in the buffer cache. If it is, that is a heap_blks_hit — a buffer cache hit. If it is not, PostgreSQL must read the page from disk (or from the operating system’s page cache), and that is a heap_blks_read. The cache hit ratio is the proportion of page accesses that were served from the buffer cache without touching disk.

The widely accepted target is 99% or higher. This is not an arbitrary number. At 99%, only 1 in 100 page accesses goes to disk. At 95%, 1 in 20 goes to disk — a 5x increase in disk I/O. At 90%, 1 in 10 — a 10x increase. The relationship between cache hit ratio and query latency is not linear; it is exponential in its impact. A drop from 99% to 97% does not feel like a 2% degradation. It feels like queries that used to take 2ms now take 8ms, because the queries that hit disk are 100–1000x slower than the ones served from memory, and they drag the average up disproportionately.

You can check your per-table cache hit ratio with this more detailed query:

-- Per-table buffer cache hit ratio SELECT relname, heap_blks_read AS disk_reads, heap_blks_hit AS cache_hits, CASE WHEN heap_blks_hit + heap_blks_read = 0 THEN 0 ELSE ROUND(heap_blks_hit::numeric / (heap_blks_hit + heap_blks_read), 4) END AS hit_ratio FROM pg_statio_user_tables ORDER BY disk_reads DESC;

Look for tables with a hit ratio below 0.99. Those are the tables where your database is going to disk, and they are the ones dragging down your overall performance. Pay special attention to large tables that are queried frequently — they are the biggest offenders because they have the most pages to cache and the highest probability of cache misses.

Key insight: The pg_statio_user_tables view tracks cumulative statistics since the last pg_stat_reset(). If your database has been running for months, the ratio reflects long-term behavior. To see recent performance, reset stats with SELECT pg_stat_reset();, wait 10–15 minutes under normal load, then re-query. This gives you an accurate snapshot of your current working set versus your buffer cache capacity.

Why Your Ratio Is Low

A cache hit ratio below 99% has a root cause, and it is almost always one of five things. Understanding which one applies to your situation determines the fix.

1. shared_buffers is too small

The default shared_buffers in PostgreSQL is 128MB. That is enough for a toy database. If your working set — the set of table and index pages that are actively queried — is 4GB, and your buffer cache is 128MB, only 3.2% of your working set fits in cache. The standard recommendation is to set shared_buffers to 25% of total system RAM, with a ceiling of about 8–16GB. On a server with 64GB of RAM, that means 16GB. On a server with 16GB, that means 4GB. Beyond 16GB, the OS page cache often provides diminishing returns, and the PostgreSQL process can become slower at managing the larger shared memory region.

2. Working set exceeds available memory

If your actively-queried data is 40GB and your server has 32GB of RAM, no amount of shared_buffers tuning will get you to 99%. The data simply does not fit. This is a scaling problem, not a configuration problem. Solutions include vertical scaling (more RAM), horizontal scaling (read replicas for read-heavy workloads), table partitioning to reduce the effective working set, or archiving old data that is rarely queried but occupies buffer cache space.

3. Sequential scans on large tables

A single SELECT * FROM orders WHERE status = 'pending' without an index on status triggers a sequential scan that reads every page of the orders table into the buffer cache, potentially evicting pages that were being used by other hot queries. One bad query can trash your entire cache. PostgreSQL does have a “bulk read” ring buffer strategy that limits the damage of sequential scans, but it is not perfect, and queries using work_mem for sorts and hash joins can still evict valuable cached pages.

4. Bloated indexes

Indexes that have not been maintained accumulate dead tuples and become bloated. A B-tree index that should be 200MB balloons to 1.2GB. That bloated index now occupies 6x more buffer cache space, crowding out other data. Worse, index scans traverse more pages to find the same data, increasing both disk reads and cache pressure. The pgstattuple extension can show you index bloat levels.

5. Too many connections

Each PostgreSQL connection consumes memory for its working area. With 500 direct connections (no pooler), the overhead from work_mem, temp_buffers, and per-connection state can consume gigabytes of RAM that would otherwise be available for the OS page cache, which acts as a second layer of caching behind shared_buffers. Connection overhead indirectly lowers your hit ratio by starving the system of usable memory.

The PostgreSQL Fixes

Each of the causes above has a direct fix at the PostgreSQL level. Here are the concrete commands.

Increase shared_buffers

-- Check current setting SHOW shared_buffers; -- Set to 25% of RAM (requires restart) -- In postgresql.conf: shared_buffers = '4GB' # For a 16GB server effective_cache_size = '12GB' # Tell planner about total cache (RAM - shared_buffers) -- Restart PostgreSQL sudo systemctl restart postgresql

Add missing indexes

-- Find sequential scans on large tables (these need indexes) SELECT relname, seq_scan, seq_tup_read, idx_scan, CASE WHEN seq_scan > 0 THEN seq_tup_read / seq_scan ELSE 0 END AS avg_rows_per_seq_scan FROM pg_stat_user_tables WHERE seq_scan > 100 ORDER BY seq_tup_read DESC; -- Create the index (concurrently to avoid locking) CREATE INDEX CONCURRENTLY idx_orders_status ON orders (status);

VACUUM and reindex regularly

-- Check table bloat SELECT relname, n_dead_tup, n_live_tup, ROUND(n_dead_tup::numeric / GREATEST(n_live_tup, 1) * 100, 2) AS dead_pct FROM pg_stat_user_tables WHERE n_dead_tup > 10000 ORDER BY n_dead_tup DESC; -- Reclaim space VACUUM (VERBOSE) orders; -- Rebuild bloated indexes (concurrently) REINDEX INDEX CONCURRENTLY idx_orders_status;

Use connection pooling

# PgBouncer configuration (pgbouncer.ini) [databases] mydb = host=127.0.0.1 port=5432 dbname=mydb [pgbouncer] listen_port = 6432 pool_mode = transaction # Release connection after each transaction max_client_conn = 1000 # Accept up to 1000 app connections default_pool_size = 25 # Only 25 actual PG connections # 500 direct connections → 25 pooled connections # Frees ~8GB of RAM for OS page cache

These fixes work. With proper shared_buffers, targeted indexes, regular vacuuming, and connection pooling, most PostgreSQL instances can reach and sustain a 99%+ buffer cache hit ratio. But here is the problem: even at 99%, every query still costs 1–5ms.

The Application-Layer Fix

A 99% PostgreSQL buffer cache hit ratio means your database is serving data from memory instead of disk. That is good. But “from memory” does not mean “free.” Every query still pays the full cost of the PostgreSQL execution pipeline: parse the SQL statement, plan the optimal execution strategy (choosing between index scan, sequential scan, hash join, etc.), execute the plan (traversing the buffer cache, assembling tuples, applying WHERE filters), and serialize the result set to send over the network to your application. Even a simple SELECT * FROM users WHERE id = 123 with a primary key index scan and a perfect buffer cache hit takes 0.5–2ms by the time the result reaches your application. A more complex query with joins, aggregations, or subqueries takes 3–10ms even when 100% of the data is cached in memory.

An application-layer L1 cache eliminates the query entirely. Instead of asking PostgreSQL to parse, plan, execute, and return the data, the application looks up the result in its own process memory. That lookup takes 1.5 microseconds — not 1.5 milliseconds, microseconds. There is no SQL parsing, no query planning, no buffer cache traversal, no network round-trip to the database. The data is already in the application’s address space as a native object. No serialization, no deserialization. A hash table lookup and a pointer dereference.

The math: At 99% PG buffer cache hit ratio, your average query takes ~2ms. With an application-layer L1 cache at 95% hit rate, 95% of queries resolve in 1.5µs and only 5% reach PostgreSQL at 2ms. Your effective average latency drops from 2ms to 0.1ms — a 20x improvement. And those 5% of queries that do reach PostgreSQL hit a nearly idle database with plenty of buffer cache headroom. See predictive caching for how AI pre-warming pushes L1 hit rates above 99%.

The Compound Effect

The real power comes from stacking both layers. PostgreSQL buffer cache ensures that the queries which do reach the database are served from memory, not disk. The application L1 cache ensures that most queries never reach the database at all. The result is near-zero database load: your PostgreSQL instance idles at 5–10% CPU instead of 70%, your connection pool stays well below capacity, and your buffer cache hit ratio paradoxically improves because fewer queries means less cache pressure means less eviction. The two layers reinforce each other.

1.5µs L1 Cache Lookup
2ms PG Buffer Hit Query
1,333× Faster with L1
~5% Database CPU After L1

Your PostgreSQL buffer cache hit ratio matters. Getting it above 99% with the fixes described here is table stakes for any production database. But the ceiling of database-level caching is the database itself. Every query still parses, plans, and executes. Every result still serializes over the network. The application-layer L1 cache removes that ceiling entirely. The fastest query is the one you never send.

Further Reading

Also Read

Stop Tuning shared_buffers. Start Caching Query Results.

An in-process L1 cache at 1.5µs eliminates the query entirely — no parsing, no planning, no network hop.

Start Free Trial Schedule Demo