Postgres pg_stat_slru: Monitoring the Caches Nobody Watches

Postgres version13+TopicObservability, SLRU cachesLevelIntermediate

Postgres has a category of caches that almost no monitoring setup watches: SLRU caches. Simple Least-Recently-Used. They are small, fixed-size page caches that hold transaction metadata — the stuff Postgres needs to decide whether a row is visible to your query.

The pg_stat_slru view, introduced in Postgres 13, exposes per-pool statistics for each SLRU cache. Most people have never looked at it.

What SLRU caches hold

There are several distinct SLRU pools, each tracking different transaction metadata:

pg_subtrans — subtransaction parent tracking. Used when your transactions use savepoints or nested subtransactions.
pg_multixact — multi-transaction IDs. Used when multiple transactions hold row locks simultaneously, typically under SELECT FOR SHARE or SELECT FOR KEY SHARE.
pg_commit_ts — commit timestamps. Only populated when track_commit_timestamp = on.
pg_notify — LISTEN/NOTIFY payloads.
pg_serial — serializable transaction conflict tracking.

Each pool has a fixed number of pages — typically 128 or 256 pages per pool, roughly 1–2MB total. You cannot tune the size without recompiling Postgres.

Reading pg_stat_slru

The view looks like this:

SELECT name, blks_zeroed, blks_hit, blks_read, blks_written,
       blks_exists, flushes, truncates
FROM pg_stat_slru
ORDER BY name;

Columns that matter most:

blks_hit — page found in the SLRU buffer, no I/O needed
blks_read — page had to be read from disk
blks_zeroed — fresh page allocated (new transaction ID range being used)
flushes — the entire SLRU was flushed to disk (checkpoint behavior)
truncates — old pages removed after transaction ID advancement

Diagnostic patterns

High blks_read on pg_subtrans indicates long-running transactions that generate many savepoints or subtransactions. Each subtransaction creates an entry in pg_subtrans. If a transaction opens enough subtransactions that older entries have been evicted from the SLRU buffer, Postgres reads them back from disk. This degrades performance and is a sign you should reduce savepoint frequency or break work into shorter transactions.

High blks_read on pg_multixact correlates with heavy SELECT FOR SHARE or SELECT FOR KEY SHARE usage under concurrency. When multiple transactions lock the same rows, Postgres creates a MultiXact ID to track all the holders. A bloated multixact is both a SLRU pressure source and a sign you may be approaching the multixact wraparound horizon.

High blks_read on pg_notify occurs when LISTEN/NOTIFY is used heavily and the notification queue fills faster than it is consumed. This typically means idle listener connections that have stopped processing or a notification producer that is publishing at a rate the consumers cannot sustain.

What pg_stat_slru does not show

No per-transaction breakdown. No correlation with lock waits. No indication of how SLRU pressure translates to query latency. For that, cross-reference with pg_stat_activity and look at wait events — specifically SLRURead wait events if they appear.

Operational guidance

The SLRU caches are not tunable. The right response to SLRU contention is always changing the workload pattern:

High pg_subtrans reads → fewer savepoints, shorter transactions, lower subtransaction nesting
High pg_multixact reads → reduce concurrent row-level share locking, review concurrency design
High pg_notify reads → reduce notification volume, ensure consumers are keeping up

To monitor trends, snapshot pg_stat_slru and compare blks_read deltas over time. A slowly increasing blks_read rate in a specific pool is a workload signal worth investigating before it affects latency.

---

Building in public at builds.anethoth.com — proof that a product is really being built.