Postgres pg_stat_io: The View That Finally Shows Your Database I/O Pattern
Postgres 16 added pg_stat_io, a system view that exposes I/O activity broken down by backend type and I/O context. It closes a long-standing observability gap. Most teams have not noticed the view exists yet.
Postgres 16, released October 2023, added a system view called pg_stat_io. It exposes I/O activity broken down by backend type and I/O context in a way that previous Postgres versions did not provide directly. The view closes a long-standing observability gap and is one of the more useful additions in recent Postgres history. Most teams have not noticed it exists yet, in part because the surrounding documentation is sparse and in part because reading the columns requires a small mental model.
What the view contains
Each row in pg_stat_io represents one combination of backend type, I/O object, and I/O context. The columns then count operations across categories: reads, writes, writebacks, extends, fsyncs, hits, evictions. The values are cumulative since the last statistics reset, similar to the other pg_stat_* views.
The three categorical columns are the key to interpretation. backend_type is the kind of process: client backend, autovacuum worker, background writer, checkpointer, walwriter, and so on. object is what is being read or written: usually 'relation' (a table or index) or 'temp relation' (a temporary table or hash spill). context is the reason for the I/O: 'normal' (a regular query reading data), 'vacuum' (autovacuum reading or writing), 'bulkread' (a sequential scan reading at high volume), 'bulkwrite' (COPY or CREATE TABLE AS at high volume).
The interesting queries against the view filter and aggregate across these three columns to answer specific questions. The view has dozens of rows because every meaningful combination is represented; the goal is not to read all of them but to query for the specific patterns that matter.
The questions the view answers
Several questions that were previously hard to answer become straightforward. How much I/O is autovacuum generating? Sum the rows where backend_type = 'autovacuum worker'. How much temporary-file I/O is happening, indicating queries that exceed work_mem? Sum the rows where object = 'temp relation'. How much sequential-scan I/O is happening relative to indexed-access I/O? Compare context = 'bulkread' to context = 'normal'.
The shared-buffer hit ratio per backend type becomes computable in a way it was not before. For client backends, the ratio of hits to reads across normal context tells you how often query data is found in shared buffers vs read from disk. For autovacuum workers, the same ratio tells you whether autovacuum is hitting cache or doing significant disk work. The two ratios are often very different, and the previously-available pg_stat_database aggregates blurred them together.
The write-pressure question becomes more diagnosable. The view distinguishes writes (data written to shared buffers' backing storage), writebacks (explicitly flushed-to-OS writes), extends (new pages added to a file as the table grows), and fsyncs (explicit syncs to durable storage). The relative volumes tell you whether you have many small writes, a few large writes, or sync-heavy workloads with checkpoint pressure.
The three diagnostic queries to keep handy
The single most useful query is the per-backend-type I/O summary, which gives a high-level picture of where the database's I/O budget is going:
SELECT backend_type,
sum(reads) AS reads,
sum(writes) AS writes,
sum(hits) AS hits,
round(sum(hits)::numeric / NULLIF(sum(hits)+sum(reads), 0) * 100, 1) AS hit_pct
FROM pg_stat_io
GROUP BY backend_type
ORDER BY sum(reads) + sum(writes) DESC;The output usually shows client backends and autovacuum workers as the largest consumers of I/O, with background writer and checkpointer accounting for the writes that happen during checkpoints. The hit percentage tells you which backend types are hitting cache effectively.
The second useful query is the temp-file diagnostic, which catches queries that exceed work_mem and spill to disk:
SELECT backend_type, context,
sum(reads) AS temp_reads,
sum(writes) AS temp_writes,
sum(extends) AS temp_extends
FROM pg_stat_io
WHERE object = 'temp relation'
GROUP BY backend_type, context
HAVING sum(reads) + sum(writes) > 0
ORDER BY sum(reads) + sum(writes) DESC;A substantial volume in this query indicates that work_mem is too low for the workload, or that specific queries (large sorts, hash joins on big tables) are spilling. The fix is usually to increase work_mem per-session for the offending queries, or to add appropriate indexes that reduce the sort or hash volume.
The third useful query is the bulkread-vs-normal ratio, which gives a signal of how much of the workload is sequential scanning vs indexed access:
SELECT context, sum(reads) AS reads, sum(hits) AS hits
FROM pg_stat_io
WHERE backend_type = 'client backend'
AND object = 'relation'
AND context IN ('normal', 'bulkread')
GROUP BY context;A high bulkread relative to normal usually indicates either expected behavior (analytics workloads, batch jobs, periodic full-table scans) or a missing index that is causing seq-scans. The query does not by itself distinguish the two, but the magnitude often gives a hint.
The reset story
The view is cumulative since the last reset, which by default is server start. The function pg_stat_reset_shared('io') resets just the I/O statistics, leaving other views' counters untouched. This is useful for measuring the I/O impact of a specific change: reset, run the workload, query the view.
The cumulative-counter problem is that a long-running server has counters dominated by historical activity, which can mask recent changes. Many teams have a small cron job that snapshots the view every few minutes to a regular table, allowing rate-of-change queries that show the actual current I/O pattern rather than the average since startup.
What the view does not show
The view shows shared-buffer-level I/O accounting, which is the layer Postgres directly manages. It does not show OS-level page-cache behavior, kernel-level I/O patterns, or storage-controller queueing. A read counted in pg_stat_io as a "read" might have been served from OS page cache without hitting disk at all, in which case the kernel-level metrics will show different numbers.
The view does not show I/O latency, only counts. The number of reads does not tell you how long they took. For latency information you need the OS-level metrics or the EXPLAIN ANALYZE timing for specific queries. The two perspectives complement each other: pg_stat_io tells you the volume and distribution of I/O, OS metrics tell you whether the I/O is fast or slow.
The view does not retain history. The counters are cumulative since reset, but the rate of change is not stored. To see how I/O patterns change over time, you need to sample the view periodically and store the snapshots externally.
The before-and-after comparison
The previous state of Postgres I/O observability was a collection of overlapping views with subtle differences. The pg_stat_bgwriter view showed background writer and checkpointer activity but not client-backend or autovacuum I/O. The pg_stat_database view showed cumulative reads and hits per database but blurred backend types together. The pg_statio_user_tables view showed per-table heap and index I/O but not broken down by backend type or context.
The new view unifies the breakdowns. The per-backend, per-context, per-object structure makes the dimensions explicit and queryable. The cost is that the view has more rows and requires explicit GROUP BY for most useful queries; the benefit is that the questions become straightforwardly expressible in SQL rather than requiring multi-view joins or external instrumentation.
The connection across our products
Our four products (DocuMint for PDF invoice generation, CronPing for cron job monitoring, FlagBit for feature flags, and WebhookVault for webhook debugging) are SQLite-based at our current scale. The pg_stat_io diagnostic does not apply directly because SQLite has a much simpler I/O model and a much smaller observability surface.
The eventual migration of one or more of these products to Postgres will inherit the I/O observability infrastructure that pg_stat_io provides. The discipline of querying the view at the same time each day and watching for changes in pattern is one of the small operational habits that catches problems early. The cost of building the habit is low; the value when a problem emerges is high.
Three observations
First: pg_stat_io is one of those system views that rewards investment before you need it. Running the diagnostic queries against a healthy database and getting a sense of the typical pattern is what makes the during-incident usage fast and correct. The cost of doing this is small; the value during an incident is the difference between knowing your baseline and not knowing it.
Second: the view exists because Postgres has been gradually moving toward better observability, but the development pattern is decade-scale. The roughly 30-year history of Postgres has produced a database that is easier to operate today than at any previous point, but the easier-to-operate part is a series of incremental improvements, each addressing a specific previously-hard question. The pg_stat_io view is one such increment, useful in proportion to how much the team invests in understanding it.
Third: the gap between database features being available and database features being used is consistently larger than database vendors expect. Many production teams are running Postgres versions with features they have not adopted, often for years after the features ship. The reasons are usually a combination of upgrade lag, documentation discoverability, and the simple fact that operational improvements compete with feature work for attention. The discipline of reading the release notes for the version your database is on, and identifying the features that would have helped during your last three incidents, is one of the highest-leverage improvements an experienced team can make.
The deeper observation is that operational maturity in any infrastructure tool is mostly about building a small set of named diagnostic queries that the team runs by reflex during incidents and during routine reviews. The queries do not need to be clever; they need to be available. The pg_stat_io view fits cleanly into that mental model: three or four canonical queries, run regularly, become a baseline that makes anomalies visible. The investment in setting that up is the difference between a team that knows what its database is doing and a team that guesses.