Postgres pg_stat_progress: Real-Time Diagnostics for Long-Running Operations
VACUUM and CREATE INDEX and COPY and ANALYZE and CLUSTER and base backup can all run for hours. Postgres has dedicated progress views that tell you exactly where each one is in real time. Most teams discover them during the second incident.
Long-running database operations have a particular flavour of operational anxiety. A VACUUM on a large table can take hours. A CREATE INDEX CONCURRENTLY on a billion-row table can take a day. A pg_basebackup on a terabyte cluster can run overnight. When the operation is in flight, the standard diagnostic question is the one the standard diagnostic tools cannot answer: how far along is it, and how long until it finishes?
Postgres has had dedicated progress views for these operations since version 9.6, and the coverage has expanded with each release. As of Postgres 17 the set includes pg_stat_progress_vacuum, pg_stat_progress_analyze, pg_stat_progress_create_index, pg_stat_progress_cluster, pg_stat_progress_copy, and pg_stat_progress_basebackup. Most teams discover them during their second long-running-operation incident, when the first one taught them that pg_stat_activity alone does not answer the actual question.
What the views actually expose
Each progress view exposes a small set of columns specific to the operation. The shape is roughly: a pid column that lets you join against pg_stat_activity, a phase column that names the current stage of the operation, and a set of numeric columns that name what is being counted. The phase column is the most useful for operational interpretation. A VACUUM moves through phases including initializing, scanning heap, vacuuming indexes, vacuuming heap, cleaning up indexes, truncating heap, and performing final cleanup. The numeric columns tell you how many heap blocks have been scanned out of how many total, how many dead tuples have been collected, and so on.
The progress views are real time. The data updates as the operation runs. There is no buffering, no statistics-collection lag, no need to call any reset function. You query the view and you see the current state. This is one of the few places in Postgres observability where the data is genuinely instantaneous.
The vacuum diagnostic
A common question during a long-running autovacuum is whether it is making progress or stuck. The diagnostic query joins pg_stat_progress_vacuum with pg_stat_activity for context:
SELECT
p.pid,
a.query,
p.phase,
p.heap_blks_scanned,
p.heap_blks_total,
CASE WHEN p.heap_blks_total > 0
THEN round(100.0 * p.heap_blks_scanned / p.heap_blks_total, 1)
ELSE 0 END AS percent_done,
p.num_dead_tuples,
p.max_dead_tuples,
now() - a.query_start AS elapsed
FROM pg_stat_progress_vacuum p
JOIN pg_stat_activity a USING (pid)
ORDER BY a.query_start;If phase is changing between successive runs of this query and heap_blks_scanned is increasing, the vacuum is working. If both are stuck for minutes, the vacuum is blocked on something: either a lock held by another transaction (check pg_blocking_pids), or full max_dead_tuples requiring an index pass, or it is in the indexes phase where progress is harder to measure. The num_dead_tuples = max_dead_tuples case is particularly important to recognize because it explains why a vacuum on a heavily-updated table sometimes appears to stall: it has filled its memory budget and is doing the index pass, which is essentially opaque to the heap-block counter.
The create index diagnostic
The pg_stat_progress_create_index view is structurally similar but with phases specific to index building. A CREATE INDEX CONCURRENTLY moves through building index, waiting for old transactions, validating, and several others. The waiting for old transactions phase is the one operators most commonly misread. It does not mean the index build is stuck. It means the build is waiting for transactions that started before the build started to finish, because the index must observe all changes from those transactions. If you have a long-running analytical query or an idle-in-transaction connection, the index will wait until that transaction commits or aborts. The diagnostic action is to find that transaction in pg_stat_activity, not to cancel the index build.
The building index phase exposes blocks_done and blocks_total, which gives a percent-complete you can use for rough time estimates. The math is approximately: remaining = elapsed * (blocks_total - blocks_done) / blocks_done. This is a rough estimate because block-scan rate is not uniform (B-tree builds slow down on later blocks as the tree grows wider), but for operational purposes it is enough to distinguish "an hour from finishing" from "a day from finishing".
The COPY diagnostic
The pg_stat_progress_copy view added in Postgres 14 is one of the most useful additions in recent releases. Bulk imports via COPY are common, slow, and visually opaque without it. The view exposes bytes_processed, bytes_total when reading from a file (zero when reading from a stream), and tuples_processed. For a COPY FROM reading a regular file, the percent-complete is trivial. For a COPY FROM STDIN driven by a client library streaming from elsewhere, only the absolute tuple count is meaningful, but watching the rate of change over a 30-second window is enough to know whether the operation is healthy or stalled.
The cluster and basebackup diagnostics
pg_stat_progress_cluster covers both CLUSTER and VACUUM FULL with phases including seq scanning heap, index scanning heap, sorting tuples, writing new heap, swapping relation files, and the index-rebuild phases. The writing new heap phase is where most of the time is spent on large tables, and the byte counters give a usable percent-complete. pg_stat_progress_basebackup covers pg_basebackup with byte counters for the entire backup including tablespaces. This is the view that lets you answer the question of when a long-running backup will finish without consulting the client side.
What progress views do not show
Some long-running operations have no progress view at all. REINDEX without CONCURRENTLY does not have its own view. ALTER TABLE with rewriting changes does not have its own view. Logical replication slot creation does not have its own view. For these, you fall back on pg_stat_activity wait events and disk-space monitoring as indirect signals. The progress views are the right answer when they exist, but they cover specific named operations and the coverage is not universal.
The other limitation is that progress views show only the current operation, not its history. If you want to know how long the last 10 vacuums took, you read the pg_stat_user_tables.last_autovacuum column or the autovacuum log via log_autovacuum_min_duration. The progress views are useful exactly during the operation and useful for almost nothing after.
Why this matters for the studio
Our four products (DocuMint, CronPing, FlagBit, WebhookVault) run on SQLite, which has its own progress mechanisms (sqlite3_progress_handler callbacks during long queries, the rather minimal .timeout commands), and so the progress-view discipline does not yet apply directly. The eventual graduation to Postgres for whichever product first hits the scale ceiling will bring progress views into the operational picture. Knowing what they show before you need them shortens the first incident.
The deeper observation is that databases accumulate observability features over decades, and the features that earn their cost are the ones that answer specific operational questions that arose enough times that the cost of building the feature became worthwhile. Progress views are the answer to a question every database operator has asked: how far along is this thing, really. The answer turned out to be a small set of dedicated views, one per operation, exposing exactly the counters that the implementation could provide cheaply. The cost was the discipline of maintaining the views as the operations themselves evolved across releases.
If the operational pattern you keep hitting is one your database does not have a progress view for, the answer is usually to file a request or contribute one. The set of progress views in Postgres 17 is larger than the set in 9.6 because operators kept hitting the gaps and someone wrote the code. The cooperative model continues to work for the slow accretion of operational features that nobody designs as the headline feature.