Postgres pg_stat_progress_vacuum: Watching VACUUM Work in Real Time

VACUUM has been running for twenty minutes. You have no idea how far along it is. The table is 80 GB. Could be halfway. Could be almost done. Could be stuck. You have no way to know.

Postgres 9.6 added pg_stat_progress_vacuum to fix exactly this. It exposes per-backend vacuum progress with enough detail to calculate a rough ETA and understand what's happening internally.

The Basic Query

SELECT
  p.pid,
  p.relid::regclass AS table,
  p.phase,
  p.heap_blks_total,
  p.heap_blks_scanned,
  p.heap_blks_vacuumed,
  ROUND(p.heap_blks_scanned::numeric / NULLIF(p.heap_blks_total, 0) * 100, 1) AS pct_scanned,
  p.index_vacuum_count,
  p.num_dead_tuples,
  p.max_dead_tuples,
  a.query
FROM pg_stat_progress_vacuum p
JOIN pg_stat_activity a USING (pid);

The view only shows active VACUUM operations. If it's empty, nothing is vacuuming right now.

The Three Phases

VACUUM works in three phases, exposed via the phase column:

scanning heap — Reading heap pages to identify dead tuples. heap_blks_scanned advances here. This is where most of the wall-clock time goes on large tables.

vacuuming indexes — Removing dead tuple references from all indexes. index_vacuum_count increments each time an index pass completes. VACUUM may make multiple passes through indexes if max_dead_tuples fills up during the heap scan (the maintenance_work_mem limit).

vacuuming heap — Marking heap pages as free and updating the visibility map. heap_blks_vacuumed advances here.

Estimating ETA

During the scanning phase, you can calculate a rough rate:

-- Run twice, 10 seconds apart; compare heap_blks_scanned
WITH snap1 AS (
  SELECT pid, heap_blks_scanned, heap_blks_total
  FROM pg_stat_progress_vacuum
  WHERE relid = 'your_table'::regclass
)
SELECT
  heap_blks_total - heap_blks_scanned AS blocks_remaining,
  -- blocks_per_second = delta_scanned / 10
  -- ETA = blocks_remaining / blocks_per_second
  'Measure scan rate over 10s then divide' AS note
FROM snap1;

Rate-based ETA is approximate. I/O pressure and autovacuum contention make it noisy. Treat it as order-of-magnitude, not a countdown.

What max_dead_tuples Tells You

When num_dead_tuples reaches max_dead_tuples, VACUUM pauses the heap scan to flush dead tuples from indexes before continuing. Multiple index_vacuum_count passes are a signal that maintenance_work_mem is undersized relative to the table's dead tuple density. Increasing maintenance_work_mem in session scope before a manual VACUUM reduces the number of passes:

SET maintenance_work_mem = '1GB';
VACUUM ANALYZE large_table;

VACUUM FULL Is Different

VACUUM FULL uses pg_stat_progress_cluster, not this view. They're different operations — VACUUM FULL rewrites the entire table, VACUUM reclaims dead space in-place. Don't look for VACUUM FULL progress here.

Matching Autovacuum PIDs

Autovacuum workers show up here too. Match by PID to get context:

SELECT
  p.pid,
  p.relid::regclass AS table,
  p.phase,
  p.heap_blks_scanned,
  p.heap_blks_total,
  a.application_name,
  a.state,
  a.backend_start
FROM pg_stat_progress_vacuum p
JOIN pg_stat_activity a USING (pid)
WHERE a.application_name = 'autovacuum worker';

What the View Doesn't Tell You

pg_stat_progress_vacuum does not expose I/O wait time, so you can't tell from here whether a slow VACUUM is CPU-bound or I/O-bound. It also doesn't show transaction ID wraparound distance — for that you want pg_stat_user_tables.n_dead_tup and age(relfrozenxid) from pg_class. And it won't tell you why autovacuum skipped a table, which requires looking at pg_stat_user_tables.last_autovacuum and the autovacuum cost parameters.

What it does tell you — phase, progress, index pass count, dead tuple accumulation — is usually enough to answer the question you actually have: is this going to finish, or should I intervene?

Published by Anethoth. Find indie SaaS projects at builds.anethoth.com.