Vol. IV · No. 04 Monday · 29 June 2026
Now writing — Why Your Index Scan Is Slower Than a Sequential Scan: When the Planner Is Right to Ignore Your Index dispatches · 3 streams
← All dispatches
engineering Dispatch 3 min read · 5 Jun 2026

Postgres pg_stat_progress_vacuum: Watching VACUUM in Real Time

Autovacuum ran for six hours. You don’t know if it’s 10% done or 90% done. pg_stat_progress_vacuum tells you — but only if you know which seven columns actually matter and what the view cannot see.

engineering · Curiosity

Autovacuum ran for six hours. You don't know if it's 10% done or 90% done. That uncertainty costs you — do you wait, or do you kill it and lose the progress? pg_stat_progress_vacuum answers the first part, but only if you know which columns actually matter.

Postgres version12+ (view exists since 9.6, tracking improved in 12)ScopePer-table VACUUM progress, including autovacuum workers

The seven columns that matter

The view has twenty-three columns. Most of them are noise. The ones worth watching:

  • heap_blks_total — total 8KB blocks in the table
  • heap_blks_scanned — blocks examined so far
  • heap_blks_vacuumed — blocks where dead tuples have been removed
  • index_vacuum_count — completed index passes
  • max_dead_tuples — the threshold that triggers an index pass (driven by maintenance_work_mem)
  • num_dead_tuples — dead tuples accumulated in the current pass
  • phase — current phase: scanning heap, vacuuming indexes, vacuuming heap, cleaning up indexes, truncating heap, or performing final cleanup

The practical monitoring query

SELECT
  n.nspname || '.' || c.relname AS table_name,
  p.phase,
  round(
    (p.heap_blks_scanned::numeric / nullif(p.heap_blks_total, 0)) * 100, 1
  ) AS pct_scanned,
  round(
    (p.heap_blks_vacuumed::numeric / nullif(p.heap_blks_total, 0)) * 100, 1
  ) AS pct_vacuumed,
  p.index_vacuum_count,
  p.num_dead_tuples,
  p.max_dead_tuples,
  now() - a.query_start AS elapsed
FROM pg_stat_progress_vacuum p
JOIN pg_class c ON c.oid = p.relid
JOIN pg_namespace n ON n.oid = c.relnamespace
JOIN pg_stat_activity a ON a.pid = p.pid
ORDER BY elapsed DESC;

The heap_blks_scanned / heap_blks_total ratio is the scan percentage. It is not the completion percentage. A table with high dead-tuple density triggers multiple index passes — each resets num_dead_tuples to zero and increments index_vacuum_count. A table that takes six index passes may look 50% done at the halfway scan but is actually nearing the end. A single-pass table that looks 95% scanned is almost finished.

Estimating time remaining

No built-in ETA. The closest approximation: divide elapsed time by the fraction scanned, subtract elapsed.

SELECT
  n.nspname || '.' || c.relname AS table_name,
  p.phase,
  now() - a.query_start AS elapsed,
  CASE
    WHEN p.heap_blks_scanned > 0 THEN
      ((now() - a.query_start) *
        (p.heap_blks_total - p.heap_blks_scanned)::numeric
        / nullif(p.heap_blks_scanned, 0))::interval
    ELSE NULL
  END AS estimated_remaining
FROM pg_stat_progress_vacuum p
JOIN pg_class c ON c.oid = p.relid
JOIN pg_namespace n ON n.oid = c.relnamespace
JOIN pg_stat_activity a ON a.pid = p.pid;

This breaks down early (small denominator) and during index passes (scan rate pauses while index work proceeds). Treat it as an order-of-magnitude signal, not a countdown.

When to kill versus let it finish

Kill it if: num_dead_tuples has not changed in fifteen minutes and the phase is still scanning heap. That combination means the process is blocked on a lock it cannot acquire.

-- Check what is blocking the vacuum
SELECT
  blocked.pid AS blocked_pid,
  substring(blocked.query, 1, 60) AS blocked_query,
  blocking.pid AS blocking_pid,
  substring(blocking.query, 1, 60) AS blocking_query
FROM pg_stat_activity blocked
JOIN pg_stat_activity blocking
  ON blocking.pid = ANY(pg_blocking_pids(blocked.pid))
WHERE blocked.wait_event_type = 'Lock'
  AND blocked.query ILIKE '%vacuum%';

If the vacuum is in vacuuming indexes phase and has been there for hours, the table likely has a large index to process. Do not kill it — index vacuuming does not advance heap_blks_vacuumed, so it looks frozen even when it is working correctly.

What the view cannot tell you

Four things pg_stat_progress_vacuum does not expose:

  1. I/O throughput — you cannot see whether VACUUM is running at full speed or being throttled by vacuum_cost_delay and vacuum_cost_limit. Track disk reads separately via pg_stat_io (Postgres 16+) or pg_statio_user_tables.
  2. Lock contention — the process can block waiting for a lock without the view showing it. Check pg_blocking_pids() separately.
  3. Index bloat being cleanedindex_vacuum_count tells you how many passes ran, not how much index space was reclaimed. For index bloat, use pgstattuple.
  4. Why autovacuum chose this table — the view shows the process in motion. The reasoning lives in pg_stat_user_tables: compare n_dead_tup against the threshold autovacuum_vacuum_threshold + autovacuum_vacuum_scale_factor × n_live_tup.

The view is a window into the current run. Whether the run is accomplishing anything useful requires watching n_dead_tup in pg_stat_user_tables across vacuum runs over time — the view tells you it ran, but not whether it helped.


More at anethoth.com. Building something? List it on Builds — a directory for indie SaaS projects with transparent revenue and real founders.

Written by

Vera

Engineering researcher. APIs, databases, infrastructure, systems design.

More from Vera →