Postgres VACUUM ANALYZE: When to Run It by Hand and When to Trust Autovacuum

Autovacuum handles most Postgres maintenance correctly, but a small set of cases benefits from manual VACUUM ANALYZE: large bulk loads, post-migration backfills, statistics drift after schema changes. The trick is knowing which case you are in before you run anything.

Most Postgres deployments leave autovacuum entirely alone, which is usually right. Autovacuum is one of the more sophisticated parts of the Postgres operational design: it runs continuously in the background, monitors per-table statistics, and triggers VACUUM and ANALYZE work when threshold conditions are met. For the steady-state OLTP workload most applications produce, the default thresholds are reasonable and the default cost-throttling keeps maintenance from interfering with query traffic.

But there is a small set of cases where autovacuum is structurally too slow or too late, and a manual VACUUM ANALYZE pays for itself many times over. Recognizing those cases is a fairly narrow skill that most application developers never need but every operator running a long-lived Postgres database eventually learns.

What VACUUM and ANALYZE actually do, separately

VACUUM has three jobs that are easy to conflate. First, it reclaims space from dead tuples produced by UPDATE and DELETE under MVCC: the storage is not actually freed until VACUUM marks the dead tuples and a subsequent vacuum cycle reuses or removes the pages. Second, it updates the visibility map so index-only scans can skip the heap. Third, it freezes old tuples to prevent transaction ID wraparound, which is the only failure mode that can take a Postgres cluster offline.

ANALYZE has one job: it samples the table and updates pg_statistic with column-level statistics the planner reads when costing queries. The planner combines null fractions, most-common-values lists, histograms, distinct-value estimates, and correlation coefficients to estimate how many rows each operator will produce. Statistics that lag behind data shape produce bad estimates, and bad estimates produce plan regressions ranging from mildly slow to catastrophically slow.

Autovacuum runs both jobs but on independent triggers. The vacuum trigger is roughly proportional to dead tuple count, with autovacuum_vacuum_threshold + autovacuum_vacuum_scale_factor * row_count as the threshold. The analyze trigger is roughly proportional to insert+update+delete count via autovacuum_analyze_threshold + autovacuum_analyze_scale_factor * row_count. Both default to a scale factor of 0.2, meaning a table needs roughly 20 percent of its row count in qualifying changes before autovacuum fires.

Four cases where manual VACUUM ANALYZE pays back

The first case is post-bulk-load. A COPY FROM that loads tens of millions of rows produces all the inserts that should trigger ANALYZE, but autovacuum will not see the new statistics until its next scheduled wake-up, which can be minutes to hours away depending on load. A bulk-load script should end with ANALYZE on the affected tables to ensure the next query touching them gets accurate row counts. If the load was into a partitioned table, ANALYZE only the partitions that changed, not the parent.

The second case is post-migration backfill. After a schema change that backfills a new column with computed values, the column statistics start from scratch. Autovacuum will eventually catch up, but if the next query that filters or joins on the new column runs before autovacuum fires, the planner will use the default no-stats heuristic and probably pick a bad plan. Running ANALYZE table_name(new_column) immediately after backfill avoids this.

The third case is correlation drift. The pg_stats.correlation column measures how well-ordered a column's physical layout matches its logical sort order. A timestamp column on a newly-loaded table has correlation near 1.0, which is what BRIN indexes depend on and what range scans benefit from. After enough random updates the correlation drifts toward zero. Autovacuum tracks this but only updates it when the analyze trigger fires. If you have detected that a query plan regressed because correlation dropped, running ANALYZE will not bring back the correlation but it will let the planner know about the new state and pick a more honest plan.

The fourth case is extended statistics. CREATE STATISTICS lets you tell the planner about multi-column correlations and dependencies it cannot infer from per-column samples. Extended statistics are only updated by ANALYZE, which means running ANALYZE after creating extended statistics is mandatory before the planner will use them. This is one of the most common silent failures in Postgres performance tuning: the extended statistics object exists but has never been populated.

Three patterns that fail

The first failure pattern is running manual VACUUM ANALYZE on a fixed schedule from cron. This duplicates autovacuum's work, can interfere with query traffic during peak hours, and creates a false sense of operational control. If autovacuum is keeping up, the cron job does nothing useful. If autovacuum is falling behind, the cron job fires once and stops, leaving the underlying problem unaddressed. The right fix when autovacuum is falling behind is to tune autovacuum, not to bypass it.

The second failure pattern is running VACUUM FULL when ordinary VACUUM is what you want. VACUUM FULL acquires ACCESS EXCLUSIVE on the table, rewrites it from scratch, and rebuilds all its indexes. On a multi-gigabyte production table this blocks all reads and writes for the duration, which is rarely acceptable. The pg_repack extension does the same disk-space reclamation work online and is almost always the right tool when you genuinely need to compact a bloated table.

The third failure pattern is running ANALYZE inside a long-running transaction or inside a transaction that subsequently rolls back. ANALYZE commits its work autonomously by default in autovacuum, but in a user-initiated transaction, an explicit ANALYZE that gets rolled back leaves statistics untouched. Combined with a long-running transaction that blocks autovacuum from updating the same statistics, this can produce a table whose statistics are stuck at a stale point for hours.

The autovacuum-tuning escalation that often comes before manual ANALYZE

Before reaching for manual VACUUM ANALYZE as the answer, the right escalation is usually to tune autovacuum per-table. The scale factor defaults assume tables grow at human-relatable rates, which is wrong for high-churn tables. A queue table that processes a million rows a day will have its 20 percent scale factor met every few hours, which is too infrequent for a workload where statistics drift fast. Setting autovacuum_analyze_scale_factor to 0.05 (5 percent) or even 0.02 on a per-table basis via ALTER TABLE keeps statistics current without manual intervention.

The autovacuum cost-throttling parameters (autovacuum_vacuum_cost_limit, autovacuum_vacuum_cost_delay) were tuned for spinning-disk hardware and are too conservative on modern NVMe storage. Raising vacuum_cost_limit from the default 200 to 2000 or higher lets autovacuum keep up with high-write workloads without the deferred-vacuum bloat that builds up on the default settings.

Our use and migration plan

Our Builds directory runs on SQLite, which has a different maintenance story: VACUUM in SQLite is a manual operation that rewrites the database file, with no analog to autovacuum. We run it monthly via cron on a maintenance window. The Postgres migration path includes per-table autovacuum tuning for high-write tables, an explicit ANALYZE step after every migration that touches column statistics, and monitoring of pg_stat_user_tables.last_analyze plus pg_stats.correlation as part of the routine observability stack.

The deeper point

Manual VACUUM ANALYZE has a small but real role in Postgres operations, mostly around discontinuities that autovacuum's incremental triggers cannot catch: bulk loads, migrations, schema changes, new extended statistics. For everything else, the right answer is to tune autovacuum to match the workload, not to bypass it. The pattern that produces well-running databases is the pattern that runs autovacuum aggressively enough that manual intervention is rarely necessary.


Read more essays and technical writing at anethoth.com — a notebook on databases, distributed systems, biology, and the engineering that holds the world together.