Postgres pg_stats: Column Statistics and Why They Matter for Query Planning

The system view that lets you read the same statistics the planner reads. Reading pg_stats is one of the most underused diagnostics for understanding why a query plan looks the way it does.

The Postgres query planner makes decisions based on statistics about the data in each table and column. Those statistics live in pg_statistic, which is a system catalog with internal representations, and are exposed in a human-readable form via the pg_stats view. Reading pg_stats well is one of the most underused diagnostic skills for understanding why a query plan looks the way it does, and one of the highest-leverage skills for fixing plans that look wrong.

What pg_stats actually contains

pg_stats has one row per column per table (plus extra rows for expression indexes and inheritance roots). The most useful columns are null_frac (fraction of nulls), n_distinct (estimated number of distinct values, negative if expressed as a fraction of table rows), most_common_vals and most_common_freqs (the MCV list of frequently-occurring values and their frequencies), histogram_bounds (an approximate histogram for non-MCV values), and correlation (-1 to 1, how strongly the physical row order correlates with the value order).

Each of these has direct planner consequences. null_frac is used for IS NULL and IS NOT NULL selectivity. n_distinct is used for join selectivity, GROUP BY row estimates, and DISTINCT row estimates. The MCV list and histogram together estimate selectivity for equality and range predicates. correlation determines whether an index scan can do mostly-sequential or scattered I/O on the heap.

The everyday diagnostic query

The starting point for diagnosing a wrong plan is comparing what the planner thinks against pg_stats:

SELECT attname, null_frac, n_distinct, n_distinct < 0 AS distinct_is_fraction,
       array_length(most_common_vals, 1) AS mcv_count,
       most_common_freqs[1] AS top_freq,
       correlation
FROM pg_stats
WHERE schemaname = 'public' AND tablename = 'orders'
ORDER BY attname;

The patterns to watch for include n_distinct being wildly wrong (compared to a SELECT COUNT(DISTINCT col) ground-truth check), top_freq being very high (indicating a skewed distribution where the MCV list is doing most of the planner's work), correlation near 1 or -1 (where index scans get sequential I/O), and correlation near 0 (where index scans get random I/O).

The column-correlation gap

pg_stats is per-column. The planner therefore assumes columns are independent for selectivity estimates on multi-column predicates. This assumption is often wrong. A predicate like WHERE country = 'US' AND state = 'California' will be estimated as P(US) times P(California), but the actual selectivity is P(California) because California implies US.

The fix is extended statistics via CREATE STATISTICS, which lets you tell the planner about specific column correlations. The two kinds that matter most are dependencies (the planner stops multiplying selectivities for correlated columns) and ndistinct (the planner gets correct row estimates for GROUP BY across multiple columns).

Extended statistics need to be created explicitly (the planner does not infer correlations from the data) and refreshed by ANALYZE like ordinary statistics. They are one of the most underused features in Postgres tuning and one of the highest-leverage fixes for multi-column predicate plans.

When statistics go stale

Statistics drift between ANALYZE runs in ways that produce wrong plans. The four cases that drift fastest are bulk loads (a million new rows that the autovacuum threshold has not yet caught), new indexes (autovacuum does not analyze just because an index was created), correlated columns (where the actual correlation changes faster than the planner notices), and extreme skew (where a few values come to dominate but the MCV list is from before the skew developed).

The diagnostic is comparing the most recent autoanalyze timestamp (from pg_stat_user_tables.last_autoanalyze) against the rate of change in the table. If a table is loading a million rows per hour and the last autoanalyze was four hours ago with default autovacuum thresholds, the statistics are wrong by a factor of millions of rows. Manual ANALYZE after bulk loads is the explicit fix; lowering autovacuum_analyze_scale_factor for the specific table is the recurring fix.

What the planner cannot infer

pg_stats has limits that no amount of ANALYZE will fix. The MCV list has a default cap of 100 values (configurable via default_statistics_target); columns with thousands of important distinct values cannot be fully captured. The histogram is approximate; queries that depend on the boundary between bins get wrong estimates. The n_distinct estimate is a hard problem; for very large tables it can be wrong by orders of magnitude even with good ANALYZE.

The escape hatches are increased default_statistics_target (more MCV values, more histogram bins, more sampling), extended statistics (cross-column correlation), and n_distinct overrides (ALTER TABLE...ALTER COLUMN...SET (n_distinct = -0.5) when you know better than the sampler).

The discipline this enables

The reason to read pg_stats is to close the loop between EXPLAIN ANALYZE and the underlying data. When EXPLAIN ANALYZE shows estimated row counts that diverge from actual row counts by orders of magnitude, the question is always whether the statistics are wrong, whether the column-correlation assumption is wrong, or whether the data distribution is too skewed for the default sampling to capture. pg_stats lets you tell which.

For all four of our products (DocuMint, CronPing, FlagBit, WebhookVault) we are SQLite-based, and SQLite does not have an equivalent. SQLite's planner uses ANALYZE to populate sqlite_stat1 and sqlite_stat4 tables but they are not exposed as a structured view, and the planner has fewer knobs to turn. This is fine at our current scale where individual queries are well under the database's processing capacity. When we eventually migrate one or more products to Postgres for the cases that need it (analytics queries over large WebhookVault event logs, complex FlagBit rule evaluation, cross-product reporting), pg_stats becomes one of the first tools we will reach for.

The deeper observation is that database query planning is an attempt to predict the cost of a plan from incomplete information about the data. pg_stats is the structured form of that information, and reading it is the structured way to understand the planner's mistakes. Most database tuning sessions that look like fighting the planner are actually sessions where the planner has bad information, and the fix is to give it better information rather than to override its decisions. ANALYZE more often, set default_statistics_target higher on important tables, create extended statistics for correlated columns, and read pg_stats to verify the planner sees what you think it sees.


This essay is part of our ongoing series on operating production databases. Our products DocuMint (PDF invoice generation API), CronPing (cron job monitoring with status pages), FlagBit (feature flags API for modern teams), and WebhookVault (webhook capture and replay) all run on SQLite today, with eventual Postgres migrations planned for the workloads that warrant it.

Read more