Postgres pg_stat_user_indexes: Finding the Indexes Nobody Is Using

Every index you add costs write throughput. pg_stat_user_indexes is the view that tells you which of those costs you are paying without getting any read-side benefit. Most production databases have several indexes that have not been read in months.

Every index added to a Postgres table costs something on every write to that table. The cost is small per row but it compounds with the number of indexes and the write rate. A table with twelve indexes pays twelve times the maintenance cost of a table with one index for every INSERT, UPDATE that touches an indexed column, and DELETE. Unindexed columns get a partial pass on UPDATE through the HOT optimization, but indexed columns force the planner to update every index that references them.

The benefit of an index is that some query somewhere uses it. If no query uses it, the cost is paid and no benefit accrues. The pg_stat_user_indexes view exposes the data needed to identify which indexes are in this state, and most production Postgres databases that have not had a periodic index audit have several indexes that have not been read in months.

What the view exposes

pg_stat_user_indexes has one row per user-created index with cumulative counters since the last statistics reset. The three columns that matter for usage analysis are idx_scan (how many times the index has been used to start a scan), idx_tup_read (rows returned by the index during scans), and idx_tup_fetch (table rows fetched via index lookup). The full row also includes schema name, table name, index name, and OIDs for joining to other system catalogs.

The simplest useful query lists indexes with zero scans since the last reset, joined to pg_relation_size for the disk cost:

SELECT schemaname, relname AS table, indexrelname AS index,
       pg_size_pretty(pg_relation_size(indexrelid)) AS size,
       idx_scan
FROM pg_stat_user_indexes
WHERE idx_scan = 0
  AND indexrelid NOT IN (
    SELECT conindid FROM pg_constraint WHERE contype IN ('p','u','x')
  )
ORDER BY pg_relation_size(indexrelid) DESC
LIMIT 20;

The pg_constraint subquery is important: primary key, unique constraint, and exclusion constraint indexes show idx_scan = 0 in normal operation because they are used for constraint enforcement during writes rather than for query lookups, and removing them would break the constraints they support.

The interpretation problem

idx_scan = 0 is a strong signal only if the counter has accumulated long enough to cover the actual query patterns. A counter reset yesterday tells you nothing about whether an index serves a weekly report. The discipline is to check the timestamp of the last pg_stat_reset and to set audit cadence at multiples of the longest expected query interval. For most B2B SaaS, monthly reports define the longest interval, so a 3-month accumulation period is typical.

The companion query is to look at indexes with very low idx_scan relative to table activity. An index scanned 100 times against a table with 10 million INSERTs is essentially unused for read purposes and is paying its full maintenance cost. The ratio of idx_scan to pg_stat_user_tables.n_tup_ins for the table is a rough usefulness metric.

idx_tup_read vs idx_tup_fetch

The two tuple counters distinguish how the index was used. idx_tup_read counts rows the index returned during scans, including rows that the table-side filter then discarded. idx_tup_fetch counts rows actually fetched from the heap via index pointer.

A large idx_tup_read with small idx_tup_fetch indicates the index is being scanned but most rows are filtered out at the heap. This often signals a low-selectivity index or one whose statistics have drifted so the planner is choosing it incorrectly. The remediation is usually either to add a more selective composite index or to update statistics with ANALYZE.

The duplicate-index problem

Beyond unused indexes, pg_stat_user_indexes helps find redundant indexes where one index's column list is a prefix of another's. The classic case is an index on (a) and another on (a, b). The first is logically redundant because the second can serve any query that uses only a, though the second is somewhat larger. The decision to keep both depends on how often the smaller index is used in scans where size matters.

A query that finds prefix-redundant pairs:

SELECT a.schemaname, a.relname AS table,
       a.indexrelname AS smaller, b.indexrelname AS larger
FROM pg_stat_user_indexes a
JOIN pg_stat_user_indexes b USING (schemaname, relname)
JOIN pg_index ia ON a.indexrelid = ia.indexrelid
JOIN pg_index ib ON b.indexrelid = ib.indexrelid
WHERE a.indexrelid != b.indexrelid
  AND array_length(ia.indkey::int[], 1) < array_length(ib.indkey::int[], 1)
  AND ia.indkey::text = (string_to_array(ib.indkey::text, ' ')[:array_length(ia.indkey::int[],1)])::int[]::text;

The query is approximate and needs review before action, but it surfaces candidates that are worth examining.

The drop-and-restore discipline

Dropping an index is reversible. Recreating it takes time and disk and produces a window where queries that depended on it run slower, but the data is not lost. The drop-and-monitor pattern: drop the candidate, wait two weeks, check that no query latency degraded measurably and no application logs show unexpected slowness. If degradation appeared, recreate the index with CREATE INDEX CONCURRENTLY.

For high-confidence cases (zero scans over three months on a non-constraint index larger than a few MB), the recreate-if-needed pattern is essentially free. For lower-confidence cases (low scans but non-zero), the audit cycle is more like to keep-or-drop based on benefit-cost analysis, where the cost is straightforward (index size, write amplification) and the benefit is some queries somewhere that run faster.

What the view does not show

pg_stat_user_indexes shows index usage but not query plans. An index scan that always reads the whole index and discards most rows is shown as a high-scan-count index even though it is being used badly. The remediation requires reading EXPLAIN output for the queries that use it.

The view also does not separate planner usage from execution usage. A query that the planner considered but chose against still does not increment idx_scan. The implication is that an index can be useful as a planner choice in one query plan while not being the chosen plan today, in which case removing it might cause the planner to choose worse plans in the future.

And the view does not show standby usage. Read-only replicas have their own pg_stat_user_indexes counters, and an index that is unused on the primary may be heavily used on a replica that serves analytics queries. The audit must include replica statistics for a complete picture.

Our pattern

Our four products (DocuMint, CronPing, FlagBit, WebhookVault) all run on SQLite, which has analogous diagnostics through EXPLAIN QUERY PLAN and the sqlite_stat tables but no equivalent persistent index-usage counter. We track index usage informally by recording slow queries and reviewing query plans during feature work rather than via automated audit. The Postgres migration plan includes adding pg_stat_user_indexes audit as a quarterly task because the cumulative cost of unused indexes grows linearly with feature count and is the kind of cost that is invisible until measured.

The deeper observation is that databases accumulate indexes faster than they retire them. Each feature adds the indexes it needs, but no feature removes the indexes that the previous version needed. Quarterly review with pg_stat_user_indexes is the discipline that prevents the accumulation from compounding into the situation where 30 percent of write throughput goes to maintaining indexes nobody reads.

Our products: DocuMint (PDF invoice generation API), CronPing (cron job monitoring with status pages), FlagBit (feature flags API for modern teams), and WebhookVault (webhook capture and replay) keep the lights on.

Read more