Postgres BRIN Indexes: When Block-Range Beats B-Tree for Time-Series and Naturally Ordered Data

BRIN indexes are the rare Postgres feature that can replace a 10GB B-tree with a 1MB index on the same column, with almost no maintenance cost. They are also wrong for most tables. Knowing which case you have is the entire skill.

The default index in Postgres is the B-tree, and for good reason. It works on every data type with a defined ordering, supports equality and range queries, and degrades gracefully when distributions are skewed. But the default is not always the right answer. For tables where the physical row order on disk correlates with the column value being indexed, the BRIN (Block Range Index) is a genuinely different kind of index that trades a small amount of query precision for a roughly thousand-fold reduction in index size and maintenance cost.

We use BRIN indexes on the high-volume append-mostly tables across DocuMint, CronPing, FlagBit, and WebhookVault where the access pattern is range-by-time and the rows arrive in roughly time-sorted order. The cost is a slightly less precise lookup, which on the kinds of queries that benefit from BRIN almost never matters. The savings is significant enough that on a billion-row events table the BRIN index can be loaded entirely into memory while a B-tree index on the same column would not.

What BRIN actually stores

A B-tree index stores one entry per row, organized in a balanced tree that supports point lookups in O(log N) and range scans by walking leaf pages. The space cost is roughly 40 bytes per indexed row, dominated by the row's TID (page and offset) plus tree overhead. On a billion-row table, that is 40GB of index for a single indexed column.

A BRIN index stores one entry per block range, where a block range defaults to 128 blocks (1MB on a standard 8KB page). For each range, BRIN records the minimum and maximum value of the indexed column observed in that range. The space cost is roughly 60 bytes per range. On the same billion-row table, that is about 60KB of index. The ratio is roughly 700,000x smaller.

The trade-off is precision. To answer a range query, Postgres consults the BRIN index to find ranges whose min-max bracket overlaps the query range, then does a sequential scan within those candidate ranges to find the actual matching rows. If the indexed column is well-correlated with physical row order, candidate ranges are tightly packed around the query value and only a small fraction of the table needs to be scanned. If the column is poorly correlated, candidate ranges spread across the entire table and the index degenerates to a sequential scan with extra overhead.

When BRIN wins

BRIN wins when the indexed column is monotonically correlated with row insertion order. The canonical case is a timestamp column on an append-only events table: rows arrive in time order, get inserted into successive pages, and the min-max of each block range is a tight window. A range query for created_at BETWEEN '2026-05-01' AND '2026-05-02' typically touches a handful of consecutive ranges out of millions, and the BRIN index turns it into a small bounded sequential scan.

Other naturally-correlated columns include auto-increment primary keys (always strictly increasing), monotonic transaction IDs, and naturally-ordered insertion patterns like log line numbers or sequence numbers from external systems. The correlation does not have to be perfect: a column that is 90% correlated still gives BRIN most of its benefit, because the planner just scans slightly more candidate ranges.

What BRIN explicitly does not give you is fast point lookups, sorted result sets, or efficient queries with high selectivity on small result sets. For those, the B-tree's per-row precision is what you are paying for.

The correlation diagnostic

The single most useful query before adding a BRIN index is the correlation statistic from pg_stats:

SELECT attname, correlation
FROM pg_stats
WHERE tablename = 'webhook_events'
  AND attname IN ('created_at', 'id', 'user_id');

Correlation is a value between -1 and 1, where 1 means the column is strictly increasing in physical row order, -1 means strictly decreasing, and 0 means uncorrelated. BRIN works well above roughly 0.7 absolute correlation. Below 0.5, BRIN typically degenerates to nearly-sequential-scan and the B-tree is the right answer despite its size.

The correlation is computed by ANALYZE from a sample, so it can drift if the table's insertion pattern changes. The most common cause of degraded BRIN performance is a backfill or reload that inserted rows out of original order, breaking the correlation that made BRIN cheap.

Configuration knobs that matter

The default pages_per_range of 128 is right for most cases, but worth tuning if the access pattern is unusual. Decreasing it (say to 16) makes the index larger but more precise, useful when range queries are typically very small relative to the table. Increasing it (to 512 or 1024) makes the index smaller but coarser, useful when range queries typically span millions of rows anyway and the precision gain is wasted.

The newer multi-minmax opclass (brin_minmax_multi_ops) stores multiple min-max pairs per range, which helps when each range contains multiple disjoint clusters. This is the right default for time-correlated data that has periodic gaps (overnight quiet periods, weekend gaps in business hours data) where the simple min-max would include the gap in the candidate range. Switching to multi-minmax often improves selectivity by 2-3x on real-world time-series with no other change.

The autosummarize setting (true by default since Postgres 12) controls whether new block ranges are indexed automatically as the table grows. Without it, ranges added since the last brin_summarize_new_values call are not yet covered by the index and queries fall back to scanning them sequentially. The default is right; the reason to know about it is that disabling it as a perceived optimization regresses query performance the moment the table grows past the last manual summarize.

Operational signals to watch

Watch the index size growth in pg_relation_size compared to the table size growth. Healthy BRIN indexes grow at roughly 1/10000 the rate of the underlying table. If your index is growing at 1/100, the correlation has degraded and BRIN is doing more work per query than it should be.

Watch the planner's choice in EXPLAIN: BRIN should show up as a Bitmap Index Scan on the BRIN index followed by a Bitmap Heap Scan. If the planner is choosing Sequential Scan over BRIN, either the predicate selectivity is too high (BRIN is correctly worse than scan for this query shape) or the correlation has degraded enough that BRIN's candidate set is most of the table.

Watch the rows-removed-by-index-recheck count in EXPLAIN ANALYZE. BRIN returns block ranges, not individual rows, so a fraction of the rows in returned blocks will not match the predicate and Postgres has to filter them out. A small recheck count (say 10-30% of returned rows) is normal; a recheck count higher than the matched count means the index is over-returning and the page-range size or opclass should be tuned.

The deeper observation

Most database internals are described in terms of asymptotic complexity that assumes nothing about the physical layout of data. The real performance characteristics in production are governed by the interaction between query patterns and physical layout, and indexes are the tools you have to exploit favorable layouts. The B-tree is the right default because it makes no assumption about physical layout. The BRIN is the rare tool that makes a strong assumption about physical layout in exchange for a thousand-fold reduction in maintenance cost. When the assumption holds, the savings is so large that the technique should be in every backend developer's toolbox. When the assumption does not hold, BRIN is unambiguously worse than the alternatives. Knowing which case you are in is the entire skill.

Read more