You added an index. The query is still slow. You run EXPLAIN and find the planner is doing a sequential scan on your indexed table. Your first instinct is that the planner is wrong. It usually isn't.
Understanding why the planner chooses sequential scans over indexes is one of the more useful things you can learn about Postgres. Once you understand the cost model, a lot of "why is this slow" debugging becomes faster.
The random I/O vs sequential I/O cost model
Postgres has two fundamental cost parameters for storage access:
random_page_cost— cost of fetching a random page from disk (default: 4.0)seq_page_cost— cost of fetching a page in a sequential scan (default: 1.0)
These are unitless numbers that express the relative cost of the two I/O patterns. The defaults encode an assumption that random I/O is four times more expensive than sequential I/O. This was broadly accurate for spinning hard drives, where the head had to physically seek to a random location. It is wrong for modern SSDs and NVMe drives, where random I/O is much closer in cost to sequential I/O.
On SSDs, random_page_cost = 1.1 to 1.5 is typically more accurate. On NVMe, 1.1 or even 1.0. If you're running Postgres on an SSD and haven't adjusted this parameter, the planner has been overvaluing sequential scans for your entire deployment.
-- Check current settings
SHOW random_page_cost;
SHOW seq_page_cost;
-- Adjust for SSD (do this in postgresql.conf, not just for a session)
SET random_page_cost = 1.1;
The selectivity threshold
Even with accurate cost parameters, an index scan is not always the right choice. The crossover point depends on how many rows the query returns.
A sequential scan reads the entire table once, in order. An index scan reads the index B-tree to find row pointers, then follows each pointer to the heap (the table) to fetch the actual row. For each row, this is a random I/O. If you're returning 30% of the table's rows, you're making 30% of the table's worth of random I/O — which, even on SSDs, is slower than reading the whole table sequentially once.
The rough threshold: an index scan is typically faster than a sequential scan when returning fewer than 5–15% of the table's rows. The exact cutoff depends on random_page_cost, the table's size, and how full the pages are. With the default random_page_cost = 4.0, the threshold is lower (index wins at even smaller fractions). With random_page_cost = 1.1, the threshold shifts — indexes help at higher selectivities.
pg_stats.correlation and physical ordering
The planner also knows something most developers don't think about: how well the physical order of rows in the heap matches the order of values in the index.
pg_stats.correlation is a value between -1 and 1 for each indexed column. A value near 1.0 means rows are stored in the heap in approximately the same order as the index. A value near 0 means the physical order is random relative to the index. A value near -1 means the order is reversed.
SELECT tablename, attname, correlation
FROM pg_stats
WHERE tablename = 'your_table'
ORDER BY ABS(correlation) DESC;
When correlation is high, following index pointers means accessing heap pages in order — essentially sequential. When correlation is near zero, each index pointer is a random jump to a different heap page. The same index scan on the same table returns very different I/O patterns depending on correlation.
A freshly loaded table often has high correlation on its primary key — rows are inserted in order, so the index order matches the heap order. After months of updates and deletes, correlation degrades. CLUSTER (which physically reorders the table) resets correlation, but it takes an ACCESS EXCLUSIVE lock and is rarely practical in production.
Bitmap index scans: the middle path
Between a pure index scan and a sequential scan, there's a third option the planner uses: the bitmap index scan.
A bitmap index scan works in two phases. First, it scans the index and builds a bitmap of heap page locations that contain matching rows — without yet fetching any heap pages. Then it reads those heap pages in order (sorted by physical location). This converts what would be random I/O into sequential I/O, at the cost of memory to hold the bitmap.
The planner chooses bitmap index scans when the fraction of rows returned is moderate — too many for a regular index scan (which would involve too many random I/Os) but few enough that a full sequential scan is wasteful. You'll see it in EXPLAIN output as Bitmap Heap Scan with a Bitmap Index Scan below it.
Understanding this should adjust how you read EXPLAIN output. A bitmap heap scan is not a sign of a problem — it's often exactly the right choice.
Covering indexes and index-only scans
The heap-fetch cost vanishes if the index contains all the columns the query needs. Postgres calls this an index-only scan.
-- This index covers both the filter column and the projected column
CREATE INDEX idx_orders_status_amount ON orders (status) INCLUDE (amount);
-- This query can be answered without touching the heap
SELECT amount FROM orders WHERE status = 'pending';
Index-only scans depend on the visibility map. Postgres maintains a visibility map that tracks which heap pages are known to be clean (all rows visible to all transactions). If the visibility map says a page is clean, an index-only scan can skip the heap fetch entirely. If the page is not marked clean — because it hasn't been VACUUMed recently — the index-only scan falls back to a heap fetch anyway.
A table that hasn't been VACUUMed will not benefit from covering indexes for index-only scans. This is a common trap: you add a covering index expecting index-only scans and don't get them because autovacuum hasn't run recently enough to keep the visibility map current.
Reading EXPLAIN ANALYZE honestly
When EXPLAIN ANALYZE shows a sequential scan, check these before assuming the planner is wrong:
- Row estimate accuracy. Are the estimated rows close to the actual rows? If not, run ANALYZE on the table — stale statistics cause bad plans, and a sequential scan with a bad row estimate is a statistics problem, not a cost model problem.
- Selectivity. What fraction of the table is your query returning? If it's 20% or more, a sequential scan may genuinely be faster.
- random_page_cost. Is it calibrated for your storage type? On SSDs,
4.0is wrong. - Table size. For small tables (fits in shared_buffers or OS page cache), sequential scans are fast regardless of selectivity. The planner knows this.
The force-index antipattern
Postgres doesn't have a force-index hint. The closest equivalent is SET enable_seqscan = off, which makes the planner pretend sequential scans are infinitely expensive.
Don't do this in production. enable_seqscan = off is a debugging tool. Using it to force an index scan that the planner judged too expensive is not fixing a problem — it's hiding one. If the planner prefers a sequential scan with accurate statistics and correct cost parameters, that preference is usually correct. If it's wrong, the right fix is accurate statistics (run ANALYZE), correct cost parameters (random_page_cost for your hardware), or a better index (covering, partial, or on a different column).
When you use enable_seqscan = off, you're betting your query's performance on your judgment against the planner's cost model. Sometimes you're right. More often, you've hidden the actual problem long enough to forget it exists.
Trust the planner. When it ignores your index, find out why. The answer is almost always informative.
Building in public at builds.anethoth.com.