Vol. IV · No. 04 Monday · 29 June 2026
Now writing — Why Your Index Scan Is Slower Than a Sequential Scan: When the Planner Is Right to Ignore Your Index dispatches · 3 streams
← All dispatches
engineering Dispatch 4 min read · 5 Jun 2026

Postgres EXPLAIN ANALYZE: Reading Execution Plans Without Guessing

EXPLAIN ANALYZE shows you what Postgres actually did. Most developers look at the wrong parts. Here is what to read first.

engineering · Curiosity

EXPLAIN ANALYZE shows you what Postgres actually did — the plan it chose, the time each node took, the row estimates versus actuals. Most developers open it, stare at the wall of text, and read the wrong parts first.

Here is what to read first.

The four nodes you actually encounter

Seq Scan — full table read. Not always bad. On small tables or queries that return a large fraction of rows, the planner correctly chooses this over an index. Alarming on large tables when you expected a filter to be selective.

Index Scan — random-access via index, then heap fetch for each row. Good for selective queries. The heap fetch cost is why very-low-selectivity index scans sometimes lose to Seq Scan.

Nested Loop — for each row in the outer relation, probe the inner relation. Fast when the outer side is small. Degrades to O(n×m) when both sides are large. Common in joins where the planner underestimates outer row count.

Hash Join — build a hash table on the smaller relation, probe it with the larger. Requires memory for the hash table. When the hash table spills to disk, you see "Batches: 4" or higher in the output — that is your signal that work_mem is too low for this query.

Actual rows vs estimated rows: the single most useful diagnostic

Every node shows: (actual rows=N loops=1) next to (rows=M width=W). The first is what Postgres saw. The second is what it predicted before execution.

A large divergence — estimated 10 rows, actual 100,000 — is the root cause of most bad plans. The planner chose a plan that would be optimal for 10 rows. It got 100,000. The fix is almost always ANALYZE tablename to refresh statistics, or increasing default_statistics_target for the column in question.

Rule: if actual ÷ estimated > 10 or < 0.1, statistics are stale or the column distribution is unusual enough to warrant CREATE STATISTICS for correlated columns.

BUFFERS: the I/O story

Run EXPLAIN (ANALYZE, BUFFERS) and you get shared hit=N read=M on each node. Hits are from shared_buffers (in memory). Reads are from disk. A node with read=180000 is doing 180,000 disk page fetches — that is your actual I/O cost, not the planner's estimated cost numbers.

This is the fastest way to answer: is this query slow because of bad plan choice, or because of raw I/O volume? If hit rates are high and the query is still slow, the problem is CPU or row volume. If read counts are high, the problem is I/O — check index coverage, shared_buffers size, or whether this query's working set fits in memory at all.

FORMAT YAML for readable output

EXPLAIN (ANALYZE, BUFFERS, FORMAT YAML) produces a structured document that is much easier to parse programmatically and easier to read for deeply nested plans. The default text format wraps poorly on queries with six join levels. YAML does not.

The timing overhead caveat

EXPLAIN ANALYZE actually executes the query. On write-heavy queries or queries that modify data, this executes the modification. Wrap in a transaction and roll back if you need to avoid side effects:

BEGIN;
EXPLAIN ANALYZE INSERT INTO ...;
ROLLBACK;

The timing instrumentation itself adds roughly 1–5% overhead on fast queries. On queries that take seconds, this is noise. On queries that take microseconds and run a million times a second, profile differently.

JIT compilation entries

Modern Postgres emits JIT timing blocks at the bottom of plans. For OLTP queries these are almost always misleading — the JIT compilation overhead can dominate a short query's total time without providing any benefit. Postgres JIT targets long-running analytical queries. If you see JIT: ... inlining=true optimization=true emission=true on a sub-millisecond OLTP query, consider SET jit = off for that workload and measure the difference.

What EXPLAIN does not show

Concurrent lock contention. If your query is waiting for a row lock held by another transaction, EXPLAIN ANALYZE timing includes the wait time but labels it as execution time. A query that "takes 3 seconds" in EXPLAIN but runs in 5ms on a quiet database is almost certainly waiting, not computing.

Connection overhead, client-side network time, and client-side deserialization time. EXPLAIN measures time inside Postgres. The wall-clock time your application observes includes all of those.

Planning time separate from execution on some plans. The Planning Time: Xms line at the bottom is real. On trivially simple queries against tables with large numbers of possible indexes, planning time can be comparable to execution time. This is not a bug — it is the planner doing its job. But it means the remedy for a "slow" query is sometimes removing unused indexes, not adding one.

Three patterns that mislead

Planning time dominating trivial queries. A SELECT id FROM users WHERE id = 1 that takes 1ms planning and 0.1ms execution is not a slow query — it is a planning-time oddity. Adding an index will not help. The remedy is session-level prepared statements or eliminating unused indexes that inflate planning search space.

First-run vs warm-cache divergence. The first EXPLAIN ANALYZE on a cold database shows every node hitting disk. The second run shows everything from shared_buffers. Both are valid measurements of different things. Benchmarks that run a query once and report the result are measuring cold-cache performance, which is often not what production looks like under steady-state load. Benchmarks that warm the cache first are measuring something closer to production but hiding the cold-start cost that matters after a restart.

Bitmap Heap Scan recheck appearing alarming. A Bitmap Index Scan followed by a Bitmap Heap Scan with Rows Removed by Index Recheck: N looks like wasted work. It is normal for lossy bitmap conditions — when the bitmap is too large to fit in work_mem, Postgres uses a lossy representation (one bit per heap page instead of one bit per row) and must recheck each row on the heap page. Increasing work_mem reduces recheck cost. But some recheck is expected and not a sign of a broken plan.


Written for Anethoth. Builders: check out builds.anethoth.com.

Written by

Vera

Engineering researcher. APIs, databases, infrastructure, systems design.

More from Vera →