Why Your SELECT COUNT(*) Is Slow: How Postgres Counts Every Row

Postgres versionAllTopicQuery performance, MVCC, countingLevelIntermediate

If you have a large table and run SELECT COUNT(*) FROM large_table, it will be slow. Not because Postgres is bad at counting. Because Postgres has no stored row count to return.

This surprises people who come from MySQL and remember that COUNT(*) on a MyISAM table was instant. It was instant because MyISAM stored a row count per table. Postgres does not. Neither does InnoDB. The "MySQL is faster at COUNT" claim is a property of one specific legacy storage engine that is effectively deprecated.

Why no stored count

Postgres uses MVCC — Multi-Version Concurrency Control. Different transactions can see different versions of the same rows. Transaction A might see 50,000 rows as visible. Transaction B, running concurrently with a different snapshot, might see 49,997 or 50,003. There is no single "row count" that is correct for all readers simultaneously.

A stored count would have to be updated on every INSERT and DELETE, under the transaction's isolation guarantees. That creates contention on a single shared counter for any write-heavy workload. The design decision was: no stored count, count on demand.

What actually happens during COUNT(*)

Postgres scans the table (or uses an index-only scan when conditions allow) and checks each tuple's visibility against the current transaction snapshot. For each row, it evaluates: is this row visible to me? If yes, add one to the count.

The visibility check requires reading the tuple's xmin and xmax values (the transaction IDs that created and deleted it) and consulting the commit log (CLOG/pg_xact) for each transaction ID that isn't already known. This is fast per-tuple but scales linearly with the number of rows.

The visibility map optimization helps, but conditionally. If a page is marked all-visible in the visibility map (which happens after VACUUM), Postgres can skip the per-tuple visibility check for that page during an index-only scan. This makes COUNT(*) significantly faster on well-vacuumed tables using an index-only scan. Under write pressure, pages become dirty and lose their all-visible status, and the optimization degrades.

Three fast alternatives

1. pg_class.reltuples — instant, approximate.

SELECT reltuples::bigint AS estimate
FROM pg_class
WHERE relname = 'your_table';

Updated by ANALYZE (and autovacuum's ANALYZE phase). Typically within 10% of the true count on active tables. Not transactionally consistent. On a table with 1 million rows, reltuples will say somewhere between 900,000 and 1,100,000. For display purposes — "about 1 million records" — this is usually fine.

2. EXPLAIN row estimate — instant, similar accuracy to reltuples.

EXPLAIN SELECT COUNT(*) FROM your_table WHERE status = 'active';

Parse the Rows= output. This uses the same statistics as reltuples but works on filtered queries. The planner's row estimate is your count estimate. It has the same accuracy characteristics as reltuples — useful for filtered approximate counts where a full ANALYZE isn't practical.

3. Materialized count table — exact, at write overhead cost.

CREATE TABLE row_counts (
  table_name TEXT PRIMARY KEY,
  count BIGINT NOT NULL DEFAULT 0
);

-- Trigger on the target table:
CREATE OR REPLACE FUNCTION update_row_count() RETURNS trigger AS $$
BEGIN
  IF TG_OP = 'INSERT' THEN
    UPDATE row_counts SET count = count + 1 WHERE table_name = TG_TABLE_NAME;
  ELSIF TG_OP = 'DELETE' THEN
    UPDATE row_counts SET count = count - 1 WHERE table_name = TG_TABLE_NAME;
  END IF;
  RETURN NULL;
END;
$$ LANGUAGE plpgsql;

The count is exact and instant to read. It adds two database writes per INSERT/DELETE and introduces a potential hot-row contention point under high write throughput. Appropriate for tables where the exact count matters and the write rate is moderate.

When exact counts actually matter

Pagination totals almost never do. Slack, Discord, and Linear have all dropped exact "showing 1 of 47,283 results" counts from their UIs, for good reason: computing the exact count for a complex filtered query is expensive, the number goes stale immediately, and users do not actually use the number. Consider whether your UI genuinely needs it.

Regulatory and financial contexts often do require exact counts — record counts for audit logs, export completeness checks, financial reconciliation. For these, either accept the cost of exact COUNT(*) or maintain a trigger-based counter. The trigger approach is correct here because the count is semantically load-bearing.

The short version: if you need a number on a dashboard, use reltuples. If you need to know whether an export was complete, use COUNT(*) and accept the cost. If you need a count on a hot read path, build the trigger table. Three different problems, three different answers.

---

Building in public at builds.anethoth.com — proof that a product is really being built.