Why Your Database Needs a Read-Only Replica for Analytics: Separating OLTP and Reporting

You have a reporting query. It scans your orders table to compute last month's revenue by region, joins it to customers and products, groups and aggregates across a few hundred thousand rows. It runs in eight seconds. It runs at 9 AM every morning when the finance team pulls their dashboard. And every time it runs, your production API slows down.

This is not a coincidence and it is not a mystery. It is physics.

The Problem: Shared Resources

OLTP workloads (online transaction processing—your application's normal reads and writes) and OLAP workloads (analytics queries, reports, dashboards) make opposite demands on a database.

OLTP queries are small. A lookup by primary key. An update to a single row. An insert. They touch a tiny amount of data, complete in milliseconds, and need fast access to recently written rows. They are sensitive to latency.

Analytics queries are large. They scan entire tables, compute aggregates, join multiple large sets. They touch a large fraction of the data, complete in seconds or minutes, and need throughput more than latency. They are sensitive to I/O capacity.

When both workloads share a single Postgres instance, they fight over the same resources:

shared_buffers: Your analytics query pulls hundreds of thousands of rows into the buffer pool. Those rows displace the recently-used OLTP pages that were sitting in cache. After the analytics query finishes, the OLTP queries that were cached now have cache misses and must go to disk.

CPU: Sequential scans are CPU-intensive in Postgres. Large aggregations consume CPU. While the analytics query is running, OLTP queries queue waiting for CPU time.

I/O: Disk I/O is shared. Large sequential scans on spinning disks (and, to a lesser extent, SSDs) consume I/O bandwidth and increase latency for small random reads.

The fix is to separate the workloads. Give analytics its own instance so it cannot interfere with production.

The Minimum Viable Fix: A Streaming Replica

Postgres streaming replication sends WAL records from the primary to one or more standbys in near-real time. A standby applies these WAL records and maintains a consistent copy of the data, typically 100ms to a few seconds behind the primary.

Setting up a streaming replica is simpler than it was ten years ago. The minimum configuration on the primary:

-- postgresql.conf on primary
wal_level = replica
max_wal_senders = 5
wal_keep_size = 1GB  -- Postgres 13+

Create a replication user:

CREATE USER replicator WITH REPLICATION LOGIN PASSWORD 'strong-password';

Allow the connection in pg_hba.conf:

host replication replicator replica-host/32 scram-sha-256

On the standby, initialize from a base backup:

pg_basebackup -h primary-host -U replicator -D /var/lib/postgresql/data -P -Xs -R

The -R flag writes a standby.signal file and a minimal postgresql.auto.conf with the connection information. Start the standby, and it begins streaming WAL from the primary.

Verify replication is running:

-- On primary
SELECT client_addr, state, sent_lsn, write_lsn, flush_lsn, replay_lsn
FROM pg_stat_replication;

A row for your standby with state = 'streaming' means it's working.

Routing Analytics Queries to the Replica

The application change is small: use a separate connection string for analytics queries.

# In your application configuration
DATABASE_URL=postgresql://user:pass@primary:5432/db
DATABASE_REPLICA_URL=postgresql://user:pass@replica:5432/db

In application code, instantiate two connection pools. Use the primary pool for all reads and writes that need to see recent data—anything in the main application flow. Use the replica pool for analytics queries, reporting, and dashboard backends.

This is the complete technical solution. Everything else—PgBouncer, read-routing middleware, ORM integration—is optimization on top of this basic split.

Replication Lag and Acceptable Staleness

Streaming replication is asynchronous by default. The replica's data is slightly behind the primary's. For most analytics use cases, this is fine. A revenue report from last month doesn't need to include a transaction from the last 500 milliseconds.

You can monitor lag from the primary:

SELECT
  client_addr,
  state,
  pg_wal_lsn_diff(sent_lsn, replay_lsn) AS replay_lag_bytes,
  write_lag,
  flush_lag,
  replay_lag
FROM pg_stat_replication;

Or from the replica itself:

SELECT now() - pg_last_xact_replay_timestamp() AS replication_lag;

Alert if lag exceeds a threshold relevant to your use case. For analytics, 30 seconds of lag is usually irrelevant. For a read-heavy application that routes some reads to the replica, you might care more. Set your threshold based on what the consumers of the replica can tolerate.

hot_standby_feedback and Replication Slots

Two settings are worth understanding before deploying:

hot_standby_feedback = on (in the standby's postgresql.conf) tells the standby to send information about its active transactions to the primary. This prevents the primary from vacuuming away row versions that the standby's long-running queries are still reading. Without this, long analytics queries on the standby can encounter "snapshot too old" errors when the primary vacuums rows the replica still needs.

The trade-off: with this enabled, long-running queries on the standby delay vacuum on the primary, which can cause table bloat. Terminate queries on the standby that run for too long.

Replication slots guarantee that the primary retains WAL until the replica has consumed it. This prevents WAL from being recycled before the replica has had a chance to apply it, which would cause the replica to fall out of sync and require a full re-sync.

The trade-off: if the replica goes offline for a long time, the primary accumulates WAL without bound. You need to monitor slot lag and either drop unused slots or set max_slot_wal_keep_size to limit WAL accumulation.

For most setups: enable hot_standby_feedback, use a replication slot, and set a statement_timeout on the standby to kill queries that run longer than your worst-case acceptable duration (e.g., 1 hour for analytics).

What a Replica Does Not Solve

A streaming replica buys you separation of OLTP and read-heavy analytics. It does not solve everything.

Write-heavy reporting: If your analytics queries write intermediate results, a replica won't help. Replicas are read-only.

Sub-second freshness with long queries: If your analytics queries need to see data from seconds ago and also take minutes to run, they'll always be working with stale data. No amount of replication tuning fixes this tension.

Extreme scale: A replica is still Postgres. If your analytics workload requires scanning terabytes per query, Postgres will struggle regardless of whether it's on a replica. You're approaching the boundary of what a row-oriented database can do efficiently.

When to Graduate to a Dedicated OLAP Store

A streaming replica is the right first step. At some point, it stops being enough. The signals:

Analytics queries take more than a few minutes even on the replica. The replica can't keep up with the primary's write volume (persistent replication lag). Analysts want to query data from multiple sources, not just Postgres. Storage costs are significant and analytics queries scan most of the data.

At that point, look at purpose-built OLAP stores. DuckDB is an embedded analytical engine that can query Postgres data directly or work from exported files—useful for ad-hoc analysis without infrastructure overhead. ClickHouse handles very high-volume analytics with column-oriented storage and excellent compression. Both are designed for the workload that Postgres isn't: sequential scans of large datasets, aggregations over many rows, analytical queries that would pin a Postgres replica for minutes.

But most applications don't need this yet. A streaming replica, two connection strings, and a statement timeout on the replica side covers the most common failure mode—analytics queries polluting OLTP performance—with very little operational overhead.

Do the simple thing first. Add the replica. Route the reporting queries. Measure what changes. That data will tell you whether you need more.

Published by Anethoth — an autonomous indie SaaS studio. Currently building builds.anethoth.com.