Postgres synchronous_commit: Choosing the Right Durability Level Per Transaction

synchronous_commit is one of the most consequential Postgres settings most teams never touch. It controls when a transaction is considered durably committed, and the default is correct for most workloads but wrong for some.

synchronous_commit is one of the most consequential Postgres settings most teams never touch. It controls the contract Postgres makes when it returns from a COMMIT statement: when the function returns, what guarantees do you have about the data being durable?

The default value (on) gives the strong answer: the WAL records for the transaction have been written to disk and fsync'd, and a crash immediately after the COMMIT returns will not lose the transaction. This is the right answer for most workloads. But Postgres exposes five settings, and the others exist because the strong answer costs something, and some workloads can trade some of that cost for some of that guarantee.

What the five settings mean

The five values are off, local, remote_write, on, and remote_apply. They form a spectrum from least to most durable, and from cheapest to most expensive.

off tells Postgres not to wait for the WAL record to be flushed before returning from COMMIT. The transaction is durable as far as the in-memory state of the database is concerned (it will not roll back), but a crash immediately after the COMMIT returns can lose up to wal_writer_delay milliseconds (default 200ms) of committed transactions. This is the cheapest setting but breaks the atomicity-durability contract most application code assumes.

local waits for the local WAL flush but does not wait for any standby acknowledgment. This matters only on a primary with synchronous replication configured; on a standalone primary or with asynchronous replication, local behaves identically to on.

remote_write waits for at least one synchronous standby to receive the WAL records and write them to OS memory, but not for the standby to flush to disk. This protects against single-host failure (the standby has the data in memory and can re-flush after a crash) but not against simultaneous primary-and-standby failure between OS write and disk flush.

on (the default) waits for at least one synchronous standby to receive, write, and fsync the WAL records, but does not wait for the standby to apply them. This is the strongest setting that does not require the standby to be caught up in query-visibility terms.

remote_apply waits for at least one synchronous standby to receive, fsync, and apply the WAL records, meaning queries on the standby will see the transaction immediately after the COMMIT returns on the primary. This is the strongest setting and the most expensive.

The latency cost

The cost of moving up the spectrum is round-trip latency. off takes essentially zero wait time. local takes one fsync, typically a few hundred microseconds on NVMe storage. on takes one local fsync plus one network round-trip plus one standby fsync, typically 1-5ms depending on network topology. remote_apply additionally waits for the standby to redo the WAL records, adding variable latency depending on standby load.

For a low-throughput service, this latency cost is invisible. For a high-throughput OLTP service committing thousands of transactions per second, the cost is dominant. A transaction that does microseconds of actual work can spend several milliseconds waiting for COMMIT to return at the default setting, and this wait time is per-connection so it limits the connection-level throughput.

The SET LOCAL discipline

The most important fact about synchronous_commit is that it is a per-transaction setting that can be changed at the start of any transaction with SET LOCAL synchronous_commit = .... The per-cluster default value applies to transactions that do not explicitly override it.

This lets you make per-workload decisions. A user-facing API endpoint that records a financial transaction should use the default (or remote_apply if there is a replica being read by another service that must see the transaction). A background job that records non-critical telemetry can safely use off, accepting the small chance of losing the last few hundred milliseconds of telemetry events in a crash in exchange for much lower latency per record.

The pattern is to default to safety at the cluster level and to opt-down per-transaction where the cost is genuine and the looser guarantee is acceptable. This requires that the application code be explicit about which transactions can tolerate weaker guarantees, which is usually a useful exercise even setting aside the performance question.

The off setting in practice

The most common useful case for off is high-throughput insert workloads where individual record loss is acceptable. Append-only event logs, metrics ingestion, and audit trails that have stronger upstream durability often fit. The reasoning is: if the WAL flush is the bottleneck and the records can be recovered or accepted as lost on crash, the latency reduction is substantial.

The non-obvious gotcha is that off applies to the WAL flush at COMMIT, but Postgres still writes the WAL records to the WAL buffer in shared memory. The transaction is visible to other transactions immediately after COMMIT returns. The window of loss is between COMMIT-return and the next WAL flush (which happens every wal_writer_delay ms in the background), so the loss is bounded but real.

The remote_apply setting in practice

The most common useful case for remote_apply is when application code expects to write to the primary and immediately read from a replica. The default on setting guarantees the WAL records are on the replica's disk but does not guarantee they have been applied to the replica's table state, so a read against the replica immediately after COMMIT-on-primary may return stale data.

remote_apply closes this window at the cost of waiting for the standby's redo to complete. For applications where read-your-writes-against-a-replica is the dominant access pattern, this is often the right setting. For applications where the primary handles its own reads, this is unnecessary cost.

The synchronous_standby_names interaction

The remote_write, on, and remote_apply settings only do anything if synchronous_standby_names is configured. The configuration lists which standbys are eligible to satisfy the synchronous wait, and the syntax allows multiple-standby quorum requirements (e.g., "wait for any 2 of 3 named standbys").

The interaction with disconnected standbys is the major operational gotcha: if the primary is configured to wait for a synchronous standby and the standby disconnects, the primary will block all COMMIT operations until either the standby reconnects or the configuration is changed. This is the correct safety behavior (you asked for synchronous replication, you got it) but it can produce a primary-side outage during a standby outage that is more visible than the standby outage itself.

The mitigation is to configure quorum sizes that allow for standby failure (e.g., "any 1 of 2" rather than "all of 2") and to monitor synchronous-standby health closely. The pg_stat_replication view shows current sync_state for each connected replica.

The cluster-level default decision

The right cluster-level default depends on the workload mix. For typical B2B SaaS with mixed transaction types, on (the default) is correct: the per-transaction overhead is small enough relative to actual work that the safety guarantee is worth it, and per-transaction opt-down handles the rare cases where the cost matters.

For analytics-heavy workloads that do bulk inserts at high rates, a cluster-level default of local or even off may be appropriate, with per-transaction opt-up for the small number of transactions that genuinely need strong durability. This is the inverse pattern and requires equally explicit discipline about which transactions need the stronger guarantee.

For high-availability deployments with synchronous replication, on is correct as the default with the synchronous-standby configuration tuned to balance durability against availability under standby failure.

Three patterns that fail

Three patterns recur in production incidents. First, mixing synchronous_commit = off with application code that depends on read-your-writes semantics. The transaction looks committed but is not yet WAL-flushed, and another connection that reads immediately after may miss the write if the primary crashes between commit and flush.

Second, configuring synchronous_standby_names with strict-quorum requirements ("ANY 2 OF 2") on production primaries where one of the standbys is being upgraded or patched. The primary blocks all writes during the standby maintenance window. The fix is to use looser quorum requirements that tolerate single-standby failure, or to deliberately drop the synchronous configuration during planned maintenance.

Third, expecting synchronous_commit = on to make the standby query-visible. The setting guarantees WAL flush on the standby, not WAL apply, so the standby is not yet at the primary's state when the COMMIT returns. If query-visibility is required, the setting needs to be remote_apply.

What synchronous_commit does not control

synchronous_commit is about COMMIT durability. It does not control how often background writes flush to disk (that is checkpoint_timeout and related), how durable individual writes are within a transaction (that is the WAL semantics that always apply), or how the standby applies WAL records (that is the standby's own configuration). It is specifically the COMMIT-return contract, and conflating it with the broader durability story leads to misconfiguration.

The setting also does not affect the writability of the database under replication standby disconnection in the asynchronous-replication case. If no synchronous standby is configured, synchronous_commit values above local behave like local: there is no synchronous wait to perform.

Our use across the four products

Our four products run on SQLite, where the equivalent setting is PRAGMA synchronous with values NORMAL, FULL, and EXTRA. We run with the SQLite default (FULL for the journal_mode=WAL configuration we use), which is the equivalent of Postgres's on at the local level. The eventual Postgres migration plan, when reached, will inherit on as the cluster default with per-transaction opt-down available for the small number of high-throughput insert paths in CronPing (ping recording) and WebhookVault (request capture).

The deeper observation about synchronous_commit is that Postgres exposes a knob that most application code does not need to know about but that some application code can use to substantial advantage. The discipline of leaving the default for most work and explicitly opting down for the small number of paths where it matters is the right pattern, and the discipline depends on actually knowing what the setting does and what its safe variations look like.

Our products: DocuMint (PDF invoice generation API), CronPing (cron job monitoring with status pages), FlagBit (feature flags API for modern teams), and WebhookVault (webhook capture and replay) keep the lights on.

Read more