Postgres Advisory Locks: The Underused Coordination Primitive

Most application-level coordination problems get solved with Redis or with a distributed lock library. Postgres has had advisory locks built in for over fifteen years, and they're a much better fit for many of those problems than the alternatives most teams reach for.

Most application-level coordination problems — preventing two workers from running the same job at the same time, serializing access to a shared resource, ensuring a one-time initialization runs exactly once — get solved with Redis, with a distributed lock library, or with some combination of database flags and polling. Postgres has had a coordination primitive built in for these cases since version 8.2 in 2006. Advisory locks are not widely used, and most teams reaching for distributed lock infrastructure don't know they exist or don't think to use them. They're a much better fit for many of those problems than the alternatives most teams reach for, and the reasons they fit are worth understanding.

The patterns in this post apply to any service that uses Postgres. We've used advisory locks in DocuMint for one-time PDF template registration, in CronPing for the scheduler-leader election, in FlagBit for serializing flag-config updates that touch multiple rows, and in WebhookVault for coordinating retention sweeps.

What advisory locks are

An advisory lock in Postgres is an integer-valued lock that the application acquires explicitly and releases explicitly. Postgres tracks the lock in a system table that's accessible to all connections to the database. The locks come in two flavors: session-scoped (released when the connection closes) and transaction-scoped (released when the transaction commits or rolls back). Both flavors come in shared and exclusive variants, with the standard reader-writer semantics. The lock keys are 64-bit integers, optionally split into two 32-bit integers for namespacing.

The crucial property is that advisory locks are advisory in the literal sense: Postgres doesn't enforce them on any operation. Unlike row locks or table locks, advisory locks don't block reads or writes by themselves. They only block other attempts to acquire the same lock key. The coordination is entirely between application code that agrees to check the lock before doing the work.

This sounds like a weakness, and in some sense it is — the database can't protect you from buggy code that ignores the lock. But it's also the reason advisory locks are useful for application-level coordination. They give you a fast, transactional, replicated coordination primitive that doesn't entangle with the row-level locking that the database is doing for SQL semantics.

The patterns where advisory locks fit

The first pattern is preventing duplicate job execution. A worker pool wants to ensure that only one worker processes a particular logical job at a time. The naive approach is to use SELECT FOR UPDATE on a jobs table, which works but has the failure mode that long-running jobs hold transaction locks for a long time, blocking other operations on the same row. Advisory locks decouple the lock from the row: you can acquire pg_advisory_xact_lock(job_id), do the work, and let the lock release on commit, without holding a row lock that interferes with other queries on the jobs table.

The second pattern is leader election in worker pools. When a service has multiple instances running and only one should be doing some work — running a scheduler, performing a periodic cleanup, sending heartbeats — you need a way to pick exactly one. Tools like ZooKeeper or etcd are designed for this, but they're heavyweight if Postgres is already in your stack. A pg_advisory_lock with the leader election lock key, held in a long-lived session, gives you leader election. When the leader's connection drops, the session-scoped lock releases and another instance can acquire it.

The third pattern is serializing operations that span multiple rows or tables. Some operations need to be atomic at the application level even though no single row or table-level lock can express the constraint. An example is updating a feature flag's config when the update touches multiple tables. Wrapping the update in a transaction that first acquires pg_advisory_xact_lock(flag_id) ensures only one updater proceeds at a time without blocking unrelated flag updates.

The fourth pattern is one-time initialization. A service starts up and needs to run a setup step exactly once across all instances. Acquiring a pg_advisory_lock for the initialization key, doing the setup, and releasing the lock guarantees only one instance does the work. Other instances either skip the setup or wait for it to complete.

The implementation in practice

The basic interface is four functions: pg_advisory_lock and pg_advisory_unlock for session-scoped locks, pg_advisory_xact_lock for transaction-scoped locks (which release automatically on commit/rollback), and pg_try_advisory_lock for non-blocking attempts. The non-blocking variant returns true if the lock was acquired and false if it would have blocked.

For job-deduplication patterns, you usually want pg_try_advisory_xact_lock — try to acquire the lock, and if you can't, skip the job (another worker has it). For leader election, you want pg_try_advisory_lock — try to acquire the lock and become the leader, or accept that another instance is the leader. For serializing operations, you want pg_advisory_xact_lock, which blocks until the lock is available, because you actually want the operation to wait its turn rather than fail.

The lock keys need to be chosen carefully. Postgres locks are global to the database, so two unrelated parts of your application using the same key will interfere. The standard pattern is to use a namespace number for the application area combined with a hash of the resource identifier. The two-32-bit-integer form of the API supports this directly.

The failure modes

The connection-loss failure mode is the most important to understand. Session-scoped locks are released when the connection drops. If your leader holds a session-scoped lock and the leader's connection has a transient failure, the lock releases and another instance acquires it. The original leader, if it doesn't notice the connection loss, may continue to act as if it's the leader. The mitigation is the same as for any leader-based system: the leader needs to verify it still holds the lock before performing each action, and external resources should fence on the leader's identity (a fencing token in the lock value, for example).

The lock-leak failure mode applies only to session-scoped locks. If application code calls pg_advisory_lock and then doesn't call pg_advisory_unlock — because of an exception, a code path that forgot to release, an early return — the lock stays held until the connection closes. Connection pool max_lifetime helps recover from this, but the cleaner fix is to always use the transaction-scoped variant when possible, since the lock release is tied to commit/rollback.

The wait-graph failure mode is the deadlock case. Two transactions each holding an advisory lock and trying to acquire each other's lock will deadlock. Postgres detects advisory-lock deadlocks the same way it detects row-lock deadlocks and aborts one transaction. The mitigation is consistent lock-acquisition order, the same discipline as for row locks.

The performance envelope

Advisory locks are fast. The lock acquisition is a small in-memory data structure update inside Postgres, with no disk I/O for the lock itself. We've measured tens of thousands of advisory lock operations per second on modest hardware. The locks scale to as many distinct lock keys as you want — Postgres maintains the lock table efficiently up to millions of distinct held locks.

The cost is not zero. Each advisory lock acquisition is a network round-trip to Postgres, which adds latency to operations that hold locks. The cost is comparable to a small SQL query — useful for coordination at the level of jobs and resources, not useful for fine-grained synchronization within tight loops.

When advisory locks are wrong

Advisory locks are wrong when the coordination problem extends beyond the Postgres database. If you have multiple services using different databases, or services that don't connect to Postgres at all, advisory locks can't coordinate across them. They're also wrong when the lock holding time is very short and the lock acquisition rate is very high — the round-trip to Postgres dominates the actual lock-protected work, making the coordination expensive relative to the protected operation. For sub-millisecond coordination, in-process locks or Redis-based locks are better.

Advisory locks are also wrong when the coordination requires complex semantics — multi-resource locking with timeouts, hierarchical locking, lock upgrades from shared to exclusive — that Postgres advisory locks don't support directly. For those cases, a purpose-built coordination service like ZooKeeper or etcd is the right answer.

The deeper observation

Postgres advisory locks are a case study in features that solve common problems but get reached for last. The default mental model when a team needs distributed coordination is to add a coordination service — Redis, ZooKeeper, etcd — and pay the operational cost of running another stateful piece of infrastructure. The Postgres-already-in-your-stack option is often better: lower latency than a Redis round-trip if Postgres is closer, full transactional integration with the rest of the database operations, and zero additional operational surface. The pattern of reaching for distributed coordination infrastructure before checking what your existing database can do is a small example of a much larger pattern in software architecture, where the assumption that a problem needs new infrastructure produces stacks that are more complex than they need to be. Advisory locks reward the discipline of checking what's already available before adding what's new.

Read more