Database Connection Pool Sizing: Beyond the max_connections Defaults
The default pool size in most ORMs is wrong for your workload. Little's Law gives you the right answer, the metrics that confirm it, and the failure modes that come from getting it wrong in either direction.
Almost every application that talks to a database has a connection pool. Almost every application uses a pool size near the framework default, which is rarely correct for the actual workload. The default is conservative, opinionated by some long-ago engineer for some unrelated workload, and inherited unchanged through years of production traffic until someone notices that the application is queueing on the pool under load. By the time you notice, the pool has been wrong for a long time.
Pool sizing is one of the highest-leverage tuning knobs in a typical web application. It sits between every request and the database, it costs essentially nothing to change, and the right size produces dramatic improvements in tail latency. The wrong size in either direction produces specific and recognizable failure modes that operators learn to read once they have seen them a few times. We have tuned connection pools across DocuMint, CronPing, FlagBit, and WebhookVault, and the same patterns recur regardless of language and framework.
Why the defaults are wrong
The framework default for a connection pool is usually a small number: 5 in many Python ORMs, 10 in default Node.js drivers, 25 in some Java pools. These numbers come from the era when database connections were genuinely expensive and a database server might have a max_connections of 100. They are conservative because the framework cannot know what the application is doing. They are also wrong for almost every production workload above light traffic.
The right size depends on three factors the framework cannot see: the rate of requests that need a database connection, the average time each request holds the connection, and the upper bound on concurrent connections set by the database. Two of these are application-specific and one is operational. The framework default is a guess that ignores all three.
Little's Law gives the right answer
Little's Law states that in a stable system, the average number of items in queue equals the arrival rate times the average time in system. Applied to a connection pool: the average number of connections actively in use equals the request rate times the average connection-hold time. The pool size you need is at least this number, plus enough headroom to handle bursts without queueing.
For a typical web application: 200 requests per second, average connection hold time of 50 milliseconds, gives 10 active connections at steady state. The pool needs to be larger than 10 to absorb bursts. A common rule of thumb is 2 times steady-state for internal database access, on the theory that bursts can transiently double the offered load.
For database calls that depend on external services or third-party APIs, the hold time can be much longer than the database operation itself if the connection is held across the external call. The right answer here is to not hold database connections across external calls, but if you must, the pool size has to absorb the higher hold time.
The transaction-pooling vs session-pooling distinction
Connection poolers like PgBouncer offer multiple pooling modes. Session pooling assigns a client connection to a backend connection for the entire client session, which mimics direct connections and supports all PostgreSQL features but limits concurrency to the number of backend connections. Transaction pooling assigns a backend connection only for the duration of a transaction, which multiplexes many more client connections over a smaller pool of backends.
Transaction pooling is the right default for most web applications because most requests run a single transaction. The caveats are real: features that depend on session state (prepared statements, advisory locks, LISTEN/NOTIFY, temporary tables, SET commands without SET LOCAL) do not work correctly under transaction pooling. The application must use SET LOCAL exclusively, must not rely on prepared statement caching, and must coordinate any LISTEN/NOTIFY usage through a separately-pooled connection.
The trade-off is favorable for applications that use the database for short transactions and pay attention to the constraints. The pool can be ten times smaller than the equivalent session pool with the same effective concurrency.
Per-tier sizing within an application
A single application typically has multiple workloads with different characteristics: synchronous user-facing requests, asynchronous background jobs, occasional reporting queries, scheduled maintenance tasks. Each has a different request rate, a different hold time, and a different sensitivity to queueing. Giving them all a share of the same pool produces interference where a slow background job starves the user-facing requests of connections.
The pattern that works is to size separate pools per workload tier, summing to a number that fits comfortably under the database's max_connections. A 100-connection database might allocate 60 to user-facing reads, 20 to user-facing writes, 15 to background jobs, and 5 to reporting and operational queries. The pools are isolated, slow jobs cannot starve fast jobs, and the database is never overcommitted.
The discipline is to plan the connection budget at the database level, not at each pool's level. Without budget planning, each subsystem will independently expand its pool when it sees queueing, and the cumulative pool size will exceed the database limit at peak load, producing connection-refused errors that look like a different problem entirely.
The pool wait time as primary metric
The single most important pool metric is the wait time: how long does a request spend queued for a connection before it gets one. At steady state, this should be near zero. Spikes in pool wait time indicate that the pool is too small relative to the offered load, that requests are holding connections too long, or that the database itself is slow and requests are accumulating behind it.
The other essential metrics are pool utilization (percentage of pool actively in use), connection-acquired-then-immediately-released ratio (which indicates a hot path that takes connections without using them), and pool max-reached events (which indicate that the pool has hit its hard ceiling and requests are failing rather than waiting).
The discipline is to alert on pool wait time, not on pool size. A pool with high utilization but zero wait time is correctly sized; making it larger wastes database resources. A pool with low utilization but high wait time during bursts is undersized for the burst pattern, regardless of average utilization.
The failure modes of getting it wrong
Undersized pools produce queueing under load. Requests wait for connections instead of doing work, latency at the application tier increases dramatically, and clients see slow responses without any obvious database-side issue. The database itself looks fine — low CPU, low query latency — because the bottleneck is in the pool, not the database. This is the failure mode most often misdiagnosed as a database problem.
Oversized pools produce a different failure mode. The database is overcommitted, query planning slows down because the planner is contending with too many concurrent sessions, memory pressure on the database increases, and connection-establishment storms during deploy can exhaust file descriptors or memory. The most spectacular version is the connection storm: a restarted application opens its full pool size simultaneously, exhausting the database's connection limit and locking out other applications.
The right answer is to size the pool conservatively at first, watch the wait-time metric under realistic load, and increase the pool size only when wait time is genuinely a problem. The pool that is slightly small is recoverable. The pool that is too large is a step away from a cascade failure.
The deeper observation
Pool sizing is one of the operational decisions that compound. A correctly-sized pool is invisible — requests get connections immediately, the database is comfortable, the operational metrics are boring. An incorrectly-sized pool is one of the most common causes of mysterious slowness and one of the easiest to fix once diagnosed. The 30 seconds it takes to change a pool size in a config file pays back as much as some weeks of database tuning work. The teams that learn to read the wait-time metric and adjust the pool deliberately spend much less time chasing imaginary database problems.