Database Foreign Keys: Constraints, Cascades, and the Patterns That Survive Production

Foreign keys enforce referential integrity at the database layer, which means the rule holds regardless of which application writes to the table. The trade-offs come from the constraint check itself, the lock granularity it requires, and the cascade options that look helpful but often crea

Foreign keys are one of the oldest features of relational databases and one of the most contested in practical engineering. The textbook position is unambiguous: every column that references another table should have a foreign key constraint, because data integrity is more important than performance. The contemporary contrarian position is also unambiguous: foreign keys add cost on every write, complicate schema migrations, and the integrity they enforce can be enforced equally well in application code. Both positions have merit, and the right answer in any specific system depends on operational realities the textbook does not cover.

This post covers what foreign keys actually do, what they cost, the cascade options and when they are appropriate, and the patterns that hold up across years of production use. The patterns apply across the four products in our studio — DocuMint, CronPing, FlagBit, and WebhookVault — and are general enough to apply to any system using a relational database.

What foreign keys actually enforce

A foreign key constraint says that every value in a child column must correspond to an existing value in a parent column. If you have an orders table with a customer_id column referencing customers(id), the database refuses any INSERT or UPDATE on orders that would set customer_id to a value not present in customers. It also, by default, refuses any DELETE on customers that would orphan an existing order, and refuses any UPDATE that would change the parent id while orders still reference it.

The guarantee is unconditional. It holds regardless of which application connects to the database, which user runs the query, or whether the application code has a bug. Application-layer integrity checks have to be implemented identically in every code path that writes the data; foreign key constraints work even when the application that writes the row is a one-off script run from a developer's laptop.

The cost of foreign keys

The cost shows up in three places. The first is the constraint check itself: every INSERT or UPDATE on the child table requires a lookup in the parent table to confirm the referenced row exists. For a well-indexed parent (and the foreign key requires an index on the parent's primary key, which is always present), this is a fast index lookup. For a heavily-loaded parent table, the lookups add up — though they are still cheaper than the writes themselves in almost all real workloads.

The second is locking. PostgreSQL acquires a row-level share lock on the referenced parent row for the duration of the transaction that writes the child. This is invisible until two transactions try to update the same parent row while different transactions are inserting children referencing that row, at which point the share locks held by the inserting transactions block the parent UPDATE. The classic case is a counters table where many child rows reference a small number of hot parent rows; the locking pattern serializes operations that would otherwise run in parallel.

The third is migration complexity. Adding a foreign key to an existing table requires that every existing row in the child satisfies the constraint, which means a full scan of the table at constraint-add time. PostgreSQL has the NOT VALID option to skip the scan and the separate VALIDATE CONSTRAINT command to scan later under a weaker lock, which is the right pattern for adding constraints to large tables in production. Removing a foreign key is fast; changing the cascade behavior requires dropping and recreating, which holds an exclusive lock briefly.

The cascade options

The default behavior is NO ACTION (or its near-synonym RESTRICT): refuse the parent operation if it would orphan or invalidate any child rows. The other options change what happens to the children when the parent changes.

ON DELETE CASCADE deletes all child rows when the parent is deleted. The use case is genuine ownership: a user account being deleted should delete the user's saved preferences, and the preferences have no meaning without the user. The trap is that cascade deletes are silent, recursive, and irreversible. A delete from an upstream parent can cascade through three or four tables and remove millions of rows that the engineer running the DELETE did not see. The audit trail problem is severe: by the time you realize the wrong delete was issued, the data is gone, and only a backup restore recovers it.

ON DELETE SET NULL sets the child's foreign-key column to NULL when the parent is deleted. The use case is retaining the child for historical purposes while breaking the link: an audit log entry can survive the deletion of the user who triggered it, with the user reference set to NULL to indicate the user no longer exists.

ON UPDATE CASCADE propagates parent primary-key changes to children. The use case is rare and shrinking: primary keys should be immutable identifiers, and any system that mutates primary keys has bigger problems than what to do about foreign keys. Use of ON UPDATE CASCADE is usually a sign that the parent column should not be the primary key.

The honest default for most systems is ON DELETE NO ACTION: refuse the parent delete if children exist, and require the application to explicitly delete children first. This forces every cascade decision to be visible in application code, which is what you want when the operation is destructive and irreversible.

The deferrable option

PostgreSQL supports deferring foreign key checks until the end of the transaction. The default is NOT DEFERRABLE, which checks the constraint immediately on each statement. DEFERRABLE INITIALLY DEFERRED checks at COMMIT time, which is necessary for circular references — two tables that reference each other where the standard insert pattern would always violate the constraint at the first INSERT.

The cost of deferring is that constraint violations surface at COMMIT time, which is when the transaction has already done all its work. The error happens after, not before, the work that should have been prevented. For most workloads, deferring is the wrong default; reserve it for the specific case of mutually-referencing tables where the alternative is no constraint at all.

When application-layer integrity is acceptable

Application-layer integrity is acceptable when one application owns the table and that application has a single code path for writes. In this case the integrity rule lives in one place in code, the application can check it before writing, and the foreign key would mostly add cost without adding guarantees. The trap is that the assumption "one application, one code path" rarely survives the first migration script, the first batch import, the first backfill job, or the first emergency-fix database update run by a human. Application-layer integrity that holds in development drifts away from production over the years; database-layer integrity does not.

The pragmatic position is that foreign keys are a guarantee, and the cost of the guarantee is small enough that you should pay it unless you have a specific reason not to. The reasons not to are real but narrow: extremely high write rates where the constraint check dominates, partitioned or sharded tables where the parent and child live on different physical locations, and event-sourced or append-only designs where the relational model is not the source of truth.

The discipline that holds up

The pattern that holds up over years of production use is straightforward: foreign keys on every reference, indexes on every foreign key column, and ON DELETE NO ACTION as the default cascade behavior. Use ON DELETE CASCADE only when the child has no meaning without the parent and you have explicitly thought about the audit-trail consequences. Use ON DELETE SET NULL when the child should outlive the parent reference. Add new foreign keys to existing tables with NOT VALID followed by VALIDATE CONSTRAINT. Watch out for the locking pattern when many child rows reference a small number of hot parent rows; in that case, denormalize or partition the access pattern rather than removing the constraint.

The deeper observation is that foreign keys are mostly an argument about whether you trust your future self to write application code that maintains invariants. The empirical evidence over decades is that future-you, future-teammates, and future-emergency-scripts do not maintain invariants with the consistency required to skip database-layer enforcement. The constraint is cheap, the alternative is fragile, and the systems that age well almost always have the constraints in place.

Read more