Soft Deletes vs Hard Deletes: The Schema Decision That Compounds

One of the earliest schema decisions in any application is what to do when a user deletes something. The naive answer, which is also occasionally the correct answer, is to issue a DELETE against the row and move on. The other answer, which has accumulated a quiet orthodoxy over the last decade, is to add a deleted_at column and update it instead, leaving the row in the table forever. The choice between these is small the first time you make it and large the twentieth time, when half your queries have a WHERE deleted_at IS NULL clause and the other half do not.

It is worth being deliberate about which one you pick and why.

What soft delete actually buys you

The argument for soft delete is usually phrased as "you can recover deleted data." This is true but underweights what the feature is really doing. The deeper benefit is that the soft-deleted row preserves referential context. If a user account is hard-deleted and that user had created a hundred invoices, those invoices either become orphaned (referencing a missing user) or are themselves cascade-deleted. With soft delete, the user row is still there. Foreign keys still resolve. Audit trails still make sense. Reports still attribute revenue correctly to the right (now deleted) user.

The other thing soft delete buys you is mistakes. Every team eventually has someone delete the wrong row, and the difference between "we can restore that in five minutes from a flag flip" and "we restore from last night's backup and lose the day" is the difference between a five-minute incident and a four-hour one.

What soft delete costs you

The cost is real and accumulates. Every query against the table now needs to filter out deleted rows. Forget the filter once and you leak deleted data to a user. Most ORMs have a way to do this automatically (Rails default_scope, Django manager overrides, SQLAlchemy filtered relationships), and most of those automatic filters can be bypassed in subtle ways. Aggregations are particularly easy to get wrong: COUNT(*) over a table without the filter is now meaningless.

The second cost is index bloat. A table that has accumulated a million soft-deleted rows alongside ten thousand active rows performs worse on most queries than one that just has ten thousand rows. Indexes have to be larger; the planner has to estimate selectivity correctly; SELECT * with a limit becomes more expensive than it should be. You can mitigate this with partial indexes (CREATE INDEX ... WHERE deleted_at IS NULL) but it is one more thing to remember.

The third cost is privacy obligation. Under GDPR, CCPA, and similar regimes, "we soft-deleted the row but kept the data" is not actually deletion. Real deletion has to happen at some point, on some schedule, with some confirmation that it happened. Soft delete buys you flexibility for recovery; it does not buy you a way out of the privacy regulation.

The middle path: tombstones with a TTL

The pattern I have ended up using in most production systems is a hybrid. Soft-delete the row immediately on user action (set deleted_at). Run a background job nightly that hard-deletes rows whose deleted_at is older than some retention window (30 days for most things, longer for billing records, shorter for sensitive personal data). The intermediate state buys you a 30-day undo window for accidents and a clean privacy story: deletion eventually happens, in a documented way, on a documented schedule.

The pattern handles cascade well too. When you hard-delete the parent at the end of the retention window, the foreign keys cascade naturally; the child rows have already been soft-deleted at the same time, so the cleanup pass tidies everything together.

What goes in the deleted_at column

A subtle detail: do not name the column is_deleted with a boolean. Use a nullable timestamp. The reasons are practical. With a timestamp you can answer "when was this deleted" without needing a separate audit column. You can sort by deletion order. You can run retention queries by date range. The only argument for boolean is "it is one byte smaller per row," which is irrelevant.

Similarly, name it deleted_at not removed_at or archived_at. The latter two suggest user-visible state changes that you might want to model separately. Archived typically means "still present, hidden from default view, recoverable via a UI action." Deleted typically means "the user pressed delete." If your domain has both, model both: an archived_at column for the user-controlled archive feature, a separate deleted_at column for actual deletion. Conflating them produces queries with strange where clauses.

The cases where hard delete is correct

There are domains where soft delete is the wrong default. Logging tables, event streams, and analytics tables typically should not have it; the row is the event, and an event being deleted does not have a coherent meaning. High-write tables where retention is short (caches, tokens, session data) should hard-delete because soft-deleted rows just clog up the working set. Tables that are functionally caches of data owned by another system should hard-delete because the source of truth is elsewhere.

The general rule is: if a row represents a meaningful record from the user's perspective, soft-delete it. If it represents an internal state that has no user analog, hard-delete it.

The audit trail trap

Soft delete is sometimes pitched as a substitute for an audit log. It is not. An audit log records changes; soft delete only records the existence-or-not of the row. If a row was modified five times before being deleted, soft delete preserves only the final pre-deletion state. The intermediate history is gone. If you need a real audit trail, you need a separate audit table that records every change, not a flag on the main table.

The combination of soft delete plus audit log is the gold standard for systems where regulation requires reproducing past states (financial, healthcare, anything subject to discovery). It is more work but the patterns compose: the soft-deleted row remains as the last seen state, and the audit table preserves the change history that led to it.

The migration that always lurks

If you launch with hard delete and decide later you need soft delete, the migration is straightforward but irreversible: add the column, default deleted rows to a fixed timestamp, update queries. If you launch with soft delete and decide later you do not need it, the migration is harder, because by then there is data in those rows that someone, somewhere, has been quietly relying on. "We will hard-delete the soft-deleted rows" turns into "we will hard-delete except for these business reasons" turns into a meeting.

The general advice, since the failure mode of "we wish we had soft delete" is much more painful than "we wish we had hard delete," is to default to soft delete with a TTL on tables that store user-meaningful data. Then write the cleanup job on day one, not after you discover six months in that the regulator wants to see retention proof.

We make four developer APIs at DocuMint (PDF generation), CronPing (cron monitoring), FlagBit (feature flags), and WebhookVault (webhook debugging). All four use the soft-delete-with-TTL pattern. The retention windows differ (longer for billing, shorter for captured webhook bodies), but the schema shape is the same: nullable deleted_at, nightly cleanup, audit log alongside.