Designing API Pagination That Survives Deletion: Cursor Stability Patterns
Cursor pagination is the right default for API list endpoints. But cursors break in subtle ways when the underlying data changes between page fetches. The patterns that handle deletions and updates correctly are worth knowing before you ship.
Cursor pagination has won the API design argument against offset pagination for almost every B2B SaaS use case. The shape is simple, the performance scales, and the semantics are easier to explain to customers than the offset-shifts-under-you confusion. What gets less attention is what happens when the underlying data changes between page fetches: rows deleted, rows updated to move them in or out of the result set, rows inserted that the cursor implicitly skips over.
We hit these cases when designing list endpoints for DocuMint, CronPing, FlagBit, and WebhookVault, and the decisions made early determine what customer code paths break later. Below are the patterns that handle the awkward cases.
What a cursor encodes
A cursor is a serialized representation of a position in a result set. The position is defined by an ordering tuple, usually a sort column plus a tiebreaker primary key. The cursor for the last row of one page becomes the starting point for the next page: WHERE (sort_column, id) > (cursor_sort_value, cursor_id) ORDER BY sort_column, id LIMIT N.
The stability property of this design is that the cursor remains valid for the same query as long as the (sort_column, id) tuple is still a meaningful position in the underlying data. Rows deleted before the cursor position do not affect subsequent pages because the next-page query skips past the cursor regardless. Rows inserted after the cursor position will appear in subsequent pages naturally. The awkward cases are rows that move across the cursor boundary or rows referenced by the cursor itself getting deleted.
The deleted-cursor-row case
What happens when the row that the cursor encodes gets deleted between page 1 and page 2? In the naive implementation, the cursor is still a valid (sort_value, id) tuple, the next-page query still works correctly, and the row's absence is just one fewer result. Most cursor implementations get this right by accident because they encode only the position values, not a reference to the row itself.
The cases where this breaks are when the cursor encodes additional information that becomes stale, such as a created_at value plus a row reference that no longer exists. The pattern that avoids the problem is to keep the cursor minimal: only the ordering tuple values, no row references, no embedded counts, no other state. The minimal cursor is robust to most underlying data changes.
The moved-row case
The harder case is when a row's sort_column value changes between pages, causing it to move within the ordering. If the row moves backward across the cursor, it gets re-included in a subsequent page (the customer sees it twice). If the row moves forward across the cursor, it gets skipped (the customer never sees it).
The double-include case is usually tolerable because the customer's idempotent processing handles it correctly, and the duplicate is detectable on the customer side. The skip case is harder: the customer has no way to know that a row was missed because nothing in the response indicates it was excluded. The mitigation is to choose sort columns that do not change frequently for the underlying rows: id is stable, created_at is stable, updated_at is not stable and produces moved-row problems regularly.
If the sort column has to be updated_at (for example, a customer-facing "recent activity" feed), the pattern that scales is to use an event log rather than a mutable list. The event log appends rows when something changes, the cursor walks the immutable log, and the customer sees a consistent ordered stream even when the underlying entities are being updated.
The snapshot pagination pattern
For cases where consistency across pages is required (an export operation that needs a coherent snapshot of the data), the right pattern is snapshot pagination. The first request initiates a snapshot, the snapshot ID is returned and used in subsequent requests, and the server pins the data to a point-in-time for the duration of the snapshot.
The implementation can be at several levels. Postgres has REPEATABLE READ isolation that gives transaction-level snapshots, but the transaction has to stay open across requests, which is operationally fragile. A more common pattern is to capture the IDs in scope at snapshot creation time and walk them in subsequent pages: the snapshot is the list of IDs, the pages return rows for chunks of the ID list. The cost is the upfront work to capture the ID list, and the benefit is that subsequent inserts and deletes do not affect the snapshot.
Snapshot pagination is appropriate for export endpoints, reporting endpoints, and other cases where the customer expects a coherent view. It is overkill for list endpoints where the customer is browsing and the standard cursor semantics are acceptable.
The cursor TTL question
Cursors should have some notion of expiry, even if loose. The reasons are partly operational (long-lived cursors can become invalid if the underlying schema changes or the sort column gets reindexed) and partly performance (the database optimizer can plan more aggressively when it knows cursors will not be older than some bound).
The pattern that works is a sliding TTL: cursors are valid for some hours of active use, refreshed on each use, and rejected if the gap between uses exceeds the TTL. The error response is a 410 Gone with a clear message, and the customer-side handling is to restart pagination from the beginning. This is acceptable for the vast majority of use cases because long pagination sessions are rare and the restart is cheap.
The opaque-vs-structured question
Whether to make cursors human-readable or opaque is a design choice with long-term consequences. Opaque cursors (base64-encoded, signed) prevent customers from constructing them by hand and let the server change the encoding without breaking client code. Structured cursors (JSON or query parameters) let customers debug pagination by inspection but couple the server to the chosen encoding forever.
The pattern that scales is opaque cursors with a stable shape. The cursor contains the ordering values plus a small version header, the server signs the cursor to detect tampering, and the encoding is documented as opaque even though it is structurally regular. Customers should treat the cursor as a black box they pass back to the API verbatim.
What to test
The cursor pagination patterns that handle real-world data changes are exactly the ones that need explicit test coverage, because the bugs only manifest under concurrent modification. The minimum test suite includes pagination across a static dataset (the happy path), pagination with concurrent inserts at the head and tail of the result set, pagination with concurrent deletes including the cursor row itself, pagination with concurrent updates that move rows across the cursor boundary, and pagination with cursor TTL expiry.
The tests that catch the most bugs are the concurrent ones, run as integration tests against a real database with a writer thread modifying the data while a reader thread paginates. These tests are easy to write once and catch a class of subtle bugs that unit tests miss.
The deeper observation
Cursor pagination is the right default because it handles the common case well and the awkward cases predictably. The patterns that handle the awkward cases are well known but rarely written down, which means each team rediscovers them under production pressure. The discipline that pays off is to think through the data-change cases at design time and to write the integration tests that catch the bugs before customers do.
The wider observation is that pagination is one of the API design surfaces where the right choice is usually obvious in retrospect but easy to get wrong if you do not think carefully. The cost of getting it wrong is paid by customers in confusing duplicate or missing rows, and the cost of getting it right is one or two days of careful design plus a small integration test suite. The ratio of cost to value strongly favors getting it right.