engineering

Pagination Without Pain: Cursor vs Offset and the Patterns That Scale

Offset pagination is fine until your table is large or your data shifts. Cursor pagination is harder to introduce after the fact. Here is the small set of decisions that determines whether your pagination scales.

Anethoth

28 Apr 2026 — 3 min read

Pagination is one of those problems that looks trivial at the read-the-docs stage and gets steadily uglier under production load. The instinct to reach for LIMIT 50 OFFSET 1000 is how most APIs start, and it works for the first thousand rows. The trouble is that the cost of fixing it later is much higher than the cost of choosing well at the start, because pagination is part of the public contract of your API. Once a client iterates with ?page=42, you cannot quietly switch the parameter to a cursor without coordinating with every consumer.

What offset actually costs you

The dirty secret of offset pagination is that the database has to walk the rows you skip. OFFSET 100000 means the engine fetches one hundred thousand rows from the index and discards them before returning the next fifty. On a small table with an index that fits in RAM, this is invisible. On a tens-of-millions-of-rows table the same query runs in seconds, then tens of seconds, then it is being killed by the connection pool's statement timeout. There is no clever indexing that fixes this; it is fundamental to what offset means.

The second problem is correctness under writes. If a row is inserted at the top of the result set between page two and page three, you see the same row twice. If a row is deleted, you skip a row entirely. Users perceive this as the system "losing data," but it is actually the API contract being incoherent under concurrent writes.

What cursor pagination actually is

A cursor is just a stable, sortable pointer into your result set. The simplest version is the primary key of the last row you returned. The client passes ?after=12345&limit=50, and the server runs WHERE id > 12345 ORDER BY id LIMIT 50. The query plan is constant time regardless of how deep into the result set you are, because the index lookup goes straight to the right spot.

The trickier version handles ordering by something other than the primary key. If you order by created_at DESC, the cursor needs to encode both the timestamp and the primary key, because timestamps are not unique. The query becomes WHERE (created_at, id) < (?, ?) ORDER BY created_at DESC, id DESC LIMIT 50. This is called a tuple comparison, and PostgreSQL, MySQL 8+, and SQLite all support it natively.

The opaque cursor rule

Do not expose the contents of your cursor to clients. The cursor in the API response should be an opaque base64 string. Internally it is JSON like {"id": 12345, "ts": "2026-04-28T11:00:00Z"}, but base64-encoding it makes two things true at once. First, clients cannot rely on the structure, so you can change your sort key later without breaking them. Second, clients cannot construct cursors from outside your API, which avoids subtle off-by-one mismatches when they try to be clever.

Limits, max-limits, and total counts

Always cap the page size on the server. ?limit=10000 is either a misconfigured client or someone trying to scrape you, and either way the right response is to silently clamp it to your maximum (typically 100 or 200). Document the cap in your error messages so well-behaved clients learn quickly.

Total counts are deceptively expensive. SELECT COUNT(*) over a filtered query has the same problem as offset: it scans rows. For most APIs you do not actually need a total; you need has_next. Return the cursor for the next page if and only if you got limit + 1 rows back, and serve limit of them. If a customer truly needs a total, charge for it: expose it on a separate endpoint with caching and rate limits.

Stable ordering matters more than you think

Whatever sort you choose, make it deterministic. Always include the primary key as the final tiebreaker, even when sorting by a unique column, because "unique" today may not be unique tomorrow if you add tenants or backfill data. Without a tiebreaker, equal sort values are returned in arbitrary order, which means your cursor pagination can skip rows or repeat them on the boundary.

When offset is actually fine

Offset pagination is the right choice for small, bounded result sets that users browse interactively. An admin search returning at most a few hundred matches is happier with ?page=3 than with cursors, because users want to skip to page seven. Cursor pagination forces sequential access, which is wrong for human browsing. The rule is simple: cursors for machine clients iterating large result sets, offset for humans navigating small ones, and never the wrong one for the wrong job.

Patterns we use in practice

Across DocuMint, CronPing, FlagBit, and WebhookVault, we use cursor pagination for every list endpoint that can return more than a few hundred rows. The cursor is a base64-encoded JSON tuple of (sort_value, id). We never expose total counts, and we cap at 100 per page. The list response includes next_cursor if and only if there are more rows. Clients walk pages by passing ?cursor=...&limit=100 until next_cursor is null.

The cost of doing this from day one is roughly four extra hours of design work. The cost of retrofitting it onto a large customer base is months of versioning, deprecation notices, and customer support. Pick well at the start, and pagination becomes invisible infrastructure instead of a recurring source of incidents.

How Treehoppers Communicate Through Plant Stems: The Strange Substrate-Borne Vibrational Network

The Forgotten History of the Microwave Oven: How Radar Engineering Reshaped the Kitchen

Postgres pg_settings: Reading and Reasoning About Configuration at Runtime

Designing API Webhook Payloads: Snapshots vs References and the Right Default for B2B SaaS