API Pagination Limits: Why Maximum Page Size Decisions Compound

The page size limits in your API are usually set once in an early ticket and then become permanent. The decisions matter more than they look because customers build their integrations around the constraints you ship with.

Most APIs grow a pagination scheme early and pick page-size limits in passing. A default page size of 25, a maximum of 100, a fallback to the maximum if the customer requests more, no special handling. The numbers feel arbitrary because they are arbitrary. They also turn out to be load-bearing in ways the original design did not anticipate, because customers build their integrations around whatever you shipped and changing the limits later is hard.

We have shipped this decision four times across DocuMint, CronPing, FlagBit, and WebhookVault, and a few of the decisions we revisited after they hit production. This post is the version we wish we had read before the first decision.

The three numbers

Every paginated endpoint has three numbers worth thinking about: the default page size when the customer does not specify, the maximum page size when the customer specifies a value, and the absolute backstop when the customer specifies something larger than the maximum. Most APIs publish the first two and silently apply the third (clamp to maximum, or return 400, or some other behavior).

The default page size affects the median customer experience. A default of 25 is conservative and friendly to dashboards; a default of 100 is friendlier to bulk operations and worse for interactive UIs. The Stripe pattern of 10 by default and 100 maximum reflects a UX-first orientation. The GitHub pattern of 30 by default and 100 maximum is similar. The Algolia pattern of 20 by default and 1000 maximum reflects a search-first orientation where customers are expected to read entire result sets.

The maximum page size affects the bulk-export experience. A maximum of 100 means a 10000-row export needs 100 requests, which at typical API rate limits takes minutes. A maximum of 1000 reduces that to 10 requests but multiplies the per-request cost by 10. The cost is mostly database-side: a single query for 1000 rows is much cheaper than 10 queries for 100 rows each, but a query that returns 1000 rows of a large object is expensive to serialize and transmit.

The asymmetric cost

Page size limits are one of the API surfaces where the cost of being too restrictive is borne by customers and the cost of being too permissive is borne by you. A customer who needs to paginate through 100000 items at 100 per page makes 1000 requests against your API: that is their inconvenience plus a thousand row-count overhead on your side. A customer who pulls 1000 items at a time pulls 100 requests, which is faster for them and probably cheaper for you despite the larger per-request size.

But a customer who pulls 10000 items at a time, if you allowed it, would tie up a database connection for the duration of that query, return a response that takes seconds to serialize, and produce timeout problems on the network path. The maximum page size needs to be set somewhere that balances these costs, and the right answer depends on the size of your typical row and the throughput characteristics of your database.

The implementation gotchas

Three implementation choices around page size limits surprise teams in production.

First, the silent-clamp pattern (if the customer requests page_size=10000, return 100 instead of an error) is friendly to customers but produces a confusing experience when the customer was expecting 10000 rows and gets 100. The friendlier-but-more-explicit pattern is to return a 400 with an error message explaining the maximum, plus a hint about how to paginate through larger result sets. The middle ground is to return the clamped result with a header (X-Page-Size-Clamped: 100) indicating the clamp happened.

Second, the offset-pagination performance cliff. Most APIs use offset-based pagination at first and discover late that offset 10000 with page_size 100 has to scan 10100 rows to return 100 of them. The cost is linear in the offset and at large offsets it becomes the dominant cost of the query. The fix is cursor-based pagination, which is constant-cost regardless of position. Cursor pagination has its own complications (cursor stability, cursor encoding), but page-size becomes less critical once the offset cost is gone.

Third, the count problem. Many APIs return a total count alongside the data array, partly because customers ask for it and partly because the early implementations had it for free from offset pagination. The count query is often much more expensive than the data query (it scans the entire result set rather than the first page), and at large data volumes the count adds 5-10x to the response time. The discipline that scales is to make the count optional and discouraged: a separate count endpoint or an opt-in query parameter, with the warning that count is expensive.

Per-endpoint variation

The page-size limits should not be uniform across all endpoints. Endpoints that return small, simple objects (a list of tags, a list of region IDs) can have larger page sizes than endpoints that return large complex objects (full invoice records with line items, webhook delivery records with full payloads). The differentiation reflects the actual cost of serving the response.

The trade-off is API surface complexity. If every endpoint has different limits, customers have to discover and remember the per-endpoint values. The pattern that works is to have two or three tiers (small-object endpoints with maximum 500, normal endpoints with maximum 100, large-object endpoints with maximum 25) and to document each endpoint's tier rather than each endpoint's specific limits.

Bulk export endpoints

For customers with genuine bulk-export use cases, the answer is usually not to raise the pagination maximum but to provide a separate export endpoint with different semantics: async, with the customer providing a callback URL or polling for completion, returning a downloadable file in CSV or JSONL format rather than paginated JSON.

The export endpoint is a different product feature with different cost characteristics. A 10-million-row export should not happen through the paginated API at all; it should be a background job that produces a file, with the customer downloading the file when ready. The decoupling from the paginated API surface lets each surface optimize for its actual use case.

Rate limiting interaction

Page size and rate limits interact in subtle ways. If the rate limit is requests-per-minute, a higher maximum page size lets customers cover a larger result set with fewer requests, effectively increasing the data throughput per minute. If the rate limit is items-per-minute, page size does not affect data throughput but affects the number of HTTP requests, which has its own costs.

The pattern that aligns customer incentives with operational cost is to rate limit on items-per-minute for endpoints with high per-row cost (so customers cannot abuse large page sizes to extract more data than the rate limit intends), and on requests-per-minute for endpoints with low per-row cost (so the rate limit reflects the actual request overhead).

The 25-100-or-error decision

The decision tree that captures most of the reasoning is: default page size 25 for endpoints used in dashboards and similar interactive contexts; maximum page size 100 for endpoints with normal-sized objects; maximum 25 for endpoints with large objects; return 400 with explicit error for requests beyond maximum. Variants of this pattern are visible at most B2B SaaS APIs and they are visible because they balance the cost asymmetries reasonably well.

The decision tree fails in a few common cases. APIs intended for analytics or data export should use higher maximums and probably async export endpoints. APIs intended for embedded UIs (a Slack-app component, a Notion block) should use lower defaults to avoid hitting message size limits. APIs with deeply nested response objects (full transaction history, complete user profile) should use page sizes much smaller than 25 to avoid serialization cost dominating the response.

The deeper observation

Page-size limits look like a small implementation detail and turn out to be one of the more permanent decisions in an API. Once customers have integrated against your limits, raising the maximum is safe but lowering it is a breaking change. The asymmetry means you should err on the side of conservative initial limits and raise them when the use case is clear, rather than setting permissive defaults and discovering the cost too late. The decision is one of the small set of API choices where customer-facing constraints and operational constraints reinforce each other rather than trade off against each other, and that combination is worth designing carefully early.

Read more