Designing API List Endpoints Without Total Counts: Why Saying How Many Is Harder Than It Looks
Most list endpoints include a total count by default. The cost of computing that count grows with the result set, the value to customers is usually small, and the visible-correctness expectation that it creates is hard to satisfy. The right default is no count.
Customer integrations that page through list endpoints almost always want to know how many items they are working with. The natural API design move is to include a total count in every response — total: 12847 alongside the page of results — so the client can show a page-N-of-M widget, plan how many requests are left, or just report progress. The cost of providing that count looks like a single SELECT COUNT(*), which feels free.
The cost is not free. The cost grows with the size of the result set, the value to customers is smaller than it looks, and the visible-correctness expectation that a total count creates is hard to satisfy when the underlying data is changing. The right default for B2B SaaS list endpoints is no total count.
Why COUNT(*) grows with the result set
The naive implementation runs a COUNT query alongside the page query. For an endpoint like GET /v1/orders?status=pending, the page query is SELECT * FROM orders WHERE status = 'pending' ORDER BY id LIMIT 50 and the count query is SELECT COUNT(*) FROM orders WHERE status = 'pending'. With an index on status, the page query scans a tiny range — the first fifty matching rows. The count query scans every matching row to count them.
For most workloads, this is fine. A status filter that matches a thousand rows runs the count in single-digit milliseconds. The problem appears at the extremes. A query that matches ten million rows runs the count in seconds even with an index. The page query stays fast because LIMIT bounds its work; the count query has no LIMIT.
The wrap-around is that exactly the queries where customers most want to know the count are the queries where the count is most expensive. Small result sets are cheap to count but customers know the count is small without needing to be told. Huge result sets are expensive to count and customers most want to know the magnitude.
Why the value to customers is smaller than it looks
The dominant use of the total count in customer integrations is showing a page-N-of-M widget in some UI. The widget is useful when M is small enough that the human user is going to page through it. For M in the dozens or low hundreds, the widget is informative. For M in the thousands, the human user is not going to page through the result set sequentially — they are going to filter further, search, or skip to a specific resource by ID.
The other use is progress reporting in batch processing. The integration is processing the entire result set, wants to show a progress bar, and needs to know the total. This use case is real but narrow — most integrations that process the entire result set could equally well process opaquely and emit progress in terms of items processed rather than items processed of total.
The third use is capacity planning — the customer wants to know how big the result set is before they start processing it, so they can decide whether to process it incrementally over multiple days or in a single batch. This use case is also narrow and is better served by a dedicated count endpoint that the customer hits once before starting, rather than by every page response carrying a count.
What replaces the total count
The pattern that handles the actual customer needs is a small set of three signals in every list response. First, a next_cursor field that is present when there are more results and absent when there are not. The has_more boolean is the simplest version of this signal; opaque cursor strings are the production version that survives mutations and allows mid-iteration resumption.
Second, an opt-in include_count parameter that, when set, runs the COUNT query and includes the total. The customer who needs the total can ask for it; the default response does not pay the cost. The parameter also creates an explicit decision point where the customer is choosing to accept the latency cost of the count.
Third, a separate GET /v1/orders/count endpoint that takes the same filter parameters as the list endpoint and returns just the count. This serves the capacity-planning use case and the count-without-page use case better than embedding the count in list responses.
The has_more trick
The simplest way to determine has_more without running a COUNT query is the limit+1 trick. If the customer requested fifty results, the page query is LIMIT 51. If the result contains fifty-one rows, the response returns the first fifty plus has_more: true. If it contains fewer than fifty-one rows, has_more: false. The cost is one extra row of data per request, which is negligible.
The trick generalizes to cursor-based pagination. The cursor encodes the position after the last returned row, and the next page query is the same shape with the cursor predicate added. The limit+1 trick continues to determine has_more cheaply.
The trick breaks down only when the underlying data changes between requests. A row that satisfied the filter when the first page was fetched and was deleted before the second page is fetched is gone from the second page. The customer sees a slightly shorter total than they would have expected. For most use cases this is acceptable; for cases where it is not, snapshot pagination is the heavy-weight alternative.
The customer-facing documentation
The documentation contract has three components. First, the default response shape — no total count, has_more boolean, next_cursor when applicable. Second, the include_count opt-in parameter with explicit latency warning. Third, the dedicated count endpoint for capacity planning.
The customer-side example code should demonstrate the pattern. The dominant example is a while loop that paginates through all results using has_more and next_cursor without ever requesting a count. The variant example uses include_count for a single-page result-summary use case. The capacity-planning example uses the count endpoint before starting a long-running operation.
The pattern that hurts is the inconsistent application — total count on some list endpoints, not on others, no explicit policy. Customers expect uniformity, and the cost of having total counts on some endpoints is that customers come to expect them on all endpoints and react badly when an endpoint omits them. The decision is more sustainable as a policy across all list endpoints than as a case-by-case choice.
What changes if you started with total counts
Most APIs that have been shipping for a while have total counts because the natural default at design time was to include them. Removing them later is a customer-facing breaking change — the field disappears from responses, and integrations that depended on it stop working.
The migration path is the same shape as any breaking change. Mark the field deprecated in the documentation. Announce a sunset window. Continue to emit the field for existing integrations. When the sunset window arrives, change the default and require include_count for the field to appear.
The deeper observation is that defaults are harder to change than they look. The default response shape of a list endpoint is part of the contract whether the documentation says so or not. The decision to include or exclude a total count from the default is a decision that compounds across the API's lifetime, and getting it right at design time is much cheaper than fixing it later.
Read more essays and technical writing at anethoth.com — a notebook on databases, distributed systems, biology, and the engineering that holds the world together.