Designing API Search Endpoints: Filtering, Sorting, and the Patterns That Survive Real Customer Queries
Search endpoints sit between database query and product feature. The decisions that make them survive contact with real customers are not the ones most API guides emphasize.
Most APIs grow a search endpoint somewhere around the time the first customer asks for a filter the canonical list endpoint does not support. The temptation is to add a query parameter, then another, then a few more, until the endpoint has accumulated fifteen optional parameters with undocumented interactions and no consistent ordering semantics. This post is about the design decisions that prevent that drift, drawn from the search and listing endpoints we operate across DocuMint, CronPing, FlagBit, and WebhookVault.
The list-vs-search distinction
The first decision is whether search is a separate endpoint or an enhancement to the list endpoint. The two patterns have different operational properties. List endpoints are typically cursor-paginated, have stable ordering, and are read-heavy with cacheable responses. Search endpoints typically have less stable ordering (relevance scores change), are more compute-intensive, and benefit from different rate limits and caching strategies.
Our default is to keep list and search separate when the search includes full-text or relevance scoring, and to enhance the list endpoint when the search is just filtering and sorting on indexed columns. The split is operational, not aesthetic: GET /invoices with filters behaves like a database query and benefits from list-endpoint optimizations; GET /invoices/search with a query string behaves like a search engine query and benefits from different infrastructure.
Filter syntax: the small grammar that scales
The most common mistake in search endpoint design is inventing a query DSL when a small grammar would do. The right default for B2B SaaS is field-equals-value filters as separate query parameters, with operators as named suffixes for the small set of cases that need them: status=active, created_after, created_before, amount_gte, amount_lte. This pattern is readable in URLs, easy to document, easy to validate, and trivially translatable to database WHERE clauses.
The trap is when customers want OR-of-fields or grouped predicates: status IN (active, pending) AND (amount > 100 OR has_attachment=true). At this point the URL-parameter approach starts to fail. Three options exist. Option one: support a small set of named multi-value parameters (status[]=active&status[]=pending) for the IN case and stop there. Option two: accept a JSON body on a POST search endpoint with a structured filter object. Option three: invent a query DSL (Salesforce SOQL, GitHub search syntax). The third is correct only at scale that justifies the documentation burden.
Sorting: stable ordering is non-negotiable
The most common search-endpoint bug is non-deterministic ordering causing the same query to return different results on different requests. The fix is to always include a tiebreaker. If the API documents sort_by=created_at, the actual SQL ordering must be ORDER BY created_at, id. Without the tiebreaker, ties produce arbitrary database-internal orderings that change as the table is updated.
The same discipline matters for relevance-based search: the primary sort is relevance score, but the tiebreaker should be a stable column like ID. Otherwise customers see results jumping around between paginated requests, especially for queries with many low-relevance matches.
The sort parameter should be a small documented allowlist, not a generic ORDER BY pass-through. Allowing arbitrary column sorts opens a whole class of bugs where the customer-requested sort triggers a full table scan because there is no index on the column they chose. Our default is to expose three or four sort options per endpoint, each backed by a covering index.
Pagination on search results
Cursor-based pagination works well for filtered lists but has subtleties for search. The cursor needs to encode enough state to resume the query: the last result's sort key plus the tiebreaker for stable ordering, plus the filter parameters so the cursor is meaningful only within that filter scope. Opaque base64-encoded cursors prevent customers from manually crafting cursors and decouple the API surface from the internal cursor encoding, which lets you change the encoding later without breaking customer code.
For relevance-ranked search, cursor pagination is harder because the relevance scores can shift slightly between requests as the index updates. Two patterns work. Pattern one: snapshot the search results at the time of the first request and paginate through that snapshot for some bounded TTL. Pattern two: use offset pagination for search results and accept the minor inconsistency, with the understanding that search-pagination beyond page 10 is rare in practice and the cost is bounded.
The full-text question
Most APIs start with substring search (LIKE '%query%' or ILIKE) and accumulate problems. Substring search does not handle stemming, has no relevance ranking, scales poorly past a few hundred thousand rows, and confuses customers when their query for "running" does not match documents containing "runs."
The intermediate step is Postgres tsvector with GIN indexes, which handles stemming, weights, and phrase search up to about 10 million documents. The Postgres FTS approach has the operational advantage that it lives in the same database as the canonical data, so the search index is always consistent with the data it indexes. The trade-off is that you give up some of the more elaborate features of dedicated search engines: custom analyzers, fuzzy matching with edit distance, faceted search with pre-computed aggregations.
The next step is Elasticsearch or Meilisearch, but the cost is real: an extra system to operate, a synchronization story between the canonical database and the search index, and a different failure mode where the search index can be stale or inconsistent. The right time to make the jump is when full-text search becomes the primary product feature rather than a customer-support tool.
Facets and aggregations
Faceted search (counts by category alongside the result list) is one of the highest-value features for B2B SaaS customers who use the API for product UIs. The naive implementation runs a separate aggregation query per facet, which scales poorly. The better approach is a single query that returns both the result list and the facet counts using window functions or LATERAL joins, or (at scale) a search-engine aggregation feature.
The API surface for facets is typically a facets[] parameter listing which facets to compute, with the response including a facets object alongside the data array. The facets are typically counts; expanding to percentiles or other aggregations is rarely worth the API-surface complexity for the use cases.
Caching search responses
Search responses are harder to cache than list responses because the query space is larger. Three patterns work. Pattern one: cache at the application level keyed on (filter parameters, sort, page, tenant), with short TTLs of 30-60 seconds and explicit purge on writes to the indexed data. Pattern two: cache only the expensive sub-queries (facet aggregations) and recombine with fresh result lists. Pattern three: stale-while-revalidate at the HTTP layer for queries with stable result sets.
The deeper question is whether to cache at all. Search endpoints with high cardinality query spaces and rapidly-changing data often have cache hit rates below 5%, at which point the caching infrastructure is more cost than benefit.
The five tests
Five test cases catch most search-endpoint bugs. One: pagination through the full result set returns each row exactly once with no gaps and no duplicates. Two: the same query at two different times (within a no-write window) returns the same results in the same order. Three: an empty result set returns a well-formed response with the empty array, not a 404. Four: an invalid filter parameter returns a 400 with a specific error pointing at the field. Five: a filter parameter that matches no rows returns 200 with empty data, not 404.
The deeper observation
Search endpoints are the API surface where the operational characteristics of the underlying data store leak through most prominently. The customer sees a unified search experience; the API must translate that experience into the small set of operations the database can actually do efficiently. The design decisions are mostly about which capabilities to expose and which to defer, and the products that handle search well tend to defer aggressively at first and add capabilities only as customer use cases compound. The temptation to expose every database capability through query parameters produces APIs that are powerful for the half a percent of customers who use them and confusing for everyone else.