Designing API Resource Filtering: How AND, OR, and Comparison Operators Compose Into a Useful Grammar

Customer filtering needs are usually narrower than they look. A small grammar of field-equals-value plus a handful of comparison operators handles 80-90 percent of B2B SaaS list endpoint use, while still keeping the implementation indexable and the query language documentable in a single page.

List endpoints with no filtering work fine until the dataset grows past a few hundred items, at which point customers start writing client-side filter loops over paginated responses. The right server-side response to this pressure is to let customers express the filter as part of the request. The wrong response is to design a full query language with nested boolean trees and arbitrary expression evaluation, which is what GraphQL filter inputs and some REST conventions end up becoming.

The grammar most B2B SaaS APIs converge on is a small one: per-field equality plus a handful of comparison and set-membership operators. It is documented in roughly a single page, indexable by the database, and survives schema evolution without breaking customer integrations.

The grammar that handles 90 percent of cases

The minimum viable filter grammar is field-equals-value, expressed as query string parameters: GET /projects?status=verified&category=devtools. Multiple parameters compose via AND by default. This is the shape Stripe and Linear and GitHub all use for the bulk of their list endpoint filtering.

Comparison operators are the most common extension, written as field_op=value with a small set of operators: created_at_gte, created_at_lte, amount_gt, amount_lt. The suffix-based naming keeps the URL parseable and makes the query plan obvious. The alternative of expressing comparisons through a query language inside a single parameter (filter=created_at gte 2026-01-01) looks cleaner in documentation but is harder to implement, harder to debug, and harder to validate.

Set membership extends naturally: status_in=verified,pending for OR-within-field semantics, tags_all=devtools,api for AND-within-field semantics on array columns. The split between any-of and all-of is worth making explicit at the parameter name because customers often want the wrong default.

Text search via case-insensitive prefix or substring match earns its own operator: name_starts_with=str or name_contains=stripe. These map cleanly to indexed queries (prefix to B-tree, contains to trigram-GIN), unlike free-form regex which produces unpredictable performance.

What this grammar deliberately cannot express

The grammar cannot express OR across different fields. status=verified OR category=devtools is not part of the vocabulary, and customers who need it have to make two API calls and union the results client-side. This is a deliberate restriction: cross-field OR queries are much harder to plan and index, and the use cases that genuinely need them are usually better served by full-text search or by a separate analytical query endpoint.

The grammar cannot express nested boolean expressions. (status=verified AND amount_gt=100) OR (status=pending AND created_at_gte=...) is not part of the vocabulary. Customers who need this either compose multiple API calls or have outgrown the list endpoint and should be using a reporting endpoint with a more powerful query language.

The grammar cannot express joins or computed fields. founder.country=US works only if the API explicitly exposes founder_country as a flat filterable field. The alternative of generically supporting dotted-path navigation produces APIs where the filter surface depends on the response surface, which makes versioning much harder.

How to bound the implementation cost

Every filterable field has to be implemented somewhere in the route handler, and the implementation has to be indexed if the field is high-cardinality. The right discipline is a per-resource allowlist of filterable fields and operators, defined in code and enforced at the route boundary. Anything not on the list returns a structured error: {"error": "unknown_filter_field", "field": "founder_country"}.

The allowlist becomes the source of truth for filter documentation, which is the second-most-common documentation request after authentication and the easiest to keep accurate when the list is derived from code rather than handwritten.

Pagination has to remain stable under filter changes. A cursor that encodes (id, sort_value) survives most filter operations transparently because the underlying ordering is preserved. A cursor that encodes filter state has to re-derive on each page and breaks if the customer changes a filter mid-pagination, which is the wrong tradeoff for the small win of stable pagination across filter changes.

Three patterns that fail

The first failure pattern is unbounded filter grammar. APIs that accept arbitrary filter expressions ("filter=field1 eq foo and (field2 gt 10 or field3 like %bar%)") look elegant in documentation but produce O(N!) implementation cost in route handler complexity and database query planning. They also produce surprising performance cliffs when customers hit query patterns the index design did not anticipate. Microsoft Graph API and some OData implementations have this shape and it is consistently the source of customer complaints about unpredictable response times.

The second failure pattern is silent filter ignoring. APIs that accept any query string parameter and silently ignore unknown ones produce integration bugs where customers think they are filtering and are not. The response should explicitly reject unknown filter parameters with a structured error, even when the temptation is to accept-and-ignore for forward compatibility.

The third failure pattern is filter semantics that vary across endpoints. If created_at_gte is inclusive on the projects endpoint but exclusive on the events endpoint, customers will write working code against one endpoint and broken code against the other. Filter operator semantics need to be consistent across the API surface, documented in one place, and tested in CI.

Our use and the design hindsight

Our Builds directory exposes a small filter surface on the browse endpoint: category, status, sort_by, plus pagination via cursor. It is not yet stress-tested by customer volume but the design has held through several internal iterations because the grammar is small and the implementation is direct: an allowlist in code, indexed columns on the database side, structured errors on unknown parameters. The pattern scales without growing complexity in proportion to feature count, which is the property that matters most as products grow.

The deeper point

Most customer filtering needs are narrower than the temptation to support arbitrary expressions suggests. A small grammar with clearly-bounded semantics handles the common cases, keeps the implementation bounded, and forces the harder questions about analytical workloads to find their right answer (a separate reporting endpoint, a data export, a customer-facing query language) instead of being shoehorned into the list endpoint where they do not belong.


Read more essays and technical writing at anethoth.com — a notebook on databases, distributed systems, biology, and the engineering that holds the world together.