GraphQL vs REST: An Honest Comparison After Shipping Both

GraphQL gets sold as the modern replacement for REST. The honest comparison is messier — GraphQL solves a real problem at large scale, creates several new problems at any scale, and is mostly the wrong default for the kind of API a small team actually needs to ship.

GraphQL has spent the last decade getting sold as the modern replacement for REST APIs. The pitch is appealing: clients ask for exactly the data they need, the server returns it in one round-trip, schema introspection makes documentation automatic, and the type system catches integration bugs at the boundary. The pitch is also incomplete in ways that matter once you actually run a GraphQL endpoint in production for more than a quarter.

This is the longer version of "should we use GraphQL or REST" written from the perspective of having shipped both. The conclusion is not that one is universally better. The conclusion is that GraphQL solves a specific class of problem that mostly does not apply to small SaaS teams, while creating a different class of problem that small SaaS teams are particularly badly equipped to handle.

The problem GraphQL was designed to solve

Facebook built GraphQL because their mobile clients were paying high latency costs for chained REST calls — the news feed needed user data, post data, comment data, reaction data, and friend-relationship data, and assembling the screen required five sequential round-trips on a flaky cellular network. Letting the client describe its complete data requirement in a single query, then having the server fan out to its internal services and return one composed response, was a real latency win at Facebook's scale and client diversity.

The problem is real. It is also a problem that occurs primarily when (1) you have many client surfaces with divergent data needs, (2) those clients are bandwidth-constrained or latency-constrained, and (3) you have many backend services whose composition the API layer needs to abstract. Small SaaS teams typically have one or two client surfaces, no bandwidth constraints, and one backend.

The N+1 query trap

The most common GraphQL footgun is also the one most teams discover only after shipping. A schema like { users { id, posts { id, comments { id } } } } looks elegant. The naive resolver implementation issues one database query for users, then N queries for each user's posts, then N×M queries for each post's comments. The classical N+1 problem, multiplied.

The standard fix is dataloader, a batching library that defers individual lookups, accumulates them within a single tick of the event loop, and issues batched queries. Dataloader is required infrastructure for any non-trivial GraphQL server. It also leaks abstraction into resolvers, fights ORM conventions, complicates pagination semantics, and creates subtle bugs around request-scoping when objects are cached across resolvers. The clean GraphQL pitch hides a meaningful additional layer of resolver-aware data-loading machinery that REST APIs do not need.

Schema federation and the cost of the gateway

The GraphQL story at scale involves federation — a central gateway that stitches together schemas from multiple backend services and routes resolvers to the right service. Apollo Federation, GraphQL Mesh, Hasura, and StepZen all sell variations of this architecture. Federation lets multiple teams own separate parts of the schema while clients see a unified surface.

The cost is a new piece of infrastructure that has to be maintained, monitored, scaled, and secured. The gateway becomes the slowest path in your system, the highest-traffic service, and the place where tracing has to be the cleanest because every request flows through it. Most small teams never reach the scale where federation pays back the operational tax of running it.

Persistent queries and the open-query problem

An open GraphQL endpoint lets clients send arbitrary queries, including queries that ask for the entire schema graph at high depth. This is a denial-of-service primitive. The standard mitigations are query depth limits, complexity scoring, persistent queries (where clients send a hash of a pre-registered query rather than the query text), and rate limiting that accounts for query cost rather than request count.

Each mitigation is real engineering. Persistent queries require build-pipeline integration to extract and register queries at deploy time. Complexity scoring requires assigning a cost to every field and pruning queries that exceed a budget. Depth limits create false negatives when legitimate queries get rejected. The combined complexity adds up to a meaningful fraction of what a REST gateway would have given you for free with rate-limiting on path-and-method.

Caching, versioning, and the things HTTP gave you

REST APIs inherit HTTP's caching infrastructure: ETag, Cache-Control, conditional requests, CDN caching, browser caching. Each layer is decades-mature, well-instrumented, and configurable per route. GraphQL, because every request is a POST to a single endpoint with a different body, gets none of this for free. You can rebuild it — Apollo Cache, Hasura's caching, persistent-query-keyed CDN caching — but you are reimplementing what HTTP already did.

Versioning is the same story. REST APIs version through path or header, with documented deprecation cycles. GraphQL's official position is "schemas evolve through field additions and deprecations, not versions" — which sounds elegant until you need to remove a field that customers depend on. The deprecation tooling exists, but the cultural and operational discipline of a versioned API is harder to achieve in a schema that is officially un-versioned.

When GraphQL is the right answer

The honest case for GraphQL is: many divergent client surfaces, bandwidth-constrained clients, multiple backend services that need composition at the API layer, and a team large enough to absorb the operational cost of dataloader plus federation plus query-cost-limiting plus the bespoke caching layer. Shopify, GitHub, and Facebook all have these conditions and all use GraphQL successfully.

The honest case against GraphQL for most small teams is: one or two client surfaces, no meaningful bandwidth constraint, one backend, and a small team. The benefits do not materialize and the costs do.

Our choice across four products

We ship REST APIs across all four products. DocuMint, CronPing, FlagBit, and WebhookVault all expose JSON-over-HTTP endpoints with explicit versioning, conditional caching where appropriate, and per-route rate limiting. The SDKs we have considered building are thin wrappers over fetch — there is no schema-driven code generation that earns its weight at our scale.

If we ever build a unified API surface that exposes data from all four products to a single client (a customer dashboard, for example), GraphQL becomes a more interesting candidate because the composition problem is real. Until then, REST is the boring correct answer for the same reason SQLite is — it does what we need without bringing infrastructure that solves problems we do not have.

The deeper lesson, applicable beyond this specific choice, is that "modern" is not the same as "appropriate." GraphQL is a genuinely good tool for the problem it was built to solve. It also represents a meaningful step up the complexity curve, and complexity that is not paid back by the problem at hand is just cost.

Read more