Designing API Error Responses: Status Codes, Body Structure, and the Patterns That Earn Trust
Error responses are the part of an API customers see when things go wrong, which means they bear disproportionate weight in determining whether customers trust the system. The patterns that survive contact with real integrations and the ones that produce angry support tickets.
An API's error responses are the part of the contract that gets tested most adversarially. Customers see error responses when they are already frustrated, often under time pressure, and they need to know quickly: did I do something wrong, or did the service do something wrong, or is this a transient condition I should retry? An API that answers these three questions clearly earns trust. An API that requires reading the source code to answer them loses customers.
We've shipped error response designs across DocuMint, CronPing, FlagBit, and WebhookVault, and the patterns that earned their place are different from what the standard advice suggests.
Status codes carry meaning, but not enough
HTTP status codes are the first line of error communication. The four-class distinction (1xx informational, 2xx success, 3xx redirection, 4xx client error, 5xx server error) is universally understood and supported by every HTTP library. Status codes do not need explanation; they communicate at a glance whether the failure was on the caller's side or the service's side.
The status codes that matter most for an API's error responses are a small set: 400 for malformed requests, 401 for missing or invalid authentication, 403 for valid authentication but insufficient authorization, 404 for missing resources, 409 for conflicts, 422 for valid format but invalid content, 429 for rate limiting, 500 for unexpected server errors, 503 for temporary unavailability. Almost every error in a well-designed API maps to one of these.
The 200-with-error-in-body anti-pattern is the most common mistake. Returning 200 for an error response forces every client to ignore the status code and parse the body to check for errors. This is a contract that loses on every dimension: harder to debug, harder to monitor, harder to integrate with retry logic. The fix is to use the right status code even when the body contains additional structured error information.
The body structure that scales
The status code answers "what category of failure" but not "what specifically failed and what should I do about it." The body provides the detail. The structure that has worked across multiple products:
{
"error": {
"code": "invoice_validation_failed",
"message": "Total amount must be positive",
"field": "total",
"request_id": "req_abc123",
"doc_url": "https://documint.anethoth.com/docs#errors"
}
}The code field is machine-readable. Clients can switch on it for programmatic handling. Codes should be stable strings — never change the code for an existing error, only add new codes for new errors. This is the most-frequently-broken contract in API evolution and the one that hurts customers most.
The message field is human-readable. It should be specific and actionable, not generic ("Bad request") or framework-leaking ("ValidationError on field"). The message is what appears in support tickets, in customer logs, in screenshots sent to support. It earns its place by being clear.
The field field, when present, points to the specific input that failed. This is essential for validation errors and useless for everything else. Don't include it for errors that don't have a single responsible field.
The request_id field is the most underrated part of error responses. When a customer reports a bug, the request_id lets you find the exact log entry on the server side. Without it, debugging customer reports requires searching by timestamp and IP and guesswork. With it, debugging is a database lookup.
The doc_url field is optional and underused. When the error has a non-obvious resolution, linking to the relevant docs section turns a frustration into a self-serve resolution.
Patterns that fail in production
Errors with stack traces in the body. Production should never leak stack traces to customers. They reveal implementation details, expose vulnerabilities, and confuse the customer about what they should do. Log the stack trace server-side, return a generic 500 with a request_id, let support look it up.
Generic messages with specific codes. "An error occurred" with code invoice_validation_failed is the worst of both worlds: the customer can't read the body to understand the problem and the code is the only signal. Either the message is specific and useful, or the message is generic and the code carries all the information; both is the right pattern.
HTML error pages from web servers. When a reverse proxy or load balancer returns an HTML error page (502, 504), API clients parsing JSON crash with a JSON parse error. Configure the proxy to return JSON for API routes. Caddy makes this easy with handle_errors directives.
Mismatched status codes. Returning 400 for "service unavailable" or 500 for "missing authentication" confuses every consumer. The status code should match the error category and the body code should be a specific instance of that category.
Validation errors and the batched-error pattern
Validation errors are the only error class that often needs to report multiple problems at once. A form submission with three invalid fields should report all three, not just the first. The pattern:
{
"error": {
"code": "validation_failed",
"message": "The request failed validation",
"request_id": "req_abc123",
"details": [
{"field": "email", "code": "invalid_format", "message": "Not a valid email"},
{"field": "amount", "code": "must_be_positive", "message": "Must be greater than 0"}
]
}
}Top-level fields stay consistent; the details array carries per-field information. This pattern only applies to validation errors. Other error classes have a single cause and a flat structure.
Rate limiting and the headers that complete the contract
429 responses need more than a body. Customers need to know how long to wait. The standard headers:
HTTP/1.1 429 Too Many Requests
Retry-After: 60
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1715515920Retry-After in seconds is the most actionable. X-RateLimit-* headers provide the broader context for clients that want to back off before hitting the limit. Returning all four is a small cost and a large customer benefit.
The deeper observation
Error responses are the part of an API that absorbs the most customer frustration, which means they are the part where investment compounds most. An API with thoughtful error responses spends less on support, has happier customers, and has a smaller blast radius when bugs do appear. The patterns above are not exotic: they are stable, well-supported by libraries, and require no specialized infrastructure. The discipline is to apply them consistently across every endpoint, document them as part of the API contract, and treat changes to them as breaking changes.