Designing API Error Codes: Stable Strings, Numeric Codes, and the Hierarchies Customers Build Reports From

Most APIs eventually accumulate an error-code taxonomy. The questions are what shape the codes take, how stable they are across versions, and how customers build reports from the resulting data. The choices look minor when you make them and feel locked-in when you realize customers are checking error.code === "rate_limit_exceeded" in production code paths that bypass support.

What the error code is for

The error code answers a different question than the HTTP status code. The status code answers "what category of problem" with the small standardized vocabulary of 4xx and 5xx codes. The error code answers "what specific problem" with the application-specific vocabulary that distinguishes between the dozens of validation errors and authentication failures and resource conflicts that all share a 400 or 401 status.

Customers use the error code for three things. The first is conditional logic: retry on some errors, escalate on others, surface to user on a third set. The second is reporting: count occurrences per error type to identify integration problems. The third is documentation lookup: paste the code into your search field and find the specific page describing the error.

The error message field answers the human question. The two fields exist for different audiences with different needs. Code is for machines and dashboards. Message is for humans reading log output during incident response.

The three common shapes

The first shape is stable strings: rate_limit_exceeded, invalid_api_key, customer_not_found. The format is snake_case lowercase, the values are human-readable, the strings are stable across versions and across language SDKs. Stripe and GitHub and Linear and most modern B2B SaaS converge on this shape.

The second shape is numeric codes: 1001, 2042, 3007. The format is numeric, often with a category prefix encoded in the leading digits. The shape is common in older APIs and in protocols designed for embedded clients where string parsing is expensive. The shape has poor ergonomics for modern web developers who must look up every number.

The third shape is hierarchical strings: billing.subscription.payment_failed, validation.field.required, auth.token.expired. The dotted hierarchy supports both exact-match and prefix-match filtering, which is useful for dashboards that want to roll up "all billing errors" or "all validation errors". The shape is less common but underrated for B2B SaaS where customers do build category-level dashboards.

The stability contract

Error codes are part of the API contract. Changing the code for an existing error is a breaking change for customers whose code checks the value. The implication is that error codes must be planned with the same care as URL paths and field names and event types.

The naming discipline that supports stability is descriptive-not-implementation. The code should describe what the customer experienced, not what the server failed to do. rate_limit_exceeded is better than redis_check_returned_zero_tokens because the latter exposes implementation that may change while the former describes the customer-facing experience.

Adding new error codes for cases not previously distinguished is non-breaking if the cases were previously raised as a more general code. Customers checking the general code will fall through gracefully. Customers using exact-match against the more specific case will start seeing the new code only when they explicitly subscribe to it.

The taxonomy decision

The grain of the taxonomy is the load-bearing question. Too coarse and customers cannot distinguish cases that matter. Too fine and the taxonomy itself becomes the integration surface and the maintenance burden grows.

The principle that has worked across our four products is: distinguish cases that produce different customer responses. rate_limit_exceeded versus quota_exceeded is worth distinguishing because rate limits are operational and quotas are billing. card_declined versus card_expired versus insufficient_funds is worth distinguishing because the customer remediation differs. Distinguishing card_declined_visa from card_declined_mastercard is not worth it because the remediation is identical.

The taxonomy should grow rather than start large. Beginning with 10-15 high-level codes and adding specificity as customer feedback identifies cases that genuinely warrant distinguishing produces a taxonomy customers can learn. Beginning with 100 fine-grained codes produces a taxonomy customers cannot learn and that you cannot rename without breaking.

The response shape

The response body should carry the code, a human-readable message, optionally a field name for validation errors, and ideally a documentation URL pointing at the specific error page. The shape we use across DocuMint and CronPing and FlagBit and WebhookVault is:

{
"error": {
"code": "validation_error",
"message": "The email field must be a valid email address",
"field": "email",
"doc_url": "https://docs.example.com/errors/validation_error",
"request_id": "req_abc123"
}
}

The request_id is non-optional for production support workflows. The doc_url is optional but pays back substantial customer support time when the documentation page is well-written. The field name is required for validation errors and absent for everything else.

The HTTP status mapping

Each error code maps to one HTTP status. The mapping should be stable across versions and consistent within categories. Validation errors return 400 or 422 with the same status across all validation codes. Authentication failures return 401 consistently. Authorization failures return 403 consistently. Rate limits return 429 consistently. Server errors return 5xx.

The pattern that fails is using the same code with different status across endpoints. rate_limit_exceeded returning 429 on most endpoints but 503 on one is the kind of inconsistency that customer code does not handle gracefully. The discipline of one-status-per-code prevents the inconsistency.

The migration path for legacy taxonomies

Teams who launched with a taxonomy that has aged badly have two options. The first is to publish the new taxonomy alongside the old and dual-emit both in the response with a versioning header that controls which is canonical. The Stripe approach. The migration window is typically two years and most customers transition voluntarily within the first year.

The second option is to publish the new taxonomy as a separate field while keeping the legacy field unchanged. error.code as the new field and error.legacy_code as the unchanged field. The pattern lets customers opt into the new taxonomy without forcing migration.

The pattern that does not work is renaming the field with a hard cutover. Customer support tickets from broken integrations dominate the engineering calendar for months after the cutover and the relationship damage from breaking customers in production is hard to undo.

Three patterns that fail

The first failure is unstable codes. A team that renames card_declined to payment_card_declined in a minor version update will discover the change broke production integrations whose authors had no notification. The discipline is treating the code as part of the API contract and applying the same deprecation discipline to it.

The second failure is leaking implementation. A code like redis_lookup_failed exposes internal architecture that customers should not depend on. When you switch from Redis to Postgres for the implementation, the customer-facing code becomes a lie. The discipline is naming codes by customer-facing meaning, not by implementation.

The third failure is over-categorization. A taxonomy with 200 codes that distinguish edge cases customers never act on becomes a maintenance burden that customers ignore. The discipline is asking for each candidate code whether at least one customer integration will branch on it specifically. If not, the case should fall through to a more general code.

What this looks like across our four products

We use stable string codes across DocuMint, CronPing, FlagBit, and WebhookVault. The codes are snake_case lowercase descriptive strings, the mapping to HTTP status is consistent within each product, and the response shape is uniform. Documenting each code with a dedicated page in the API reference is on our infrastructure roadmap and is the single highest-impact developer-experience investment we have not yet made.

The taxonomy is small. DocuMint has roughly 18 codes spanning authentication, validation, billing, and PDF generation failures. CronPing has roughly 14 codes spanning authentication, monitor management, and notification delivery. FlagBit has roughly 16 codes spanning authentication, flag rules, and evaluation. WebhookVault has roughly 20 codes spanning authentication, endpoint configuration, and webhook capture. The total across the four products is approximately 68 distinct codes, which is small enough that the documentation surface stays manageable.

Deeper observation

The deeper observation is that error codes are one of the parts of an API that customers interact with most under stress and that providers pay least attention to under normal operations. The investment of designing the taxonomy carefully, naming codes for customer-facing meaning rather than implementation, and treating stability as a contract pays back in the form of customer integrations that handle failure gracefully and customer support workloads that stay manageable. The investment of throwing together error codes ad-hoc as they come up and accepting whatever shape happens produces support workloads that grow superlinearly with the API surface. The choice between the two is one of the differences between developer-experience-focused B2B SaaS and developer-experience-neglected B2B SaaS.

Our products: DocuMint (PDF invoice generation API), CronPing (cron job monitoring with status pages), FlagBit (feature flags API for modern teams), and WebhookVault (webhook capture and replay) put these patterns into production.