ETags and Conditional Requests: HTTP Caching Beyond Cache-Control
Most API teams configure Cache-Control once when they ship the API and never touch HTTP caching again. The result is one of two failure modes: caches that lie because the freshness window was set without thinking, or caches that do nothing because everything is marked private no-cache. Both modes leave the most useful HTTP caching primitive — conditional requests with ETags — on the table.
This post covers what ETags actually do, how conditional requests save bandwidth and CPU on both sides, the strong-vs-weak distinction and when each is correct, the optimistic-concurrency-control use that most APIs miss, and the operational signals that suggest the feature is configured correctly.
What an ETag is
An ETag is an opaque string the server attaches to a response that identifies a specific version of a specific resource. The client stores the ETag alongside the cached response. On the next request, the client sends the stored ETag in an If-None-Match header. If the server's current version still has the same ETag, the server responds with 304 Not Modified and no body. If the version has changed, the server responds with 200 and the new body and a new ETag.
The wire savings are exactly the body bytes. For a 50KB JSON response that changes daily and is fetched hourly by a polling client, ETags reduce traffic by roughly 23x without any change in client behavior beyond honoring the standard. The CPU savings on the server side are usually larger than the bandwidth savings because the response body never has to be serialized.
Strong vs weak ETags
The standard distinguishes strong ETags from weak ETags by a leading W/ prefix. A strong ETag changes when any byte of the response changes, including formatting differences, HTTP-header ordering effects, and gzip-level differences. A weak ETag changes only when the semantic content changes, with explicit permission for byte-level differences that do not affect meaning.
For most APIs the right default is weak ETags computed from a content hash of the canonical JSON representation. Strong ETags require either byte-level reproducibility of the response or expensive caching of the actual rendered bytes, and the operational difficulty rarely earns the marginal correctness improvement. The exception is range requests on large binary payloads, where strong ETags are required for the range-resume semantics to be correct.
Conditional writes with If-Match
The use that most API teams miss is the inverse of If-None-Match. The If-Match header on a write request asks the server to perform the write only if the current ETag matches the one the client thinks it is updating. This is optimistic concurrency control via HTTP, and it solves the lost-update problem without any server-side locking.
The pattern is: client GETs the resource and receives an ETag; client modifies the representation locally; client PUTs the modified resource with If-Match set to the original ETag; server compares the current ETag and either applies the write or returns 412 Precondition Failed. The 412 response tells the client that someone else updated the resource in the interval, and the client must re-fetch and reapply the change.
For FlagBit feature flag updates, this pattern would prevent the lost-update mode where two operators edit the same flag concurrently and the second write silently overwrites the first. For DocuMint invoice template edits, it would catch the case where a teammate updates the template between when you opened the editor and when you saved.
How to compute the ETag
The two reasonable approaches are version numbers and content hashes. A monotonically increasing version number stored alongside the resource is the cheapest option for resources that already have a version column. The ETag is just the version number formatted as a quoted string. The downside is that the version increments on every write even if the content is unchanged, which prevents the cache from skipping no-op updates.
A content hash — typically SHA-256 of the canonical JSON representation, truncated and base64-encoded — is more accurate and supports no-op deduplication on the cache, but requires either computing the hash on every request or caching it alongside the resource. For most APIs the right answer is to cache the hash on the resource record and update it on write, paying the small storage cost for the read-side savings.
Where ETags do not help
Three classes of response do not benefit from ETags. The first is responses that are different on every request — search results with timestamp components, paginated lists where the page is computed on the fly, anything with a dynamic now() in the response. The ETag never matches and the conditional request just adds a round trip.
The second is responses that are already small enough that the conditional request overhead exceeds the body size. The breakpoint is somewhere around 1KB; below that, the request-and-304 round trip costs more than just sending the body.
The third is responses that are already cacheable for a long time via Cache-Control. If the client has a fresh response with five minutes left on its max-age, the client should not be sending a conditional request at all, and the server's ETag is irrelevant for that interval.
The Vary header subtlety
ETags interact with the Vary header in subtle ways. If a response varies by Accept, Accept-Encoding, or any custom header, the ETag must be different for different values of that header. Otherwise a client with one Accept-Encoding receives a 304 for a body it does not have. The right pattern is to include the Vary inputs in the ETag computation, either by hashing them in or by using separate cache keys per Vary input.
For WebhookVault request inspection responses, the Vary surface is small — Accept and Accept-Encoding are usually enough — but the discipline of including them in the ETag computation prevents subtle correctness bugs that show up only when one client uses gzip and another does not.
Operational signals
The four signals that tell you whether ETags are working are: the ratio of 304 responses to 200 responses on resources you expect to be cacheable; the bandwidth saved on representative client polling patterns; the rate of 412 Precondition Failed responses on writes (low is normal, high suggests a race-condition pattern in the application); and the fraction of requests that include If-None-Match (low suggests clients are not honoring the standard).
For CronPing monitor list endpoints that polling clients fetch every minute, the 304 ratio after a day of operation is the right diagnostic. If most poll requests return 304, the cache is working. If most return 200 with unchanged content, the ETag computation is wrong and the cache is doing nothing.
The deeper observation
HTTP caching is a layered system: Cache-Control sets the freshness contract, ETags provide the revalidation primitive, and conditional headers tie them together. Configuring just one of the three layers leaves most of the value on the table. The teams that get HTTP caching right configure all three deliberately, monitor the resulting traffic patterns, and treat the caching surface as a first-class product feature rather than an afterthought set in middleware. The bandwidth and latency savings compound across millions of requests in ways that no application-level optimization can match, because the protocol was designed for exactly this case and most servers and clients already implement it correctly.