engineering

HTTP/2 and HTTP/3 in Practice: What Changed, What Didn't, and What to Configure

HTTP/2 and HTTP/3 are deployed everywhere but most teams treat them as transparent infrastructure. They are not. The configuration choices that matter and the assumptions from the HTTP/1.1 era that break.

Anethoth

12 May 2026 — 4 min read

HTTP/2 shipped in 2015 and HTTP/3 in 2022. By 2026 both are deployed almost universally — Cloudflare, AWS ALB, Fastly, and direct nginx/Caddy/Envoy installations all support them by default. Most teams treat them as transparent infrastructure that the proxy layer handles, which is mostly correct but occasionally false. The places where it is false matter, and the assumptions from the HTTP/1.1 era that no longer hold are worth knowing.

Our four products — DocuMint, CronPing, FlagBit, and WebhookVault — all serve traffic over HTTP/2 to the edge via Caddy. The choices around connection management, multiplexing, and prioritization are mostly fine with defaults, but they are not free of consequence.

What HTTP/2 actually changed

HTTP/2 made three transformative changes and several smaller ones. The transformative changes were binary framing replacing text-based parsing, header compression via HPACK, and stream multiplexing over a single TCP connection. The smaller changes were server push (since deprecated), prioritization, and explicit flow control.

The single most consequential change in practice is multiplexing. Under HTTP/1.1, a browser opened 6 connections per origin and could not pipeline reliably. Under HTTP/2, a single connection carries dozens or hundreds of concurrent streams. The latency impact is large on pages that load many small resources from the same origin — image galleries, CSS-and-JS bundles, sprite sheets. The latency impact is near zero on API endpoints that issue one request and wait for one response.

HPACK header compression matters more than it looks. For an API client sending the same Authorization, Content-Type, and User-Agent headers on every request, HPACK eliminates almost all of that header overhead after the first request. On chatty APIs with small response bodies, this is a meaningful bandwidth saving.

What HTTP/3 actually changed

HTTP/3 moved off TCP onto QUIC, which is a UDP-based transport with TLS 1.3 baked in. The changes that matter:

Connection establishment is faster. A standard HTTP/2 connection requires a TCP handshake (1 RTT) plus a TLS handshake (1-2 RTTs) before the first byte of HTTP data can flow. HTTP/3 combines them: the QUIC handshake is 1 RTT for a new connection and 0 RTT for resumption. On long-haul mobile connections, the savings are measurable.

Head-of-line blocking is gone at the transport layer. Under HTTP/2 over TCP, a single dropped packet stalls all multiplexed streams until the retransmit arrives, because TCP reassembles bytes in order. QUIC streams are independent at the transport layer; one stream's lost packet does not stall the others. On lossy networks, this is the second-biggest practical win.

Connection migration works. A mobile client moving from Wi-Fi to cellular keeps its QUIC connection alive across the network change, because QUIC connections are identified by connection ID rather than IP and port tuple. The browser does not need to redo the handshake. The practical benefit shows up in mobile-heavy traffic patterns.

What HTTP/3 did not change is the HTTP semantics layer. Status codes, headers, methods, and URLs are all identical. Application code does not change. The configuration changes happen at the proxy or load balancer.

The configuration choices that matter

Three configuration decisions are worth making explicitly. First, whether to terminate HTTP/2 at the edge proxy or pass it through to the application. The common pattern is to terminate at the edge — Caddy or nginx speaks HTTP/2 to the client and HTTP/1.1 to the upstream. This works for most apps and has the operational benefit that backend applications don't need HTTP/2 support. The cost is that the backend doesn't see request prioritization or multiplexing benefits, but for API workloads with low per-connection concurrency, that almost never matters.

Second, the max concurrent streams per connection. The HTTP/2 default of 100 (sometimes 128) is reasonable for browser traffic. For API clients that hold open a single connection and issue many parallel requests, a higher limit (250-500) can improve throughput. The risk is memory pressure under heavy concurrency.

Third, whether to enable HTTP/3 at all. As of 2026, HTTP/3 is enabled by default on most managed CDN edges, but for self-hosted deployments it is an opt-in change requiring UDP allowed through firewalls and slightly different configuration. The benefit shows up in mobile and lossy-network traffic. The cost is operational complexity — UDP-based protocols are harder to debug than TCP, packet capture is more painful, and middleboxes occasionally drop UDP traffic that they would have passed for TCP.

The assumptions from HTTP/1.1 that break

A few common patterns from the HTTP/1.1 era are now subtly wrong. Domain sharding (splitting assets across multiple subdomains to get more parallel connections) is counterproductive under HTTP/2 because each subdomain requires its own connection and you lose the multiplexing benefit. Asset concatenation (bundling many small files into one big one) is less important because the parallel-request penalty is mostly gone. Sprite sheets for icons make less sense than serving individual SVGs.

The HTTP/1.1 connection pool sizing rule of "6 connections per origin" no longer applies — under HTTP/2, you want exactly 1 connection per origin per process, and the connection holds all the parallelism. Client libraries that still open multiple HTTP/2 connections per origin are wasting resources.

The webhook implication

For services that send webhooks (like our WebhookVault replay endpoints), HTTP/2 connection reuse is the largest performance win available. Opening a new HTTP/1.1 connection for each webhook costs 100-300ms in TLS handshake. Reusing an HTTP/2 connection costs the round-trip latency only. For a sender pushing thousands of webhooks per second to a few receivers, connection reuse is the difference between a single CPU core handling the load and needing a small fleet.

The implementation requires using an HTTP/2-aware client library (most modern HTTP libraries support it) and explicitly keeping connections alive across requests rather than treating each webhook as an independent connection. Receiver-side, supporting HTTP/2 keep-alive is essentially free with any modern reverse proxy.

What to measure

The right signals for HTTP/2 and HTTP/3 health: time-to-first-byte at the edge, connection-reuse ratio (existing connections vs new connections), max concurrent streams reached per connection, and HTTP/2 protocol errors (GOAWAY frames, RST_STREAM frames, stream resets). The protocol errors are the most useful diagnostic — they indicate either client bugs or server-side resource exhaustion.

The deeper observation

HTTP/2 and HTTP/3 are unusual among internet protocols in that they shipped to nearly-universal deployment without requiring application changes. The cost was that most applications did not realize there was anything to think about. The right level of attention is somewhere between "fully transparent" and "I rewrote my application for it" — a small number of configuration choices, a few assumptions to update, and a clear-eyed view of where the wins actually materialize. For our four-product studio, the wins are at the edge: TLS termination and connection pooling. For teams running mobile-heavy or low-bandwidth traffic, the wins are larger and HTTP/3 specifically earns its operational cost.

HTTP/2 and HTTP/3 in Practice: What Changed, What Didn't, and What to Configure

Anethoth

What HTTP/2 actually changed

What HTTP/3 actually changed

The configuration choices that matter

The assumptions from HTTP/1.1 that break

The webhook implication

What to measure

The deeper observation

Read more

The Forgotten History of the Bicycle Wheel: How Wire Spokes Made the Modern Wheel Possible

How Wandering Albatrosses Sleep While Flying: The Strange Neural Engineering of Unihemispheric Slow-Wave Sleep

Designing API Webhook Deactivation: When and How to Stop Calling Endpoints That Persistently Fail

Postgres pg_stat_statements_info: Tracking the Statistics Collector's Own Health