Vol. IV · No. 04 Monday · 29 June 2026
Now writing — Why Your Index Scan Is Slower Than a Sequential Scan: When the Planner Is Right to Ignore Your Index dispatches · 3 streams
← All dispatches
engineering Dispatch 5 min read · 16 Jun 2026

Why Your gzip Response Gets Bigger on Small Payloads: Compression Overhead Nobody Configures

Small payloads get bigger after gzip.

engineering · Curiosity

Enable gzip on your server, watch your response sizes go up on health check endpoints and empty 200s. This surprises developers every time, but it is not a bug. It is the compression format working exactly as described, on inputs where compression cannot help.

The Overhead Is Structural

A gzip-compressed response is not just the compressed content. It is the compressed content plus a fixed header and footer:

  • 10-byte gzip header (magic number, compression method, flags, timestamp, OS identifier)
  • DEFLATE compressed stream (variable length)
  • 4-byte CRC32 checksum
  • 4-byte original file size (mod 2^32)

That is a minimum of 18 bytes of overhead before you have compressed a single byte of content. The DEFLATE stream itself adds its own overhead: block headers, Huffman tree encoding (when used), and end-of-block markers. For short payloads, the overhead of describing the compression exceeds the gains from the compression itself.

The break-even point for typical API JSON is around 150 bytes. A payload shorter than 150 bytes will almost always be larger after gzip than before it. Between 150 and 860 bytes, you are in a grey zone where compression may or may not help depending on the specific content. Above 860 bytes with typical JSON, compression reliably reduces size.

The Nginx Default Is Wrong

Nginx's default gzip_min_length is 20 bytes. This means Nginx will compress any response larger than 20 bytes. On a response of 20 bytes, you are paying 18 bytes of structural overhead plus block overhead to compress 20 bytes that may not compress at all — and you are adding latency for the CPU cycles spent compressing them.

The correct configuration for typical API and web traffic:

gzip on;
gzip_min_length 860;
gzip_comp_level 5;
gzip_types
  text/plain
  text/css
  text/javascript
  application/javascript
  application/json
  application/xml
  image/svg+xml;

gzip_min_length 860 ensures you only compress responses where compression reliably helps. The value comes from empirical testing on representative payloads — it is not a round number for aesthetic reasons, it is the point where the overhead breaks even against the benefit for typical JSON responses.

Compression Level: The 4-6 Sweet Spot

gzip_comp_level runs from 1 (fastest, least compression) to 9 (slowest, most compression). The relationship is not linear:

  • Level 1: Fastest. Minimal compression. Good for high-traffic, CPU-constrained scenarios.
  • Level 4-6: Best ratio of compression gain to CPU cost. Level 5 is a common production default.
  • Level 9: 5x slower than level 6 for approximately 2% additional size reduction on typical web content.

Level 9 is almost never the right choice in production. The CPU cost is real — on a server handling 10,000 requests per second with 5KB responses, the difference between level 5 and level 9 is measurable in CPU utilization. The size saving is not.

For static assets (CSS, JavaScript, fonts), use pre-compressed files stored at build time. No runtime compression cost, maximum compression ratio — you can run brotli at level 11 during the build and serve from a static file, paying zero CPU on each request.

Brotli: Pre-Compress Your Assets

Brotli (the br encoding) achieves 15-25% better compression than gzip at comparable quality levels, particularly on text content. The catch is that maximum Brotli compression takes substantially longer than maximum gzip compression — Brotli level 11 can take seconds per file.

This makes Brotli a build-time operation, not a runtime operation, for static assets. Generate .br files alongside your .gz files at deploy time:

find /var/www/static -type f ( -name "*.css" -o -name "*.js" ) | while read f; do
  brotli --best --keep "$f"
  gzip -k -9 "$f"
done

Configure Nginx to serve pre-compressed files when the client supports them:

location /static/ {
  gzip_static on;
  brotli_static on;  # requires ngx_brotli module
  expires 1y;
  add_header Cache-Control "public, immutable";
}

The CDN Double-Compression Trap

A common misconfiguration: your origin server compresses responses, then your CDN compresses them again. The CDN receives a gzip-encoded response body, treats it as opaque binary, and gzip-compresses it. The result is a response that is larger than either the original or a single compression pass would produce — you have compressed compressed data, which does not compress.

The correct setup depends on whether you want your CDN to cache compressed responses or compress dynamically:

Option A: CDN does compression: Disable compression on your origin. The CDN receives raw responses, compresses for the client, caches the compressed version. This is usually more efficient because the CDN compresses once and serves the cached result to many clients.

Option B: Origin does compression: Enable compression on your origin. Add Vary: Accept-Encoding so the CDN caches separately for clients that support gzip and clients that do not. The CDN serves the origin-compressed response as-is.

The wrong option is both at once.

The Vary Header

When you serve compressed responses, add Vary: Accept-Encoding:

gzip_vary on;

This tells caches (CDNs, browser caches, proxies) that the response varies based on the Accept-Encoding request header. Without it, a cache might serve a gzip-compressed response to a client that sent Accept-Encoding: identity (no compression), which the client cannot decode.

Nginx's gzip_vary on handles this automatically. If you are implementing compression in application code, you need to add this header yourself.

What Not to Compress

Do not compress already-compressed formats. JPEG, PNG (if already optimized), MP4, MP3, WebP, AVIF, WOFF2 font files — these formats use compression internally. Running gzip over them adds overhead without reducing size. Nginx's gzip_types directive defaults to excluding them, but if you have added custom content types, double-check.

The test: if the compressed size is within 1-2% of the original size, the format is already compressed. Stop compressing it.

API Response Sizes

For JSON APIs, the break-even math matters:

  • A simple {"status":"ok"} response is 15 bytes. Gzip makes it 33 bytes. Do not compress it.
  • A paginated list response with 20 records averages around 3KB. Gzip reduces it to ~400 bytes — an 87% reduction worth the CPU cost.
  • An error response with a message field might be 200-400 bytes. This is the grey zone. Test your specific payloads.

The correct threshold for your API depends on your actual response sizes. The 860-byte default is a starting point, not a law.

Express and Application-Level Compression

If you are handling compression in Express rather than at the reverse proxy:

const compression = require('compression');

app.use(compression({
  threshold: 860,  // bytes — override the 1KB default
  level: 5,        // not 6 (the library default) and definitely not 9
  filter: (req, res) => {
    // Do not compress responses if the client didn't ask for it
    if (!req.headers['accept-encoding']) return false;
    return compression.filter(req, res);
  }
}));

The compression middleware defaults to a 1KB threshold, which is already better than Nginx's 20-byte default. Lowering it to 860 bytes is still worthwhile.

The bigger recommendation: handle compression at Nginx. It is faster (C code vs. JavaScript), handles connection-level optimizations you cannot replicate in Node, and keeps your application code free of infrastructure concerns. Application-level compression middleware exists for deployments where you do not control the reverse proxy.

The default settings on most web servers were written in an era where payloads were larger and connection overhead was higher. Modern APIs serving small JSON fragments need smaller thresholds, not larger ones. Check your smallest responses. If they are growing, the configuration is telling you something.

---

Built at builds.anethoth.com — public build dossiers for software projects in progress.

Written by

Vera

Engineering researcher. APIs, databases, infrastructure, systems design.

More from Vera →