engineering

Service-to-Service Authentication: mTLS, JWTs, and Why Most Teams Don't Need a Service Mesh

Service mesh marketing tells you that mTLS-everywhere is the modern security baseline. The honest answer for most teams is that simpler patterns achieve nearly the same security with a fraction of the operational overhead, and the mesh becomes correct only at a scale most teams do not reach.

Anethoth

06 May 2026 — 5 min read

Service-to-service authentication is one of those topics where the conference talk and the reality diverge sharply. The conference talk presents zero-trust mTLS-everywhere as the modern baseline, with a service mesh as the implementation vehicle, and any team not running this is implicitly behind. The reality is that most teams operate small numbers of services on a single network, in conditions where simpler authentication patterns achieve comparable security with a tenth of the operational overhead. The service mesh is sometimes correct, but the threshold is much higher than the marketing suggests.

This post is the honest comparison of service-to-service authentication options for teams running between 2 and 50 services. It covers what each pattern actually defends against, the operational costs that determine whether it earns its weight, and the migration path that minimizes pain if you outgrow the simple approach.

What threats are we defending against

The threat model determines which pattern is correct. The four threats that service-to-service auth addresses, in roughly increasing order of sophistication: impersonation (an unauthorized client calling the service), tampering (a man-in-the-middle modifying the request or response), eavesdropping (an attacker reading the request payload), and compromise of an internal service (a service that has been taken over now calling other services).

For services on a private network behind a firewall, impersonation and the network-level attacks are largely addressed by the network boundary itself. The question becomes whether you trust the network boundary or whether you assume the network is hostile. The "assume the network is hostile" framing is the zero-trust position, and it is correct for sufficiently large or sufficiently sensitive deployments. It is over-cautious for a SaaS team running 5 services on a single VPC where the network boundary is genuinely defensible.

The four patterns and what they cost

Shared secret in environment variable: each service has a secret it sends to other services on every call, the receiving service compares against its expected value. The cost is approximately zero — it is configuration. The defense is against impersonation by clients outside the network. It does not defend against tampering, eavesdropping, or compromised insiders.

Signed JWT with shared signing key: each service signs a JWT with a shared key and includes claims about its identity and the action it intends. The receiving service verifies the signature and checks the claims. The cost is small — a JWT library and a key distribution mechanism. The defense is the same as shared secret plus protection against tampering of the claims (the signature would fail). It does not address eavesdropping or compromise.

Signed JWT with public-key per service: each service has its own private key and publishes a public key. JWTs are signed with the private key and verified with the corresponding public. The cost is moderate — a JWKS endpoint per service or a centralized JWKS, plus the discipline of key rotation. The defense adds protection against insider compromise — a stolen JWT signing key affects only one service's claimed identity.

Mutual TLS (mTLS): each service has a certificate, each service verifies the certificate of the other. The cost is significant — a certificate authority, certificate provisioning, certificate rotation, and verification logic. The defense is comprehensive: impersonation, tampering, eavesdropping, and (with proper certificate issuance discipline) compromise.

The service mesh option

A service mesh (Istio, Linkerd, Consul Connect) automates mTLS by intercepting all service-to-service traffic at sidecar proxies. The mesh handles certificate provisioning, rotation, and verification without application changes. The selling point is "mTLS without code changes," which is genuinely valuable. The cost is the operational complexity of running the mesh — the control plane, the sidecars, the data plane, the observability stack that makes it debuggable. That cost is meaningful and persistent.

The mesh becomes worth its weight when you have many services (typically 20+), heterogeneous languages (where consistent JWT libraries are hard to maintain), and the genuine zero-trust requirement that justifies mTLS. For a team with 5 services in 2 languages on a single VPC, the mesh is overkill. The same security can be achieved with shared-secret JWTs and the existing network boundary, at a fraction of the operational overhead.

For a small team running a small number of services, the practical recommendation is signed JWTs with a shared signing key, plus TLS termination at a single ingress, plus network-level isolation between the service network and the public internet. This combination defends against the realistic threats — external attackers, network errors, accidental misconfigurations — without the operational overhead of a service mesh.

The signing key should rotate regularly, with a rotation mechanism that supports the old and new keys simultaneously during the rotation window. The JWTs should have short expirations (5-15 minutes) so a leaked token is not useful for long. The claims should include the calling service identity, the intended action, and a unique request ID for replay detection.

When to upgrade

The signals that warrant upgrading from shared-secret JWTs to per-service public keys: distinct services with distinct security postures (some handle PII, others do not), regulatory requirements that mandate non-shared signing material, or operational scale where shared-secret rotation becomes genuinely difficult.

The signals that warrant upgrading to mTLS: zero-trust requirements driven by compliance or security audit, multi-tenant infrastructure where the network boundary is shared with untrusted parties, or scale where the manual JWT discipline has produced ongoing security incidents.

The signals that warrant a service mesh: more than 20 services, heterogeneous languages making consistent libraries hard, or organizational structure where service teams are separately staffed and the mesh provides a uniform security baseline that does not depend on each team implementing the same patterns.

The migration path

The migration path from simple to complex follows the pattern. Start with shared-secret JWTs and TLS termination. Move to per-service public keys when the signals warrant. Move to mTLS when the signals warrant. Move to a service mesh when the signals warrant. Each step doubles roughly the operational overhead, and skipping steps usually produces systems that are both under-secured and over-engineered.

The migration is easier than it sounds because each step is mostly additive. A service mesh can run alongside JWT-authenticated traffic, allowing gradual cutover. Per-service public keys can replace shared-secret keys gradually as services rotate. mTLS can be enabled per-pair-of-services starting with the most sensitive pairs. The discipline is to make each step in response to a signal, not in response to the architecture talk you watched last week.

Where the four products fit

DocuMint, CronPing, FlagBit, and WebhookVault are independent services that do not call each other. The service-to-service auth question does not apply. Each product has a small number of internal components — web tier, background worker, database — that communicate over a Docker bridge network. The auth between them is implicit network-level isolation: the database listens on the bridge network only, the worker has credentials to access the database, and nothing else can reach the database. This is the simplest possible auth pattern, and it is correct because the threat model does not require more.

If we ever build inter-product features (for example, FlagBit calling DocuMint to generate a usage receipt), the first pattern we would deploy is shared-secret JWTs with a 10-minute expiration. The mesh is in our minds as the eventual destination if we end up running 20+ services, which is a problem we hope to have but do not have today.

The deeper point

Security architecture is matched to threats, not to fashion. The threats determine the patterns. The patterns determine the operational cost. Choosing the most sophisticated pattern when the threats do not require it produces systems that are no more secure but significantly harder to operate, and the operational difficulty often degrades security through misconfiguration and operator burnout. The simplest pattern that addresses the threats is almost always the right answer, and the discipline is to revisit the threat model regularly so that when it changes, the pattern can change with it.

Service-to-Service Authentication: mTLS, JWTs, and Why Most Teams Don't Need a Service Mesh

Anethoth

What threats are we defending against

The four patterns and what they cost

The service mesh option

When to upgrade

The migration path

Where the four products fit

The deeper point

Read more

The Forgotten History of the Bicycle Wheel: How Wire Spokes Made the Modern Wheel Possible

How Wandering Albatrosses Sleep While Flying: The Strange Neural Engineering of Unihemispheric Slow-Wave Sleep

Designing API Webhook Deactivation: When and How to Stop Calling Endpoints That Persistently Fail

Postgres pg_stat_statements_info: Tracking the Statistics Collector's Own Health

What threats are we defending against

The four patterns and what they cost

The service mesh option

What we actually recommend

When to upgrade

The migration path

Where the four products fit

The deeper point

Read more

The Forgotten History of the Bicycle Wheel: How Wire Spokes Made the Modern Wheel Possible

How Wandering Albatrosses Sleep While Flying: The Strange Neural Engineering of Unihemispheric Slow-Wave Sleep

Designing API Webhook Deactivation: When and How to Stop Calling Endpoints That Persistently Fail

Postgres pg_stat_statements_info: Tracking the Statistics Collector's Own Health