Secrets Management for Small SaaS: Patterns That Don't Require a Vault Cluster

Most secrets-management advice is written for large enterprises and assumes a HashiCorp Vault cluster you do not have. The patterns that actually work for small SaaS are simpler — environment files with strict permissions, KMS-backed encryption for the few secrets that need it, rotation disciplin...

Walk into any small SaaS startup and you will find secrets in three places: an .env file on the production host, a few environment variables set by the deployment pipeline, and a Slack thread from six months ago where someone shared a credential without thinking. The dominant secrets-management advice on the internet — set up Vault, integrate with cloud KMS, build a secret-injection sidecar — is written for organizations with a security team that can own that infrastructure. For a four-person team running four products on one VPS, that advice is correct in principle and unworkable in practice.

What actually works at small scale is a much simpler set of patterns: strict file permissions on .env files, KMS-backed encryption for the few secrets that genuinely need it, a rotation discipline that does not require dedicated tooling, and the meta-discipline of keeping the surface of secrets as small as possible. This piece walks through the patterns that are honest about the trade-offs at this scale.

The .env file as the unit of secret management

For most small SaaS teams, the .env file on the production host is the secrets store. This is fine if it is treated with the discipline a secrets store deserves: file permissions are chmod 600 owned by the application user, the file is excluded from version control via .gitignore, and the file is provisioned via a secure channel (SSH paste, configuration management, manual SCP) rather than being committed anywhere.

The patterns that go wrong: .env files committed accidentally because someone added a new secret and ran git add . instead of being explicit; .env files readable by other users on the host because the default umask was 022; .env files copied to staging or test environments that have weaker access controls; .env files with secrets for services that are no longer in use.

The minimum hygiene is a one-line audit script that checks .env file permissions and a periodic review that diffs the .env keys against the application's actual configuration usage. The five-line bash script that runs find /opt -name '.env*' -exec stat -c '%a %n' {} \; is the lowest-effort, highest-impact security control you can deploy.

Secrets that genuinely need encryption at rest

Some secrets need stronger protection than file permissions provide. The criterion is whether the secret is more sensitive than the host itself: if compromise of the secret is significantly worse than compromise of the host, it deserves encryption with keys held outside the host.

The canonical examples are payment processor keys (Stripe production keys, PayPal credentials), customer data encryption keys, and the master keys for any vault-like data your application stores. Compromising these means the attacker can move money or read customer data even after losing access to the host.

The right pattern at small scale is KMS-backed encryption: the secret is encrypted with a key held in AWS KMS, GCP Cloud KMS, or a similar service, and decrypted by the application at startup using its IAM-granted access to the KMS. The secret on disk is the ciphertext; the plaintext exists only in process memory. If the host is compromised, the attacker gets the ciphertext and the IAM credentials, and can decrypt — but the audit trail in CloudTrail / Cloud Audit Logs records every decryption, which is the actual security benefit.

This is more infrastructure than .env files but less infrastructure than running a Vault cluster, and it is the right intermediate point for most small teams.

The secret surface principle

The most effective secrets-management practice is to have fewer secrets. Every secret is a thing that can leak. Reducing the number of secrets reduces the leak surface in a way no rotation policy can match.

Concrete patterns: use IAM roles instead of API keys for cloud service access (no credential, no leak). Use service accounts with limited scopes instead of admin credentials (smaller blast radius if leaked). Use mutual TLS for service-to-service auth instead of API tokens (no shared secret to steal). Use webhook signatures instead of webhook auth tokens (the signing key never leaves your service).

The compounding benefit is that secret-free or signature-only architectures eliminate entire categories of failure: the engineer who pastes a credential into a Slack thread, the log file that captures a secret in a request body, the third-party tool that claims to need credentials and turns out to need only signed requests. Each of these is a real-world incident type that a secret-light architecture renders impossible.

Rotation: the policy that actually gets followed

The standard advice on rotation is to rotate every secret every 90 days. The standard reality is that secrets get rotated when an engineer leaves, when a credential is suspected of being leaked, or when a compliance audit demands it. The 90-day cadence is followed in policy documents and ignored in practice.

The rotation policy that actually gets followed is event-driven, not calendar-driven: rotate on engineer departure, rotate on suspected compromise, rotate when third parties request it (Stripe key rotation, OAuth refresh credentials, etc). The calendar cadence applies only to the secrets that genuinely matter most: the payment processor keys, the customer data encryption keys, the database root credentials.

For event-driven rotation to work at all, the application must support graceful credential rotation: the ability to introduce a new credential, run both credentials valid simultaneously for a brief window, then retire the old one. This requires the credential to be configurable (not hardcoded), to be loaded at startup (not embedded), and ideally to be reloadable on a signal so rotation does not require a deployment.

The pattern we use across DocuMint, CronPing, FlagBit, and WebhookVault is environment-loaded credentials with the rotation operation being: deploy with both old and new key valid, wait for old key to be unused, deploy with only new key. This works for any credential that has a server-side allowlist (API keys, webhook signing keys), and it does not require any rotation infrastructure beyond the deployment pipeline.

The credentials that should never be in the repo

The .gitignore file is the first line of defense and the easiest to get wrong. Every team has their list, but a minimum set of patterns that should be in every backend project's .gitignore: .env*, *.key, *.pem, *.pfx, credentials.json, service-account*.json. The pre-commit hook that runs git diff --cached against a regex of common secret formats (AWS keys start with AKIA, GitHub tokens start with ghp_, Stripe keys start with sk_) catches the rest before commit.

For the secrets that have already leaked into history, the only correct response is to rotate the secret and accept that the old one is compromised even if the leak was internal-only. git filter-branch and BFG Repo-Cleaner can rewrite history to remove the secret from the repo, but they cannot remove it from the clones, the CI caches, the engineer laptops, or the Slack thread where someone shared the leaked commit URL. The secret is compromised the moment it touches a repo. Treat it that way.

The secrets-in-logs problem

The most common way secrets actually leak is through logs. A request body contains a credential. The request is logged. The log is shipped to an external aggregator. The aggregator is breached six months later. The credential leaks.

The defense is structured logging with explicit field-level redaction: every log event has a known schema, fields known to contain secrets are redacted at the logger level, and request bodies are never logged in full. This is more discipline than infrastructure: the logger library should make redaction the default, and engineers should treat any deviation from that default as a code review concern.

The audit pattern that works: a periodic log query for known secret prefixes (sk_, ghp_, AKIA, etc.) across the log aggregator. Any hit is an incident. The query takes ten seconds; the absence of hits is the only evidence you actually have that secrets-in-logs is not happening.

The deployment pipeline as a secret store

For teams that have outgrown the .env file but have not adopted a vault, the deployment pipeline (GitHub Actions, GitLab CI, etc.) is often the de facto secrets store. CI provider's secrets vault holds the credentials, the deployment job copies them to the host or injects them as environment variables.

This is fine as far as it goes. The risks: CI provider compromise (rare but possible), insider risk among engineers with CI admin access (more common), accidental secret exposure in CI logs (very common — every CI provider has a redaction feature, and every CI provider has cases where the redaction misses).

The pattern that helps is to scope CI secrets per environment and per workflow, so a leaked credential is bounded to a single deployment context. The pattern that helps more is to keep payment-processor and customer-data keys out of CI entirely, fetching them from KMS at application startup rather than injecting them through CI. The CI pipeline only needs to deploy code; it does not need to handle the most sensitive secrets.

The summary that actually fits a small team

For most small SaaS teams, the secrets-management stack should be: .env files with strict permissions and a daily audit script for the bulk of operational secrets, KMS-backed encryption for payment and customer-data secrets, IAM roles wherever the cloud provider supports them, signature-based auth wherever a shared secret can be replaced with a signing key, structured logging with field-level redaction, .gitignore plus pre-commit hooks for the leak-prevention basics, and event-driven rotation with calendar rotation reserved for the secrets that genuinely matter most.

This is not Vault. It is not the architecture you would build at a thousand engineers. It is the architecture that is correct, operable, and maintainable at four engineers, which is the actual scale most SaaS teams are at. The dominant failure mode at small scale is not insufficient secrets infrastructure; it is the secrets infrastructure people do not actually operate. The smaller the architecture, the more likely it is to be correctly operated.

Read more