Webhook Subscription Management: Patterns for the Single-URL-To-Multi-Tenant Migration
Most webhook integrations start with one URL hardcoded in a settings page. Then customers ask for per-event routing, per-environment filtering, retry visibility, and the schema that started as a single text field becomes a subscription system. Here is the migration path.
Webhook integrations almost always start the same way. There is a settings page in the dashboard with a single text field labeled "Webhook URL." Customers paste a URL. Your service POSTs every event to that URL. It works. It ships. It pays the bills.
Then a customer asks if they can have a different URL for staging events versus production events. Then another customer asks if they can subscribe only to invoice.paid events without receiving every other event type. Then someone asks for a per-team URL because their finance team and engineering team want different events going to different Slack channels. The single text field has quietly become a subscription system, and the question is whether you build it deliberately or let the requirements drag you through three painful migrations.
This is the migration path from one hardcoded URL to a multi-tenant subscription system, written from the perspective of having done it twice. The conclusions are: do the migration when the second customer asks rather than the seventh, expose the model deliberately rather than accreting fields onto a single record, and make the system observable from day one because customers will not wait for you to add observability after they discover events are missing.
The single-URL stage and what it lets you defer
The simplest webhook configuration is a single URL stored against the customer record. One field, one URL, every event goes there. This stage is correct for almost every webhook product at launch because it lets you defer the entire subscription model until you have customers whose actual usage patterns tell you what shape it should take.
What you are deferring: per-event filtering, per-environment routing, multiple endpoints per customer, signing key rotation per endpoint, retry visibility per endpoint, and the dashboard surface for managing all of this. These are non-trivial pieces of product, and building them on speculation produces the wrong shape.
What you cannot defer at this stage: signature verification on a per-customer signing secret, idempotency keys on every event, a stable event ID that consumers can deduplicate on, and structured logging of every delivery attempt. These are required even with one URL because they enable the upgrade path.
The two-URL trap
The most common wrong move is adding a second field. Customer asks for separate staging and production webhooks, you add a staging_webhook_url column next to webhook_url, you ship in a day, problem solved. Then a customer asks for three URLs. Then for per-event-type routing. Each accretion fights the schema, and three migrations later you have a wide row with webhook_url, staging_webhook_url, finance_webhook_url, analytics_webhook_url, and a parallel set of signing-secret columns.
The right move when the second URL is requested is to introduce the subscription table. The migration is bigger than adding a column, but it is the migration you would have done eventually, done now, when there is one customer to migrate.
The subscription model schema
The minimum viable subscription schema has these tables. A webhook_endpoints table with columns for endpoint URL, signing secret, customer ID, descriptive label, enabled flag, and creation timestamp. A webhook_subscriptions table linking endpoints to event types, with one row per (endpoint, event-type) pair. A webhook_deliveries table logging every delivery attempt with endpoint ID, event ID, HTTP status, response time, attempt number, and timestamp.
The migration from single-URL is: create the new tables, copy each existing customer's URL into a single endpoint row labeled "Default," create subscription rows for every event type the customer was previously receiving, switch the delivery code path to read from the new tables, dual-write for one retention cycle, and drop the legacy columns. The dual-write phase is what makes the migration safe.
Per-endpoint signing secrets, not per-customer
The single-URL stage usually has one signing secret per customer. This is fine until you have multiple endpoints per customer, at which point sharing a secret across endpoints becomes a problem: rotating the secret rotates it for all endpoints simultaneously, and a leak at one endpoint compromises all of them.
The right model is a signing secret per endpoint, generated when the endpoint is created. The endpoint object's API contract returns the secret once at creation time and then exposes only the last four characters. Customers can rotate by adding a new endpoint, switching their consumer to verify the new signature, and then deleting the old endpoint. This is the same pattern Stripe uses for webhook endpoints and is worth copying.
Filtering and the wildcard problem
Per-event-type filtering looks easy: subscribe an endpoint to a list of event types, only deliver events whose type matches. The trap is wildcards. Customers will ask for invoice.* patterns to subscribe to all invoice-related events. Then they will ask for *.created to subscribe to all creation events. Then they will ask for negation: everything except user.deleted.
The honest engineering choice is to start with literal event-type matching and explicit lists. If a customer wants every invoice event, the dashboard lets them check every checkbox in a single click. Wildcards arrive in version two if they arrive at all. The reason is that wildcard semantics interact in unobvious ways with new event types — when you ship a new invoice.refund_initiated event, customers who subscribed to invoice.* get it automatically without consenting, which is sometimes correct and sometimes a breaking change for them.
Delivery visibility as the highest-leverage feature
The customer-facing dashboard table that lists every recent webhook delivery, with status code, response body, latency, and a manual replay button, is the single highest-leverage feature in a webhook product. Most of the customer-facing problems with webhooks are not the webhooks themselves — they are the consumer endpoints failing in ways the customer cannot see. The delivery log is what makes those failures visible.
The retention pattern is to keep delivery records for 30 days at the row level, with longer retention for delivery counts as time-series metrics. Beyond 30 days the row-level data ages out of being useful for debugging individual events, but the summary statistics are still useful for trend analysis. WebhookVault is built around this exact principle — the capture-and-replay surface is the product.
Retry policy as configuration, not code
The retry policy — backoff schedule, maximum attempt count, eventual-failure handling — should be exposed as endpoint-level configuration when customers reach the scale where defaults stop working. The default of exponential backoff capped at 24 hours with twelve attempts works for most customers. Some customers (real-time payment processors) want shorter retries because stale events are worthless to them. Some customers (overnight batch importers) want longer retries because their consumer is intentionally not running between batches.
Exposing this as endpoint configuration costs a few columns and a more complicated retry worker. Not exposing it leads to a stream of customer-success conversations about why their consumer is being hammered too aggressively or not aggressively enough.
Where the four products fit
WebhookVault is the obvious fit — it is purpose-built for inspecting and replaying webhook deliveries, and the subscription model described here is the model the product implements. FlagBit uses the same delivery-log pattern for flag-change notifications. CronPing uses it for monitor-state-change webhooks. DocuMint uses it for document-completion webhooks. The subscription system is the same; what varies is the event vocabulary.
The migration is cheaper than the accretion
The reason to do the subscription migration when the second customer asks rather than the seventh is that the migration cost grows superlinearly with the number of customers and the number of fields you have accreted. With one customer and one column to migrate, the operation is a script that runs in a minute. With seven customers, four columns, and three signing secrets each, the operation is a project. The schema you would have shipped in week one is the schema you ship in month four anyway, and the difference between the two paths is several months of dual-write maintenance and customer-facing migration communication that did not need to happen.
The subscription model is also the schema customers eventually expect. Every developer-facing product they have integrated with — Stripe, GitHub, Linear, Vercel — has the same shape: endpoints, subscriptions, signing secrets per endpoint, delivery log with replay. Building toward that shape from day two makes the rest of your product feel familiar to the developers integrating with it.