Designing API Webhook Filtering: Event Types, Resource Filters, and the Patterns That Stay Maintainable

Webhook subscriptions almost always need filtering. The patterns that work are a small flat namespace of event types, optional resource filters per subscription, and explicit handling of new event types as they get added.

The simplest webhook subscription model is one URL per customer, all events delivered. It works for the first dozen customers and breaks the moment someone says they only want invoice events but not customer events, or only events for one project but not all projects, or any of the other reasonable variations of "I do not want everything." The right answer is webhook filtering, and the design choices made early determine whether the filtering API stays maintainable as the product grows.

We thought through these choices when designing the webhook surface for WebhookVault and revisited them as we added webhooks to FlagBit, CronPing, and DocuMint. The patterns below are the ones that survived contact with real integrations.

The event type namespace

The first design decision is the structure of the event type names. The dot-separated noun.verb pattern (invoice.created, invoice.paid, customer.updated) is the convention Stripe established and most webhook APIs have followed since. The dot separator suggests hierarchy but is generally treated as opaque by both sender and receiver. The convention works because it is concise, scannable, and groups related events visually in dashboards and code.

The wrong defaults are wildcard subscriptions like invoice.* or anything resembling regex matching. Wildcards seem like a small convenience but compound into substantial maintenance cost: every new event type added in your namespace silently changes the meaning of every wildcard subscription, and customers cannot tell from the dashboard whether a wildcard subscription includes the new event or not. The Stripe approach (a literal list of event types per subscription) is more verbose but avoids the surprise.

The decision to use a flat namespace versus a deeper hierarchy is worth making explicitly. A flat namespace (invoice.paid, invoice.refunded, invoice.disputed) keeps the event type list scannable and avoids overly-clever categorization. A deeper hierarchy (billing.invoice.paid, billing.invoice.refunded) gains nothing for two-level events but starts to pay off if you have hundreds of event types across many product areas. The threshold for graduating to a deeper hierarchy is higher than most teams expect; we have stayed flat across all four products.

Resource filters as second-class

The second filtering dimension is the resource filter: subscribe only to events for a specific project, customer, environment, or other identifying attribute. The choice that bites is whether resource filters are a property of the subscription or a property of the URL.

The clean pattern is one webhook subscription per (URL, event-type, resource-filter) tuple, stored in a join table. A customer that wants invoice.paid events for production environment subscribes once. A customer that wants the same events for staging environment creates a second subscription. The benefit is that the dashboard shows exactly what each subscription covers and the customer can disable one filter without affecting others.

The unclean pattern is per-URL filters in subscription metadata: one URL with a JSON blob of filters that the API checks before delivering. The unclean pattern looks simpler at first because there is one row per URL, but it makes the dashboard harder (the displayed filters get long and confusing) and forces the customer to do all-or-nothing changes when adjusting filters.

The schema

The minimum viable schema is three tables. webhook_endpoints stores the URL plus signing key plus active flag plus account ownership. webhook_subscriptions stores (endpoint_id, event_type, resource_type, resource_id) tuples with resource_type and resource_id nullable for global subscriptions. webhook_deliveries stores per-event delivery attempts with status and timestamps for replay and audit.

The query that runs for every event is a JOIN between the event and the subscriptions to find which endpoints should receive it: SELECT endpoint_id FROM webhook_subscriptions WHERE event_type = $1 AND (resource_id IS NULL OR resource_id = $2). The query is indexed on (event_type, resource_id) and runs in microseconds for the common case of a small number of subscriptions per event type.

The schema accommodates the patterns that customers actually want: subscribe to all invoice events for any resource (one row, resource_id IS NULL), subscribe to all events for a specific project (one row per event type, all with the same resource_id), subscribe to invoice.paid events for one customer (one row, fully specified). The patterns the schema does not accommodate are also the patterns that probably should not be supported: regex matching, time-window filters, content-based filters on the event payload.

New event types and the discovery problem

The decision that compounds is what happens when you add a new event type. The two reasonable answers are silent-add (existing subscriptions are unaffected, customers who want the new event explicitly subscribe to it) and migrate-existing (existing subscriptions to related events automatically include the new event). Silent-add is the default that scales because it does not change the meaning of existing subscriptions; migrate-existing is the temptation that bites because it produces customer-visible surprises.

The communication discipline around new event types matters more than the technical mechanism. A changelog that announces new event types, an email to webhook customers when significant new events are added, and a public list of all event types in the API documentation. Customers who want the new event subscribe explicitly, and customers who do not want it are not surprised by deliveries they did not expect.

The wrong-but-tempting alternative is wildcard subscriptions that automatically include new events. The convenience is real but the silent-meaning-change is the failure mode that produces support tickets six months later when a customer's integration breaks because of an event they did not know they had subscribed to.

Per-event delivery records

The webhook_deliveries table is where the operational work happens. Each event-to-endpoint pairing gets a row with status, attempts, last attempt timestamp, response code, and a reference to the event payload. The dashboard reads from this table to show customers their delivery history; the replay feature inserts new attempts into the table; the alerting infrastructure reads from this table to notice failure patterns.

The retention question for the deliveries table is the operational pressure point. Keeping deliveries forever produces an unbounded growth pattern that eventually requires partitioning. Keeping them for 30 days is the standard answer that lets customers diagnose recent issues without producing unbounded storage. The retention choice should match the customer's debugging window, not the operational convenience.

The dashboard surface

The dashboard is the customer-facing face of the filtering design. The features that matter are: a list of subscriptions per endpoint with each subscription's event type and resource filter visible, a way to add new subscriptions with a dropdown of available event types, the ability to disable a subscription temporarily without deleting it, and per-subscription delivery statistics so customers can spot which subscriptions are sending the most or failing the most.

The operational feature that pays off is the test-event button that sends a sample of each event type to a chosen endpoint. The test events are clearly marked as test (in a header, in the event type, or both) so the customer can tell them from real events. The benefit is that customers can verify their endpoint handles each event type before depending on the production traffic.

What we did not build

The features we considered and decided not to build, with reasons.

Wildcard subscriptions: silent meaning changes when new events are added, and the dashboard becomes ambiguous about coverage.

Content-based filters on the event payload: dramatically increases the per-event delivery cost and the implementation complexity, with the right answer for customers being to filter on their side after receiving the event.

Regex matching on event types: same problems as wildcards plus regex injection risk, and the use case is rare enough that listing the specific event types is not a meaningful burden.

Time-window subscriptions: a subscription that only delivers events during business hours sounds useful and is mostly an anti-pattern because customers want all events with their downstream system handling time-of-day logic.

The deeper observation

The webhook filtering API is one of the surfaces where small initial design choices compound into ongoing maintenance cost. The pattern of literal event-type lists with optional resource filters is conservative and somewhat verbose, and it is the pattern that scales without requiring breaking changes as the event vocabulary grows. The patterns that look more elegant initially (wildcards, regex, deep hierarchies, content filters) tend to produce ongoing cost in customer surprise and operational complexity. The discipline of choosing the boring pattern early tends to pay off because the boring pattern is the one that does not need to be explained to every new customer integration.

Read more