Designing Webhooks for Mobile Clients: Patterns That Survive Unreliable Networks
The standard webhook playbook assumes a desktop or server consumer. The receiver has a stable public URL, persistent network connectivity, and a server-side application that can be relied on to acknowledge events within a few hundred milliseconds. Mobile clients break every one of these assumptions. The phone is behind NAT and changes IP addresses. The app is suspended when the screen turns off. The network drops every time the user walks into a building. The battery limits how much work the app can do in the background, and the operating system enforces those limits whether the app wants them or not.
A webhook system that wants to support mobile clients has to accommodate these realities. The patterns that work in production are different from the standard playbook, and they are worth getting right because the alternative is dropped events that customers blame on the API rather than the network. Across DocuMint, CronPing, FlagBit, and WebhookVault, we have learned that mobile webhook delivery is its own discipline, and the right pattern depends on what the mobile app actually needs to do with the events.
The wrong default: push directly to the mobile app
The naive design routes webhook events directly to the mobile client. The mobile app registers a webhook URL, the server delivers events to that URL, the client app receives them and processes them. This pattern works on desktop and fails on mobile for two reasons: the mobile client does not have a stable public URL (it is behind NAT and might be on cellular, WiFi, or both depending on the moment), and the mobile client is not always running when events arrive.
The workarounds for the NAT problem are uglier than they look. Long-polling from the mobile client works but burns battery. Push notification services like APNS and FCM can deliver events, but they have payload size limits, delivery guarantees that are not strong enough for billing-critical operations, and they require the app to be in a state where it can respond to the notification. WebSockets from the mobile client to the server keep a connection open but are killed when the app suspends, and the reconnect logic has to handle the events that arrived during the suspension.
The right default: server-to-server with mobile client polling
The pattern that works in production is to deliver webhooks to a server that the mobile client can poll, rather than to the mobile client directly. The customer's backend (or a managed service playing the same role) receives webhook events, stores them, and the mobile client polls for new events when it has network connectivity and is in a state to handle them.
This pattern decouples the webhook delivery, which needs to be reliable and timely, from the mobile client interaction, which needs to handle the realities of mobile networks. The webhook delivery follows the standard playbook: server-to-server, signed payloads, exponential backoff retries, replay capability. The mobile client interaction follows a different pattern: long-polling or paginated REST with a since-cursor, background fetch tasks, and explicit handling of the suspended-app case.
The infrastructure cost is one server that the customer operates (or that the API provider offers as a managed service). The gain is that webhook delivery does not have to deal with mobile reality, and the mobile client does not have to deal with the inversion of control that webhook delivery implies. Every part of the system handles the problems that match its assumptions.
The patterns for push notifications
For events that genuinely need to wake up the mobile app, push notifications are the right primitive. APNS for iOS and FCM for Android can deliver a small payload to the app even when it is suspended, and the operating system will either deliver it immediately or hold it for when the app comes to the foreground.
The right pattern is to use push notifications as a wakeup signal, not as the event payload. The notification says "you have new events" and includes minimal context to help the user decide whether to open the app. When the user opens the app (or when the app gets a background fetch window), it pulls the actual events from the API. This decoupling matters because push notification payloads are size-limited (APNS allows 4KB, FCM allows 4KB for data messages), because push notifications can be dropped or delayed by the OS without notification to the sender, and because the event payload might contain sensitive information that should not be in the lock-screen preview.
The push notification system also has its own delivery guarantees that are weaker than webhook delivery. APNS will best-effort deliver notifications but can drop them if too many are sent in too short a time, if the device is offline for too long, or if the OS decides the app does not have priority. Treating push notifications as a wakeup signal with the actual events fetched from the API means that dropped push notifications are recoverable: the next successful notification or the next app open will pull all the missed events.
The patterns for offline reconciliation
Mobile apps must assume they will lose network connectivity at unpredictable times. When the app comes back online, it has to reconcile its local state with the server's state, which means catching up on events that were missed during the offline period.
The right pattern is to expose an events endpoint that supports a since-cursor and pagination. The mobile client tracks the cursor of the latest event it has successfully processed. On reconnection, it requests events since that cursor. The server returns events in cursor-ordered batches, and the client processes them and advances the cursor as it goes. If the connection drops mid-fetch, the client resumes from the last successfully-processed cursor on the next attempt.
The cursor format should be opaque and stable. Using a monotonically-increasing event ID or a strict-monotonic timestamp works; using row offsets or anything that depends on insertion order across multiple servers does not. The cursor must survive event deletions, table reorganizations, and replication lag without producing incorrect results.
The retention period for events available via the since-cursor endpoint should be longer than the longest reasonable offline period. For consumer apps, 30 days is a safe minimum because users go on vacation, travel, and leave apps unopened for weeks at a time. For enterprise apps where the consumer is a backend, shorter retention can be acceptable, but the trade-off should be explicit.
The idempotency-on-the-client problem
Mobile apps frequently process events twice: once when the push notification wakes them up, once when the user opens the app and the events endpoint is polled. The standard webhook idempotency pattern of keying on event ID works as long as the client is disciplined about checking the event ID before applying side effects.
The right pattern is to keep a processed-events table in the local SQLite database, keyed on event ID. Before applying any side effect (notifying the user, updating local state, sending an analytics event), the client checks whether the event ID has been processed. The check is fast (indexed lookup on a primary key), the storage cost is minimal (a small fraction of the events table), and the bug-prevention value is high.
The client-side processed-events table should be capped to a reasonable size (typically the same retention as the server-side events) and cleaned up periodically. The server-side equivalent uses the same pattern (we covered this in detail in our piece on the idempotency token pattern), and both halves of the system together prevent duplicate processing whether the duplication comes from network retries, push notification duplication, or post-reconnection catch-up.
What the server side has to do differently
The webhook server-side has to support a few features beyond the standard playbook to make this all work. First, the events endpoint must support cursor-based pagination, not just offset pagination. We covered this in detail in our piece on cursor-based pagination; the requirement is the same.
Second, the event-emission system must record every event in a queryable form, not just emit it to webhook endpoints. The events table is a load-bearing piece of infrastructure that supports both the webhook delivery side and the mobile-client polling side. The minimum schema is event ID, event type, account ID, payload, created timestamp, and any context needed to filter by subscription.
Third, the delivery side has to support both push (server-to-server webhook URL) and pull (mobile-client cursor-based polling). The push side delivers to customer backends that handle the mobile distribution. The pull side delivers to mobile clients directly when there is no customer backend. The same events power both delivery modes.
Fourth, the system must support webhook subscription that filters by event type and by other criteria. A mobile app for a customer's end-users does not need to see every event in the customer's account, just the events relevant to the specific user. The filtering can be at the subscription level, at the event payload level, or at the API key level; the right answer depends on the customer's data model.
What this is and is not
The honest summary: mobile clients are different from desktop or server clients in ways that matter for webhook design, and the right pattern is to deliver webhooks to a server and let mobile clients pull from that server, rather than trying to push directly to mobile clients. The pattern adds a hop to the architecture but eliminates the failure modes that come from mobile network reality.
The wrong reflexive design of pushing webhooks to mobile clients works in demos and fails in production. The right design layers the system in a way that lets each part handle the problems that match its assumptions. The customer's backend handles webhook reception, which assumes server-side reliability. The mobile client handles event consumption, which assumes mobile reality. Push notifications handle the wakeup signal, which assumes OS cooperation. Each part of the system is doing the job it was designed for, and the system as a whole is more reliable than any single-pattern design could be.
This pattern is invisible in the API documentation because it lives at the boundary between the API and the customer's architecture. But it is the pattern that makes mobile apps actually work in production, and it is the pattern that customers thank us for after they have spent months trying to make a naive design work and finally given up.