Materialized Views: When to Trade Freshness for Speed

Materialized views cache the result of a query on disk and let you query the cache instead of recomputing. They are one of the most underused database features in small-team SaaS. The patterns that work and the maintenance cost that determines whether they earn their place.

A materialized view is a query result stored as if it were a table. The first read computes it; subsequent reads scan the stored bytes. The only complication is that the underlying data continues to change while the cached result does not. Reconciling those two facts is what determines whether a materialized view is the right answer.

We use materialized views in modest doses across DocuMint, CronPing, FlagBit, and WebhookVault. The pattern that recurs: materialized views earn their cost in dashboards, reporting endpoints, and any place where the query is expensive and the answer can be a few seconds or a few minutes stale. They lose money everywhere else.

What a materialized view actually is

In PostgreSQL, CREATE MATERIALIZED VIEW stores the result of a SELECT statement as if it were a regular table. You can index it, query it, join against it. What you cannot do is have it update itself when the underlying data changes — that is your job. REFRESH MATERIALIZED VIEW recomputes it from scratch; REFRESH MATERIALIZED VIEW CONCURRENTLY recomputes it without locking out readers (at the cost of requiring a unique index on the view).

SQLite has no native materialized views, but the same pattern works with a regular table populated by a periodic INSERT…SELECT. The conceptual gap between the two is small; the implementation tooling is what differs.

The four cases where they fit

Reporting and dashboards are the canonical case. The query is a multi-table aggregate that takes 2-30 seconds; it is run by every dashboard view; the data underneath updates faster than the dashboard can be re-rendered usefully. A materialized view refreshed every few minutes converts every dashboard load from a 5-second wait to a 50-millisecond cache hit. The customer perception is instantaneous; the data is at most a few minutes behind.

Search precomputation is the second case. Building a denormalized search row from a half-dozen joins is expensive at query time. A materialized view that contains the denormalized form, refreshed when source data changes (or on a schedule), turns search into a single-table scan with the right index.

API responses with expensive computation are the third case. If a customer hits a paginated endpoint and the underlying data requires aggregation across many tables, computing it per-request is wasteful. Computing it once per minute and serving from cache is cheap. The trade-off is whether the customer can tolerate a minute of staleness, which for most analytics-shaped APIs they can.

Cross-system aggregations are the fourth case. When the query needs data from multiple databases or sources, doing the join at every read is a performance and reliability problem. A materialized view that holds the denormalized result, refreshed on a schedule, isolates the read path from the source system's availability and performance.

The cases where they don't fit

Anything that requires real-time accuracy is wrong. Materialized views are caches by definition; they will return values that no longer reflect the current state of the database. Billing, financial transactions, anything where the customer reads a value and immediately acts on it should not be served from a materialized view.

Anything that updates faster than the refresh interval is wrong. If the underlying data changes every few seconds and the materialized view refreshes every few seconds, you have built a complicated way of running the query at every refresh and serving stale results in between.

Anything where the query is already fast is wrong. A 200-millisecond query does not benefit from being cached; the maintenance overhead exceeds the read savings.

The refresh problem

Refreshing is where the apparent simplicity ends. Three patterns work in production.

The first is scheduled refresh. A cron job runs REFRESH MATERIALIZED VIEW CONCURRENTLY every N minutes. The interval is chosen to balance freshness against the load of running the underlying query. This is the default and it works for most cases. The failure mode is when the refresh itself takes longer than the interval, which means the next refresh starts before the previous finishes. Catching this requires monitoring refresh duration and the time-since-last-successful-refresh, and reacting before they grow without bound.

The second is event-driven refresh. When source data changes, a trigger or application code enqueues a refresh job. This gives lower-latency freshness at the cost of complexity around debouncing (so a thousand updates do not trigger a thousand refreshes) and ordering (so refreshes do not stack up). The pattern is right when the source data changes infrequently but the materialized view needs to be near-real-time when it does change.

The third is incremental maintenance. Some systems (PostgreSQL with extensions, Materialize, ClickHouse with refreshable views) can update the materialized view incrementally as source data changes, instead of recomputing from scratch. This is the most powerful pattern and the one that requires the most careful design. It is rarely the right answer at small scale because the operational complexity is significant.

Concurrent refresh and the unique index requirement

Plain REFRESH MATERIALIZED VIEW takes an ACCESS EXCLUSIVE lock on the view for the duration of the refresh. For a 30-second refresh, that means 30 seconds of blocked reads. REFRESH MATERIALIZED VIEW CONCURRENTLY rebuilds in a temporary table and swaps atomically, allowing reads to continue during the refresh. The catch is that concurrent refresh requires a UNIQUE index on the view, and the index must cover at least one column that uniquely identifies each row.

This is not optional in production. Always add the unique index when creating the materialized view; always use the CONCURRENTLY keyword in the refresh job. The non-concurrent refresh is a foot-gun that surfaces only under load.

The five operational signals

Materialized views need active monitoring or they degrade silently. The signals worth tracking: refresh duration (a sudden increase often precedes a refresh that doesn't fit in the schedule); time since last successful refresh (alerts when the schedule slips); refresh failure count (silent failures during a stuck-on-old-data incident are particularly painful to debug after the fact); query time on the view (regressions here usually mean the view's indexes are insufficient for the query patterns now hitting it); and source-table freshness vs view freshness gap (the meaningful number for the customer-experience side).

The deeper observation

Materialized views are an honest acknowledgement that not every query needs to be live. Trading a small amount of freshness for a large amount of speed is a deal that pays off whenever the customer cannot perceive the staleness and would notice the slowness. The places where they fit are also the places where most teams over-engineer with caching layers — Redis in front of a query, application-level memoization, denormalization in the application code. A materialized view does the same thing with less infrastructure and more reliable invalidation. It is one of the database features that mostly compounds in value the longer the system is alive.

Read more