Postgres TOAST: Where Large Values Actually Live

Postgres pages are 8KB, but rows can be much larger. The Oversized-Attribute Storage Technique handles this transparently, with performance and operational implications most teams never inspect.

Most developers treat Postgres rows as if they were arbitrary-sized blobs that the database happens to handle. They are not. The on-disk page size is 8KB, fixed at compile time, and a row that does not fit in a page has to be stored somewhere else. The Oversized-Attribute Storage Technique, or TOAST, is the mechanism that handles this, and it operates so transparently that most teams never look at it until something goes wrong.

We hit TOAST behavior across all four products: DocuMint stores invoice HTML templates, CronPing stores ping payloads, FlagBit stores targeting-rule JSON, and WebhookVault captures arbitrary webhook bodies. Each has run into a TOAST surprise at some point.

The mechanism

When Postgres receives a row that does not fit in a page, it does four things in sequence. First, it tries to compress any variable-length columns marked TOASTable using its default LZ-based compressor (or LZ4 if configured in Postgres 14+). Second, if the row still does not fit, it moves the largest compressed values to a separate TOAST table associated with the parent table, replacing the values in the main row with pointers. Third, if a single value is larger than the chunk size (about 2KB by default), it splits the value into chunks and stores them as separate rows in the TOAST table. Fourth, only then does it commit the write.

The TOAST table itself is invisible by default: it lives in the pg_toast schema with a name derived from the parent table OID. You can find it with a query against pg_class joined on reltoastrelid, but most operational tools do not surface it. The TOAST table has its own indexes, its own statistics, and its own VACUUM schedule.

The four storage strategies

Each TOASTable column has a strategy attribute that controls TOAST behavior:

  • PLAIN: never compress, never out-of-line. Used for fixed-width types.
  • EXTENDED: compress first, then move out-of-line if still too large. The default for most variable-length types.
  • EXTERNAL: never compress, but move out-of-line if needed. Useful for columns that are already compressed (PNG, JPEG, zstd-compressed payloads) where Postgres compression wastes CPU.
  • MAIN: try to compress and keep inline; only move out-of-line as a last resort.

You can change a column's strategy with ALTER TABLE ... ALTER COLUMN ... SET STORAGE. This affects future writes but not existing rows. The classic mistake is using EXTENDED on a column storing already-compressed data: Postgres spends CPU on compression that will not shrink the data, and then has to detect that the compressed result is larger than the input and silently keep the original. Switching to EXTERNAL on these columns is a frequent silent performance win.

The query performance implications

A query that reads only non-TOASTed columns from a wide table is fast, because the database scans only the main heap. A query that reads a TOASTed column has to do additional reads to the TOAST table, following the pointer indirection. This means SELECT * is more expensive on tables with large TOASTable columns than a column-list projection that omits the large fields.

The implication is operational: when investigating slow queries on wide tables, EXPLAIN ANALYZE shows the planner's view but not the TOAST detoasting cost. The actual elapsed time can be dominated by TOAST reads that do not appear in the plan. The fix is usually to project only the columns the application actually needs, which forces the planner to skip detoasting for unused columns.

The VACUUM implications

UPDATE on a TOASTed column generates new TOAST rows even if the main row's TOAST pointer would otherwise be reusable. Postgres has no general value-deduplication: every UPDATE creates new TOAST entries, and the old ones become dead tuples to be vacuumed. On tables with frequent UPDATEs to large columns, the TOAST table can bloat faster than the main heap, and the TOAST table's autovacuum schedule needs separate tuning.

The diagnostic is to monitor pg_stat_user_tables for the parent table and pg_stat_all_tables for the TOAST table separately. A common surprise is finding that the parent table is healthy but its TOAST relation has millions of dead tuples and has not been vacuumed in days because the autovacuum thresholds use the same scale factor as the parent, which is calibrated to a wider row count.

The size limits

A single TOASTed value can be up to 1GB in raw form, though performance degrades badly past about 10MB. The schoolroom advice of "Postgres can store anything" is not wrong, but the practical limit for fast access is much lower than the architectural limit. Above 10MB, the cost of detoasting plus the cost of moving the data over the network makes the database a poor choice compared to object storage with the metadata in Postgres.

Our rule across the four products is: store anything under 1MB inline and trust TOAST to handle it. Store anything over 10MB in S3-compatible object storage with a URL in Postgres. The 1-10MB middle range requires a case-by-case decision based on access patterns.

The configuration knobs worth knowing

The toast_tuple_target table-level parameter (Postgres 11+) sets the row size threshold above which TOAST kicks in, defaulting to 2032 bytes. Setting it lower forces more aggressive out-of-lining, which can help on tables where wide-row scans are common. Setting it higher keeps more data inline, which can help on small-table workloads with frequent full-row reads.

The default_toast_compression setting (Postgres 14+) switches between pglz and LZ4. LZ4 is faster to compress and decompress at a slight cost in compression ratio. For modern hardware on TOAST-heavy workloads, LZ4 is the right default and is what we use across our four products.

The deeper observation

The TOAST mechanism is one of those parts of Postgres that works well enough that most teams never look at it, but the failure modes when it goes wrong are subtle and operational. The pattern of "transparent magic that occasionally surprises you" is common across databases: it is also true of MVCC, query planning, autovacuum, and replication. The price of the convenience is a periodic obligation to understand what is actually happening under the covers, usually triggered by an incident.

The right discipline is to read the operational documentation for the parts of the database you depend on, even when they are transparent. The TOAST documentation is short (a few thousand words in the Postgres manual), and reading it once buys multi-year operational protection against surprises that cost much longer to debug from scratch.

Read more