Postgres COMMENT ON: The Underused Schema Documentation That Lives Where Schema Lives
Postgres has had COMMENT ON since version 7 in 2002. Almost no team uses it. The cost is one extra line per object and the value is that documentation lives where the schema lives and survives every refactor.
Postgres has had a COMMENT ON statement since version 7, released in 2002. It attaches a free-form text string to any database object: table, column, function, index, sequence, schema, type, role, tablespace. The comment is stored in the system catalogs, visible in psql via \d+, exposed through information_schema, and preserved across pg_dump and restore. It costs one extra line per object to write. Almost no team uses it.
The reasons are familiar. Documentation lives in README files, in wiki pages, in design docs, in the comments on the application-layer ORM model. Those locations are where engineers are already writing prose; reaching for a SQL statement to add documentation feels like extra work. The wiki page goes stale, the README diverges from the schema, the ORM comment is invisible to the DBA debugging a query at 2 AM. The schema-level comment is the only place where documentation lives next to the thing it documents and travels with it through every migration.
What COMMENT ON actually does
The syntax is straightforward. COMMENT ON TABLE invoices IS 'Customer-facing invoice records, one row per invoice. Soft-deleted via deleted_at.'; attaches a string to the table. COMMENT ON COLUMN invoices.subtotal_cents IS 'Pre-tax invoice total in cents. Always non-negative.'; attaches a string to a specific column. The same pattern works for functions, indexes, triggers, sequences, schemas, types, and roles.
The string can be any length and contains any text. Newlines and Markdown are allowed; the string is opaque to Postgres. Tools that read the comments may interpret the content (pgAdmin renders Markdown, some documentation generators look for structured tags), but the database itself just stores the bytes.
Reading the comments is also straightforward. The \d+ tablename command in psql shows the comment for the table and each column. The obj_description() and col_description() functions return comments programmatically. The pg_description system catalog stores the raw comments and can be joined to pg_class, pg_attribute, and other catalogs to build documentation views.
What to comment on
Not everything needs a comment. The discipline is to comment on the things where the name is insufficient to convey the meaning, the constraints, or the history. Three categories cover most of the high-value cases.
First: tables whose purpose is non-obvious. A table named invoices probably does not need a comment; a table named ledger_entries does, because the relationship to invoices and payments is not implied by the name. A junction table named users_organizations probably does not need a comment; a junction table named access_grants does, because the grant model has rules (expiry, scope, delegation) that the name does not convey.
Second: columns whose meaning is non-obvious or whose constraints are non-obvious. A column named status needs a comment listing the valid values and their meanings. A column named amount_cents needs a comment specifying the currency, the sign convention, and whether tax is included. A column named metadata needs a comment describing what shape the JSON blob takes. A column named created_at probably does not need a comment.
Third: anything with a non-obvious history. A column that exists for backwards compatibility but should not be referenced in new code deserves a comment saying so. An index that exists to support one specific query that runs once a day deserves a comment naming the query. A trigger that exists because of an incident in 2023 deserves a comment referencing the incident. The future engineer who finds these objects without context will guess wrong, and the cost of guessing wrong is usually higher than the cost of writing the comment was.
What the comment should say
The comment should answer the question a future engineer would ask when first encountering the object. For tables: what does this represent, what is the granularity of one row, what is the relationship to other tables. For columns: what does this contain, what are the valid values, what is the sign or unit convention. For indexes: what query is this for, why is the column order what it is. For functions: what is the contract, what side effects exist, what is the cost.
The comment should not restate the name. A comment on the customer_id column that says "the customer ID" is worse than no comment at all because it implies that the obvious meaning is the actual meaning, which might not be true. A useful comment on customer_id would say "FK to customers.id. NULL for system-generated invoices (rare, see migration 0042).".
The comment should record the why, not just the what. A column with a name and type is self-documenting at the what level. The thing the database cannot tell you is why the column exists, what it was designed for, what constraints apply beyond the explicit constraints. The comment is the place to record the why.
The migration discipline
Comments belong in migrations. A migration that creates a table should include the COMMENT ON statements for the table and each of its non-obvious columns. A migration that adds a column should include the COMMENT ON for the column. A migration that changes the meaning of a column should include an updated COMMENT ON.
This is the rule that makes the comments useful long-term. If comments are added ad-hoc by individual engineers in the database console, they will drift, they will disappear during schema refactors, and the team will stop trusting them. If they are added through the same migration pipeline that adds the schema, they will survive every change and remain trustworthy.
The cost is small: a few extra lines per migration. The benefit compounds: every future engineer who looks at the schema in any tool sees the documentation, every documentation generator that reads the database sees the comments, every code review that involves schema changes can include review of the comments.
The tooling surface
Several tools read database comments and incorporate them into documentation. SchemaSpy generates HTML documentation from a database including all comments. pg_dump preserves comments by default, so they survive backup-and-restore. The Postgres ORM ecosystem is uneven: some ORMs read comments and incorporate them into generated code documentation, others ignore them.
The information_schema views expose comments under the obj_description family. Custom documentation generators can join pg_description against the other catalogs to produce per-table, per-column reports. The \d+ psql command remains the most-used surface for working engineers and the lowest-friction way to encounter the comments during normal work.
The tooling that does not exist, and that would be valuable, is a comment-linter that flags schema changes without corresponding comment updates. The pattern of "you renamed a column but didn't update its comment" is a common case where comments drift, and a CI check that catches it would be easy to write and would close the most common drift path. As far as we know no widely-adopted tool does this yet.
What this looks like in practice
Our four products (DocuMint for PDF invoice generation, CronPing for cron job monitoring, FlagBit for feature flags, and WebhookVault for webhook debugging) are SQLite-based at current scale. SQLite does not have COMMENT ON in the same form, though it has a workaround via comments inside CREATE TABLE statements that are preserved in the schema text and visible via .schema. The discipline of writing comments next to non-obvious columns transfers across the two databases even if the syntax differs.
The eventual migration to Postgres will inherit the comments from the SQLite schema; the small upfront investment in writing them in SQLite saves the larger cost of trying to reconstruct the design intent later. The pattern is the same: comments belong in the migration, they describe the why not the what, and they survive every refactor because they live where the schema lives.
Three observations
First: COMMENT ON is one of the cheapest possible code-review additions a team can make. Adding "every migration that creates or modifies a column should include a COMMENT ON for non-obvious columns" to the migration review checklist costs essentially nothing per migration and accumulates into a much more navigable schema over years.
Second: the resistance to using COMMENT ON is mostly cultural rather than technical. Engineers who would readily write a doc-comment on a Python class or a Go struct will not write a database comment on the corresponding table, even though the database comment is more durable (it survives ORM rewrites) and more accessible (it shows up in every database tool). The mismatch is worth examining.
Third: the schema is the most stable artifact in most systems. Application code gets rewritten every few years. Frontend frameworks change every year. The database schema, once established, often persists for the entire lifetime of the product, sometimes outliving the company that created it. Documentation that lives in the schema is documentation that has the longest possible shelf life. The discipline of putting documentation in the longest-lived layer of the system is one of the small habits that distinguishes mature teams from teams that are still learning.
The deeper observation is that documentation tends to drift from the thing it documents in proportion to the distance between the two artifacts. README files drift faster than docstrings, docstrings drift faster than schema comments. The argument for COMMENT ON is structural: it minimizes the drift distance by putting the documentation in the same place as the thing being documented. The cost is small and the benefit is durable, and the main reason teams don't use it is that they haven't built the habit. Building the habit takes a code-review checklist and a few months. The payoff persists for the lifetime of the schema.