Postgres EXPLAIN (FORMAT JSON): Programmatic Query Plan Analysis
The default EXPLAIN output is built for humans reading one plan at a time. The JSON format is built for tools, scripts, and dashboards that need to read many plans and compare them programmatically.
The default EXPLAIN output is one of the most useful Postgres features for understanding query performance. The output is a tree of operators with cost estimates, row estimates, and (with ANALYZE) actual execution metrics for each node, formatted as indented text designed for a human reading one plan at a time. The format is correct for that use case and has remained essentially stable since Postgres 8.0. The format is wrong for any use case that involves processing many plans programmatically: comparing plans across time, detecting regressions, building dashboards that surface slow-query patterns, or feeding plan data into automated analysis pipelines. For those use cases, EXPLAIN (FORMAT JSON) produces the same plan tree as structured JSON that any programming language can parse without writing a custom indent-aware text parser.
What the JSON format actually contains
The JSON output is a single-element array containing a top-level object with the plan tree as a nested Plan field. Each node has fields for the operator type (Node Type), the relation involved (Relation Name for table-scanning operators), startup and total cost estimates (Startup Cost and Total Cost), row estimates (Plan Rows), row width estimates (Plan Width), and a Plans array for child operators. With ANALYZE the nodes also include actual execution metrics: actual startup and total times (Actual Startup Time and Actual Total Time), actual row counts (Actual Rows), and loop counts (Actual Loops).
With BUFFERS the nodes include shared-block hit, read, dirtied, and written counters (Shared Hit Blocks, Shared Read Blocks, etc.) plus temp-block counters for sorts and hashes that spill to disk. With WAL (Postgres 13+) the nodes include WAL generation metrics. With VERBOSE the nodes include the specific columns output by each operator and the join conditions, which is useful for understanding what each node is actually doing rather than just its operator class.
The top-level object also includes summary fields: Planning Time and Execution Time at the top level (with ANALYZE), Settings (Postgres 12+) showing non-default configuration values that affected the plan choice, and Triggers for any triggers that fired during execution. The Settings field is one of the more underused features, because it shows exactly which non-default configuration values the planner used and helps diagnose cases where a query plans differently in production than in development because of a configuration drift.
The everyday diagnostic queries
The first useful programmatic query is the estimate-vs-actual ratio for every node in a plan. The pattern is to walk the plan tree, compute Actual Rows / Plan Rows at each node, and flag any nodes where the ratio is greater than 10 or less than 0.1. The ratio is the single most useful diagnostic in query plans because it points directly at the statistics that the planner has wrong, and the statistics-wrong cases are the most common cause of slow queries that the planner could have made fast with better information.
The second useful programmatic query is the operator-type histogram across many plans. The pattern is to capture plans from auto_explain logs or from pg_stat_statements wrapping, parse them all, count occurrences of each operator type, and surface the distribution. A workload that is dominated by Seq Scan operators may have missing indexes; a workload dominated by Hash Join operators with disk spills may need a larger work_mem; a workload dominated by Nested Loop operators with large outer relations may have statistics problems forcing the wrong join algorithm.
The third useful programmatic query is the buffer-hit ratio per query. With BUFFERS enabled, summing Shared Hit Blocks over Shared Hit Blocks + Shared Read Blocks across all nodes gives the cache hit ratio for the specific query. A query with a 99 percent cache hit ratio on a warm cache that drops to 30 percent on a cold cache is doing substantially more I/O than the warm-case timing suggests, which is the kind of difference that produces unpleasant surprises when production load patterns change.
The auto_explain integration
The auto_explain extension can be configured to emit JSON-formatted plans for slow queries, which produces the input data stream for programmatic analysis. The configuration auto_explain.log_format = 'json' in postgresql.conf produces JSON in the Postgres log; the configuration auto_explain.log_min_duration = 1000 captures plans for queries slower than one second; the configuration auto_explain.log_analyze = on includes actual execution metrics. The overhead of log_analyze is small but non-zero (per-node timing calls cost a few microseconds), so the right pattern is to enable it broadly with a high duration threshold rather than narrowly with a low duration threshold.
The combination of auto_explain with JSON format and a log-collection pipeline (Loki, Elasticsearch, BigQuery via export) produces a continuous stream of slow-query plans that can be analyzed offline. The pattern is to capture the plans, parse them, group by normalized query text, and surface the queries whose plans have changed since the last analysis window. The plan-changed-since-last-window signal is the leading indicator for query regression, and catching it early gives operators time to react before the regression becomes a customer-visible problem.
The pg_stat_statements complement
The pg_stat_statements extension aggregates query timing and I/O metrics across all executions of each query but does not capture plans. The combination with auto_explain JSON output fills the gap: pg_stat_statements tells you which queries are expensive in aggregate, and auto_explain JSON output tells you why specific slow executions were slow. The two together produce a complete picture that either alone leaves incomplete.
The technique of correlating pg_stat_statements with auto_explain plans by query fingerprint is the standard production observability pattern for Postgres workloads. The implementation is straightforward but specific to each log-collection pipeline; the underlying primitives (normalized query text in pg_stat_statements, query text in auto_explain output) make the correlation possible without custom Postgres extensions.
What the JSON format does not solve
The JSON format does not solve the problem that EXPLAIN without ANALYZE shows the planner's estimates rather than actual execution behavior. The estimates can be wrong by orders of magnitude when statistics are stale or when the query has correlated predicates that the planner cannot reason about. The right pattern for serious diagnosis is EXPLAIN (ANALYZE, BUFFERS, FORMAT JSON) which produces actual execution metrics alongside estimates, with the cost of actually running the query (including any side effects from data-modifying statements wrapped in BEGIN; EXPLAIN ANALYZE ...; ROLLBACK; if needed).
The JSON format does not solve the problem that EXPLAIN shows the plan for a single execution rather than the typical plan across many executions. The planner can choose different plans for the same query based on parameter values, query plan cache state, or autovacuum-driven statistics changes. The pattern of analyzing a single plan and concluding that "this query has this plan" is incorrect for queries with parameter sensitivity; the right pattern is to capture plans across many executions and look at the distribution.
The JSON format does not solve the problem that some plan information is hidden from EXPLAIN entirely. The plan-cache state, the runtime parameter values that produced the plan, the autovacuum state of the involved tables, and the row-level locking activity during execution are all invisible to EXPLAIN. The pattern of "I ran EXPLAIN and the plan looks fine" should always include a check of pg_stat_user_tables for table-level statistics health, pg_locks for any lock waits during execution, and pg_stat_activity for the planner's view of the query.
The diagnostic patterns that compound
The patterns that compound across many cycles of query analysis are: always run EXPLAIN (ANALYZE, BUFFERS, FORMAT JSON) rather than the default text format when investigating performance, always capture the output to a file rather than reading it interactively (the JSON is verbose enough that scrolling through it loses context), always compare against a recently-captured baseline when investigating regressions, always check the Settings block for configuration drift, and always look at the estimate-vs-actual ratios before reading the timing numbers.
The deeper observation is that programmatic plan analysis is one of the underused Postgres features. The text-format output is so familiar that the JSON format feels exotic, but the JSON format is what enables the kind of automated analysis that catches regressions early, surfaces patterns across many queries, and produces the operational visibility that production database workloads benefit from. Our SQLite-baseline products across DocuMint, CronPing, FlagBit, and WebhookVault have a similar but more limited capability via EXPLAIN QUERY PLAN output that can be parsed; our Postgres migration plan includes wrapping auto_explain JSON output capture as part of the launch observability investment.
Our products: DocuMint (PDF invoice generation API), CronPing (cron job monitoring with status pages), FlagBit (feature flags API for modern teams), and WebhookVault (webhook capture and replay) put these patterns into production.