Postgres SQL/JSON Path Expressions: Querying jsonb with Real Pattern Matching
Postgres has supported jsonb since 9.4, but the SQL standard's jsonpath language added in Postgres 12 changed what you can ask of JSON columns. The operators are unfamiliar and the language is its own small grammar, but the capability is real.
Most jsonb queries in practice use a small handful of operators: the -> and ->> field accessors, the @> containment operator, and the ? key-existence operator. These cover the cases where you know exactly which key you want and what value you expect there. They do not cover the case where you want to ask whether any element of a nested array matches a predicate, or whether a value exists somewhere in a deep structure regardless of path, or whether multiple conditions hold across different parts of the same document. For those queries, Postgres 12 added support for the SQL standard's jsonpath language and the operators @@ and @? that evaluate jsonpath expressions against jsonb values.
What jsonpath is
A jsonpath expression is a small grammar for navigating and filtering JSON documents. The grammar borrows from XPath but is simpler. The root of the document is $. Field access is .field or ."field with spaces". Array element access is [n] for a specific index or [*] for any element. Recursive descent is .* for any field at the current level or .** for any descendant at any depth. Filter predicates use ?(@.field == value) syntax with @ referring to the current element.
The two operators that evaluate jsonpath against jsonb are @? (does the expression match anything in the document, returning boolean) and @@ (does the predicate hold for the document, returning boolean). The distinction matters: @? is true if the path returns at least one item, while @@ evaluates a predicate that must be explicitly true at the path's destination. In practice you reach for @? for existence questions and @@ for value comparison questions.
The basic patterns look like this. Find documents where any element of the tags array equals 'urgent': tags @? '$[*] ? (@ == "urgent")'. Find documents where the user object has age greater than 21: data @@ '$.user.age > 21'. Find documents where any nested object at any depth has a deleted field that is true: data @? '$.** ? (@.deleted == true)'.
Why jsonpath earns its complexity
The operator set for jsonb without jsonpath is awkward for nested-structure queries. Consider a document representing a customer with multiple addresses, each address having a country field. Without jsonpath, asking whether any address has country US requires either application-side iteration after fetching the whole document or a lateral join that unnests the array: SELECT * FROM customers c, jsonb_array_elements(c.addresses) addr WHERE addr->>'country' = 'US'. With jsonpath the query becomes SELECT * FROM customers WHERE addresses @? '$[*] ? (@.country == "US")', which reads closer to what the question actually asks and avoids the row multiplication that the unnesting approach produces if you only want existence and not all matching elements.
The deep-search case is where jsonpath becomes genuinely difficult to replicate without it. The recursive descent operator .** finds matching elements at any depth, which is impossible to express in pure SQL without recursive CTEs that walk the document. For documents with variable structure (audit logs that capture arbitrary event payloads, webhook events from heterogeneous sources, configuration documents with optional nested structure), the ability to ask "does this field exist anywhere in the document" is the difference between feasible and infeasible queries.
The predicate filtering in jsonpath also includes comparison operators (==, !=, <, >, <=, >=), logical operators (&&, ||, !), and string functions (starts with, like_regex). The like_regex function is particularly useful for pattern matching within JSON string values, which would otherwise require extracting the string with ->> and matching with SQL's regex operators in the WHERE clause.
Indexing for jsonpath queries
The GIN indexes that work for the @> containment operator do not help for jsonpath queries by default. The jsonb_path_ops opclass that gives smaller and faster indexes for containment does not support jsonpath at all. The standard jsonb_ops opclass supports the @? and @@ operators, but the index is used less aggressively than for containment queries and the planner's cost estimates can be wrong for complex jsonpath expressions.
The Postgres 12+ approach to indexing jsonpath queries is to create expression indexes on the specific paths you query. If you frequently ask whether data.user.country is some specific value, an index on (data -> 'user' ->> 'country') works for that specific query but does not generalize to arbitrary jsonpath expressions. The trade-off is between flexibility (jsonpath queries with GIN index, slower but works for any path) and performance (expression indexes on specific paths, faster but narrow).
The Postgres 13+ jsonb_path_ops on a flat representation gives some support for jsonpath existence queries, but the indexing story for jsonpath is genuinely weaker than for the simpler operators. For high-volume read paths against jsonb columns with jsonpath queries, the typical solution is to denormalize the specific fields you query into proper columns (often as generated columns) and index those, treating jsonpath as the fallback for less-common queries.
The jsonpath gotchas
The first gotcha is that jsonpath is strict mode by default, which means accessing missing fields raises an error. The lax mode (specified with strict or lax prefix in the expression) silently treats missing fields as null and is usually what you want for queries against heterogeneous documents. The default is lax for the @? and @@ operators, but explicit specification is clearer.
The second gotcha is that arithmetic and comparison operators in jsonpath have different semantics from SQL. The == operator is exact equality including type. A jsonpath query comparing a string "21" to a number 21 returns false, whereas the SQL comparison after extracting with ->> would do automatic type coercion. The implication is that jsonpath comparisons can silently miss matches when the document has inconsistent typing.
The third gotcha is that the predicate filtering in jsonpath applies to elements at the path destination, not to entire documents. The expression $[*] ? (@.country == "US") returns the matching array elements; the @? operator checks whether that returned set is non-empty. If you want the original document filtered by whether any element matches, you use @? in the WHERE clause; if you want the matching elements themselves, you use jsonb_path_query as a set-returning function in the FROM or SELECT clause. The two use cases require different syntactic patterns.
When jsonpath is the right tool
The cases where jsonpath earns its complexity are deep-search queries against documents with variable structure, predicate filtering across nested arrays, and existence checks for fields whose path depth is not fixed. These are the cases where the simpler operators produce awkward queries or require application-side post-processing.
The cases where simpler operators are sufficient and jsonpath is over-engineered are flat structures, fixed schemas, and exact-match queries on known paths. For most application workloads against jsonb columns, the -> and ->> accessors plus @> containment cover 80-90 percent of queries; jsonpath handles the remainder where it would be infeasible without it.
The cases where neither approach is right and you should reach for proper columns are queries that drive read-path performance, queries that need indexing for high-volume access, and queries whose pattern is stable enough that the document shape could be normalized into relational columns. The flexibility of jsonb has operational costs, and the question of when to denormalize jsonb into columns is one of the recurring schema-evolution decisions in Postgres-based products.
Our use across the four products
Our four products use jsonb columns for audit log payloads, webhook event data, and customer-supplied configuration. The query patterns are mostly the simple operators against known paths, with jsonpath used in a small number of analytics queries that ask about deep structure.
WebhookVault stores captured webhook payloads as jsonb and uses jsonpath for the customer-facing search feature ("find requests where any header matches X" or "find requests where the body has a deleted field anywhere"). The jsonpath usage is bounded to a small set of pre-defined expressions exposed through the dashboard search UI, not arbitrary customer-supplied expressions (because customer-supplied jsonpath would be a denial-of-service vector with complex expressions against large documents). The query performance is acceptable because the captured request retention is bounded and the indexed columns handle the high-volume filtering.
FlagBit uses jsonb for the targeting rules associated with each feature flag, with the rules being a structured tree of conditions. The evaluation path uses the simpler operators and direct field access; the analytics path (which we use internally to understand which rules are being used) uses jsonpath to walk the rule tree looking for specific patterns.
CronPing uses jsonb for monitor configuration and uses only the simple operators; the structure is regular enough that jsonpath would be over-engineered.
DocuMint uses jsonb for invoice template configuration and PDF generation parameters; the structure is variable enough that jsonpath might help future queries, but we have not needed it yet.
What jsonpath does not solve
The first thing jsonpath does not solve is mutation. The jsonpath operators are read-only; to modify jsonb based on jsonpath matching, you use jsonb_set with paths specified as text arrays, not jsonpath expressions. This is an awkward asymmetry in the jsonb API, and Postgres 16 added jsonb_path_query_array and related functions that help bridge the gap, but the canonical mutation operators do not accept jsonpath.
The second thing jsonpath does not solve is the impedance mismatch between document and relational thinking. If your queries against jsonb consistently look like "find documents where this nested field has this value" with the nested field being the same across all queries, you are paying the jsonb flexibility tax without using it. The proper solution is to extract those fields into columns, not to write more elaborate jsonpath expressions to access them.
The third thing jsonpath does not solve is query performance at high volume. The execution of a jsonpath expression against a jsonb value is interpreted and involves overhead per row. For analytical queries scanning millions of documents, the cost can be substantial, and the typical solution is materialized views with the relevant fields extracted as columns. The jsonpath approach works at the small-to-medium scale where the query writing time matters more than the per-query execution time.
The deeper observation is that jsonpath is a tool for cases where the document shape is genuinely variable and the queries need to navigate that variability. Postgres has been adding capabilities for working with semi-structured data for over a decade, and jsonpath is the most recent significant addition. The operator set is larger than most teams realize, the indexing story is weaker than the simple-operator story, and the right pattern is to reach for the simpler operators first and graduate to jsonpath only when the document structure genuinely requires it.
Our products: DocuMint (PDF invoice generation API), CronPing (cron job monitoring with status pages), FlagBit (feature flags API for modern teams), and WebhookVault (webhook capture and replay) put these patterns into production.