View Old View New View Both View Only Previous Next

This draft contains only sections that have differences from the version that it modified.

W3C

XML Path Language (XPath) 4.0 WG Review Draft

W3C Editor's Draft 23 February 2026

This version:
https://qt4cg.org/specifications/xpath-40/
Most recent version of XPath:
https://qt4cg.org/specifications/xpath-40/
Most recent Recommendation of XPath:
https://www.w3.org/TR/2017/REC-xpath-31-20170321/
Editor:
Michael Kay, Saxonica <mike@saxonica.com>

Please check the errata for any errors or issues reported since publication.

See also translations.

This document is also available in these non-normative formats: XML.


Abstract

XPath 4.0 is an expression language that allows the processing of values conforming to the data model defined in [XQuery and XPath Data Model (XDM) 4.0]. The name of the language derives from its most distinctive feature, the path expression, which provides a means of hierarchic addressing of the nodes in an XML tree. As well as modeling the tree structure of XML, the data model also includes atomic items, function items, maps, arrays, and sequences. This version of XPath supports JSON as well as XML, and adds many new functions in [XQuery and XPath Functions and Operators 4.0].

XPath 4.0 is a superset of XPath 3.1. A detailed list of changes made since XPath 3.1 can be found in I Change Log.

Status of this Document

This is a draft prepared by the QT4CG (officially registered in W3C as the XSLT Extensions Community Group). Comments are invited.

Dedication

The publications of this community group are dedicated to our co-chair, Michael Sperberg-McQueen (1954–2024).

Michael was central to the development of XML and many related technologies. He brought a polymathic breadth of knowledge and experience to everything he did. This, combined with his indefatigable curiosity and appetite for learning, made him an invaluable contributor to our project, along with many others. We have lost a brilliant thinker, a patient teacher, and a loyal friend.


4 Expressions

This section discusses each of the basic kinds of expression. Each kind of expression has a name such as PathExpr, which is introduced on the left side of the grammar production that defines the expression. Since XPath 4.0 is a composable language, each kind of expression is defined in terms of other expressions whose operators have a higher precedence. In this way, the precedence of operators is represented explicitly in the grammar.

The order in which expressions are discussed in this document does not reflect the order of operator precedence. In general, this document introduces the simplest kinds of expressions first, followed by more complex expressions. For the complete grammar, see Appendix [A XPath 4.0 Grammar].

The highest-level symbol in the XPath grammar is XPath.

XPath::=Expr
Expr::=(ExprSingle ++ ",")
ExprSingle::=ForExpr
| LetExpr
| QuantifiedExpr
| IfExpr
| OrExpr
ExprSingle::=ForExpr
| LetExpr
| QuantifiedExpr
| IfExpr
| OrExpr
ForExpr::=ForClauseForLetReturn
LetExpr::=LetClauseForLetReturn
QuantifiedExpr::=("some" | "every") (QuantifierBinding ++ ",") "satisfies" ExprSingle
IfExpr::="if" "(" Expr ")" (UnbracedActions | BracedAction)
OrExpr::=AndExpr ("or" AndExpr)*

The XPath 4.0 operator that has lowest precedence is the comma operator, which is used to combine two operands to form a sequence. As shown in the grammar, a general expression (Expr) can consist of multiple ExprSingle operands, separated by commas.

The name ExprSingle denotes an expression that does not contain a top-level comma operator (despite its name, an ExprSingle may evaluate to a sequence containing more than one item.)

The symbol ExprSingle is used in various places in the grammar where an expression is not allowed to contain a top-level comma. For example, each of the arguments of a function call must be a ExprSingle, because commas are used to separate the arguments of a function call.

After the comma, the expressions that have next lowest precedence are ForExpr, LetExpr, QuantifiedExpr, IfExpr, and OrExpr. Each of these expressions is described in a separate section of this document.

4.12 For and Let Expressions

XPath provides two closely-related expressions, called For and Let expressions, that can be used to bind variables to values. These are described in the following sections.

4.12.1 For Expressions

Changes in 4.0  

  1. A for member clause is added to FLWOR expressions to allow iteration over an array.  [Issue 49 PR 344 10 February 2023]

  2. Multiple for and let clauses can be combined in an expression without an intervening return keyword.   [Issue 22 PR 28 18 December 2020]

  3. A for key/value clause is added to FLWOR expressions to allow iteration over maps.  [Issue 31 PR 1249 1 June 2024]

  4. A positional variable can be defined in a for expression.   [Issue 231 PR 1131 1 April 2024]

  5. The type of a variable used in a for expression can be declared.   [Issue 796 PR 1131 1 April 2024]

XPath provides an iteration facility called a for expression. It can be used to iterate over the items of a sequence, the members of an array, or the entries in a map.

ForExpr::=ForClauseForLetReturn
ForClause::="for" (ForBinding ++ ",")
ForBinding::=ForItemBinding | ForMemberBinding | ForEntryBinding
ForItemBinding::=VarNameAndTypePositionalVar? "in" ExprSingle
VarNameAndType::="$" EQNameTypeDeclaration?
EQName::=QName | URIQualifiedName
TypeDeclaration::="as" SequenceType
PositionalVar::="at" VarName
VarName::="$" EQName
ExprSingle::=ForExpr
| LetExpr
| QuantifiedExpr
| IfExpr
| OrExpr
ForMemberBinding::="member" VarNameAndTypePositionalVar? "in" ExprSingle
ForEntryBinding::=((ForEntryKeyBindingForEntryValueBinding?) | ForEntryValueBinding) PositionalVar? "in" ExprSingle
ForEntryKeyBinding::="key" VarNameAndType
ForEntryValueBinding::="value" VarNameAndType
ForLetReturn::=ForExpr | LetExpr | ("return" ExprSingle)
LetExpr::=LetClauseForLetReturn

A for expression is evaluated as follows:

  1. If the ForClause includes multiple ForBindings with a comma separator, the forexpression is first expanded to a set of nested for expressions, each of which contains a single ForBinding. More specifically, every separating comma is replaced by for.

    Example:

    The expression:

    for $x in X, $y in Y return $x + $y

    is expanded to:

    for $x in X for $y in Y return $x + $y
  2. Having performed this expansion, variables bound in the ForClause are called the range variables, the variable named in the PositionalVar (if present) is called the position variable, the expression that follows the in keyword is called the binding expression, and the expression in the ForLetReturn part (that is, the following LetExpr or ForExpr, or the ExprSingle that follows the return keyword) is called the return expression.

    [Definition: The result of evaluating the binding expression in a for expression is called the binding collection ].

  3. If a position variable is declared, its type is implicitly xs:integer. Its name (as an expanded QName) must be different from the name of a range variable declared in the same ForBinding. [err:XQST0089].

  4. When a ForItemBinding is used (that is, when none of the keywords member, key, or value is used), the expression iterates over the items in a sequence:

    1. If a TypeDeclaration is present then each item in the binding collection is converted to the specified type by applying the coercion rules.

    2. The return expression is evaluated once for each item in the binding collection, with a dynamic context in which the range variable is bound to that item, and the position variable (if present) is bound to the one-based position of that item in the binding collection, as an instance of type xs:integer.

    3. The result of the for expression is the sequence concatenation of the results of the successive evaluations of the return expression.

  5. When the member keyword is present:

    1. The value of the binding collection must be a single array. Otherwise, a type error is raised: [err:XPTY0141]. However, the coercion rules also allow a JNode whose ·content· is an array to be supplied.

    2. If a TypeDeclaration is present then each member of the binding collection array is converted to the specified type by applying the coercion rules. (Recall that this can be any sequence, not necessarily a single item).

    3. The result of the single-variable for member expression is obtained by evaluating the return expression once for each member of that array, with the range variable bound to that member

    4. The return expression is evaluated once for each member of the binding collection array, with a dynamic context in which the range variable is bound to that member, and the position variable (if present) is bound to the one-based position of that member in the binding collection.

    5. The result of the for expression is the sequence concatenation of the results of the successive evaluations of the return expression.

      Note that the result is a sequence, not an array.

  6. When the key and/or value keywords are present:

    1. The value of the binding collection must be a single map. Otherwise, a type error is raised: [err:XPTY0141]. However, the coercion rules also allow a JNode whose ·content· is a map to be supplied. The map is treated as a sequence of key/value pairs, in implementation dependent order.

    2. If the key keyword is present, then the corresponding variable is bound to the key part of the key/value pair.

    3. If the value keyword is present, then the corresponding variable is bound to the value part of the key/value pair.

    4. If both the key and value keywords are present, then the corresponding variables must have distinct names. [err:XQST0089].

    5. If a TypeDeclaration is present for the key, then each key is converted to the specified type by applying the coercion rules.

    6. If a TypeDeclaration is present for the value, then each value is converted to the specified type by applying the coercion rules.

    7. The result of the single-variable for key/value expression is obtained by evaluating the return expression once for each entry in the map, with the range variables bound to that entry as described.

    8. The return expression is evaluated once for each entry of the binding collection map, with a dynamic context in which the keyrange variable (if present) is bound to the key part of that entry, the valuerange variable (if present) is bound to the value part of that entry, and the position variable (if present) is bound to the one-based position of that entry in the implementation dependent ordering of the binding collection.

    9. The result of the for expression is the sequence concatenation of the results of the successive evaluations of the return expression.

      Note that the result is a sequence, not a map.

The following example illustrates the use of a for expression in restructuring an input document. The example is based on the following input:

<bib>
  <book>
    <title>TCP/IP Illustrated</title>
    <author>Stevens</author>
    <publisher>Addison-Wesley</publisher>
  </book>
  <book>
    <title>Advanced Programming in the Unix Environment</title>
    <author>Stevens</author>
    <publisher>Addison-Wesley</publisher>
  </book>
  <book>
    <title>Data on the Web</title>
    <author>Abiteboul</author>
    <author>Buneman</author>
    <author>Suciu</author>
  </book>
</bib>

The following example transforms the input document into a list in which each author’s name appears only once, followed by a list of titles of books written by that author. This example assumes that the context value is the bib element in the input document.

for $a in distinct-values(book/author)
return ((book/author[. = $a])[1], book[author = $a]/title)

The result of the above expression consists of the following sequence of elements. The titles of books written by a given author are listed after the name of the author. The ordering of author elements in the result is implementation-dependent due to the semantics of the fn:distinct-values function.

<author>Stevens</author>
<title>TCP/IP Illustrated</title>
<title>Advanced Programming in the Unix environment</title>
<author>Abiteboul</author>
<title>Data on the Web</title>
<author>Buneman</author>
<title>Data on the Web</title>
<author>Suciu</author>
<title>Data on the Web</title>

The following example illustrates a for expression containing more than one variable:

for $i in (10, 20),
    $j in (1, 2)
return ($i + $j)

The result of the above expression, expressed as a sequence of numbers, is as follows: 11, 12, 21, 22

The scope of a variable bound in a for expression is the return expression. The scope does not include the expression to which the variable is bound. The following example illustrates how a variable binding may reference another variable bound earlier in the same for expression:

for $x in $z, $y in f($x)
return g($x, $y)

The following example illustrates processing of an array.

for member $map in parse-json('[{ "x": 1, "y": 2 }, { "x": 10, "y": 20 }]') 
return $map ! (?x + ?y)

The result is the sequence (3, 30).

The following example illustrates processing of a map.

for key $key value $value in { "x": 1, "y": 2, "z: 3 }
return `{$key}={$value}`

The result is the sequence ("x=1", "y=2", "z=3") (but not necessarily in that order).

Note:

The focus for evaluation of the return clause of a for expression is the same as the focus for evaluation of the for expression itself. The following example, which attempts to find the total value of a set of order-items, is therefore incorrect:

sum(for $i in order-item return @price * @qty)

Instead, the expression must be written to use the variable bound in the for clause:

sum(for $i in order-item return $i!(@price * @qty))

Note:

XPath 4.0 allows the format:

for $order in //orders
for $line in $order/order-line
return $line/value

primarily because it is familiar to XQuery users, some of whom may regard it as more readable than the XPath 3.1 alternative which uses a comma in place of the second for.

4.13 Maps and Arrays

Most modern programming languages have support for collections of key/value pairs, which may be called maps, dictionaries, associative arrays, hash tables, keyed lists, or objects (these are not the same thing as objects in object-oriented systems). In XPath 4.0, we call these maps. Most modern programming languages also support ordered lists of values, which may be called arrays, vectors, or sequences. In XPath 4.0, we have both sequences and arrays. Unlike sequences, an array is an item, and can appear as an item in a sequence.

Note:

The XPath 4.0 specification focuses on syntax provided for maps and arrays, especially constructors and lookup.

Some of the functionality typically needed for maps and arrays is provided by functions defined in Section 18 Processing mapsFO and Section 19 Processing arraysFO, including functions used to read JSON to create maps and arrays, serialize maps and arrays to JSON, combine maps to create a new map, remove map entries to create a new map, iterate over the keys of a map, convert an array to create a sequence, combine arrays to form a new array, and iterate over arrays in various ways.

4.13.1 Maps

Changes in 4.0  

  1. Ordered maps are introduced.  [Issue 1651 PR 1703 14 January 2025]

[Definition: A map is a function that associates a set of keys with values, resulting in a collection of key / value pairs.] [Definition: Each key / value pair in a map is called an entry.] [Definition: The value associated with a given key is called the associated value of the key.]

Maps and their properties are defined in the data model: see Section 8.2 Map ItemsDM. For an overview of the functions available for processing maps, see Section 18 Processing mapsFO.

Note:

Maps in XPath 4.0 are ordered. The effect of this property is explained in Section 8.2 Map ItemsDM. In an ordered map, the order of entries is predictable and depends on the order in which they were added to the map.

4.13.1.1 Map Constructors

Changes in 4.0  

  1. In map constructors, the keyword map is now optional, so map { 0: false(), 1: true() } can now be written { 0: false(), 1: true() }, provided it is used in a context where this creates no ambiguity.   [Issue 1070 PR 1071 26 March 2024]

  2. The order of key-value pairs in the map constructor is now retained in the constructed map.  [Issue 1651 PR 1703 14 January 2025]

  3. A general expression is allowed within a map constructor; this facilitates the creation of maps in which the presence or absence of particular keys is decided dynamically.  [Issue 2003 PR 2094 13 July 2025]

A map can be created using a MapConstructor.

Examples are:

{ "a": 1, "b": 2 }

which constructs a map with two entries, and

{ "a": 1, if ($condition) { map{ "b": 2 } } }

which constructs a map having either one or two entries depending on the value of $condition.

Both the keys and the values in a map constructor can be supplied as expressions rather than as constants.

MapConstructor::="map"? "{" (MapConstructorEntry ** ",") "}"
MapConstructorEntry::=ExprSingle (":" ExprSingle)?
ExprSingle::=ForExpr
| LetExpr
| QuantifiedExpr
| IfExpr
| OrExpr

Note:

The keyword map was required in earlier versions of the language; in XPath 4.0 it becomes optional. There may be cases where using the keyword improves readability.

In order to allow the map keyword to be omitted, an incompatible change has been made to XQuery computed element and attribute constructors: if the name of the constructed element or attribute is a language keyword, it must now be written using the QNameLiteral syntax, for example element #div {}.

Although the grammar allows a MapConstructor to appear within an EnclosedExpr (that is, between curly brackets), this may be confusing to readers, and using the map keyword in such cases may improve clarity. The keyword map is used in the second example above to avoid any confusion between the braces required for the then part of the conditional expression, and the braces required for the inner map constructor.

If the EnclosedExpr appears in a context such as a StringTemplate, the two adjacent left opening braces must at least be separated by whitespace.

When a MapConstructorEntry is written as two instances of ExprSingle separated by a colon, the first expression is evaluated and atomized to form a key, and the second expression is evaluated to form the corresponding value. The result is a single-entry mapDM which will be merged into the constructed map, as described below. A type error [err:XPTY0004] occurs if the result of the first expression (after atomization) is not a single atomic item. The result of the second expression is used as is.

When the MapConstructorEntry is written as a single instance of ExprSingle with no colon, it must evaluate to a sequence of zero or more map items ([err:XPTY0004]). However, the coercion rules also allow a JNode whose ·content· is a map (or a sequence of maps) to be supplied. These map items will be merged into the constructed map, as described below.

Each contained MapConstructorEntry thus delivers zero or more maps, and the result of the map constructor is a new map obtained by merging these component maps, in order, as if by the map:merge function.

[Definition: Two atomic items K1 and K2 have the same key value if fn:atomic-equal(K1, K2) returns true, as specified in Section 14.2.1 fn:atomic-equalFO ] If two or more entries have the same key value then a dynamic error is raised [err:XQDY0137]. The error may be raised statically if two or more entries can be determined statically to have the same key value.

The entry orderDM of the entries in the constructed map retains the order of the MapConstructorEntry entries in the input.

Example: Constructing a fixed map

The following expression constructs a map with seven entries:

{
  "Su" : "Sunday",
  "Mo" : "Monday",
  "Tu" : "Tuesday",
  "We" : "Wednesday",
  "Th" : "Thursday",
  "Fr" : "Friday",
  "Sa" : "Saturday"
}

 

Example: Constructing a map with conditional entries

The following expression constructs a map with either five or seven entries, depending on a supplied condition:

{
  "Mo" : "Monday",
  "Tu" : "Tuesday",
  "We" : "Wednesday",
  "Th" : "Thursday",
  "Fr" : "Friday",
  if ($include-weekends) {
    { "Sa" : "Saturday",
      "Su" : "Sunday"
    }
  }
}

 

Example: Constructing a map to index nodes

The following expression (which uses two nested map constructors) constructs a map that indexes employees by the value of their @id attribute:

{ //employee ! {@id, .} }

 

Example: Constructing nested maps

Maps can nest, and can contain any XDM value. Here is an example of a nested map with values that can be string values, numeric values, or arrays:

{
  "book": {
    "title": "Data on the Web",
    "year": 2000,
    "author": [
      {
        "last": "Abiteboul",
        "first": "Serge"
      },
      {
        "last": "Buneman",
        "first": "Peter"
      },
      {
        "last": "Suciu",
        "first": "Dan"
      }
    ],
    "publisher": "Morgan Kaufmann Publishers",
    "price": 39.95
  }
}

Note:

The syntax deliberately mimics JSON, but there are a few differences. JSON constructs that are not accepted in XPath 4.0 map constructors include the keywords true, false, and null, and backslash-escaped characters such as "\n" in string literals. In an XPath 4.0 map constructor, of course, any literal value can be replaced with an expression.

Note:

In some circumstances, it is necessary to include whitespace before or after the colon of a MapConstructorEntry to ensure that it is parsed as intended.

For instance, consider the expression {a:b}. Although it matches the EBNF for MapConstructor (with a matching MapKeyExpr and b matching MapValueExpr), the "longest possible match" rule requires that a:b be parsed as a QName, which results in a syntax error. Changing the expression to {a :b} or {a: b} will prevent this, resulting in the intended parse.

Similarly, consider these three expressions:

{a:b:c}
{a:*:c}
{*:b:c}

In each case, the expression matches the EBNF in two different ways, but the “longest possible match” rule forces the parse in which the MapKeyExpr is a:b, a:*, or *:b (respectively) and the MapValueExpr is c. To achieve the alternative parse (in which the MapKeyExpr is merely a or *), insert whitespace before and/or after the first colon.

See A.3 Lexical structure.

Note:

There are also several functions that can be used to construct maps with a variable number of entries:

  • map:build takes any sequence as input, and for each item in the sequence, it computes a key and a value, by calling user-supplied functions.

  • map:merge takes a sequence of maps (often but not necessarily single-entry mapDM) and merges them into a single map.

  • map:of-pairs takes a sequence of key-value pair mapsFO and merges them into a single map.

Any of these functions can be used to build an index of employee elements using the value of the @id attribute as a key:

  • map:build(//employee, fn { @id })

  • map:merge(//employee ! { @id, . })

  • map:of-pairs(//employee ! { 'key': @id, 'value': . })

All three functions also provide control over:

  • The way in which duplicate keys are handled, and

  • The ordering of entries in the resulting map.

4.13.4 Filter Expressions for Maps and Arrays

Changes in 4.0  

  1. Filter expressions for maps and arrays are introduced.   [Issue 1159 PR 1163 20 April 2024]

  2. Predicates in filter expressions for maps and arrays can now be numeric.   [Issue 1207 PR 1217 15 May 2024]

FilterExprAM::=PostfixExpr "?[" Expr "]"
PostfixExpr::=PrimaryExpr | FilterExpr | DynamicFunctionCall | LookupExpr | FilterExprAM
Expr::=(ExprSingle ++ ",")
ExprSingle::=ForExpr
| LetExpr
| QuantifiedExpr
| IfExpr
| OrExpr

Maps and arrays can be filtered using the construct INPUT?[FILTER]. For example, $array?[count(.)=1] filters an array to retain only those members that are single items.

Note:

The character-pair ?[ forms a single token; no intervening whitespace or comment is allowed.

The required type of the left-hand operand INPUT is (map(*)|array(*))?: that is, it must be either an empty sequence, a single map, or a single array [err:XPTY0004]. However, the coercion rules also allow a JNode whose ·content·If it is a map or array to be supplied. If the value is an empty sequence, the result of the expression is an empty sequence.

If the value of INPUT is an array, then the FILTER expression is evaluated for each member of the array, with that member as the context value, with its position in the array as the context position, and with the size of the array as the context size. The result of the expression is an array containing those members of the input array for which the predicate truth value of the FILTER expression is true. The order of retained members is preserved.

For example, the following expression:

let $array := [ (), 1, (2, 3), (4, 5, 6) ]
return $array?[count(.) ge 2]

returns:

[ (2, 3), (4, 5, 6) ]

Note:

Numeric predicates are handled in the same way as with filter expressions for sequences. However, the result is always an array, even if only one member is selected. For example, given the $array shown above, the result of $array?[3] is the single-member arrayDM[ (2, 3) ]. Contrast this with $array?3 which delivers the sequence 2, 3.

If the value of INPUT is a map, then the FILTER expression is evaluated for each entry in the map, with the context value set to an item of type record(key as xs:anyAtomicType, value as item()*), in which the key and value fields represent the key and value of the map entry. The context position is the position of the entry in the map (in entry orderDM), and the context size is the number of entries in the map. The result of the expression is a map containing those entries of the input map for which the predicate truth value of the FILTER expression is true. The relative order of entries in the result retains the relative order of entries in the input.

For example, the following expression:

let map := { 1: "alpha", 2: "beta", 3: "gamma" }
return $map?[?key ge 2]

returns:

{ 2: "beta", 3: "gamma" }

Note:

A filter expression such as $map?[last()-1, last()] might be used to return the last two entries of a map in entry orderDM.