XML Path Language (XPath) 4.0 WG Review Draft

Except as noted in this document, if any operand of an expression raises a dynamic error, the expression also raises a dynamic error. If an expression can validly return a value or raise a dynamic error, the implementation may choose to return the value or raise the dynamic error (see 2.4.4 Errors and Optimization). For example, the logical expression expr1 and expr2 may return the value false if either operand returns false, or may raise a dynamic error if either operand raises a dynamic error.

If more than one operand of an expression raises an error, the implementation may choose which error is raised by the expression. For example, in this expression:

($x div $y) + xs:decimal($z)

both the sub-expressions ($x div $y) and xs:decimal($z) may raise an error. The implementation may choose which error is raised by the + expression. Once one operand raises an error, the implementation is not required, but is permitted, to evaluate any other operands.

[Definition: In addition to its identifying QName, a dynamic error may also carry a descriptive string and one or more additional values called error values.] An implementation may provide a mechanism whereby an application-defined error handler can process error values and produce diagnostic messages. The host language may also provide error handling mechanisms.

A dynamic error may be raised by a system function or operator. For example, the div operator raises an error if its operands are xs:decimal values and its second operand is equal to zero. Errors raised by system functions and operators are defined in [XQuery and XPath Functions and Operators 4.0] or the host language.

A dynamic error can also be raised explicitly by calling the fn:error function, which always raises a dynamic error and never returns a value. This function is defined in Section 3.1.1 fn:error^FO. For example, the following function call raises a dynamic error, providing a QName that identifies the error, a descriptive string, and a diagnostic value (assuming that the prefix app is bound to a namespace containing application-defined error codes):

error(xs:QName("app:err057"), "Unexpected value", string($v))

error(#app:err057, "Unexpected value", string($v))

4 Expressions

This section discusses each of the basic kinds of expression. Each kind of expression has a name such as PathExpr, which is introduced on the left side of the grammar production that defines the expression. Since XPath 4.0 is a composable language, each kind of expression is defined in terms of other expressions whose operators have a higher precedence. In this way, the precedence of operators is represented explicitly in the grammar.

The order in which expressions are discussed in this document does not reflect the order of operator precedence. In general, this document introduces the simplest kinds of expressions first, followed by more complex expressions. For the complete grammar, see Appendix [A XPath 4.0 Grammar].

The highest-level symbol in the XPath grammar is XPath.

`XPath`	::=	`Expr`
`Expr`	::=	`(ExprSingle ++ ",")`
`ExprSingle`	::=	`ForExpr \| LetExpr \| QuantifiedExpr \| IfExpr \| OrExpr`

`ExprSingle`	::=	`ForExpr \| LetExpr \| QuantifiedExpr \| IfExpr \| OrExpr`
`ForExpr`	::=	`ForClauseForLetReturn`
`LetExpr`	::=	`LetClauseForLetReturn`
`QuantifiedExpr`	::=	`("some" \| "every") (QuantifierBinding ++ ",") "satisfies" ExprSingle`
`IfExpr`	::=	`"if" "(" Expr ")" (UnbracedActions \| BracedAction)`
`OrExpr`	::=	`AndExpr ("or" AndExpr)*`

The XPath 4.0 operator that has lowest precedence is the comma operator, which is used to combine two operands to form a sequence. As shown in the grammar, a general expression (Expr) can consist of multiple ExprSingle operands, separated by commas.

The name ExprSingle denotes an expression that does not contain a top-level comma operator (despite its name, an ExprSingle may evaluate to a sequence containing more than one item.)

The symbol ExprSingle is used in various places in the grammar where an expression is not allowed to contain a top-level comma. For example, each of the arguments of a function call must be a ExprSingle, because commas are used to separate the arguments of a function call.

After the comma, the expressions that have next lowest precedence are ForExpr, LetExpr, QuantifiedExpr, IfExpr, and OrExpr. Each of these expressions is described in a separate section of this document.

4.2 Primary Expressions

[Definition: Primary expressions are the basic primitives of the language. They include literals, variable references, context value references, and function calls. A primary expression may also be created by enclosing any expression in parentheses, which is sometimes helpful in controlling the precedence of operators.] Map and Array Constructors are described in 4.13.1 Maps and 4.13.2 Arrays.

`PrimaryExpr`	::=	`Literal \| VarRef \| ParenthesizedExpr \| ContextValueRef \| FunctionCall \| FunctionItemExpr \| MapConstructor \| ArrayConstructor \| StringTemplate \| UnaryLookup`
`Literal`	::=	`NumericLiteral \| StringLiteralNumericLiteral \| StringLiteral \| QNameLiteral`
`VarRef`	::=	`"$" EQName`
`ParenthesizedExpr`	::=	`"(" Expr? ")"`
`ContextValueRef`	::=	`"."`
`FunctionCall`	::=	`EQNameArgumentList`
		/* xgc: reserved-function-names */
		/* gn: parens */
`FunctionItemExpr`	::=	`NamedFunctionRef \| InlineFunctionExpr`
`NamedFunctionRef`	::=	`EQName "#" IntegerLiteral`
		/* xgc: reserved-function-names */
`InlineFunctionExpr`	::=	`MethodAnnotation* ("function" \| "fn") FunctionSignature? FunctionBody`
`MapConstructor`	::=	`"map"? "{" (MapConstructorEntry ** ",") "}"`
`ArrayConstructor`	::=	`SquareArrayConstructor \| CurlyArrayConstructor`
`StringTemplate`	::=	"`" (StringTemplateFixedPart \| StringTemplateVariablePart)* "`"
		/* ws: explicit */
`UnaryLookup`	::=	`Lookup`

4.2.1 Literals

[Definition: A literal is a direct syntactic representation of an atomic item.] XPath 4.0 supports twothree kinds of literals: numeric literals and, string literals, and QName literals.

4.2.1.3 QName Literals

Changes in 4.0 ⬇ ⬆

QName literals are new in 4.0. [Issue 1661 ]

A QName literal represents a value of type xs:QName.

`QNameLiteral`	::=	`"#" EQName`
`EQName`	::=	`QName \| URIQualifiedName`

For example, the expression node-name($node) = #xml:space returns true if the name of the node $node is the QName with local part space and namespace URI http://www.w3.org/XML/1998/namespace.

If the EQName is an unprefixed NCName, then it is taken as being in no namespace. If it is a prefixed QName, then the prefix is resolved to a namespace URI using the statically known namespaces. If there is no binding for the prefix in the statically known namespaces then a static error is raised [err:XPST0008].

Note:

QNames are widely used in XPath 4.0 to represent the names of constructs such as functions, variables, and elements. A QName appearing on its own as an expression, for example my:invoice, is an abbreviation for the axis step child::my:step, which selects a child element of the context node having this particular element name. A different construct is therefore needed to represent an atomic value of type xs:QName. For example, the function fn:error expects an xs:QName value as its first argument, so (provided that the prefix err is defined in the static context) it is possible to use a call such as error(#err:XPTY0004) to raise an error with this error code.

4.2.1.34.2.1.4 Constants of Other Types

The xs:boolean values true and false can be constructed by calls to the system functionsfn:true() and fn:false(), respectively.

Values of other simple types can be constructed by calling the constructor function for the given type. The constructor functions for XML Schema built-in types are defined in Section 21.1 Constructor functions for XML Schema built-in atomic types^FO. In general, the name of a constructor function for a given type is the same as the name of the type (including its namespace). For example:

xs:integer("12") returns the integer value twelve.
xs:date("2001-08-25") returns an item whose type is xs:date and whose value represents the date 25th August 2001.
xs:dayTimeDuration("PT5H") returns an item whose type is xs:dayTimeDuration and whose value represents a duration of five hours.

Constructor functions can also be used to create special values that have no literal representation, as in the following examples:

xs:float("NaN") returns the special floating-point value, "Not a Number."
xs:double("INF") returns the special double-precision value, "positive infinity."

Constructor functions are available for all simple types, including union types. For example, if my:dt is a user-defined union type whose member types are xs:date, xs:time, and xs:dateTime, then the expression my:dt("2011-01-10") creates an atomic item of type xs:date. The rules follow XML Schema validation rules for union types: the effect is to choose the first member type that accepts the given string in its lexical space.

It is also possible to construct values of various types by using a cast expression. For example:

9 cast as hatsize returns the atomic item 9 whose type is hatsize.

4.13 Maps and Arrays

Most modern programming languages have support for collections of key/value pairs, which may be called maps, dictionaries, associative arrays, hash tables, keyed lists, or objects (these are not the same thing as objects in object-oriented systems). In XPath 4.0, we call these maps. Most modern programming languages also support ordered lists of values, which may be called arrays, vectors, or sequences. In XPath 4.0, we have both sequences and arrays. Unlike sequences, an array is an item, and can appear as an item in a sequence.

Note:

The XPath 4.0 specification focuses on syntax provided for maps and arrays, especially constructors and lookup.

Some of the functionality typically needed for maps and arrays is provided by functions defined in Section 18 Processing maps^FO and Section 19 Processing arrays^FO, including functions used to read JSON to create maps and arrays, serialize maps and arrays to JSON, combine maps to create a new map, remove map entries to create a new map, iterate over the keys of a map, convert an array to create a sequence, combine arrays to form a new array, and iterate over arrays in various ways.

4.13.3 Lookup Expressions

Changes in 4.0 ⬇ ⬆

The lookup operator ? can now be followed by a string literal, for cases where map keys are strings other than NCNames. It can also be followed by a variable reference.
A deep lookup operator ?? is provided for searching trees of maps and arrays. [Issue 297 PR 837 23 November 2023]
Lookup expressions can now take a modifier (such as keys, values, or pairs) enabling them to return structured results rather than a flattened sequence. [Issues 960 1094 PR 1125 23 April 2024]
An inline function may be annotated as a %method, giving it access to its containing map. [Issues 1800 1845 PRs 1817 1853 4 March 2025]
The key specifier can reference an item type or sequence type, to select values of that type only. This is especially useful when processing trees of maps and arrays, as encountered when processing JSON input. [Issues 1456 1866 PRs 1864 1877]

XPath 4.0 provides two lookup operators ? and ?? for maps and arrays. These provide a terse syntax for accessing the entries in a map or the members of an array.

The operator "?", known as the shallow lookup operator, returns values found immediately in the operand map or array. The operator "??", known as the deep lookup operator, also searches nested maps and arrays. The effect of the deep lookup operator "??" is explained in 4.13.3.3 Deep Lookup.

4.13.3.3 Deep Lookup

The deep lookup operator ?? has both unary and postfix forms. The unary form ??modifier::KS (where KS is any KeySpecifier) has the same effect as the binary form .??modifier::KS.

The semantics are defined as follows.

First we define the recursive content of an item as follows:

declare function immediate-content($item as item()) as record(key, value)* {
  if ($item instance of map(*)) {
    map:pairs($item)
  } else if ($item instance of array(*)) {
    for member $m at $p in $item
    return { "key": $p, "value": $m }
  }
};    

declare function recursive-content($item as item()) as record(key, value)* {
  immediate-content($item) ! (., ?items::value ! recursive-content(.))
};

Note:

Explanation: the immediate content of a map is obtained by splitting it into a sequence of key-value pairs, each representing one entry. The immediate content of an array is obtained by constructing a sequence of key-value pairs, one for each array member, where the key is the array index and the value is the corresponding member. Each key-value pair is of type record(key as xs:anyAtomicType, value as item()*). The recursive content of an item contains the key-value pairs in its immediate content, each followed by the recursive content obtained by expanding any maps or arrays in the immediate content.

It is then useful to represent the recursive content as a sequence of single-entry-maps^DM: so each pair { "key": $K, "value": $V } is converted to the form { $K: $V }. This can be achieved using the expression recursive-content($V) ! { ?key: ?value }.

In addition we define the function array-or-map as follows:

declare function array-or-map($item as item()) {
  typeswitch ($item) {
    case array(*) | map(*) return $item
    default return error(xs:QName("err:XPTY0004"))
  }
}

declare function array-or-map($item as item()) {
  typeswitch ($item) {
    case array(*) | map(*) return $item
    default return error(#err:XPTY0004)
  }
}

The result of the expression E??pairs::KS, where E is any expression and KS is any KeySpecifier, is then:

(E ! array-or-map(.) -> recursive-content(.) ! { ?key: ?value })
?pairs::KS

Note:

This is best explained by considering examples.

Consider the expression let $V := [ { "first": "John", "last": "Smith" }, { "first": "Mary", "last": "Evans" } ].

The recursive content of this array is the sequence of six maps:

{ "key": 1, "value": { "first": "John", "last": "Smith" } }
{ "key": 2, "value": { "first": "Mary", "last": "Evans" } }
{ "key": "first", "value": "John" }
{ "key": "last", "value": "Smith" }
{ "key": "first", "value": "Mary" }
{ "key": "last", "value": "Evans" }

The expression $V??pairs::* returns this sequence.

With some other KeySpecifierKS, $V??pairs::KS returns selected items from this sequence that match KS. Formally this is achieved by converting the key-value pairs to single-entry maps^DM, applying the KeySpecifier to the sequence of single-entry maps, and then converting the result back into a sequence of key-value pairs.

For example, given the expression $V??pairs::first, the selection from the converted sequence will include the two single entry maps^DM{ "first" : "John" } and { "first" : "Mary" }, which will be delivered in key-value pair form as { "key": "first", "value": "John" }, { "key": "first", "value": "Mary" }.

The effect of using modifiers other than pairs is the same as with shallow lookup expressions:

If the modifier is items (explicitly or by default), the result of $V??items::KS is the same as the result of $V??pairs::KS ! map:get(., "value"); that is, it is the sequence concatenation of the value parts.
If the modifier is values, the result of $V??values::KS is the same as the result of $V??pairs::KS ! array { map:get(., "value") }.
If the modifier is keys, the result of $V??keys::KS is the same as the result of $V??pairs::KS ! map:get(., "key").

Note:

This means that with the example given earlier:

The expression $V ?? first returns the sequence "John", "Mary".
The expression $V ?? last returns the sequence "Smith", "Evans".
The expression $V ?? 1 returns the sequence { "first": "John", "last": "Smith" }.
The expression $V ?? ~[record(first, last)] ! `{ ?first } { ?last }` returns the sequence "John Smith", "Mary Evans". This expression selects all values of type record(first, last) at any level in the tree.

Note:

The effect of evaluating all shallow lookups on maps rather than arrays is that no error arises if an array subscript is out of bounds. In the above example, $value??3 would return an empty sequence, it would not raise an error.

This also affects the way an xs:untypedAtomic key value is handled. Given the shallow lookup expression $A?$x, if $A is an array and $x (after atomization) is xs:untypedAtomic then the value of $x is converted to an integer (by virtue of the coercion rules applying to a call on array:get). With a deep lookup expression $A??$x, by contrast, the semantics are defined in terms of a map lookup, in which xs:untypedAtomic items are always treated as strings.

Note:

The definition of the recursive-content function is such that items in the top-level value that are not maps or arrays are ignored, whereas items that are not themselves maps or arrays, but which appear in the content of a map or array at the top level, are included. This means that E??X mirrors the behavior of E//X, in that it includes all items that are one-or-more levels deep in the tree.

Note:

The result of the deep lookup operator retains order when processing sequences and arrays, but not when processing maps.

Note:

An expression involving multiple deep lookup operators may return duplicates. For example, the result of the expression [ [ [ "a" ], [ "b" ] ], [ [ "c" ], [ "d" ] ] ] ?? 1 ?? 1 is ([ "a" ], "a", "b", "a", "c"). This is because the first ?? operator selects members in position 1 at all three levels, that is it selects the arrays [ [ "a" ], [ "b" ] ], [ "a" ], and [ "c" ] as well as each of the four string values. The second ?? operator selects members in position 1 within each of these values, which results in the string "a" being selected twice.

Note:

A type error is raised if the value of the left-hand expression includes an item that is neither a map nor an array.

Example: Examples of Deep Lookup Expressions

Consider the tree $tree of maps and arrays that results from applying the fn:parse-json function to the following input:

{
  "desc"    : "Distances between several cities, in kilometers.",
  "updated" : "2014-02-04T18:50:45",
  "uptodate": true,
  "author"  : null,
  "cities"  : {
    "Brussels": [
      { "to": "London",    "distance": 322 },
      { "to": "Paris",     "distance": 265 },
      { "to": "Amsterdam", "distance": 173 }
    ],
    "London": [
      { "to": "Brussels",  "distance": 322 },
      { "to": "Paris",     "distance": 344 },
      { "to": "Amsterdam", "distance": 358 }
    ],
    "Paris": [
      { "to": "Brussels",  "distance": 265 },
      { "to": "London",    "distance": 344 },
      { "to": "Amsterdam", "distance": 431 }
     ],
    "Amsterdam": [
      { "to": "Brussels",  "distance": 173 },
      { "to": "London",    "distance": 358 },
      { "to": "Paris",     "distance": 431 }
    ]
  }
}

Given two variables $from and $to containing the names of two cities that are present in this table, the distance between the two cities can be obtained with the expression:

$tree ?? $from ?? ~[record(to, distance)][?to = $to] ? distance

The names of all pairs of cities whose distance is represented in the data can be obtained with the expression:

$tree ?? $cities
=> map:for-each(fn($key, $val) { $val ?? to ! ($key || "-" || .) })

Example: Comparison with JSONPath

This example provides XPath equivalents to some examples given in the JSONPath specification. [TODO: add a reference].

The examples query the result of parsing the following JSON value, representing a store whose stock consists of four books and a bicycle:

{
  "store": {
    "book": [
      {
        "category": "reference",
        "author": "Nigel Rees",
        "title": "Sayings of the Century",
        "price": 8.95
      },
      {
        "category": "fiction",
        "author": "Evelyn Waugh",
        "title": "Sword of Honour",
        "price": 12.99
      },
      {
        "category": "fiction",
        "author": "Herman Melville",
        "title": "Moby Dick",
        "isbn": "0-553-21311-3",
        "price": 8.99
      },
      {
        "category": "fiction",
        "author": "J. R. R. Tolkien",
        "title": "The Lord of the Rings",
        "isbn": "0-395-19395-8",
        "price": 22.99
      }
    ],
    "bicycle": {
      "color": "red",
      "price": 399
    }
  }
}

The following table illustrates some queries on this data, expressed both in JSONPath and in XPath 4.0.

JSONPath vs XPath 4.0 Comparison
Query	JSONPath	XPath 4.0
The authors of all books in the store	`$.store.book[*].author`	`$m?store?book??author`
All authors	`$..author`	`$m??author`
All things in store (four books and a red bicycle)	`$.store.*`	`$m?store?*`
The prices of everything in the store	`$.store..price`	`$m?store??price`
The third book	`$..book[2]`	`$m??book?3`
The third book's author	`$..book[2].author`	`$m??book?3?author`
The third book's publisher (empty result)	`$..book[2].publisher`	`$m??book?3?publisher`
The last book (in order)	`$..book[-1]`	`$m??book => array:foot()`
The first two books	`$..book[0,1]`	`$m??book?(1, 2)`
All books with an ISBN	`$..book[?@.isbn]`	`$m??book[?isbn]`
All books cheaper than 10	`$..book[?@.price<10]`	`$m??book[?price lt 10]`
All member values and array elements contained in the input value	`$..*`	`$m??*`

A XPath 4.0 Grammar

A.1 EBNF

Changes in 4.0 ⬇ ⬆

The EBNF operators ++ and ** have been introduced, for more concise representation of sequences using a character such as "," as a separator. The notation is borrowed from Invisible XML. [Issue 1366 PR 1498]

The grammar of XPath 4.0 uses the same simple Extended Backus-Naur Form (EBNF) notation as [XML 1.0] with the following differences.

The notation XYZ ** "," indicates a sequence of zero or more occurrences of XYZ, with a single comma between adjacent occurrences.
The notation XYZ ++ "," indicates a sequence of one or more occurrences of XYZ, with a single comma between adjacent occurrences.
All named symbols have a name that begins with an uppercase letter.
It adds a notation for referring to productions in external specifications.
Comments or extra-grammatical constraints on grammar productions are between '/*' and '*/' symbols.
- A 'xgc:' prefix is an extra-grammatical constraint, the details of which are explained in A.1.2 Extra-grammatical Constraints
- A 'ws:' prefix explains the whitespace rules for the production, the details of which are explained in A.3.5 Whitespace Rules
- A 'gn:' prefix means a 'Grammar Note', and is meant as a clarification for parsing rules, and is explained in A.1.3 Grammar Notes. These notes are not normative.

The terminal symbols for this grammar include the quoted strings used in the production rules below, and the terminal symbols defined in section A.3.1 Terminal Symbols. The grammar is a little unusual in that parsing and tokenization are somewhat intertwined: for more details see A.3 Lexical structure.

The EBNF notation is described in more detail in A.1.1 Notation.

`AbbrevForwardStep`	::=	`("@" NodeTest) \| SimpleNodeTest`
`AbbrevReverseStep`	::=	`".."`
`AdditiveExpr`	::=	`MultiplicativeExpr (("+" \| "-") MultiplicativeExpr)*`
`AndExpr`	::=	`ComparisonExpr ("and" ComparisonExpr)*`
`AnyArrayType`	::=	`"array" "(" "*" ")"`
`AnyFunctionType`	::=	`("function" \| "fn") "(" "*" ")"`
`AnyItemTest`	::=	`"item" "(" ")"`
`AnyKindTest`	::=	`"node" "(" ")"`
`AnyMapType`	::=	`"map" "(" "*" ")"`
`AnyRecordType`	::=	`"record" "(" "*" ")"`
`Argument`	::=	`ExprSingle \| ArgumentPlaceholder`
`ArgumentList`	::=	`"(" ((PositionalArguments ("," KeywordArguments)?) \| KeywordArguments)? ")"`
`ArgumentPlaceholder`	::=	`"?"`
`ArrayConstructor`	::=	`SquareArrayConstructor \| CurlyArrayConstructor`
`ArrayType`	::=	`AnyArrayType \| TypedArrayType`
`ArrowExpr`	::=	`UnaryExpr (SequenceArrowTarget \| MappingArrowTarget)*`
`ArrowTarget`	::=	`FunctionCall \| RestrictedDynamicCall`
`AttributeName`	::=	`EQName`
`AttributeTest`	::=	`"attribute" "(" (NameTestUnion ("," TypeName)?)? ")"`
`AxisStep`	::=	`(ReverseStep \| ForwardStep) Predicate*`
`BracedAction`	::=	`EnclosedExpr`
`CastableExpr`	::=	`CastExpr ("castable" "as" CastTarget "?"?)?`
`CastExpr`	::=	`PipelineExpr ("cast" "as" CastTarget "?"?)?`
`CastTarget`	::=	`TypeName \| ChoiceItemType \| EnumerationType`
`ChoiceItemType`	::=	`"(" (ItemType ++ "\|") ")"`
`CommentTest`	::=	`"comment" "(" ")"`
`ComparisonExpr`	::=	`OtherwiseExpr ((ValueComp \| GeneralComp \| NodeComp) OtherwiseExpr)?`
`ContextValueRef`	::=	`"."`
`CurlyArrayConstructor`	::=	`"array" EnclosedExpr`
`DocumentTest`	::=	`"document-node" "(" (ElementTest \| SchemaElementTest \| NameTestUnion)? ")"`
`DynamicFunctionCall`	::=	`PostfixExprPositionalArgumentList`
`ElementName`	::=	`EQName`
`ElementTest`	::=	`"element" "(" (NameTestUnion ("," TypeName "?"?)?)? ")"`
`EnclosedExpr`	::=	`"{" Expr? "}"`
`EnumerationType`	::=	`"enum" "(" (StringLiteral ++ ",") ")"`
`EQName`	::=	`QName \| URIQualifiedName`
`Expr`	::=	`(ExprSingle ++ ",")`
`ExprSingle`	::=	`ForExpr \| LetExpr \| QuantifiedExpr \| IfExpr \| OrExpr`
`ExtensibleFlag`	::=	`"," "*"`
`FieldDeclaration`	::=	`FieldName "?"? ("as" SequenceType)?`
`FieldName`	::=	`NCName \| StringLiteral`
`FilterExpr`	::=	`PostfixExprPredicate`
`FilterExprAM`	::=	`PostfixExpr "?[" Expr "]"`
`ForBinding`	::=	`ForItemBinding \| ForMemberBinding \| ForEntryBinding`
`ForClause`	::=	`"for" (ForBinding ++ ",")`
`ForEntryBinding`	::=	`((ForEntryKeyBindingForEntryValueBinding?) \| ForEntryValueBinding) PositionalVar? "in" ExprSingle`
`ForEntryKeyBinding`	::=	`"key" VarNameAndType`
`ForEntryValueBinding`	::=	`"value" VarNameAndType`
`ForExpr`	::=	`ForClauseForLetReturn`
`ForItemBinding`	::=	`VarNameAndTypePositionalVar? "in" ExprSingle`
`ForLetReturn`	::=	`ForExpr \| LetExpr \| ("return" ExprSingle)`
`ForMemberBinding`	::=	`"member" VarNameAndTypePositionalVar? "in" ExprSingle`
`ForwardAxis`	::=	`("attribute" \| "child" \| "descendant" \| "descendant-or-self" \| "following" \| "following-or-self" \| "following-sibling" \| "following-sibling-or-self" \| "namespace" \| "self") "::"`
`ForwardStep`	::=	`(ForwardAxisNodeTest) \| AbbrevForwardStep`
`FunctionBody`	::=	`EnclosedExpr`
`FunctionCall`	::=	`EQNameArgumentList`
		/* xgc: reserved-function-names */
		/* gn: parens */
`FunctionItemExpr`	::=	`NamedFunctionRef \| InlineFunctionExpr`
`FunctionSignature`	::=	`"(" ParamList ")" TypeDeclaration?`
`FunctionType`	::=	`AnyFunctionType \| TypedFunctionType`
`GeneralComp`	::=	`"=" \| "!=" \| "<" \| "<=" \| ">" \| ">="`
`IfExpr`	::=	`"if" "(" Expr ")" (UnbracedActions \| BracedAction)`
`InlineFunctionExpr`	::=	`MethodAnnotation* ("function" \| "fn") FunctionSignature? FunctionBody`
`InstanceofExpr`	::=	`TreatExpr ("instance" "of" SequenceType)?`
`IntersectExceptExpr`	::=	`InstanceofExpr (("intersect" \| "except") InstanceofExpr)*`
`ItemType`	::=	`AnyItemTest \| TypeName \| KindTest \| FunctionType \| MapType \| ArrayType \| RecordType \| EnumerationType \| ChoiceItemType`
`KeySpecifier`	::=	`NCName \| IntegerLiteral \| StringLiteral \| VarRef \| ParenthesizedExpr \| LookupWildcard \| TypeSpecifier`
`KeywordArgument`	::=	`EQName ":=" Argument`
`KeywordArguments`	::=	`(KeywordArgument ++ ",")`
`KindTest`	::=	`DocumentTest \| ElementTest \| AttributeTest \| SchemaElementTest \| SchemaAttributeTest \| PITest \| CommentTest \| TextTest \| NamespaceNodeTest \| AnyKindTest`
`LetBinding`	::=	`VarNameAndType ":=" ExprSingle`
`LetClause`	::=	`"let" (LetBinding ++ ",")`
`LetExpr`	::=	`LetClauseForLetReturn`
`Literal`	::=	`NumericLiteral \| StringLiteralNumericLiteral \| StringLiteral \| QNameLiteral`
`Lookup`	::=	`("?" \| "??") (Modifier "::")? KeySpecifier`
`LookupExpr`	::=	`PostfixExprLookup`
`LookupWildcard`	::=	`"*"`
`MapConstructor`	::=	`"map"? "{" (MapConstructorEntry ** ",") "}"`
`MapConstructorEntry`	::=	`MapKeyExpr ":" MapValueExpr`
`MapKeyExpr`	::=	`ExprSingle`
`MappingArrowTarget`	::=	`"=!>" ArrowTarget`
`MapType`	::=	`AnyMapType \| TypedMapType`
`MapValueExpr`	::=	`ExprSingle`
`MethodAnnotation`	::=	`"%method"`
`Modifier`	::=	`"pairs" \| "keys" \| "values" \| "items"`
`MultiplicativeExpr`	::=	`UnionExpr (("" \| "×" \| "div" \| "÷" \| "idiv" \| "mod") UnionExpr)`
`NamedFunctionRef`	::=	`EQName "#" IntegerLiteral`
		/* xgc: reserved-function-names */
`NamespaceNodeTest`	::=	`"namespace-node" "(" ")"`
`NameTest`	::=	`EQName \| Wildcard`
`NameTestUnion`	::=	`(NameTest ++ "\|")`
`NodeComp`	::=	`"is" \| "<<" \| ">>"`
`NodeTest`	::=	`UnionNodeTest \| SimpleNodeTest`
`NumericLiteral`	::=	`IntegerLiteral \| HexIntegerLiteral \| BinaryIntegerLiteral \| DecimalLiteral \| DoubleLiteral`
`OccurrenceIndicator`	::=	`"?" \| "*" \| "+"`
		/* xgc: occurrence-indicators */
`OrExpr`	::=	`AndExpr ("or" AndExpr)*`
`OtherwiseExpr`	::=	`StringConcatExpr ("otherwise" StringConcatExpr)*`
`ParamList`	::=	`(VarNameAndType ** ",")`
`ParenthesizedExpr`	::=	`"(" Expr? ")"`
`PathExpr`	::=	`("/" RelativePathExpr?) \| ("//" RelativePathExpr) \| RelativePathExpr`
		/* xgc: leading-lone-slash */
`PipelineExpr`	::=	`ArrowExpr ("->" ArrowExpr)*`
`PITest`	::=	`"processing-instruction" "(" (NCName \| StringLiteral)? ")"`
`PositionalArgumentList`	::=	`"(" PositionalArguments? ")"`
`PositionalArguments`	::=	`(Argument ++ ",")`
`PositionalVar`	::=	`"at" VarName`
`PostfixExpr`	::=	`PrimaryExpr \| FilterExpr \| DynamicFunctionCall \| LookupExpr \| FilterExprAM`
`Predicate`	::=	`"[" Expr "]"`
`PrimaryExpr`	::=	`Literal \| VarRef \| ParenthesizedExpr \| ContextValueRef \| FunctionCall \| FunctionItemExpr \| MapConstructor \| ArrayConstructor \| StringTemplate \| UnaryLookup`
`QNameLiteral`	::=	`"#" EQName`
`QNameLiteral`	::=	`"#" EQName`
`QuantifiedExpr`	::=	`("some" \| "every") (QuantifierBinding ++ ",") "satisfies" ExprSingle`
`QuantifierBinding`	::=	`VarNameAndType "in" ExprSingle`
`RangeExpr`	::=	`AdditiveExpr ("to" AdditiveExpr)?`
`RecordType`	::=	`AnyRecordType \| TypedRecordType`
`RelativePathExpr`	::=	`StepExpr (("/" \| "//") StepExpr)*`
`RestrictedDynamicCall`	::=	`(VarRef \| ParenthesizedExpr \| FunctionItemExpr \| MapConstructor \| ArrayConstructor) PositionalArgumentList`
`ReverseAxis`	::=	`("ancestor" \| "ancestor-or-self" \| "parent" \| "preceding" \| "preceding-or-self" \| "preceding-sibling" \| "preceding-sibling-or-self") "::"`
`ReverseStep`	::=	`(ReverseAxisNodeTest) \| AbbrevReverseStep`
`SchemaAttributeTest`	::=	`"schema-attribute" "(" AttributeName ")"`
`SchemaElementTest`	::=	`"schema-element" "(" ElementName ")"`
`SequenceArrowTarget`	::=	`"=>" ArrowTarget`
`SequenceType`	::=	`("empty-sequence" "(" ")") \| (ItemTypeOccurrenceIndicator?)`
`SimpleMapExpr`	::=	`PathExpr ("!" PathExpr)*`
`SimpleNodeTest`	::=	`KindTest \| NameTest`
`SquareArrayConstructor`	::=	`"[" (ExprSingle ** ",") "]"`
`StepExpr`	::=	`PostfixExpr \| AxisStep`
`StringConcatExpr`	::=	`RangeExpr ("\|\|" RangeExpr)*`
`StringTemplate`	::=	"`" (StringTemplateFixedPart \| StringTemplateVariablePart)* "`"
		/* ws: explicit */
`StringTemplateFixedPart`	::=	((Char - ('{' \| '}' \| '`')) \| "{{" \| "}}" \| "``")*
		/* ws: explicit */
`StringTemplateVariablePart`	::=	`EnclosedExpr`
		/* ws: explicit */
`TextTest`	::=	`"text" "(" ")"`
`TreatExpr`	::=	`CastableExpr ("treat" "as" SequenceType)?`
`TypedArrayType`	::=	`"array" "(" SequenceType ")"`
`TypeDeclaration`	::=	`"as" SequenceType`
`TypedFunctionParam`	::=	`("$" EQName "as")? SequenceType`
`TypedFunctionType`	::=	`("function" \| "fn") "(" (TypedFunctionParam ** ",") ")" "as" SequenceType`
`TypedMapType`	::=	`"map" "(" ItemType "," SequenceType ")"`
`TypedRecordType`	::=	`"record" "(" (FieldDeclaration ** ",") ExtensibleFlag? ")"`
`TypeName`	::=	`EQName`
`TypeSpecifier`	::=	`"~[" SequenceType "]"`
`UnaryExpr`	::=	`("-" \| "+")* ValueExpr`
`UnaryLookup`	::=	`Lookup`
`UnbracedActions`	::=	`"then" ExprSingle "else" ExprSingle`
`UnionExpr`	::=	`IntersectExceptExpr (("union" \| "\|") IntersectExceptExpr)*`
`UnionNodeTest`	::=	`"(" SimpleNodeTest ("\|" SimpleNodeTest)* ")"`
`ValueComp`	::=	`"eq" \| "ne" \| "lt" \| "le" \| "gt" \| "ge"`
`ValueExpr`	::=	`SimpleMapExpr`
`VarName`	::=	`"$" EQName`
`VarNameAndType`	::=	`"$" EQNameTypeDeclaration?`
`VarRef`	::=	`"$" EQName`
`Wildcard`	::=	`"" \| (NCName ":") \| (":" NCName) \| (BracedURILiteral "")`
		/* ws: explicit */
`XPath`	::=	`Expr`

I Change Log (Non-Normative)

Use the arrows to browse significant changes since the 3.1 version of this specification.
See 1 Introduction
Sections with significant changes are marked Δ in the table of contents.
See 1 Introduction
Setting the default namespace for elements and types to the special value ##any causes an unprefixed element name to act as a wildcard, matching by local name regardless of namespace.
See 3.2.7.2 Element Types
The terms FunctionType, ArrayType, MapType, and RecordType replace FunctionTest, ArrayTest, MapTest, and RecordTest, with no change in meaning.
See 3.2.8.1 Function Types
Record types are added as a new kind of ItemType, constraining the value space of maps.
See 3.2.8.3 Record Types
Function coercion now allows a function with arity N to be supplied where a function of arity greater than N is expected. For example this allows the function true#0 to be supplied where a predicate function is required.
See 3.4.4 Function Coercion
PR 1817 1853
An inline function may be annotated as a %method, giving it access to its containing map.
See 4.5.6 Inline Function Expressions
See 4.5.6.1 Methods
See 4.13.3 Lookup Expressions
The symbols × and ÷ can be used for multiplication and division.
See 4.8 Arithmetic Expressions
The rules for value comparisons when comparing values of different types (for example, decimal and double) have changed to be transitive. A decimal value is no longer converted to double, instead the double is converted to a decimal without loss of precision. This may affect compatibility in edge cases involving comparison of values that are numerically very close.
See 4.10.1 Value Comparisons
Operators such as < and > can use the full-width forms ＜ and ＞ to avoid the need for XML escaping.
See 4.10.2 General Comparisons
The lookup operator ? can now be followed by a string literal, for cases where map keys are strings other than NCNames. It can also be followed by a variable reference.
See 4.13.3 Lookup Expressions
PR 1864 1877
The key specifier can reference an item type or sequence type, to select values of that type only. This is especially useful when processing trees of maps and arrays, as encountered when processing JSON input.
See 4.13.3 Lookup Expressions
PR 1763 1830
The syntax on the right-hand side of an arrow operator has been relaxed; a dynamic function call no longer needs to start with a variable reference or a parenthesized expression, it can also be (for example) an inline function expression or a map or array constructor.
See 4.20 Arrow Expressions
The arrow operator => is now complemented by a “mapping arrow” operator =!> which applies the supplied function to each item in the input sequence independently.
See 4.20.2 Mapping Arrow Expressions
PR 1023 1128
It has been clarified that function coercion applies even when the supplied function item matches the required function type. This is to ensure that arguments supplied when calling the function are checked against the signature of the required function type, which might be stricter than the signature of the supplied function item.
See 3.4.4 Function Coercion
The static typing feature has been dropped.
See 5 Conformance
The syntax document-node(N), where N is a NameTestUnion, is introduced as an abbreviation for document-node(element(N)). For example, document-node(*) matches any well-formed XML document (as distinct from a document fragment).
See 3.2.7 Node Types
QName literals are new in 4.0.
See 4.2.1.3 QName Literals
QName literals are new in 4.0.
See 4.2.1.3 QName Literals
PR 28
Multiple for and let clauses can be combined in an expression without an intervening return keyword.
See 4.12.1 For Expressions
See 4.12.2 Let Expressions
PR 159
Keyword arguments are allowed on static function calls, as well as positional arguments.
See 4.5.1.1 Static Function Call Syntax
PR 202
The presentation of the rules for the subtype relationship between sequence types and item types has been substantially rewritten to improve clarity; no change to the semantics is intended.
See 3.3 Subtype Relationships
PR 230
The rules for “errors and optimization” have been tightened up to disallow many cases of optimizations that alter error behavior. In particular there are restrictions on reordering the operands of and and or, and of predicates in filter expressions, in a way that might allow the processor to raise dynamic errors that the author intended to prevent.
See 2.4.5 Guarded Expressions
PR 254
The term "function conversion rules" used in 3.1 has been replaced by the term "coercion rules".
See 3.4 Coercion Rules
The coercion rules allow “relabeling” of a supplied atomic item where the required type is a derived atomic type: for example, it is now permitted to supply the value 3 when calling a function that expects an instance of xs:positiveInteger.
See 3.4 Coercion Rules
PR 284
Alternative syntax for conditional expressions is available: if (condition) { X }.
See 4.14 Conditional Expressions
PR 286
Element and attribute tests can include alternative names: element(chapter|section), attribute(role|class).
See 3.2.7 Node Types
The NodeTest in an AxisStep now allows alternatives: ancestor::(section|appendix)
See 3.2.7 Node Types
Element and attribute tests of the form element(N) and attribute(N) now allow N to be any NameTest, including a wildcard.
See 3.2.7.2 Element Types
See 3.2.7.3 Attribute Types
PR 324
String templates provide a new way of constructing strings: for example `{$greeting}, {$planet}!` is equivalent to $greeting || ', ' || $planet || '!'
See 4.9.2 String Templates
PR 326
Support for higher-order functions is now a mandatory feature (in 3.1 it was optional).
See 5 Conformance
PR 344
A for member clause is added to FLWOR expressions to allow iteration over an array.
See 4.12.1 For Expressions
PR 368
The concept of the context item has been generalized, so it is now a context value. That is, it is no longer constrained to be a single item.
See 2.2.2 Dynamic Context
PR 433
Numeric literals can now be written in hexadecimal or binary notation; and underscores can be included for readability.
See 4.2.1.1 Numeric Literals
PR 519
The rules for tokenization have been largely rewritten. In some cases the revised specification may affect edge cases that were handled in different ways by different 3.1 processors, which could lead to incompatible behavior.
See A.3 Lexical structure
PR 521
New abbreviated syntax is introduced (focus function) for simple inline functions taking a single argument. An example is fn { ../@code }
See 4.5.6 Inline Function Expressions
PR 603
The rules for reporting type errors during static analysis have been changed so that a processor has more freedom to report errors in respect of constructs that are evidently wrong, such as @price/@value, even though dynamic evaluation is defined to return an empty sequence rather than an error.
See 2.4.6 Implausible Expressions
See 4.6.4.3 Implausible Axis Steps
PR 606
Element and attribute tests of the form element(A|B) and attribute(A|B) are now allowed.
See 3.2.7.2 Element Types
See 3.2.7.3 Attribute Types
PR 691
Enumeration types are added as a new kind of ItemType, constraining the value space of strings.
See 3.2.6 Enumeration Types
PR 728
The syntax record(*) is allowed; it matches any map.
See 3.2.8.3 Record Types
PR 815
The coercion rules now allow conversion in either direction between xs:hexBinary and xs:base64Binary.
See 3.4 Coercion Rules
PR 837
A deep lookup operator ?? is provided for searching trees of maps and arrays.
See 4.13.3 Lookup Expressions
PR 911
The coercion rules now allow any numeric type to be implicitly converted to any other, for example an xs:double is accepted where the required type is xs:double.
See 3.4 Coercion Rules
PR 996
The value of a predicate in a filter expression can now be a sequence of integers.
See 4.4 Filter Expressions
PR 1031
An otherwise operator is introduced: A otherwise B returns the value of A, unless it is an empty sequence, in which case it returns the value of B.
See 4.15 Otherwise Expressions
PR 1071
In map constructors, the keyword map is now optional, so map { 0: false(), 1: true() } can now be written { 0: false(), 1: true() }, provided it is used in a context where this creates no ambiguity.
See 4.13.1.1 Map Constructors
PR 1125
Lookup expressions can now take a modifier (such as keys, values, or pairs) enabling them to return structured results rather than a flattened sequence.
See 4.13.3 Lookup Expressions
PR 1131
A positional variable can be defined in a for expression.
See 4.12.1 For Expressions
The type of a variable used in a for expression can be declared.
See 4.12.1 For Expressions
The type of a variable used in a let expression can be declared.
See 4.12.2 Let Expressions
PR 1132
Choice item types (an item type allowing a set of alternative item types) are introduced.
See 3.2.5 Choice Item Types
PR 1163
Filter expressions for maps and arrays are introduced.
See 4.13.4 Filter Expressions for Maps and Arrays
PR 1181
The default namespace for elements and types can be set to the value ##any, allowing unprefixed names in axis steps to match elements with a given local name in any namespace.
See 2.2.1 Static Context
If the default namespace for elements and types has the special value ##any, then an unprefixed name in a NameTest acts as a wildcard, matching names in any namespace or none.
See 4.6.4.2 Node Tests
PR 1197
The keyword fn is allowed as a synonym for function in function types, to align with changes to inline function declarations.
See 3.2.8.1 Function Types
In inline function expressions, the keyword function may be abbreviated as fn.
See 4.5.6 Inline Function Expressions
PR 1212
XPath 3.0 included empty-sequence and item as reserved function names, and XPath 3.1 added map and array. This was unnecessary since these names never appear followed by a left parenthesis at the start of an expression. They have therefore been removed from the list. New keywords introducing item types, such as record and enum, have not been included in the list.
See A.4 Reserved Function Names
PR 1217
Predicates in filter expressions for maps and arrays can now be numeric.
See 4.13.4 Filter Expressions for Maps and Arrays
PR 1249
A for key/value clause is added to FLWOR expressions to allow iteration over maps.
See 4.12.1 For Expressions
PR 1250
Several decimal format properties, including minus sign, exponent separator, percent, and per-mille, can now be rendered as arbitrary strings rather than being confined to a single character.
See 2.2.1.2 Decimal Formats
PR 1265
The rules regarding the document-uri property of nodes returned by the fn:collection function have been relaxed.
See 2.2.2 Dynamic Context
PR 1344
Parts of the static context that were there purely to assist in static typing, such as the statically known documents, were no longer referenced and have therefore been dropped.
See 2.2.1 Static Context
The static typing option has been dropped.
See 2.3 Processing Model
PR 1361
The term atomic value has been replaced by atomic item.
See 2.1.2 Values
PR 1384
If a type declaration is present, the supplied values in the input sequence are now coerced to the required type. Type declarations are now permitted in XPath as well as XQuery.
See 4.16 Quantified Expressions
PR 1496
The context value static type, which was there purely to assist in static typing, has been dropped.
See 2.2.1 Static Context
PR 1498
The EBNF operators ++ and ** have been introduced, for more concise representation of sequences using a character such as "," as a separator. The notation is borrowed from Invisible XML.
See 2.1 Terminology
The EBNF notation has been extended to allow the constructs (A ++ ",") (one or more occurrences of A, comma-separated, and (A ** ",") (zero or more occurrences of A, comma-separated.
See 2.1.1 Grammar Notation
The EBNF operators ++ and ** have been introduced, for more concise representation of sequences using a character such as "," as a separator. The notation is borrowed from Invisible XML.
See A.1 EBNF
See A.1.1 Notation
PR 1501
The coercion rules now apply recursively to the members of an array and the entries in a map.
See 3.4 Coercion Rules
PR 1532
Four new axes have been defined: preceding-or-self, preceding-sibling-or-self, following-or-self, and following-sibling-or-self.
See 4.6.4.1 Axes
PR 1577
The syntax record() is allowed; the only thing it matches is an empty map.
See 3.2.8.3 Record Types
PR 1686
With the pipeline operator ->, the result of an expression can be bound to the context value before evaluating another expression.
See 4.18 Pipeline operator
PR 1696
Parameter names may be included in a function signature; they are purely documentary.
See 3.2.8.1 Function Types
PR 1703
Ordered maps are introduced.
See 4.13.1 Maps
The order of key-value pairs in the map constructor is now retained in the constructed map.
See 4.13.1.1 Map Constructors
PR 1874
The coercion rules now reorder the entries in a map when the required type is a record type.
See 3.4 Coercion Rules
PR 1898
The rules for subtyping of document node types have been refined.
See 3.3.2.4.2 Subtyping Nodes: Document Nodes

XML Path Language (XPath) 4.0 WG Review Draft

W3C Editor's Draft 218 February 2026

Abstract

Status of this Document

Dedication

2 Basics

2.4 Error Handling

2.4.3 Handling Dynamic Errors

4 Expressions

4.2 Primary Expressions

4.2.1 Literals

4.2.1.3 QName Literals

4.2.1.34.2.1.4 Constants of Other Types

4.13 Maps and Arrays

4.13.3 Lookup Expressions

4.13.3.3 Deep Lookup

A XPath 4.0 Grammar

A.1 EBNF

I Change Log (Non-Normative)