View Old View New View Both View Only Previous Next

This draft contains only sections that have differences from the version that it modified.

W3C

XML Path Language (XPath) 4.0 WG Review Draft

W3C Editor's Draft 23 February 2026

This version:
https://qt4cg.org/specifications/xpath-40/
Most recent version of XPath:
https://qt4cg.org/specifications/xpath-40/
Most recent Recommendation of XPath:
https://www.w3.org/TR/2017/REC-xpath-31-20170321/
Editor:
Michael Kay, Saxonica <http://www.saxonica.com/>

This document is also available in these non-normative formats: XML.


Abstract

XPath 4.0 is an expression language that allows the processing of values conforming to the data model defined in [XDM 4.0]. The name of the language derives from its most distinctive feature, the path expression, which provides a means of hierarchic addressing of the nodes in an XML tree. As well as modeling the tree structure of XML, the data model also includes atomic items, function items, maps, arrays, and sequences. This version of XPath supports JSON as well as XML, and adds many new functions in [Functions and Operators 4.0].

XPath 4.0 is a superset of XPath 3.1. A detailed list of changes made since XPath 3.1 can be found in I Change Log.

Status of this Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document.

This document is a working draft developed and maintained by a W3C Community Group, the XQuery and XSLT Extensions Community Group unofficially known as QT4CG (where "QT" denotes Query and Transformation). This draft is work in progress and should not be considered either stable or complete. Standard W3C copyright and patent conditions apply.

The community group welcomes comments on the specification. Comments are best submitted as issues on the group's GitHub repository.

The community group maintains two extensive test suites, one oriented to XQuery and XPath, the other to XSLT. These can be found at qt4tests and xslt40-test respectively. New tests, or suggestions for correcting existing tests, are welcome. The test suites include extensive metadata describing the conditions for applicability of each test case as well as the expected results. They do not include any test drivers for executing the tests: each implementation is expected to provide its own test driver.

Dedication

The publications of this community group are dedicated to our co-chair, Michael Sperberg-McQueen (1954–2024).

Michael was central to the development of XML and many related technologies. He brought a polymathic breadth of knowledge and experience to everything he did. This, combined with his indefatigable curiosity and appetite for learning, made him an invaluable contributor to our project, along with many others. We have lost a brilliant thinker, a patient teacher, and a loyal friend.


4 Expressions

This section discusses each of the basic kinds of expression. Each kind of expression has a name such as PathExpr, which is introduced on the left side of the grammar production that defines the expression. Since XPath 4.0 is a composable language, each kind of expression is defined in terms of other expressions whose operators have a higher precedence. In this way, the precedence of operators is represented explicitly in the grammar.

The order in which expressions are discussed in this document does not reflect the order of operator precedence. In general, this document introduces the simplest kinds of expressions first, followed by more complex expressions. For the complete grammar, see Appendix [A XPath 4.0 Grammar].

The highest-level symbol in the XPath grammar is XPath.

XPath::=(DefaultElementNamespaceDecl ";")? (NamespaceDecl ";")* Expr
DefaultElementNamespaceDecl::="declare" "default" "element" "namespace" URILiteral
NamespaceDecl::="declare" "namespace" NCName "=" URILiteral
Expr::=(ExprSingle ++ ",")

The effect of a DefaultElementNamespaceDecl or NamespaceDecl is described in 4.1 Namespace Declarations.

ExprSingle::=ForExpr
| LetExpr
| QuantifiedExpr
| IfExpr
| OrExpr
ForExpr::=ForClauseForLetReturn
LetExpr::=LetClauseForLetReturn
QuantifiedExpr::=("some" | "every") (QuantifierBinding ++ ",") "satisfies" ExprSingle
IfExpr::="if" "(" Expr ")" (UnbracedActions | BracedAction)
OrExpr::=AndExpr ("or" AndExpr)*

The XPath 4.0 operator that has lowest precedence is the comma operator, which is used to combine two operands to form a sequence. As shown in the grammar, a general expression (Expr) can consist of multiple ExprSingle operands, separated by commas.

The name ExprSingle denotes an expression that does not contain a top-level comma operator (despite its name, an ExprSingle may evaluate to a sequence containing more than one item.)

The symbol ExprSingle is used in various places in the grammar where an expression is not allowed to contain a top-level comma. For example, each of the arguments of a function call must be a ExprSingle, because commas are used to separate the arguments of a function call.

After the comma, the expressions that have next lowest precedence are ForExpr, LetExpr, QuantifiedExpr, IfExpr, and OrExpr. Each of these expressions is described in a separate section of this document.

4.5 Filter Expressions

Changes in 4.0 (next | previous)

  1. The value of a predicate in a filter expression can now be a sequence of integers.   [Issue 816 PR 996 6 February 2024]

FilterExpr::=PostfixExprPredicate
PostfixExpr::=PrimaryExpr | FilterExpr | DynamicFunctionCall | LookupExpr | MethodCall | FilterExprAM
Predicate::="[" Expr "]"

A filter expression consists of a base expression followed by a predicate, which is an expression written in square brackets. The result of the filter expression consists of the items returned by the base expression, filtered by applying the predicate to each item in turn. The ordering of the items returned by a filter expression is the same as their order in the result of the primary expression.

Note:

Where the expression before the square brackets is an AbbreviatedStep or FullStep, the expression is technically not a filter expression but an AxisStep. There are minor differences in the semantics: see 4.7.54.7.6 Predicates within Steps

Here are some examples of filter expressions:

  • Given a sequence of products in a variable, return only those products whose price is greater than 100.

    $products[price gt 100]
  • List all the integers from 1 to 100 that are divisible by 5. (See 4.8.1 Sequence Concatenation for an explanation of the to operator.)

    (1 to 100)[. mod 5 eq 0]
  • The result of the following expression is the integer 25:

    (21 to 29)[5]
  • The following example returns the fifth through ninth items in the sequence bound to variable $orders.

    $orders[5 to 9]
  • The following example illustrates the use of a filter expression as a step in a path expression. It returns the last chapter or appendix within the book bound to variable $book:

    $book/(chapter | appendix)[last()]

For each item in the input sequence, the predicate expression is evaluated using an inner focus, defined as follows: The context value is the item currently being tested against the predicate. The context size is the number of items in the input sequence. The context position is the position of the context value within the input sequence.

For each item in the input sequence, the result of the predicate expression is coerced to an xs:boolean value, called the predicate truth value, as described below. Those items for which the predicate truth value is true are retained, and those for which the predicate truth value is false are discarded.

[Definition: The predicate truth value of a value $V is the result of the expression if ($V instance of xs:numeric+) then ($V = position()) else fn:boolean($V).]

Expanding this definition, the predicate truth value can be obtained by applying the following rules, in order:

  1. If the value V of the predicate expression is a sequence whose first item is an instance of the type xs:numeric, then:

    1. V must be an instance of the type xs:numeric+ (that is, every item in V must be numeric). A type error [err:FORG0006]FO40 is raised if this is not the case.

    2. The predicate truth value is true if V is equal (by the = operator) to the context position, and is false otherwise.

    In effect this means that an item in the input sequence is selected if its position in the sequence is equal to one or more of the numeric values in the predicate. For example, the predicate [3 to 5] is true for the third, fourth, and fifth items in the input sequence.

    Note:

    It is possible, though not generally useful, for the value of a numeric predicate to depend on the focus, and thus to differ for different items in the input sequence. For example, the predicate [xs:integer(@seq)] selects those items in the input sequence whose @seq attribute is numerically equal to their position in the input sequence.

    It is also possible, and again not generally useful, for the value of the predicate to be numeric for some items in the input sequence, and boolean for others. For example, the predicate [@special otherwise last()] is true for an item that either has an @special attribute, or is the last item in the input sequence.

    Note:

    The truth value of a numeric predicate does not depend on the order of the numbers in V. The predicates [ 1, 2, 3 ] and [ 3, 2, 1 ] have exactly the same effect. The items in the result of a filter expression always retain the ordering of the input sequence.

    Note:

    The truth value of a numeric predicate whose value is non-integral or non-positive is always false.

    Note:

    Beware that using boolean operators (and, or, not()) with numeric values may not have the intended effect. For example the predicate [1 or last()] selects every item in the sequence, because or operates on the effective boolean value of its operands. The required effect can be achieved with the predicate [1, last()].

  2. Otherwise, the predicate truth value is the effective boolean value of the predicate expression.

4.7 Path Expressions

Changes in 4.0 (next | previous)

  1. Path expressions are extended to handle JNodes (found in trees of maps and arrays) as well as XNodes (found in trees representing parsed XML).   [Issue 2054 ]

PathExpr::=AbsolutePathExpr
| RelativePathExpr
/* xgc: leading-lone-slash */
AbsolutePathExpr::=("/" RelativePathExpr?) | ("//" RelativePathExpr)
RelativePathExpr::=StepExpr (("/" | "//") StepExpr)*

[Definition: A path expression is either an absolute path expression or a relative path expression ]

[Definition: An absolute path expression is an instance of the production AbsolutePathExpr: it consists of either (a) the operator / followed by zero or more operands separated by / or // operators, or (b) the operator // followed by one or more operands separated by / or // operators.]

[Definition: A relative path expression is a non-trivial instance of the production RelativePathExpr: it consists of two or more operand expressions separated by / or // operators.]

[Definition: The operands of a path expression are conventionally referred to as steps.]

Note:

The term step must not be confused with axis step. A step can be any kind of expression, often but not necessarily an axis step, while an axis step can be used in any expression context, not necessarily as a step in a path expression.

A path expression is typically used to locate GNodes within GTrees.

Note:

Note the terminology:

The following definitions are copied from the data model specification, for convenience:

  • [Definition: A tree that is rooted at a parentless JNode is referred to as a JTree.]

  • [Definition: A tree that is rooted at a parentless XNode is referred to as an XTree.]

  • [Definition: The term generic node or GNode is a collective term for XNodes (more commonly called simply nodes) representing the parts of an XML document, and JNodes, often used to represent the parts of a JSON document.]

  • [Definition: A JNode is a kind of item used to represent a value within the context of a tree of maps and arrays. A root JNode represents a map or array; a non-root JNode represents a member of an array or an entry in a map.]

[Definition: The term GTree means JTree or XTree.]

Absolute path expressions (those starting with an initial / or //), start their selection from the root GNode of a GTree; relative path expressions (those without a leading / or //) start from the context value.

4.7.2 Relative Path Expressions

RelativePathExpr::=StepExpr (("/" | "//") StepExpr)*
StepExpr::=PostfixExpr | AxisStep
PostfixExpr::=PrimaryExpr | FilterExpr | DynamicFunctionCall | LookupExpr | MethodCall | FilterExprAM
AxisStep::=(AbbreviatedStep | FullStep) Predicate*

A relative path expression is a path expression that selects GNodes within a GTree by following a series of steps starting at the GNodes in the context value (which may be any kind of GNode, not necessarily the root of the tree).

Each non-initial occurrence of // in a path expression is expanded as described in 4.7.7 Abbreviated Syntax4.7.4 Recursive Path Operator (//), leaving a sequence of steps separated by /. This sequence of steps is then evaluated from left to right. So a path such as E1/E2/E3/E4 is evaluated as ((E1/E2)/E3)/E4. The semantics of a path expression are thus defined by the semantics of the binary / operator, which is defined in 4.7.3 Path operatorOperator (/).

Note:

Although the semantics describe the evaluation of a path with more than two steps as proceeding from left to right, the / operator is in most cases associative, so evaluation from right to left usually delivers the same result. The cases where / is not associative arise when the functions fn:position() and fn:last() are used: A/B/position() delivers a sequence of integers from 1 to the size of (A/B), whereas A/(B/position()) restarts the counting at each B element.

The following example illustrates the use of a relative path expressions to select within an XTree. It is assumed that the context value is a single XNode, referred to as the context node.

  • child::div1/child::para

    Selects the para element children of the div1 element children of the context node; that is, the para element grandchildren of the context node that have div1 parents.

Note:

Since each step in a path provides context GNodes for the following step, in effect, only the last step in a path is allowed to return a sequence of non-GNodes.

4.7.3 Path operatorOperator (/)

The path operator / is primarily used for locating GNodes within GTrees. The value of the left-hand operand may include maps and arrays; such items are implicitly converted to JNodes as if by a call on the fn:jtree function. After this conversion, the left-hand operand must return a sequence of GNodes. The result of the operator is either a sequence of GNodes (in document order, with no duplicates), or a sequence of non-GNodes.

The operation E1/E2 is evaluated as follows: Expression E1 is evaluated. Any maps or arrays in the result are converted to JNodes by applying the fn:jtree function. If the result is not a (possibly empty) sequence S of GNodes, a type error is raised [err:XPTY0019]. Each GNode in S then serves in turn to provide an inner focus (the GNode as the context value, its position in S as the context position, the length of S as the context size) for an evaluation of E2, as described in 2.2.2 Dynamic Context. The sequences resulting from all the evaluations of E2 are combined as follows:

  1. If every evaluation of E2 returns a (possibly empty) sequence of GNodes, these sequences are combined, and duplicate GNodes are eliminated based on GNode identity. The resulting GNode sequence is returned in document order.

  2. If every evaluation of E2 returns a (possibly empty) sequence of non-GNodes, these sequences are concatenated, in order, and returned. The returned sequence preserves the orderings within and among the subsequences generated by the evaluations of E2.

    Note:

    The use of path expressions to select values other than GNodes is for backwards compatibility. Generally it is preferable to use the simple mapping operator ! for this purpose. For example, write $nodes!node-name() in preference to $nodes/node-name().

  3. If the multiple evaluations of E2 return at least one GNode and at least one non-GNode, a type error is raised [err:XPTY0018].

Note:

The semantics of the path operator can also be defined using the simple map operator (!) as follows (the function fn:distinct-ordered-nodes($R) has the effect of eliminating duplicates and sorting nodes into document order):

let $R := E1 ! E2
return if (every $r in $R satisfies $r instance of gnode())
       then (fn:distinct-ordered-nodes($R))
       else if (every $r in $R satisfies not($r instance of gnode()))
       then $R
       else error()

For a table comparing the step operator to the map operator, see 4.20 Simple map operator (!).

4.7.4 Recursive Path Operator (//)

When // is used as an infix operator, it can be treated as an abbreviation for /descendant-or-self::gnode()/.

In simple cases, an expression such as $x//y is equivalent to $x/descendant::y. But in some cases the semantics are more complex, for example:

  • $x//@a expands to $x/descendant-or-self::gnode()/attribute::a, which selects all attributes having $x as an ancestor.

  • $x//y[1] expands to $x/descendant-or-self::gnode()/child::y[1], which selects every descendant element of $x named y that is the first child of its parent. This is not the same as ($x//y)[1], which selects the first descendant of $x that is named y.

The // operator can be used both with XNodes and with JNodes: the same expansion applies in both cases. For example, if $x is the array:

[ {"a":10, "b":11}, [ {"a":20, "b":21} ] ]

then $x//b returns two JNodes whose contents are 11 and 21 respectively.

It is valid, but rarely useful, to use the // operator in conjunction with an axis other than child or attribute. For example, $x//following-sibling::y expands to $x/descendant-or-self::gnode()/following-sibling::y, which selects every y descendant of $x that is not the first child of its parent, as well as the following siblings of $x itself.

It is also valid to follow // with an expression other than an axis step. For example, distinct-values($x//node-name()) has the same effect as $x/descendant-or-self::gnode() =!> node-name() => distinct-values()

The effect of // at the start of an expression is explained in 4.7.1 Absolute Path Expressions.

4.7.44.7.5 Axis Steps

AxisStep::=(AbbreviatedStep | FullStep) Predicate*
AbbreviatedStep::=".." | ("@" NodeTest) | SimpleNodeTest
FullStep::=AxisNodeTest
Axis::=("ancestor" | "ancestor-or-self" | "attribute" | "child" | "descendant" | "descendant-or-self" | "following" | "following-or-self" | "following-sibling" | "following-sibling-or-self" | "namespace" | "parent" | "preceding" | "preceding-or-self" | "preceding-sibling" | "preceding-sibling-or-self" | "self") "::"
NodeTest::=UnionNodeTest | SimpleNodeTest
Predicate::="[" Expr "]"

[Definition: An axis step is an instance of the production AxisStep: it is an expression that returns a sequence of GNodes that are reachable from a starting GNode via a specified axis. An axis step has three parts: an axis, which defines the direction of movement for the step, a node test, which selects GNodes based on their properties, and zero or more predicates which are used to filter the results.]

Note:

An axis step is an expression in its own right. While axis steps are often used as the operands of path expressions, they can also appear in other contexts (without a / or // operator); equally, the operands of a path expression can be any expression, not restricted to an axis step.

If the context value for an axis step includes a map or array, this is implicitly converted to a JNode as if by applying the fn:jtree function. If, after this conversion, the sequence contains a value that is not a GNode, a type error is raised [err:XPTY0020]. The result of evaluating the axis step is a sequence of zero or more GNodes.

The axis stepS is equivalent to ./S. Thus, if the context value is a sequence containing multiple GNodes, the semantics of a axis step are equivalent to a path expression in which the step is always applied to a single GNode. The following description therefore explains the semantics for the case where the context value is a single GNode, called the origin.

Note:

The equivalence of a axis stepS to the path expression./S means that the resulting GNode sequence is returned in document order.

In the abbreviated syntax for a step, the axis can be omitted and other shorthand notations can be used as described in 4.7.74.7.8 Abbreviated Syntax.

The unabbreviated syntax for an axis step consists of the axis name and node test separated by a double colon. The result of the step consists of the GNodes reachable from the origin via the specified axis that match the node test. For example, the step child::para selects the para element children of the origin XNode: child is the name of the axis, and para is the name of the element nodes to be selected on this axis. The available axes are described in 4.7.4.14.7.5.1 Axes. The available node tests are described in 4.7.4.24.7.5.2 Node Tests. Examples of steps are provided in 4.7.64.7.7 Unabbreviated Syntax and 4.7.74.7.8 Abbreviated Syntax.

4.7.4.14.7.5.1 Axes

Changes in 4.0 (next | previous)

  1. Four new axes have been defined: preceding-or-self, preceding-sibling-or-self, following-or-self, and following-sibling-or-self.   [Issue 1519 PR 1532 29 October 2024]

Axis::=("ancestor" | "ancestor-or-self" | "attribute" | "child" | "descendant" | "descendant-or-self" | "following" | "following-or-self" | "following-sibling" | "following-sibling-or-self" | "namespace" | "parent" | "preceding" | "preceding-or-self" | "preceding-sibling" | "preceding-sibling-or-self" | "self") "::"

An axis is essentially a function that takes a GNode (the origin) as input, and delivers a sequence of GNodes (always from within the same GTree as the origin) as its result.

XPath defines a set of axes for traversing documents, but a host language may define a subset of these axes. The following axes are defined:

  • The child axis contains the children of the origin.

    If the origin is an XNode, these are the XNodes returned by the [XDM 4.0] section 7.6.3 children Accessor accessor.

    Note:

    In an XTree, only document nodes and element nodes have children. If the origin is any other kind of XNode, or if the origin is an empty document or element node, then the child axis returns an empty sequence. The children of a document node or element node may be element, processing instruction, comment, or text nodes. Attribute, namespace, and document nodes can never appear as children.

    If the origin is a JNode, these are the JNodes returned by the j-childrenDM accessor.

  • The descendant axis is defined as the transitive closure of the child axis; it contains the descendants of the origin (the children, the children of the children, and so on).

    More formally, $node/descendant::gnode() delivers the result of fn:transitive-closure($node, fn { child::gnode() }).

  • The descendant-or-self axis contains the origin and the descendants of the origin.

    More formally, $node/descendant-or-self::gnode() delivers the result of $node/(. | descendant::gnode()).

  • The parent axis returns the parent of the origin.

    If the origin is an XNode, this is the result of the [XDM 4.0] section 7.6.11 parent Accessor accessor.

    If the origin is a JNode, this is the value of the ·parent· property of the origin.

    If the GNode has no parent, the axis returns an empty sequence.

    Note:

    An attribute node may have an element node as its parent, even though the attribute node is not a child of the element node.

  • The ancestor axis is defined as the transitive closure of the parent axis; it contains the ancestors of the origin (the parent, the parent of the parent, and so on).

    More formally, $node/ancestor::gnode() delivers the result of fn:transitive-closure($node, fn { parent::gnode() }).

    Note:

    The ancestor axis includes the root GNode of the GTree in which the origin is found, unless the origin is itself the root GNode.

  • The ancestor-or-self axis contains the origin and the ancestors of the origin; thus, the ancestor-or-self axis will always include the root.

    More formally, $node/ancestor-or-self::gnode() delivers the result of $node/(. | ancestor::gnode()).

  • The following-sibling axis returns the origin’s following siblings, that is, those children of the origin’s parent that occur after the origin in document order. If the origin is an attribute or namespace node, the following-sibling axis is empty.

    More formally, $node/following-sibling::gnode() delivers the result of fn:siblings($node)[. >> $node]).

  • The following-sibling-or-self axis contains the origin, together with the contents of the following-sibling axis.

    More formally, $node/following-sibling-or-self::gnode() delivers the result of fn:siblings($node)[not(. << $node)]

  • The preceding-sibling axis returns the origin’s preceding siblings, that is, those children of the origin’s parent that occur before the context node in document order. If the origin is an attribute or namespace node, the preceding-sibling axis is empty.

    More formally, $node/preceding-sibling::gnode() delivers the result of fn:siblings($node)[. << $node].

  • The preceding-sibling-or-self axis contains the origin, together with the contents of the preceding-sibling axis.

    More formally, $node/preceding-sibling-or-self::gnode() delivers the result of fn:siblings($node)[not(. >> $node).

  • The following axis contains all descendants of the root of the GTree in which the origin is found, are not descendants of the origin, and occur after the origin in document order.

    More formally, $node/following::gnode() delivers the result of $node/ancestor-or-self::gnode()/following-sibling::gnode()/descendant-or-self::gnode()

  • The following-or-self axis contains the origin, together with the contents of the following axis.

    More formally, $node/following-or-self::gnode() delivers the result of $node/(. | following::gnode()).

  • The preceding axis returns all descendants of the root of the GTree in which the origin is found, are not ancestors of the origin, and occur before the origin in document order.

    More formally, $node/preceding::gnode() delivers the result of $node/ancestor-or-self::gnode()/preceding-sibling::gnode()/descendant-or-self::gnode().

  • The preceding-or-self axis returns the origin, together with the contents of the preceding axis.

    More formally, $node/preceding-or-self::gnode() delivers the result of $node/(. | preceding::gnode()).

  • The attribute axis is defined only for XNodes. It returns the attributes of the origin, which are the nodes returned by the [XDM 4.0] section 7.6.1 attributes Accessor; the axis will be empty unless the context node is an element.

    If the attribute axis is applied to a JNode, a type error [err:XPTY0004] is raised.

  • The self axis contains just the origin itself.

    The self axis is primarily useful when testing whether the origin satisfies particular conditions, for example if ($x[self::chapter]).

    More formally, $node/self::gnode() delivers the result of $node.

  • The namespace axis is defined only for XNodes. It returns the namespace nodes of the origin, which are the nodes returned by the [XDM 4.0] section 7.6.7 namespace-nodes Accessor; this axis is empty unless the origin is an element node. The namespace axis is deprecated as of XPath 2.0. If XPath 1.0 compatibility mode is true, the namespace axis must be supported. If XPath 1.0 compatibility mode is false, then support for the namespace axis is implementation-defined. An implementation that does not support the namespace axis when XPath 1.0 compatibility mode is false must raise a static error [err:XPST0010] if it is used. Applications needing information about the in-scope namespaces of an element should use the function fn:in-scope-namespaces.

    If the namespace axis is applied to a JNode, a type error [err:XPTY0004] is raised.

Axes can be categorized as forward axes and reverse axes. An axis that only ever contains the origin or nodes that are after the context node in document order is a forward axis. An axis that only ever contains the context node or nodes that are before the context node in document order is a reverse axis.

The parent, ancestor, ancestor-or-self, preceding, preceding-or-self, preceding-sibling, and preceding-sibling-or-self axes are reverse axes; all other axes are forward axes.

The ancestor, descendant, following, preceding and self axes partition a GTree (ignoring attribute and namespace nodes): they do not overlap and together they contain all the GNodes in the GTree.

[Definition: Every axis has a principal node kind. If an axis can contain elements, then the principal node kind is element; otherwise, it is the kind of nodes that the axis can contain.] Thus:

  • For the attribute axis, the principal node kind is attribute.

  • For the namespace axis, the principal node kind is namespace.

  • For all other axes, the principal node kind is element.

4.7.4.24.7.5.2 Node Tests

Changes in 4.0 (next | previous)

  1. If the default namespace for elements and types has the special value ##any, then an unprefixed name in a NameTest acts as a wildcard, matching names in any namespace or none.   [Issue 296 PR 1181 30 April 2024]

[Definition: A node test is a condition on the properties of a GNode. A node test determines which GNodes returned by an axis are selected by a step.]

NodeTest::=UnionNodeTest | SimpleNodeTest
UnionNodeTest::="(" (SimpleNodeTest ++ "|") ")"
SimpleNodeTest::=TypeTest | Selector
TypeTest::=RegularItemType | ("type" "(" SequenceType ")")
Selector::=EQName | Wildcard | ("get" "(" ExprSingle ")")
EQName::=QName | URIQualifiedName
Wildcard::="*"
| (NCName ":*")
| ("*:" NCName)
| (BracedURILiteral "*")
/* ws: explicit */
ExprSingle::=ForExpr
| LetExpr
| QuantifiedExpr
| IfExpr
| OrExpr

Node tests fall into three categories:

  • Type tests, which test the type of the GNode;

  • Selectors, which act as keys used to identify the GNode among its siblings (in the case of XNodes, this is the node name);

  • Union node tests, which provide multiple conditions: a GNode satisfies the union node test if it satisfies any of its operand node tests.

A UnionNodeTest matches a node N if at least one of the constituent SimpleNodeTests matches N.

For example, (div1|div2|div3) matches a node named div1, div2, or div3

The semantics of selectors varies between XNodes and JNodes, so the two cases are described separately.

4.7.4.34.7.5.3 Selectors for XNodes

This section describes the semantics of a name test in the case where the origin is an XNode.

[Definition: A node test that consists only of an EQName or a Wildcard is called a name test.]

A node test written as an NCName is expanded as follows:

If the expanded node test is an expanded QNameQ, then it matches a node N if and only if the kind of node N is the principal node kind for the step axis and the expanded QName of the node is equal (as defined by the eq operator) to Q. For example, child::para selects the para element children of the context node; if the context node has no para children, it selects an empty set of nodes. attribute::abc:href selects the attribute of the context node with the QName abc:href; if the context node has no such attribute, it selects an empty set of nodes.

Note:

A name test is not satisfied by an element node whose name does not match the expanded QName of the name test, even if it is in a substitution group whose head is the named element.

Wildcard node tests are interpreted as follows:

  • The node test * is true for any node of the principal node kind of the step axis. For example, child::* will select all element children of the context node, and attribute::* will select all attributes of the context node.

  • A node test can have the form NCName:*. In this case, the prefix is expanded in the same way as with a lexical QName, using the statically known namespaces in the static context. If the prefix is not found in the statically known namespaces, a static error is raised [err:XPST0081]. The node test is true for any node of the principal node kind of the step axis whose expanded QName has the namespace URI to which the prefix is bound, regardless of the local part of the name.

  • A node test can contain a BracedURILiteral, for example Q{http://example.com/msg}*. Such a node test is true for any node of the principal node kind of the step axis whose expanded QName has the namespace URI specified in the BracedURILiteral, regardless of the local part of the name.

  • A node test can also have the form *:NCName. In this case, the node test is true for any node of the principal node kind of the step axis whose local name matches the given NCName, regardless of its namespace or lack of a namespace.

A selector can also take the form get(E) where E is an ExprSingle. The contained expression E is evaluated with an absentDMfocus. An XNode satisfies the selector if its node kind is the principal node kind of the axis and its node name is equal to one of the values (necessarily an xs:QName) present in the atomized value of the selector expression.

That is, if the context item is an XNode, then Axis::get(E) returns the result of the expression:

let $selector := fn(){ data(E) }()
return Axis::*[some($selector, atomic-equal(?, node-name()))]

Note:

The purpose of evaluating E within the body of an anonymous inline function is to ensure that it is evaluated with an absent focus.

It is not an error if the atomized value of E includes atomic items that are not xs:QName values: such values are effectively ignored.

Note:

Unnamed nodes (such as text nodes) will never be selected

Note:

The result is in document order. The order of items in the result of the selector expression is immaterial.

For example, child::get((#body, #x:body)) selects all child elements whose name is one of the QName values body or x:body. Note that the evaluation of QName literals is not sensitive to the default namespace for elements and types.

A further example: descendant::(get(#para) | text()) returns all descendants of the context node that are either elements named para (in no namespace) or text nodes; the results are in document order.

4.7.4.44.7.5.4 Selectors for JNodes

When the origin is a JNode, the selector filters the JNodes returned by the axis according to the JNode's ·selector· property. In the case of a JNode that wraps an entry in a map, this can be any atomic item; for a JNode that wraps a member of an array, it will be a non-negative integer.

If the selector takes the form *, then it matches every JNode (including one whose ·selector· property is absent).

If the selector takes the form of an NCName, then it matches every JNode whose ·selector· property is an atomic item (necessarily an xs:string, xs:anyURI or xs:untypedAtomic value) that is equal to this NCName under the rules of the fn:atomic-equal function.

If the selector takes the form of any other EQName or wildcard, then it matches every JNode whose ·selector· property is a matching xs:QName, using the same rules as for wildcards in XNode steps (see 4.7.4.34.7.5.3 Selectors for XNodes).

If the selector takes the form get(E), where E is an ExprSingle, the contained expression E is evaluated with an absentDMfocus. A JNode satisfies the selector if the value of its ·selector· property is equal to one of the values present in the atomized value of the selector expression E, under the rules of the atomic-equal function.

That is, if the context item is a JNode, then Axis::get(E) returns the result of the expression:

let $selector := fn(){ data(<var>E</var>) }()
return <var>Axis</var>::*[some($selector, atomic-equal(?, jnode-selector()))]

Note:

The purpose of evaluating E within the body of an anonymous inline function is to ensure that it is evaluated with an absent focus.

It is not an error if the atomized value of E includes atomic items that select nothing: such values are effectively ignored. This is true both for maps and arrays.

Note:

The result is in document order. The order of items in the result of the selector expression is immaterial.

Note:

A performant implementation might reasonably be expected, at least in the case where the value of the selector is a single atomic item, to select an entry in a map or a member in an array in constant time.

For example:

  • child::code selects an entry in a map whose key is the string "code"

  • child::get("date of birth") selects an entry in a map whose key is the string "date of birth"

  • child::get(3) selects the third member of an array

    Note:

    The same result can be achieved using the expression child::*[3].

  • child::get(1 to 3) selects the first three members of an array, in document order.

    Note:

    child::get((3, 2, 1)) also returns the first three members in document order.

  • child::get(current-date()) selects an entry in a map whose key is an xs:date value equal to the current date.

All the above expressions return a sequence of JNodes. If the containing expression expects atomic items, then the JNodes are automatically atomized.

4.7.4.54.7.5.5 Type Tests

[Definition: An alternative form of a node test called a type test can select XNodes based on their type, or in the case of JNodes, the type of their contained ·content· ].

TypeTest::=RegularItemType | ("type" "(" SequenceType ")")
RegularItemType::=AnyItemTest | NodeKindTest | GNodeType | JNodeType | MapType | ArrayType | RecordType | EnumerationType
AnyItemTest::="item" "(" ")"
NodeKindTest::=DocumentTest
| ElementTest
| AttributeTest
| SchemaElementTest
| SchemaAttributeTest
| PITest
| CommentTest
| TextTest
| NamespaceNodeTest
| AnyNodeKindTest
DocumentTest::="document-node" "(" (ElementTest | SchemaElementTest | NameTestUnion)? ")"
ElementTest::="element" "(" (NameTestUnion ("," TypeName "?"?)?)? ")"
SchemaElementTest::="schema-element" "(" ElementName ")"
NameTestUnion::=(NameTest ++ "|")
NameTest::=EQName | Wildcard
AttributeTest::="attribute" "(" (NameTestUnion ("," TypeName)?)? ")"
SchemaAttributeTest::="schema-attribute" "(" AttributeName ")"
PITest::="processing-instruction" "(" (NCName | StringLiteral)? ")"
StringLiteral::=AposStringLiteral | QuotStringLiteral
/* ws: explicit */
CommentTest::="comment" "(" ")"
TextTest::="text" "(" ")"
NamespaceNodeTest::="namespace-node" "(" ")"
AnyNodeKindTest::="node" "(" ")"
GNodeType::="gnode" "(" ")"
JNodeType::="jnode" "(" SequenceType? ")"
MapType::=AnyMapType | TypedMapType
ArrayType::=AnyArrayType | TypedArrayType
RecordType::=AnyRecordType | TypedRecordType
EnumerationType::="enum" "(" (StringLiteral ++ ",") ")"
SequenceType::=("empty-sequence" "(" ")")
| (ItemTypeOccurrenceIndicator?)

The most general form of type test uses the syntax type(SequenceType). This selects:

  • XNodes that are instances of the supplied SequenceType;

  • JNodes whose ·content· property is an instance of the supplied SequenceType.

For the most commonly encountered types, this syntax can be abbreviated: for example node(), text(), array(*), and record(x, y) can be written directly without the enclosing type(...).

If the origin is an XNode the type used will normally be a NodeKindTest such as node() or comment(). Specifying a type that cannot select nodes, such as map(*), is allowed but pointless.

Note:

If T is a NodeKindTest, there is a subtle difference between the expressions $N/T and $N/type(T): if no explicit axis is specified, and if T is in the form attribute(N) or schema-attribute(N), this changes the default axis to be the attribute axis; and similarly for tests implicitly using the namespace axis. This rule does not apply when the step is written as type(attribute(N)) or type(schema-attribute(N)). Such constructs make sense, for example, when selecting members of an array that are XNodes: the expressions $array/type(element(*)) and $array/type(attribute(*)) can be used to select the members of an array that are element nodes or attribute nodes respectively.

Such expressions return a sequence of JNodes, whose ·content· property is an XNode. Note that it is not directly possible to start with a JNode, select a contained XNode, and then navigate from the XNode: a path such as $array/type(element(p))/@id will not work. This is because the first step, $array/type(element(p)), does not select an element node, it selects a JNode whose ·content· property is that element node, and use of the attribute axis starting from a JNode has no effect.

Instead, the required effect can be achieved by adding a step that explicitly extracts the content of the JNode: $array/type(element(p))/jnode-content()/@id.

The syntax and semantics of a kind test are described in 3.1 Sequence Types and 3.1.2 Sequence Type Matching.

Shown below are several examples of type tests that might be used in path expressions selecting within an XTree:

  • node() matches any XNode.

  • text() matches any text node.

  • comment() matches any comment node.

  • namespace-node() matches any namespace node.

  • element() matches any element node.

  • schema-element(person) matches any element node whose name is person (or is in the substitution group headed by person), and whose type annotation is the same as (or is derived from) the declared type of the person element in the in-scope element declarations.

  • element(person) matches any element node whose name is person, regardless of its type annotation.

  • element(doctor|nurse) matches any element node whose name is doctor or nurse, regardless of its type annotation.

  • element(person, surgeon) matches any non-nilled element node whose name is person, and whose type annotation is surgeon or is derived from surgeon.

  • element(doctor|nurse, medical-staff) matches any non-nilled element node whose name is doctor or nurse, and whose type annotation is medical-staff or is derived from medical-staff.

  • element(*, surgeon) matches any non-nilled element node whose type annotation is surgeon (or is derived from surgeon), regardless of its name.

  • attribute() matches any attribute node.

  • attribute(price) matches any attribute whose name is price, regardless of its type annotation.

  • attribute(*, xs:decimal) matches any attribute whose type annotation is xs:decimal (or is derived from xs:decimal), regardless of its name.

  • document-node() matches any document node.

  • document-node(element(book)) matches any document node whose children consist of a single element node that satisfies the kind testelement(book), interleaved with zero or more comments and processing instructions, and no text nodes.

  • document-node(book) is an abbreviation for document-node(element(book)).

The following examples show type type tests that might be used in path expressions selecting within a JTree:

  • array(*) matches any JNode whose ·content· is an array.

  • record(longitude, latitude, *) matches any JNode whose ·content· is a map having entries with keys "longitude" and "latitude".

  • type(empty-sequence()) matches any JNode whose ·content· is an empty sequence.

  • type(xs:date) matches any JNode whose ·content· is an instance of xs:date.

4.7.4.64.7.5.6 Implausible Axis Steps

Changes in 4.0 (next | previous)

  1. The rules for reporting type errors during static analysis have been changed so that a processor has more freedom to report errors in respect of constructs that are evidently wrong, such as @price/@value, even though dynamic evaluation is defined to return an empty sequence rather than an error.   [Issue 602 PR 603 25 July 2023]

Certain axis steps, given an inferred type for the context value, are classified as implausible. During the static analysis phase, a processor may (subject to the rules in 2.5.6 Implausible Expressions) report a static error when such axis steps are encountered: [err:XPTY0144].

More specifically, an axis step is classified as implausible if any of the following conditions applies:

  1. The inferred item type of the context value is a node kind for which the specified axis is always empty: for example, the inferred item type of the context value is attribute() and the axis is child.

  2. The node test exclusively selects node kinds that cannot appear on the specified axis: for example, the axis is child and the node test is document-node().

  3. In a schema-aware environment, when using the child, descendant, descendant-or-self, or attribute axes, the inferred item type of the context value has a content type that does not allow any node matching the node test to be present on the relevant axis. For example, if the inferred item type of the context value is schema-element(list) and the relevant element declaration (taking into account substitution group membership and wildcards) only allows item children, the axis step child::li will never select anything and is therefore classified as implausible.

Examples of implausible axis steps include the following:

  • @code/text(): attributes cannot have text node children.

  • /@code: document nodes cannot have attributes.

  • ancestor::text(): the ancestor axis never returns text nodes.

  • element(*)/child::map: the child axis starting at an element node will never select a map.

Note:

Processors may choose not to classify the expression /.. as implausible, since XSLT 1.0 users were sometimes advised to use this construct as an explicit way of denoting the empty sequence.

4.7.54.7.6 Predicates within Steps

Predicate::="[" Expr "]"
Expr::=(ExprSingle ++ ",")

A predicate within an AxisStep has similar syntax and semantics to a predicate within a filter expression. The only difference is in the way the context position is set for evaluation of the predicate.

Note:

The operator [] binds more tightly than /. This means that the expression a/b[1] is interpreted as child::a/(child::b[1]): it selects the first b child of every a element, in contrast to (a/b)[1] which selects the first b element that is a child of some a element.

A common mistake is to write //a[1] where (//a)[1] is intended. The first expression, //a[1], selects every descendant a element that is the first child of its parent (it expands to /descendant-or-self::node()/child::a[1]), whereas (//a)[1] selects the a element in the document.

For the purpose of evaluating the context position within a predicate, the input sequence is considered to be sorted as follows: into document order if the predicate is in a forward-axis step, into reverse document order if the predicate is in a reverse-axis step, or in its original order if the predicate is not in a step.

More formally:

  • For a step using a forwards axis, such as child::test[P], the result is the same as for the equivalent filter expression(child::test)[P] (note the parentheses). The same applies if there are multiple predicates, for example child::test[P1][P2][P3] is equivalent to (child::test)[P1][P2][P3].

  • For a step using a reverse axis, such as ancestor::test[P], the result is the same as the expression reverse(ancestor::test)[P] => reverse(). The same applies if there are multiple predicates, for example ancestor::test[P1][P2][P3] is equivalent to reverse(ancestor::test)[P1][P2][P3] => reverse().

Note:

The result of the expression preceding-sibling::* is in document order, but preceding-sibling::*[1] selects the last preceding sibling element, that is, the one that immediately precedes the context node.

Similarly, the expression preceding-sibling::x[1, 2, 3] selects the last three preceding siblings, returning them in document order. For example, given the input:

<doc><a/><b/><c/><d/><e/><f/></doc>

The result of //e ! preceding-sibling::*[1, 2, 3] is <b/>, <c/>, <d/>. The expression //e ! preceding-sibling::*[3, 2, 1] delivers exactly the same result.

Here are some examples of axis steps that contain predicates to select XNodes:

  • This example selects the second chapter element that is a child of the context node:

    child::chapter[2]
  • This example selects all the descendants of the context node that are elements named "toy" and whose color attribute has the value "red":

    descendant::toy[attribute::color = "red"]
  • This example selects all the employee children of the context node that have both a secretary child element and an assistant child element:

    child::employee[secretary][assistant]
  • This example selects the innermost div ancestor of the context node:

    ancestor::div[1]
  • This example selects the outermost div ancestor of the context node:

    ancestor::div[last()]
  • This example selects the names of all the ancestor elements of the context node that have an @id attribute, outermost element first:

    ancestor::*[@id]

Note:

The expression ancestor::div[1] parses as an AxisStep with a reverse axis, and the position 1 therefore refers to the first ancestor div in reverse document order, that is, the innermost div. By contrast, (ancestor::div)[1] parses as a FilterExpr, and therefore returns the first qualifying div element in the order of the ancestor::div expression, which is in document order.

The fact that a reverse-axis step assigns context positions in reverse document order for the purpose of evaluating predicates does not alter the fact that the final result of the step is always in document order.

The expression ancestor::(div1|div2)[1] does not have the same meaning as (ancestor::div1|ancestor::div2)[1]. In the first expression, the predicate [1] is within a step that uses a reverse axis, so nodes are counted in reverse document order. In the second expression, the predicate applies to the result of a union expression, so nodes are counted in document order.

When the context value for evaluation of a step includes multiple GNodes, the step is evaluated separately for each of those GNodes, and the results are combined, eliminating duplicates and sorting into document order.

Note:

To avoid reordering and elimination of duplicates, replace the step S by .!S.

4.7.64.7.7 Unabbreviated Syntax

This section provides a number of examples of path expressions in which the axis is explicitly specified in each step. The syntax used in these examples is called the unabbreviated syntax. In many common cases, it is possible to write path expressions more concisely using an abbreviated syntax, as explained in 4.7.74.7.8 Abbreviated Syntax.

These examples assume that the context value is a single node, referred to as the context node.

  • child::para selects the para element children of the context node.

  • child::(para|bullet) selects the para and bullet element children of the context node.

  • child::* selects all element children of the context node.

  • child::text() selects all text node children of the context node.

  • child::(text()|comment()) selects all text node and comment node children of the context node.

  • child::node() selects all the children of the context node. Note that no attribute nodes are returned, because attributes are not children.

  • attribute::name selects the name attribute of the context node.

  • attribute::* selects all the attributes of the context node.

  • parent::node() selects the parent of the context node. If the context node is an attribute node, this expression returns the element node (if any) to which the attribute node is attached.

  • descendant::para selects the para element descendants of the context node.

  • ancestor::div selects all div ancestors of the context node.

  • ancestor-or-self::div selects the div ancestors of the context node and, if the context node is a div element, the context node as well.

  • descendant-or-self::para selects the para element descendants of the context node and, if the context node is a para element, the context node as well.

  • self::para selects the context node if it is a para element, and otherwise returns an empty sequence.

  • self::(chapter|appendix) selects the context node if it is a chapter or appendix element, and otherwise returns an empty sequence.

  • child::chapter/descendant::para selects the para element descendants of the chapter element children of the context node.

  • child::*/child::para selects all para grandchildren of the context node.

  • / selects the root of the tree that contains the context node, but raises a dynamic error if this root is not a document node.

  • /descendant::para selects all the para elements in the same document as the context node.

  • /descendant::list/child::member selects all the member elements that have a list parent and that are in the same document as the context node.

  • child::para[position() = 1] selects the first para child of the context node.

  • child::para[position() = last()] selects the last para child of the context node.

  • child::para[position() = last()-1] selects the last but one para child of the context node.

  • child::para[position() > 1] selects all the para children of the context node other than the first para child of the context node.

  • following-sibling::chapter[position() = 1] selects the next chapter sibling of the context node.

  • following-sibling::(chapter|appendix)[position() = 1] selects the next sibling of the context node that is either a chapter or an appendix.

  • preceding-sibling::chapter[position() = 1] selects the previous chapter sibling of the context node.

  • /descendant::figure[position() = 42] selects the forty-second figure element in the document containing the context node.

  • /child::book/child::chapter[position() = 5]/child::section[position() = 2] selects the second section of the fifth chapter of the book whose parent is the document node that contains the context node.

  • child::para[attribute::type eq "warning"] selects all para children of the context node that have a type attribute with value warning.

  • child::para[attribute::type eq 'warning'][position() = 5] selects the fifth para child of the context node that has a type attribute with value warning.

  • child::para[position() = 5][attribute::type eq "warning"] selects the fifth para child of the context node if that child has a type attribute with value warning.

  • child::chapter[child::title = 'Introduction'] selects the chapter children of the context node that have one or more title children whose typed value is equal to the string Introduction.

  • child::chapter[child::title] selects the chapter children of the context node that have one or more title children.

  • child::*[self::chapter or self::appendix] selects the chapter and appendix children of the context node.

  • child::*[self::(chapter|appendix)][position() = last()] selects the last chapter or appendix child of the context node.

4.7.74.7.8 Abbreviated Syntax

AbbreviatedStep::=".." | ("@" NodeTest) | SimpleNodeTest
NodeTest::=UnionNodeTest | SimpleNodeTest
SimpleNodeTest::=TypeTest | Selector
TypeTest::=RegularItemType | ("type" "(" SequenceType ")")
Selector::=EQName | Wildcard | ("get" "(" ExprSingle ")")
EQName::=QName | URIQualifiedName
Wildcard::="*"
| (NCName ":*")
| ("*:" NCName)
| (BracedURILiteral "*")
/* ws: explicit */
ExprSingle::=ForExpr
| LetExpr
| QuantifiedExpr
| IfExpr
| OrExpr

The abbreviated syntax for a step permits the following abbreviations:

  1. The attribute axis attribute:: can be abbreviated by @. For example, the expression para[@type = "warning"] is short for child::para[attribute::type = "warning"] and so selects para children with a type attribute with value equal to warning.

  2. If the axis name is omitted from an axis step, the default axis is child, with two exceptions: (1) if the NodeTest in an axis step contains an AttributeTest or SchemaAttributeTest then the default axis is attribute; (2) if the NodeTest in an axis step is a NamespaceNodeTestthen the default axis is namespace, but in an implementation that does not support the namespace axis, an error is raised [err:XQST0134].

    Note:

    The namespace axis is deprecated as of XPath 2.0, but is required in some languages that use XPath, including XSLT.

    For example, the path expression section/para is an abbreviation for child::section/child::para, and the path expression section/@id is an abbreviation for child::section/attribute::id. Similarly, section/attribute(id) is an abbreviation for child::section/attribute::attribute(id). Note that the latter expression contains both an axis specification and a node test.

    Similarly, within a JTree rooted at an array, the expression get(1)/parts/get(2)/part-no gets the first member of the top-level array (presumably a map), then the "parts" entry within this map (presumably an array), then the second member of this array (presumably a map), and finally the part-no entry within this map.

    Note:

    The same selection could be made using the lookup expression ?1?parts?2?part-no. The main difference is that path expressions offer more flexibility in being able to navigate around the containing JTree. Also, the lookup expression $a?1 fails if the array index is out of bounds; the path expression $a/get(1) (or $a/*[1]) instead returns an empty sequence.

    Note:

    An abbreviated axis step that omits the axis name must use a SimpleNodeTest rather than a UnionNodeTest. This means that a construct such as (ul|ol) is treated as an abbreviation for (child::ul|child::ol) rather than child::(ul|ol). Since the two constructs have exactly the same semantics, this is not actually a restriction.

  3. A step consisting of .. is short for parent::gnode(). For example (assuming the context item is an XNode), ../title is short for parent::gnode()/child::title and so will select the title children of the parent of the context node.

    Similarly, if $dateOfBirth is a JNode resulting from the expression $map/get("date of birth"), then $dateOfBirth/../gender will select the entry having key "gender" within $map.

    Note:

    The expression ., known as a context value reference, is a primary expression, and is described in 4.3.3 Context Value References.

Here are some examples of path expressions that use the abbreviated syntax. These examples assume that the context value is a single XNode, referred to as the context node:

  • para selects the para element children of the context node.

  • * selects all element children of the context node.

  • text() selects all text node children of the context node.

  • @name selects the name attribute of the context node.

  • @(id|name) selects the id and name attributes of the context node.

  • @* selects all the attributes of the context node.

  • para[1] selects the first para child of the context node.

  • para[last()] selects the last para child of the context node.

  • */para selects all para grandchildren of the context node.

  • /book/chapter[5]/section[2] selects the second section of the fifth chapter of the book whose parent is the document node that contains the context node.

  • chapter//para selects the para element descendants of the chapter element children of the context node.

  • //para selects all the para descendants of the root document node and thus selects all para elements in the same document as the context node.

  • //@version selects all the version attribute nodes that are in the same document as the context node.

  • //list/member selects all the member elements in the same document as the context node that have a list parent.

  • .//para selects the para element descendants of the context node.

  • .. selects the parent of the context node.

  • ../@lang selects the lang attribute of the parent of the context node.

  • para[@type = "warning"] selects all para children of the context node that have a type attribute with value warning.

  • para[@type = "warning"][5] selects the fifth para child of the context node that has a type attribute with value warning.

  • para[5][@type = "warning"] selects the fifth para child of the context node if that child has a type attribute with value warning.

  • chapter[title = "Introduction"] selects the chapter children of the context node that have one or more title children whose typed value is equal to the string Introduction.

  • chapter[title] selects the chapter children of the context node that have one or more title children.

  • employee[@secretary and @assistant] selects all the employee children of the context node that have both a secretary attribute and an assistant attribute.

  • book/(chapter|appendix)/section selects every section element that has a parent that is either a chapter or an appendix element, that in turn is a child of a book element that is a child of the context node.

  • If E is any expression that returns a sequence of nodes, then the expression E/. returns the same nodes in document order, with duplicates eliminated based on node identity.

The following examples use abbreviated paths to access data within the JTree obtained by parsing the JSON text:

[
  { "first": "John", 
    "last": "Baker", 
    "date of birth": "2003-04-19", 
    "occupation": "cook"}, 
  { "first": "Mary", 
    "last": "Smith", 
    "date of birth": "2006-08-12", 
    "occupation": "teacher"},                 
]
  • get(1)/first returns a JNode whose ·content· is the string "John".

  • //first[. = "Mary"]/../last returns a JNode whose ·content· is the string "Smith".

  • //first[. = "Mary"]/../get("date of birth") returns a JNode whose ·content· is the string "2006-08-12".

  • //*[occupation = "cook"]!`{first} {last}` returns the string "John Baker".

  • //*[occupation = "cook"]/following-sibling::*[1]!`{first} {last}` returns the string "Mary Smith".

  • //*[last = "Smith"]/../get(1)/last returns the string "Baker".

  • //record(first, last, *) ! string(last) returns the sequence of two strings "Baker", "Smith".

4.7.84.7.9 Comparison with JSONPath

Path expressions applied to a JTree offer similar capability to JSONPath, which is an XPath-like language design for querying JSON.

Example: Comparison with JSONPath

This example provides XPath equivalents to some examples given in the JSONPath specification. [TODO: add a reference].

The examples query the result of parsing the following JSON value, representing a store whose stock consists of four books and a bicycle:

{
  "store": {
    "book": [
      {
        "category": "reference",
        "author": "Nigel Rees",
        "title": "Sayings of the Century",
        "price": 8.95
      },
      {
        "category": "fiction",
        "author": "Evelyn Waugh",
        "title": "Sword of Honour",
        "price": 12.99
      },
      {
        "category": "fiction",
        "author": "Herman Melville",
        "title": "Moby Dick",
        "isbn": "0-553-21311-3",
        "price": 8.99
      },
      {
        "category": "fiction",
        "author": "J. R. R. Tolkien",
        "title": "The Lord of the Rings",
        "isbn": "0-395-19395-8",
        "price": 22.99
      }
    ],
    "bicycle": {
      "color": "red",
      "price": 399
    }
  }
}

The following table illustrates some queries on this data, expressed both in JSONPath and in XPath 4.0.

JSONPath vs XPath 4.0 Comparison
QueryJSONPathXPath 4.0
The authors of all books in the store$.store.book[*].author/store/book//author
All authors$..author//author
All things in store (four books and a red bicycle)$.store.*/store/*
The prices of everything in the store$.store..price/store//price
The third book$..book[2] //book/*[3]
The third book's author$..book[2].author//book/*[3]/author
The third book's publisher (empty result)$..book[2].publisher//book/*[3]/publisher
The last book (in order)$..book[-1]//book/*[last()]
The first two books$..book[0,1]//book/*[1,2]
All books with an ISBN$..book[?@.isbn]//book[isbn]
All books cheaper than 10$..book[?@.price<10]//book[price lt 10]
All member values and array elements contained in the input value$..*//*

4.20 Simple map operator (!)

SimpleMapExpr::=PathExpr ("!" PathExpr)*
PathExpr::=AbsolutePathExpr
| RelativePathExpr
/* xgc: leading-lone-slash */

A mapping expression S!E evaluates the expression E once for every item in the sequence obtained by evaluating S. The simple mapping operator ! can be applied to any sequence, regardless of the types of its items, and it can deliver a mixed sequence of nodes, atomic items, and functions. Unlike the similar / operator, it does not sort nodes into document order or eliminate duplicates.

Each operation E1!E2 is evaluated as follows: Expression E1 is evaluated to a sequence S. Each item in S then serves in turn to provide an inner focus (the item as the context value, its position in S as the context position, the length of S as the context size) for an evaluation of E2 in the dynamic context. The sequences resulting from all the evaluations of E2 are combined as follows: Every evaluation of E2 returns a (possibly empty) sequence of items. The final result is the sequence concatenation of these sequences. The returned sequence preserves the orderings within and among the subsequences generated by the evaluations of E2.

Simple map operators have functionality similar to 4.7.3 Path operatorOperator (/). The following table summarizes the differences between these two operators

OperatorPath operator (E1 / E2)Simple map operator (E1 ! E2)
E1Any sequence of nodesAny sequence of items
E2Either a sequence of nodes or a sequence of non-node itemsA sequence of items
Additional processingDuplicate elimination and document orderingSimple sequence concatenation

The following examples illustrate the use of simple map operators combined with path expressions.

  • child::div1 / child::para / string() ! concat("id-", .)

    Selects the para element children of the div1 element children of the context node; that is, the para element grandchildren of the context node that have div1 parents. It then outputs the strings obtained by prepending "id-" to each of the string values of these grandchildren.

  • $emp ! (@first, @middle, @last)

    Returns the values of the attributes first, middle, and last for each element in $emp, in the order given. (The / operator, if used here, would return the attributes in an unpredictable order.)

  • $docs ! ( //employee)

    Returns all the employee elements within all the documents identified by the variable $docs, in document order within each document, but retaining the order of documents.

  • avg( //employee / salary ! translate(., '$', '') ! number(.))

    Returns the average salary of the employees, having converted the salary to a number by removing any $ sign and then converting to a number. (The second occurrence of ! could not be written as / because the left-hand operand of / cannot be an atomic item.)

  • string-join((1 to $n) ! "*")

    Returns a string containing $n asterisks.

  • $values ! (.*.) => sum()

    Returns the sum of the squares of a sequence of numbers.

  • string-join(ancestor::* ! name(), '/')

    Returns the names of ancestor elements, joined by / characters, i.e., the path to the parent of the context.

I Change Log (Non-Normative)

  1. If a section of this specification has been updated since version 3.1, an overview of the changes is provided, along with links to navigate to the next or previous change.

    See 1 Introduction

  2. Sections with significant changes are marked with a ✭ symbol in the table of contents.

    See 1 Introduction

  3. PR 691 2154 

    Enumeration types are added as a new kind of ItemType, constraining the value space of strings.

    See 3.2.6 Enumeration Types

  4. Setting the default namespace for elements and types to the special value ##any causes an unprefixed element name to act as a wildcard, matching by local name regardless of namespace.

    See 3.2.7.2 Element Types

  5. The terms FunctionType, ArrayType, MapType, and RecordType replace FunctionTest, ArrayTest, MapTest, and RecordTest, with no change in meaning.

    See 3.2.8.1 Function Types

  6. Record types are added as a new kind of ItemType, constraining the value space of maps.

    See 3.2.8.3 Record Types

  7. Function coercion now allows a function with arity N to be supplied where a function of arity greater than N is expected. For example this allows the function true#0 to be supplied where a predicate function is required.

    See 3.4.4 Function Coercion

  8. The symbols × and ÷ can be used for multiplication and division.

    See 4.9 Arithmetic Expressions

  9. PR 1763 1830 

    The syntax on the right-hand side of an arrow operator has been relaxed; a dynamic function call no longer needs to start with a variable reference or a parenthesized expression, it can also be (for example) an inline function expression or a map or array constructor.

    See 4.21 Arrow Expressions

  10. The arrow operator => is now complemented by a “mapping arrow” operator =!> which applies the supplied function to each item in the input sequence independently.

    See 4.21.2 Mapping Arrow Expressions

  11. PR 1023 1128 

    It has been clarified that function coercion applies even when the supplied function item matches the required function type. This is to ensure that arguments supplied when calling the function are checked against the signature of the required function type, which might be stricter than the signature of the supplied function item.

    See 3.4.4 Function Coercion

  12. A dynamic function call can now be applied to a sequence of functions, and in particular to an empty sequence. This makes it easier to chain a sequence of calls.

    See 4.6.3.1 Evaluating Dynamic Function Calls

  13. The syntax document-node(N), where N is a NameTestUnion, is introduced as an abbreviation for document-node(element(N)). For example, document-node(*) matches any well-formed XML document (as distinct from a document fragment).

    See 3.2.7 Node Types

  14. QName literals are new in 4.0.

    See 4.3.1.3 QName Literals

  15. Path expressions are extended to handle JNodes (found in trees of maps and arrays) as well as XNodes (found in trees representing parsed XML).

    See 4.7 Path Expressions

  16. A method call invokes a function held as the value of an entry in a map, supplying the map implicitly as the value of the first argument.

    See 4.14.4 Method Calls

  17. The treat as expression now raises a type error rather than a dynamic error when it fails.

    See 4.18.5 Treat

  18. XPath 4.0 allows an XPath expression to be preceded by namespace declarations, allowing namespace prefixes to be bound within the XPath expression, rather than relying entirely on the host language to declare namespace prefixes.

    See 4.1 Namespace Declarations

  19. PR 28 

    Multiple for and let clauses can be combined in an expression without an intervening return keyword.

    See 4.13.1 For Expressions

    See 4.13.2 Let Expressions

  20. PR 159 

    Keyword arguments are allowed on static function calls, as well as positional arguments.

    See 4.6.1.1 Static Function Call Syntax

  21. PR 202 

    The presentation of the rules for the subtype relationship between sequence types and item types has been substantially rewritten to improve clarity; no change to the semantics is intended.

    See 3.3 Subtype Relationships

  22. PR 230 

    The rules for “errors and optimization” have been tightened up to disallow many cases of optimizations that alter error behavior. In particular there are restrictions on reordering the operands of and and or, and of predicates in filter expressions, in a way that might allow the processor to raise dynamic errors that the author intended to prevent.

    See 2.5.5 Guarded Expressions

    See 4.12 Logical Expressions

  23. PR 254 

    The term "function conversion rules" used in 3.1 has been replaced by the term "coercion rules".

    See 3.4 Coercion Rules

    The coercion rules allow “relabeling” of a supplied atomic item where the required type is a derived atomic type: for example, it is now permitted to supply the value 3 when calling a function that expects an instance of xs:positiveInteger.

    See 3.4 Coercion Rules

  24. PR 284 

    Alternative syntax for conditional expressions is available: if (condition) { X }.

    See 4.15 Conditional Expressions

  25. PR 286 

    Element and attribute tests can include alternative names: element(chapter|section), attribute(role|class).

    See 3.2.7 Node Types

    The NodeTest in an AxisStep now allows alternatives: ancestor::(section|appendix)

    See 3.2.7 Node Types

    Element and attribute tests of the form element(N) and attribute(N) now allow N to be any NameTest, including a wildcard.

    See 3.2.7.2 Element Types

    See 3.2.7.3 Attribute Types

  26. PR 324 

    String templates provide a new way of constructing strings: for example `{$greeting}, {$planet}!` is equivalent to $greeting || ', ' || $planet || '!'

    See 4.10.2 String Templates

  27. PR 326 

    Support for higher-order functions is now a mandatory feature (in 3.1 it was optional).

    See 5 Conformance

  28. PR 344 

    A for member clause is added to FLWOR expressions to allow iteration over an array.

    See 4.13.1 For Expressions

  29. PR 368 

    The concept of the context item has been generalized, so it is now a context value. That is, it is no longer constrained to be a single item.

    See 2.2.2 Dynamic Context

  30. PR 433 

    Numeric literals can now be written in hexadecimal or binary notation; and underscores can be included for readability.

    See 4.3.1.1 Numeric Literals

  31. PR 519 

    The rules for tokenization have been largely rewritten. In some cases the revised specification may affect edge cases that were handled in different ways by different 3.1 processors, which could lead to incompatible behavior.

    See A.3 Lexical structure

  32. PR 521 

    New abbreviated syntax is introduced (focus function) for simple inline functions taking a single argument. An example is fn { ../@code }

    See 4.6.6 Inline Function Expressions

  33. PR 603 

    The rules for reporting type errors during static analysis have been changed so that a processor has more freedom to report errors in respect of constructs that are evidently wrong, such as @price/@value, even though dynamic evaluation is defined to return an empty sequence rather than an error.

    See 2.5.6 Implausible Expressions

    See 4.7.4.64.7.5.6 Implausible Axis Steps

  34. PR 606 

    Element and attribute tests of the form element(A|B) and attribute(A|B) are now allowed.

    See 3.2.7.2 Element Types

    See 3.2.7.3 Attribute Types

  35. PR 728 

    The syntax record(*) is allowed; it matches any map.

    See 3.2.8.3 Record Types

  36. PR 815 

    The coercion rules now allow conversion in either direction between xs:hexBinary and xs:base64Binary.

    See 3.4 Coercion Rules

  37. PR 911 

    The coercion rules now allow any numeric type to be implicitly converted to any other, for example an xs:double is accepted where the required type is xs:decimal.

    See 3.4 Coercion Rules

  38. PR 996 

    The value of a predicate in a filter expression can now be a sequence of integers.

    See 4.5 Filter Expressions

  39. PR 1031 

    An otherwise operator is introduced: A otherwise B returns the value of A, unless it is an empty sequence, in which case it returns the value of B.

    See 4.16 Otherwise Expressions

  40. PR 1071 

    In map constructors, the keyword map is now optional, so map { 0: false(), 1: true() } can now be written { 0: false(), 1: true() }, provided it is used in a context where this creates no ambiguity.

    See 4.14.1.1 Map Constructors

  41. PR 1131 

    A positional variable can be defined in a for expression.

    See 4.13.1 For Expressions

    The type of a variable used in a for expression can be declared.

    See 4.13.1 For Expressions

    The type of a variable used in a let expression can be declared.

    See 4.13.2 Let Expressions

  42. PR 1132 

    Choice item types (an item type allowing a set of alternative item types) are introduced.

    See 3.2.5 Choice Item Types

  43. PR 1163 

    Filter expressions for maps and arrays are introduced.

    See 4.14.5 Filter Expressions for Maps and Arrays

  44. PR 1181 

    The default namespace for elements and types can be set to the value ##any, allowing unprefixed names in axis steps to match elements with a given local name in any namespace.

    See 2.2.1 Static Context

    If the default namespace for elements and types has the special value ##any, then an unprefixed name in a NameTest acts as a wildcard, matching names in any namespace or none.

    See 4.7.4.24.7.5.2 Node Tests

  45. PR 1197 

    The keyword fn is allowed as a synonym for function in function types, to align with changes to inline function declarations.

    See 3.2.8.1 Function Types

    In inline function expressions, the keyword function may be abbreviated as fn.

    See 4.6.6 Inline Function Expressions

  46. PR 1212 

    New keywords introducing item types, such as record, item, and enum, have been added to the list of reserved function names.

    See A.4 Reserved Function Names

  47. PR 1217 

    Predicates in filter expressions for maps and arrays can now be numeric.

    See 4.14.5 Filter Expressions for Maps and Arrays

  48. PR 1249 

    A for key/value clause is added to FLWOR expressions to allow iteration over maps.

    See 4.13.1 For Expressions

  49. PR 1250 

    Several decimal format properties, including minus sign, exponent separator, percent, and per-mille, can now be rendered as arbitrary strings rather than being confined to a single character.

    See 2.2.1.2 Decimal Formats

  50. PR 1265 

    The rules regarding the document-uri property of nodes returned by the fn:collection function have been relaxed.

    See 2.2.2 Dynamic Context

  51. PR 1344 

    Parts of the static context that were there purely to assist in static typing, such as the statically known documents, were no longer referenced and have therefore been dropped.

    See 2.2.1 Static Context

    The static typing option has been dropped.

    See 2.4 Processing Model

    The static typing feature has been dropped.

    See 5 Conformance

  52. PR 1361 

    The term atomic value has been replaced by atomic item.

    See 2.1.3 Values

  53. PR 1384 

    If a type declaration is present, the supplied values in the input sequence are now coerced to the required type. Type declarations are now permitted in XPath as well as XQuery.

    See 4.17 Quantified Expressions

  54. PR 1496 

    The context value static type, which was there purely to assist in static typing, has been dropped.

    See 2.2.1 Static Context

  55. PR 1498 

    The EBNF operators ++ and ** have been introduced, for more concise representation of sequences using a character such as "," as a separator. The notation is borrowed from Invisible XML.

    See 2.1 Terminology

    The EBNF notation has been extended to allow the constructs (A ++ ",") (one or more occurrences of A, comma-separated, and (A ** ",") (zero or more occurrences of A, comma-separated.

    See 2.1.1 Grammar Notation

    The EBNF operators ++ and ** have been introduced, for more concise representation of sequences using a character such as "," as a separator. The notation is borrowed from Invisible XML.

    See A.1 EBNF

    See A.1.1 Notation

  56. PR 1501 

    The coercion rules now apply recursively to the members of an array and the entries in a map.

    See 3.4 Coercion Rules

  57. PR 1532 

    Four new axes have been defined: preceding-or-self, preceding-sibling-or-self, following-or-self, and following-sibling-or-self.

    See 4.7.4.14.7.5.1 Axes

  58. PR 1577 

    The syntax record() is allowed; the only thing it matches is an empty map.

    See 3.2.8.3 Record Types

  59. PR 1686 

    With the pipeline operator ->, the result of an expression can be bound to the context value before evaluating another expression.

    See 4.19 Pipeline operator

  60. PR 1696 

    Parameter names may be included in a function signature; they are purely documentary.

    See 3.2.8.1 Function Types

  61. PR 1703 

    Ordered maps are introduced.

    See 4.14.1 Maps

    The order of key-value pairs in the map constructor is now retained in the constructed map.

    See 4.14.1.1 Map Constructors

  62. PR 1874 

    The coercion rules now reorder the entries in a map when the required type is a record type.

    See 3.4 Coercion Rules

  63. PR 1898 

    The rules for subtyping of document node types have been refined.

    See 3.3.2.5.2 Subtyping Nodes: Document Nodes

  64. PR 1991 

    Named record types used in the signatures of built-in functions are now available as standard in the static context.

    See 2.2.1 Static Context

  65. PR 2031 

    The terms XNode and JNode are introduced; the existing term node remains in use as a synonym for XNode where the context does not specify otherwise.

    See 2.1.3 Values

    JNodes are introduced

    See 3.2.9 Generalized Node Types

  66. PR 2055 

    Sequences, arrays, and maps can be destructured in a let expression to extract their components into multiple variables.

    See 4.13.2 Let Expressions

  67. PR 2094 

    A general expression is allowed within a map constructor; this facilitates the creation of maps in which the presence or absence of particular keys is decided dynamically.

    See 4.14.1.1 Map Constructors

  68. PR 2115 

    This section describes and formalizes a convention that was already in use, but not explicitly stated, in earlier versions of the specification.

    See 2.1.2 Expression Names

  69. PR 2130 

    Operator is-not is introduced, as a complement to the operator is.

    See 4.11.3 GNode Comparisons

    Operators precedes and follows are introduced as synonyms for operators << and >>.

    See 4.11.3 GNode Comparisons

  70. PR 2134 

    The lookup operator ? can now be followed by an arbitrary literal, for cases where keys are items other than integers or NCNames. It can also be followed by a variable reference or a context value reference.

    See 4.14.3 Lookup Expressions

  71. PR 2176 

    Operators precedes-or-is and follows-or-is are introduced as synonyms for the union of operators << and is and for the union of operators >> and is, respectively.

    See 4.11.3 GNode Comparisons

  72. PR 2202 

    The type schema-element(N) is now defined to be a subtype of element() and of various other element tests.

    See 3.3.2.5.3 Subtyping Nodes: Elements

    The type schema-attribute(N) is now defined to be a subtype of attribute() and of various other attribute tests.

    See 3.3.2.5.4 Subtyping Nodes: Attributes

  73. PR 2213 

    This section (“External Resources and Security”) is new.

    See 2.3 External Resources and Security

  74. PR 2218 

    The rules for value comparisons when comparing values of different types (for example, decimal and double) have changed to be transitive. A decimal value is no longer converted to double, instead the double is converted to a decimal without loss of precision. This may affect compatibility in edge cases involving comparison of values that are numerically very close.

    See 4.11.1 Value Comparisons

    The rules for comparing untyped atomic items with numeric values have changed. Rather than converting an untyped atomic item unconditionally to xs:double, it is now converted to the type of the numeric operand. This is designed to ensure that comparisons such as <a>1.1</a> = 1.1 succeed, given that the values will now be compared as decimals rather than as doubles.

    See 4.11.2 General Comparisons

  75. PR 2227 

    A URIQualifiedName may now supply a prefix as well as a URI and local name.

    See 2.1.4 Namespaces and QNames