XML Path Language (XPath) 4.0 WG Review Draft

2 Basics

2.2 Expression Context

[Definition: The expression context for a given expression consists of all the information that can affect the result of the expression.]

This information is organized into two categories called the static context and the dynamic context.

2.2.1 Static Context

Changes in 4.0 ⬇ ⬆

The default namespace for elements and types can be set to the value ##any, allowing unprefixed names in axis steps to match elements with a given local name in any namespace. [Issue 296 PR 1181 30 April 2024]
Parts of the static context that were there purely to assist in static typing, such as the statically known documents, were no longer referenced and have therefore been dropped. [Issue 1343 PR 1344 23 September 2024]
The context value static type, which was there purely to assist in static typing, has been dropped. [Issue 1495 PR 1496 29 October 2024]

[Definition: The static context of an expression is the information that is available during static analysis of the expression, prior to its evaluation.] This information can be used to decide whether the expression contains a static error.

The individual components of the static context are described below.

In XPath 4.0, the static context for an expression is largely defined by the host language, that is, by the calling environment that causes an XPath expression to be evaluated. Most of the static context components are constant throughout an expression; the only exception is in-scope variables. (There are constructs in the language, such as the ForExpr and LetExpr, that add additional variables to the static context of their subexpressions.)

Some components of the static context, but not all, also affect the dynamic semantics of expressions. For example, casting of a string such as "xbrl:xbrl" to an xs:QName might expand the prefix xbrl to the namespace URI http://www.xbrl.org/2003/instance using the statically known namespaces from the static context; since the input string "xbrl:xbrl" is in general not known until execution time (it might be read from a source document), this means that the values of the statically known namespaces must be available at execution time.

[Definition: XPath 1.0 compatibility mode.This value is true if rules for backward compatibility with XPath Version 1.0 are in effect; otherwise it is false. ]
[Definition: Statically known namespaces. This is a mapping from prefix to namespace URI that defines all the namespaces that are known during static processing of a given expression.]
The URI value is whitespace normalized according to the rules for the xs:anyURI type in Section 3.2.17 anyURI ^XS1-2 or Section 3.3.17 anyURI ^XS11-2.
The statically known namespaces may include a binding for the zero-length prefix; however, this is used only in limited circumstances because the rules for resolving unprefixed QNames depend on how such a name is used.
Note the difference between in-scope namespaces, which is a dynamic property of an element node, and statically known namespaces, which is a static property of an expression.
[Definition: Default namespace for elements and types. This is either a namespace URI, or the special value "##any", or absent^DM. This indicates how unprefixed QNames are interpreted when they appear in a position where an element name or type name is expected.]
- If the value is set to a namespace URI, this namespace is used for any such unprefixed QName. The URI value is whitespace-normalized according to the rules for the xs:anyURI type in Section 3.2.17 anyURI ^XS1-2 or Section 3.3.17 anyURI ^XS11-2.
- The special value "##any" indicates that:
  - When an unprefixed QName is used as a name test for selecting named elements in an axis step, the name test will match an element having the specified local name, in any namespace or none.
  - When an unprefixed QName is used in a context where a type name is expected (but not as a function name), the default namespace is the xs namespace, http://www.w3.org/2001/XMLSchema.
  - In any other context, an unprefixed QName represents a name in no namespace.
- If the value is absent^DM, an unprefixed QName representing an element or type name is interpreted as being in no namespace.
[Definition: Default function namespace. This is either a namespace URI, or absent^DM. The namespace URI, if present, is used for any unprefixed QName appearing in a position where a function name is expected.] The URI value is whitespace-normalized according to the rules for the xs:anyURI type in Section 3.2.17 anyURI ^XS1-2 or Section 3.3.17 anyURI ^XS11-2
In its simplest form its value is simply a whitespace-normalized xs:anyURI value (most commonly, the URI http://www.w3.org/2005/xpath-functions) to be used as the default namespace for unprefixed function names. However, the use of a more complex algorithm is not precluded, for example an algorithm which searches multiple namespaces for a matching name.
[Definition: In-scope schema definitions is a generic term for all the element declarations, attribute declarations, and schema type definitions that are in scope during static analysis of an expression.] It includes the following three parts:
- [Definition: In-scope schema types. Each schema type definition is identified either by an expanded QName (for a named type) or by an implementation-dependent type identifier (for an anonymous type). The in-scope schema types include the predefined schema types described in 3.5 Schema Types. ]
- [Definition: In-scope element declarations. Each element declaration is identified either by an expanded QName (for a top-level element declaration) or by an implementation-dependent element identifier (for a local element declaration). ] An element declaration includes information about the element’s substitution group affiliation.
  [Definition: Substitution groups are defined in Section 2.2.2.2 Element Substitution Group ^XS1-1 and Section 2.2.2.2 Element Substitution Group ^XS11-1. Informally, the substitution group headed by a given element (called the head element) consists of the set of elements that can be substituted for the head element without affecting the outcome of schema validation.]
- [Definition: In-scope attribute declarations. Each attribute declaration is identified either by an expanded QName (for a top-level attribute declaration) or by an implementation-dependent attribute identifier (for a local attribute declaration). ]
[Definition: In-scope variables. This is a mapping from expanded QName to type. It defines the set of variables that are available for reference within an expression. The expanded QName is the name of the variable, and the type is the static type of the variable.]
An expression that binds a variable extends the in-scope variables, within the scope of the variable, with the variable and its type. Within the body of an inline function expression, the in-scope variables are extended by the names and types of the function parameters.
[Definition: In-scope named item types. This is a mapping from expanded QName to named item types.]
[Definition: A named item type is an ItemType identified by an expanded QName.]
Named item types serve two purposes:
- They allow frequently used item types, especially complex item types such as record types, to be given simple names, to avoid repeating the definition every time it is used.
- They allow the definition of recursive types, which are useful for describing recursive data structures such as lists and trees. For details see 3.2.8.3.1 Recursive Record Types.
Note:
Named item types can be defined in a host language such as XQuery 4.0 and in XSLT 4.0, but not in XPath 4.0 itself. They are available in XPath only if the host language provides the ability to define them.
[Definition: Statically known function definitions. This is a set of function definitions.]
Function definitions are described in 2.2.1.1 Function Definitions.
[Definition: Statically known collations. This is an implementation-defined mapping from URI to collation. It defines the names of the collations that are available for use in processing expressions.] [Definition: A collation is a specification of the manner in which strings and URIs are compared and, by extension, ordered. For a more complete definition of collation, see Section 5.3 Comparison of strings^FO.]
[Definition: Static Base URI. This is an absolute URI, used to resolve relative URIs during static analysis. ] For example, it is used to resolve module location URIs in XQuery, and the URIs in xsl:import and xsl:include in XSLT. If E is a subexpression of F then the Static Base URI of E is the same as the Static Base URI of F. There are no constructs in XPath that require resolution of relative URI references during static analysis.
Relative URI references are resolved as described in 2.5.6 Resolving a Relative URI Reference.
At execution time, relative URIs supplied to functions such as fn:doc are resolved against the Executable Base URI, which may or may not be the same as the Static Base URI.
[Definition: Statically known decimal formats. This is a mapping from QNames to decimal formats, with one default format that has no visible name, referred to as the unnamed decimal format. Each format is available for use when formatting numbers using the fn:format-number function.]
Decimal formats are described in 2.2.1.2 Decimal Formats.

3 Types

As noted in 2.1.2 Values, every value in XPath 4.0 is regarded as a sequence of zero, one, or more items. The type system of XPath 4.0, described in this section, classifies the kinds of value that the language can handle, and the operations permitted on different kinds of value.

The type system of XPath 4.0 is related to the type system of [XML Schema 1.0] or [XML Schema 1.1] in two ways:

atomic items in XPath 4.0 (which are one kind of item) have atomic types such as xs:string, xs:boolean, and xs:integer. These types are taken directly from their definitions in [XML Schema 1.0] or [XML Schema 1.1].
Nodes (which are another kind of item) have a property called a type annotation which determines the type of their content. The type annotation is a schema type. The type annotation of a node must not be confused with the item type of the node. For example, an element <age>23</age> might have been validated against a schema that defines this element as having xs:integer content. If this is the case, the type annotation of the node will be xs:integer, and in the XPath 4.0 type system, the node will match the item typeelement(age, xs:integer).

This chapter of the specification starts by defining sequence types and item types, which describe the range of values that can be bound to variables, used in expressions, or passed to functions. It then describes how these relate to schema types, that is, the simple and complex types defined in an XSD schema.

Note:

In many situations the terms item type and sequence type are used interchangeably to refer either to the type itself, or to the syntactic construct that designates the type: so in the expression $x instance of xs:string*, the construct xs:string* uses the SequenceType syntax to designate a sequence type whose instances are sequences of strings. When more precision is required, the specification is careful to use the terms item type and sequence type to refer to the actual types, while using the production names ItemType and SequenceType to refer to the syntactic designators of these types.

3.2 Item Types

[Definition: An item type is a type that can be expressed using the ItemType syntax, which forms part of the SequenceType syntax. Item types match individual items.]

Note:

While this definition is adequate for the purpose of defining the syntax of XPath 4.0, it ignores the fact that there are also item types that cannot be expressed using XPath 4.0 syntax: specifically, item types that reference an anonymous simple type or complex type defined in a schema. Such types can appear as type annotations on nodes following schema validation.

In most cases, the set of items matched by an item type consists either exclusively of atomic items, exclusively of nodes, or exclusively of function items^DM. Exceptions include the generic types item(), which matches all items, xs:error, which matches no items, and choice item types, which can match any combination of types.

[Definition: An item type designator is a syntactic construct conforming to the grammar rule ItemType. An item type designator is said to designate an item type.]

Note:

Two item type designators may designate the same item type. For example, element() and element(*) are equivalent, as are attribute(A) and attribute(A, xs:anySimpleType).

Lexical QNames appearing in an item type designator have their prefixes expanded to namespace URIs by means of the statically known namespaces and (where applicable) the default namespace for elements and types. Equality of QNames is defined by the eq operator.

`ItemType`	::=	`AnyItemTest \| TypeName \| KindTest \| FunctionType \| MapType \| ArrayType \| RecordType \| EnumerationType \| ChoiceItemType`
`AnyItemTest`	::=	`"item" "(" ")"`
`TypeName`	::=	`EQName`
`EQName`	::=	`QName \| URIQualifiedName`
`KindTest`	::=	`DocumentTest \| ElementTest \| AttributeTest \| SchemaElementTest \| SchemaAttributeTest \| PITest \| CommentTest \| TextTest \| NamespaceNodeTest \| AnyKindTest`
`DocumentTest`	::=	`"document-node" "(" (ElementTest \| SchemaElementTest \| NameTestUnion)? ")"`
`ElementTest`	::=	`"element" "(" (NameTestUnion ("," TypeName "?"?)?)? ")"`
`SchemaElementTest`	::=	`"schema-element" "(" ElementName ")"`
`NameTestUnion`	::=	`(NameTest ++ "\|")`
`NameTest`	::=	`EQName \| Wildcard`
`Wildcard`	::=	`"" \| (NCName ":") \| (":" NCName) \| (BracedURILiteral "")`
		/* ws: explicit */
`AttributeTest`	::=	`"attribute" "(" (NameTestUnion ("," TypeName)?)? ")"`
`SchemaAttributeTest`	::=	`"schema-attribute" "(" AttributeName ")"`
`PITest`	::=	`"processing-instruction" "(" (NCName \| StringLiteral)? ")"`
`StringLiteral`	::=	`AposStringLiteral \| QuotStringLiteral`
		/* ws: explicit */
`CommentTest`	::=	`"comment" "(" ")"`
`TextTest`	::=	`"text" "(" ")"`
`NamespaceNodeTest`	::=	`"namespace-node" "(" ")"`
`AnyKindTest`	::=	`"node" "(" ")"`
`FunctionType`	::=	`AnyFunctionType \| TypedFunctionType`
`MapType`	::=	`AnyMapType \| TypedMapType`
`ArrayType`	::=	`AnyArrayType \| TypedArrayType`
`RecordType`	::=	`AnyRecordType \| TypedRecordType`
`EnumerationType`	::=	`"enum" "(" (StringLiteral ++ ",") ")"`
`ChoiceItemType`	::=	`"(" (ItemType ++ "\|") ")"`

This section defines the syntax and semantics of different ItemTypes in terms of the values that they match.

Note:

For an explanation of the EBNF grammar notation (and in particular, the operators ++ and **), see A.1 EBNF.

An item type designator written simply as an EQName (that is, a TypeName) is interpreted as follows:

If the name is written as a lexical QName, then it is expanded using the in-scope namespaces in the static context. If the name is an unprefixed NCName, then it is expanded according to the default namespace for elements and types.
If the name matches a named item type in the static context, then it is taken as a reference to the corresponding item type. The rules that apply are the rules for the expanded item type definition.
Otherwise, it must match the name of a type in the in-scope schema types in the static context: specifically, an atomic type or a pure union type. See 3.5 Schema Types for details.
Note:
A name in the xs namespace will always fall into this category, since the namespace is reserved. See 2.1.3 Namespaces and QNames.
If the name cannot be resolved to a type, a static error is raised [err:XPST0051].

3.2.8 Function, Map, and Array Types

The following sections describe the syntax for item types for functions, including arrays and maps.

The subtype relation among these types is described in the various subsections of 3.3.2 Subtypes of Item Types.

3.2.8.1 Function Types

Changes in 4.0 ⬇ ⬆

The keyword fn is allowed as a synonym for function in function types, to align with changes to inline function declarations. [Issue 1192 PR 1197 21 May 2024]
The terms FunctionType, ArrayType, MapType, and RecordType replace FunctionTest, ArrayTest, MapTest, and RecordTest, with no change in meaning.
Parameter names may be included in a function signature; they are purely documentary. [Issue 1136 PR 1696 12 January 2025]

A FunctionType matches selected function items, potentially checking their signature^DM (which includes the types of the arguments and results).

`FunctionType`	::=	`AnyFunctionType \| TypedFunctionType`
`AnyFunctionType`	::=	`("function" \| "fn") "(" "*" ")"`
`TypedFunctionType`	::=	`("function" \| "fn") "(" (TypedFunctionParam ** ",") ")" "as" SequenceType`
`TypedFunctionParam`	::=	`("$" EQName "as")? SequenceType`
`EQName`	::=	`QName \| URIQualifiedName`
`SequenceType`	::=	`("empty-sequence" "(" ")") \| (ItemTypeOccurrenceIndicator?)`

The keywords function and fn are synonyms.

An AnyFunctionType matches any function item, including a map or an array. For example, the following expressions all return true:

fn:name#1 instance of function(*)
fn { @id } instance of function(*)
fn:random-number-generator() instance of function(*)
[ 1, 2, 3 ] instance of fn(*)
{} instance of fn(*)

A TypedFunctionType matches a function item if the function’s type signature (as defined in Section 7.1 Function Items^DM) is a subtype of the TypedFunctionType.

Note:

The keywords function and fn are synonymous.

If parameter names are included in a TypedFunctionType, they are purely documentary and have no semantic effect. In particular, they play no part in deciding whether a particular function item matches the function type, and they never appear as keywords in function calls. For example the construct function($x as node()) as xs:string designates exactly the same type as function(node()) as xs:string.

Any parameter names that are supplied must be distinct [err:XQST0039].

A TypedFunctionType may also match certain maps and arrays, as described in 3.2.8.2 Map Types and 3.2.8.4 Array Types

Here are some examples of expressions that use a TypedFunctionType:

fn:count#1 instance of function(item()*) as xs:integer returns true, because the signature of the function item fn:count#1 is function(item()*) as xs:integer.
fn:count#1 instance of function(xs:string*) as item() returns true, because the signature of the function item fn:count#1 is a subtype of function(xs:string*) as item().
Note:
The same type might also be written fn($x as xs:int, $y as xs:int) as xs:int.
function(xs:anyAtomicType) as item()* matches any map, or any other function item with the required signature.
function(xs:integer) as item()* matches any array, or any other function item with the required signature.

3.2.8.3 Record Types

Changes in 4.0 ⬇ ⬆

Record types are added as a new kind of ItemType, constraining the value space of maps.
The syntax record(*) is allowed; it matches any map. [Issue 52 PR 728 10 October 2023]
The syntax record() is allowed; the only thing it matches is an empty map. [Issue 1491 PR 1577 17 October 2024]

A RecordType matches maps that meet specific criteria.

For example, the RecordTyperecord(r as xs:double, i as xs:double) matches a map if the map has exactly two entries: an entry with key "r" whose value is a singletonxs:double value, and an entry with key "i" whose value is also a singletonxs:double value.

Record types describe a subset of the value space of maps. They do not define any new kinds of values, or any additional operations. They are useful in many cases to describe more accurately the type of a variable, function parameter, or function result, giving benefits both in the readability of the code, and in the ability of the processor to detect and diagnose type errors and to optimize execution.

`RecordType`	::=	`AnyRecordType \| TypedRecordType`
`AnyRecordType`	::=	`"record" "(" "*" ")"`
`TypedRecordType`	::=	`"record" "(" (FieldDeclaration ** ",") ExtensibleFlag? ")"`
`FieldDeclaration`	::=	`FieldName "?"? ("as" SequenceType)?`
`FieldName`	::=	`NCName \| StringLiteral`
`StringLiteral`	::=	`AposStringLiteral \| QuotStringLiteral`
		/* ws: explicit */
`SequenceType`	::=	`("empty-sequence" "(" ")") \| (ItemTypeOccurrenceIndicator?)`
`ExtensibleFlag`	::=	`"," "*"`

If the list of fields ends with ",*" then the record type is said to be extensible. For example, the RecordTyperecord(e as element(Employee), *) matches a map if it has an entry with key "e" whose value matches element(Employee), regardless what other entries the map might contain.

For generality:

The syntax record() defines a record type that has no explicit fields and that is not extensible. The only thing it matches is an empty map^DM.
The syntax record(*) defines an extensible record type that has no explicit field declarations. It is equivalent to the item type map(*): that is, it matches any map.

A record type can constrain only those entries whose keys are strings, but when the record type is marked as extensible, then other entries may be present in the map with either string or non-string keys. Entries whose key is a string can be expressed using an (unquoted) NCName if the key conforms to NCName syntax, or using a (quoted) string literal otherwise.

Although constructors for named record types produce a map in which the entry order^DM reflects the order of field definitions in the record type definition, the entry order^DM of a map has no effect on whether the map matches a particular record type: the entries in a map do not have to be in any particular order.

Note:

Lookup expressions have been extended in 4.0 so that non-NCName keys can be used without parentheses: employee?"middle name"

If the type declaration for a field is omitted, then item()* is assumed: that is, the map entry may have any type.

If the field name is followed by a question mark, then the value must have the specified type if it is present, but it may also be absent. For example, the RecordTyperecord(first as xs:string, middle? as xs:string, last as xs:string, *) requires the map to have string-valued entries with keys "first" and "last"; it also declares that if the map has an entry with key "middle", the value of that entry must be a single xs:string. Declaring the type as record(first as xs:string, middle? as xs:string?, last as xs:string, *) also allows the entry with key "middle" to be present but empty.

Note:

Within an extensible record type, a FieldDeclaration that is marked optional and has no declared type does not constrain the map in any way, so it serves no practical purpose, but it is permitted because it may have documentary value.

The names of the fields in a record type must be distinct [err:XPST0021].

If a variable $rec is known to conform to a particular record type, then when a lookup expression $rec?field is used, (a) the processor can report a type error if $rec cannot contain an entry with name field (see 4.13.3.4 Implausible Lookup Expressions), and (b) the processor can make static type inferences about the type of value returned by $rec?field.

Note:

(TODO: change function signatures as suggested here!) A number of functions in the standard function library use maps as function arguments; this is a useful technique where the information to be supplied across the interface is highly variable. However, the type signature for such functions typically declares the argument type as map(*), which gives very little information (and places very few constraints) on the values that are actually passed across. Using record types offers the possibility of improving this: for example, the options argument of fn:parse-json, previously given as map(*), can now be expressed as record(liberal? as xs:boolean, duplicates? as xs:string, escape? as xs:boolean, fallback as fn(xs:string) as xs:string, *). In principle the xs:string type used to describe the duplicates option could also be replaced by a schema-defined subtype of xs:string that enumerates the permitted values ("reject", "use-first", "use-last").

The use of a record type in the signature of such a function causes the coercion rules to be invoked. So, for example, if the function expects an entry in the map to be an xs:double value, it becomes possible to supply a map in which the corresponding entry has type xs:integer.

Greater precision in defining the types of such arguments also enables better type checking, better diagnostics, better optimization, better documentation, and better syntax-directed editing tools.

Note:

One of the motivations for introducing record types is to enable better pattern matching in XSLT when processing JSON input. With XML input, patterns are often based around XML element names. JSON has no direct equivalent of XML’s element names; matching a JSON object such as {longitude: 130.2, latitude: 53.4} relies instead on recognizing the property names appearing in the object. XSLT 4.0, by integrating record types into pattern matching syntax, allows such an object to be matched with a pattern of the form match="record(longitude, latitude)"

Rules defining whether one record type is a subtype of another are given in 3.3.2.8 Subtyping Records.

3.3 Subtype Relationships

Changes in 4.0 ⬇ ⬆

The presentation of the rules for the subtype relationship between sequence types and item types has been substantially rewritten to improve clarity; no change to the semantics is intended. [Issue 196 PR 202 25 October 2022]

[Definition: Given two sequence types or item types, the rules in this section determine if one is a subtype of the other. If a type A is a subtype of type B, it follows that every value matched by A is also matched by B.]

Note:

The relationship subtype(A, A) is always true: every type is a subtype of itself.

Note:

The converse is not necessarily true: we cannot infer that if every value matched by A is also matched by B, then A is a subtype of type B. For example, A might be defined as the set of strings matching the regular expression [A-Z]*, while B is the set of strings matching the regular expression [A-Za-z]*; no subtype relationship holds between these types.

The rules for deciding whether one sequence type is a subtype of another are given in 3.3.1 Subtypes of Sequence Types. The rules for deciding whether one item type is a subtype of another are given in 3.3.2 Subtypes of Item Types.

Note:

The subtype relationship is not acyclic. There are cases where subtype(A, B) and subtype(B, A) are both true. This implies that A and B have the same value space, but they can still be different types. For example this applies when A is a union type with member types xs:string and xs:integer, while B is a union type with member types xs:integer and xs:string. These are different types ("23" cast as A produces a string, while "23" cast as B produces an integer, because casting is attempted to each member type in order) but both types have the same value space.

3.3.2 Subtypes of Item Types

We use the notation A ⊆ B, or itemtype-subtype(A, B) to indicate that an item typeA is a subtype of an item type B. This section defines the rules for deciding whether any two item types have this relationship.

The rules in this section apply to item types, not to item type designators. For example, if the name STR has been defined in the static context as a named item type referring to the type xs:string, then anything said here about the type xs:string applies equally whether it is designated as xs:string or as STR, or indeed as the parenthesized forms (xs:string) or (STR).

References to named item types are handled as described in 3.3.2.9 Subtyping of Named Item Types.

The relationship A ⊆ B is true if and only if at least one of the conditions listed in the following subsections applies:

3.3.2.4 Subtyping of Node Types

The following subsections describe the subtype relationships among node types.

3.3.2.4.2 Subtyping Nodes: Document Nodes

Changes in 4.0 ⬇ ⬆

The rules for subtyping of document node types have been refined. [Issue 1624 PR 1898 7 April 2025]

Given item types A and B, A ⊆ B is true if any of the following rules apply.

These rules apply after expanding document-node(N), where N is a NameTestUnion, to the equivalent document-node(element(N)).

A is document-node(E) for any E, and B is document-node().
Examples:
document-node(element(chap)) ⊆ document-node()
document-node(*) ⊆ document-node()
All the following are true:
1. A is document-node(A_e)
2. B is document-node(B_e)
3. A_e ⊆ B_e
Examples:
document-node(element(title)) ⊆ document-node(element(*)).
document-node(title) ⊆ document-node(*).
A is document-node(element(A₁|A₂|..., T)) (where T may be absent), and for each A_n, document-node(element(A_n, T)) ⊆ B.
Examples:
- document-node(a|b) ⊆ document-node(a) | document-node(b)
- document-node(a|b) ⊆ document-node(a|b|c)

3.4 Coercion Rules

Changes in 4.0 ⬇ ⬆

The term "function conversion rules" used in 3.1 has been replaced by the term "coercion rules". [ PR 254 29 November 2022]
The coercion rules allow “relabeling” of a supplied atomic item where the required type is a derived atomic type: for example, it is now permitted to supply the value 3 when calling a function that expects an instance of xs:positiveInteger. [Issue 117 PR 254 29 November 2022]
The coercion rules now allow any numeric type to be implicitly converted to any other, for example an xs:double is accepted where the required type is xs:double. [Issue 980 PR 911 30 January 2024]
The coercion rules now allow conversion in either direction between xs:hexBinary and xs:base64Binary. [Issues 130 480 PR 815 7 November 2023]
The coercion rules now apply recursively to the members of an array and the entries in a map. [Issue 1318 PR 1501 29 October 2024]
The coercion rules now reorder the entries in a map when the required type is a record type. [Issue 1862 PR 1874 25 March 2025]

[Definition: The coercion rules are rules used to convert a supplied value to a required type, for example when converting an argument of a function call to the declared type of the function parameter. ] The required type is expressed as a sequence type. The effect of the coercion rules may be to accept the value as supplied, to convert it to a value that matches the required type, or to reject it with a type error.

This section defines how the coercion rules operate; the situations in which the rules apply are defined elsewhere, by reference to this section.

Note:

In previous versions of this specification, the coercion rules were referred to as the function conversion rules. The terminology has changed because the rules are not exclusively associated with functions or function calling.

If the required type is empty-sequence(), no coercion takes place (the supplied value must be an empty sequence, or a type error occurs).

In all other cases, the required sequence typeT comprises a required item typeR and an optional occurrence indicator. The coercion rules are then applied to a supplied value V and the required type T as follows:

If XPath 1.0 compatibility mode is true and V is not an instance of the required type T, then the conversions defined in 3.4.1 XPath 1.0 Compatibility Rules are applied to V. Then:
Each item in V is processed against the required item type R using the item coercion rules defined in 3.4.2 Item Coercion Rules, and the results are sequence-concatenated into a single sequence V′.
A type error is raised if the cardinality of V′ does not match the required cardinality of T [err:XPTY0004].

4 Expressions

This section discusses each of the basic kinds of expression. Each kind of expression has a name such as PathExpr, which is introduced on the left side of the grammar production that defines the expression. Since XPath 4.0 is a composable language, each kind of expression is defined in terms of other expressions whose operators have a higher precedence. In this way, the precedence of operators is represented explicitly in the grammar.

The order in which expressions are discussed in this document does not reflect the order of operator precedence. In general, this document introduces the simplest kinds of expressions first, followed by more complex expressions. For the complete grammar, see Appendix [A XPath 4.0 Grammar].

The highest-level symbol in the XPath grammar is XPath.

`XPath`	::=	`Expr`
`Expr`	::=	`(ExprSingle ++ ",")`
`ExprSingle`	::=	`ForExpr \| LetExpr \| QuantifiedExpr \| IfExpr \| OrExpr`

`ExprSingle`	::=	`ForExpr \| LetExpr \| QuantifiedExpr \| IfExpr \| OrExpr`
`ForExpr`	::=	`ForClauseForLetReturn`
`LetExpr`	::=	`LetClauseForLetReturn`
`QuantifiedExpr`	::=	`("some" \| "every") (QuantifierBinding ++ ",") "satisfies" ExprSingle`
`IfExpr`	::=	`"if" "(" Expr ")" (UnbracedActions \| BracedAction)`
`OrExpr`	::=	`AndExpr ("or" AndExpr)*`

The XPath 4.0 operator that has lowest precedence is the comma operator, which is used to combine two operands to form a sequence. As shown in the grammar, a general expression (Expr) can consist of multiple ExprSingle operands, separated by commas.

The name ExprSingle denotes an expression that does not contain a top-level comma operator (despite its name, an ExprSingle may evaluate to a sequence containing more than one item.)

The symbol ExprSingle is used in various places in the grammar where an expression is not allowed to contain a top-level comma. For example, each of the arguments of a function call must be a ExprSingle, because commas are used to separate the arguments of a function call.

After the comma, the expressions that have next lowest precedence are ForExpr, LetExpr, QuantifiedExpr, IfExpr, and OrExpr. Each of these expressions is described in a separate section of this document.

4.6 Path Expressions

`PathExpr`	::=	`("/" RelativePathExpr?) \| ("//" RelativePathExpr) \| RelativePathExpr`
		/* xgc: leading-lone-slash */
`RelativePathExpr`	::=	`StepExpr (("/" \| "//") StepExpr)*`

[Definition: A path expression consists of a series of one or more steps, separated by / or //, and optionally beginning with / or //. A path expression is typically used to locate nodes within trees. ]

Absolute path expressions (those starting with an initial / or //), start their selection from the root node of a tree; relative path expressions (those without a leading / or //) start from the context value.

A path expression consisting of a single step is evaluated as described in 4.6.4 Steps.

4.6.4 Steps

`StepExpr`	::=	`PostfixExpr \| AxisStep`
`PostfixExpr`	::=	`PrimaryExpr \| FilterExpr \| DynamicFunctionCall \| LookupExpr \| FilterExprAM`
`AxisStep`	::=	`(ReverseStep \| ForwardStep) Predicate*`

[Definition: A step is a part of a path expression that generates a sequence of items and then filters the sequence by zero or more predicates. The value of the step consists of those items that satisfy the predicates, working from left to right. A step may be either an axis step or a postfix expression.] Postfix expressions are described in 4.3 Postfix Expressions.

[Definition: An axis step returns a sequence of nodes that are reachable from a starting node via a specified axis. Such a step has two parts: an axis, which defines the "direction of movement" for the step, and a node test, which selects nodes based on their kind, name, and/or type annotation .]

If the context value is a sequence of zero or more nodes, an axis step returns a sequence of zero or more nodes; otherwise, a type error is raised [err:XPTY0020].

The step expression S is equivalent to ./S. Thus, if the context value is a sequence containing multiple nodes, the semantics of a step expression are equivalent to a path expression in which the step is always applied to a single node. The following description therefore explains the semantics for the case where the context value is a single node, called the context node.

Note:

The equivalence of a step S to the path expression ./S means that the resulting node sequence is returned in document order.

An axis step may be either a forward step or a reverse step, followed by zero or more predicates.

In the abbreviated syntax for a step, the axis can be omitted and other shorthand notations can be used as described in 4.6.7 Abbreviated Syntax.

The unabbreviated syntax for an axis step consists of the axis name and node test separated by a double colon. The result of the step consists of the nodes reachable from the starting node via the specified axis that have the node kind, name, and/or type annotation specified by the node test. For example, the step child::para selects the para element children of the context node: child is the name of the axis, and para is the name of the element nodes to be selected on this axis. The available axes are described in 4.6.4.1 Axes. The available node tests are described in 4.6.4.2 Node Tests. Examples of steps are provided in 4.6.6 Unabbreviated Syntax and 4.6.7 Abbreviated Syntax.

4.6.4.1 Axes

Changes in 4.0 ⬇ ⬆

Four new axes have been defined: preceding-or-self, preceding-sibling-or-self, following-or-self, and following-sibling-or-self. [Issue 1519 PR 1532 29 October 2024]

`ForwardAxis`	::=	`("attribute" \| "child" \| "descendant" \| "descendant-or-self" \| "following" \| "following-or-self" \| "following-sibling" \| "following-sibling-or-self" \| "namespace" \| "self") "::"`
`ReverseAxis`	::=	`("ancestor" \| "ancestor-or-self" \| "parent" \| "preceding" \| "preceding-or-self" \| "preceding-sibling" \| "preceding-sibling-or-self") "::"`

XPath defines a set of axes for traversing documents, but a host language may define a subset of these axes. The following axes are defined:

The child axis contains the children of the context node, which are the nodes returned by the Section 6.7.3 children Accessor^DM.
Note:
Only document nodes and element nodes have children. If the context node is any other kind of node, or if the context node is an empty document or element node, then the child axis is an empty sequence. The children of a document node or element node may be element, processing instruction, comment, or text nodes. Attribute, namespace, and document nodes can never appear as children.
The descendant axis is defined as the transitive closure of the child axis; it contains the descendants of the context node (the children, the children of the children, and so on).
More formally, $node/descendant::node() delivers the result of fn:transitive-closure($node, fn { child::node() }).
The descendant-or-self axis contains the context node and the descendants of the context node.
More formally, $node/descendant-or-self::node() delivers the result of $node/(. | descendant::node()).
The parent axis contains the sequence returned by the Section 6.7.11 parent Accessor^DM, which returns the parent of the context node, or an empty sequence if the context node has no parent.
Note:
An attribute node may have an element node as its parent, even though the attribute node is not a child of the element node.
The ancestor axis is defined as the transitive closure of the parent axis; it contains the ancestors of the context node (the parent, the parent of the parent, and so on).
More formally, $node/ancestor::node() delivers the result of fn:transitive-closure($node, fn { parent::node() }).
Note:
The ancestor axis includes the root node of the tree in which the context node is found, unless the context node is the root node.
The ancestor-or-self axis contains the context node and the ancestors of the context node; thus, the ancestor-or-self axis will always include the root node.
More formally, $node/ancestor-or-self::node() delivers the result of $node/(. | ancestor::node()).
The following-sibling axis contains the context node’s following siblings, that is, those children of the context node’s parent that occur after the context node in document order. If the context node is an attribute or namespace node, the following-sibling axis is empty.
More formally, $node/following-sibling::node() delivers the result of fn:siblings($node)[. >> $node]).
The following-sibling-or-self axis contains the context node, together with the contents of the following-sibling axis.
More formally, $node/following-sibling-or-self::node() delivers the result of fn:siblings($node)[not(. << $node)]
The preceding-sibling axis contains the context node’s preceding siblings, that is, those children of the context node’s parent that occur before the context node in document order. If the context node is an attribute or namespace node, the preceding-sibling axis is empty.
More formally, $node/preceding-sibling::node() delivers the result of fn:siblings($node)[. << $node].
The preceding-sibling-or-self axis contains the context node, together with the contents of the preceding-sibling axis.
More formally, $node/preceding-sibling-or-self::node() delivers the result of fn:siblings($node)[not(. >> $node).
The following axis contains all nodes that are descendants of the root of the tree in which the context node is found, are not descendants of the context node, and occur after the context node in document order.
More formally, $node/following::node() delivers the result of $node/ancestor-or-self::node()/following-sibling::node()/descendant-or-self::node()
The following-or-self axis contains the context node, together with the contents of the following axis.
More formally, $node/following-or-self::node() delivers the result of $node/(. | following::node()).
The preceding axis contains all nodes that are descendants of the root of the tree in which the context node is found, are not ancestors of the context node, and occur before the context node in document order.
More formally, $node/preceding::node() delivers the result of $node/ancestor-or-self::node()/preceding-sibling::node()/descendant-or-self::node().
The preceding-or-self axis contains the context node, together with the contents of the preceding axis.
More formally, $node/preceding-or-self::node() delivers the result of $node/(. | preceding::node()).
The attribute axis contains the attributes of the context node, which are the nodes returned by the Section 6.7.1 attributes Accessor^DM; the axis will be empty unless the context node is an element.
The self axis contains just the context node itself.
The self axis is primarily useful when testing whether the context node satisfies particular conditions, for example if ($x[self::chapter]).
More formally, $node/self::node() delivers the result of $node.
The namespace axis contains the namespace nodes of the context node, which are the nodes returned by the Section 6.7.7 namespace-nodes Accessor^DM; this axis is empty unless the context node is an element node. The namespace axis is deprecated as of XPath 2.0. If XPath 1.0 compatibility mode is true, the namespace axis must be supported. If XPath 1.0 compatibility mode is false, then support for the namespace axis is implementation-defined. An implementation that does not support the namespace axis when XPath 1.0 compatibility mode is false must raise a static error [err:XPST0010] if it is used. Applications needing information about the in-scope namespaces of an element should use the functions Section 11.2.7 fn:in-scope-prefixes^FO, and Section 11.2.8 fn:namespace-uri-for-prefix^FO.

Axes can be categorized as forward axes and reverse axes. An axis that only ever contains the context node or nodes that are after the context node in document order is a forward axis. An axis that only ever contains the context node or nodes that are before the context node in document order is a reverse axis.

The parent, ancestor, ancestor-or-self, preceding, preceding-or-self, preceding-sibling, and preceding-sibling-or-self axes are reverse axes; all other axes are forward axes.

The ancestor, descendant, following, preceding and self axes partition a document (ignoring attribute and namespace nodes): they do not overlap and together they contain all the nodes in the document.

[Definition: Every axis has a principal node kind. If an axis can contain elements, then the principal node kind is element; otherwise, it is the kind of nodes that the axis can contain.] Thus:

For the attribute axis, the principal node kind is attribute.
For the namespace axis, the principal node kind is namespace.
For all other axes, the principal node kind is element.

4.13 Maps and Arrays

Most modern programming languages have support for collections of key/value pairs, which may be called maps, dictionaries, associative arrays, hash tables, keyed lists, or objects (these are not the same thing as objects in object-oriented systems). In XPath 4.0, we call these maps. Most modern programming languages also support ordered lists of values, which may be called arrays, vectors, or sequences. In XPath 4.0, we have both sequences and arrays. Unlike sequences, an array is an item, and can appear as an item in a sequence.

Note:

The XPath 4.0 specification focuses on syntax provided for maps and arrays, especially constructors and lookup.

Some of the functionality typically needed for maps and arrays is provided by functions defined in Section 18 Processing maps^FO and Section 19 Processing arrays^FO, including functions used to read JSON to create maps and arrays, serialize maps and arrays to JSON, combine maps to create a new map, remove map entries to create a new map, iterate over the keys of a map, convert an array to create a sequence, combine arrays to form a new array, and iterate over arrays in various ways.

4.13.4 Filter Expressions for Maps and Arrays

Changes in 4.0 ⬇ ⬆

Filter expressions for maps and arrays are introduced. [Issue 1159 PR 1163 20 April 2024]
Predicates in filter expressions for maps and arrays can now be numeric. [Issue 1207 PR tba 1217 15 May 2024]

`FilterExprAM`	::=	`PostfixExpr "?[" Expr "]"`
`PostfixExpr`	::=	`PrimaryExpr \| FilterExpr \| DynamicFunctionCall \| LookupExpr \| FilterExprAM`
`Expr`	::=	`(ExprSingle ++ ",")`
`ExprSingle`	::=	`ForExpr \| LetExpr \| QuantifiedExpr \| IfExpr \| OrExpr`

Maps and arrays can be filtered using the construct INPUT?[FILTER]. For example, $array?[count(.)=1] filters an array to retain only those members that are single items.

Note:

The character-pair ?[ forms a single token; no intervening whitespace or comment is allowed.

The required type of the left-hand operand INPUT is (map(*)|array(*))?: that is, it must be either an empty sequence, a single map, or a single array [err:XPTY0004]. If it is an empty sequence, the result of the expression is an empty sequence.

If the value of INPUT is an array, then the FILTER expression is evaluated for each member of the array, with that member as the context value, with its position in the array as the context position, and with the size of the array as the context size. The result of the expression is an array containing those members of the input array for which the predicate truth value of the FILTER expression is true. The order of retained members is preserved.

For example, the following expression:

let $array := [ (), 1, (2, 3), (4, 5, 6) ]
return $array?[count(.) ge 2]

returns:

[ (2, 3), (4, 5, 6) ]

Note:

Numeric predicates are handled in the same way as with filter expressions for sequences. However, the result is always an array, even if only one member is selected. For example, given the $array shown above, the result of $array?[3] is the single-member array^DM[ (2, 3) ]. Contrast this with $array?3 which delivers the sequence 2, 3.

If the value of INPUT is a map, then the FILTER expression is evaluated for each entry in the map, with the context value set to an item of type record(key as xs:anyAtomicType, value as item()*), in which the key and value fields represent the key and value of the map entry. The context position is the position of the entry in the map (in entry order^DM), and the context size is the number of entries in the map. The result of the expression is a map containing those entries of the input map for which the predicate truth value of the FILTER expression is true. The relative order of entries in the result retains the relative order of entries in the input.

For example, the following expression:

let map := { 1: "alpha", 2: "beta", 3: "gamma" }
return $map?[?key ge 2]

returns:

{ 2: "beta", 3: "gamma" }

Note:

A filter expression such as $map?[last()-1, last()] might be used to return the last two entries of a map in entry order^DM.

4.20 Arrow Expressions

Changes in 4.0 ⬇ ⬆

The syntax on the right-hand side of an arrow operator has been relaxed; a dynamic function call no longer needs to start with a variable reference or a parenthesized expression, it can also be (for example) an inline function expression or a map or array constructor. [Issues 1716 1829 PRs 1763 1830 25 February 2025]

Arrow expressions apply a function to a value, using the value of the left-hand expression as the first argument to the function.

`ArrowExpr`	::=	`UnaryExpr (SequenceArrowTarget \| MappingArrowTarget)*`
`UnaryExpr`	::=	`("-" \| "+")* ValueExpr`
`SequenceArrowTarget`	::=	`"=>" ArrowTarget`
`ArrowTarget`	::=	`FunctionCall \| RestrictedDynamicCall`
`FunctionCall`	::=	`EQNameArgumentList`
		/* xgc: reserved-function-names */
		/* gn: parens */
`RestrictedDynamicCall`	::=	`(VarRef \| ParenthesizedExpr \| FunctionItemExpr \| MapConstructor \| ArrayConstructor) PositionalArgumentList`
`VarRef`	::=	`"$" EQName`
`ParenthesizedExpr`	::=	`"(" Expr? ")"`
`FunctionItemExpr`	::=	`NamedFunctionRef \| InlineFunctionExpr`
`NamedFunctionRef`	::=	`EQName "#" IntegerLiteral`
		/* xgc: reserved-function-names */
`InlineFunctionExpr`	::=	`MethodAnnotation* ("function" \| "fn") FunctionSignature? FunctionBody`
`MapConstructor`	::=	`"map"? "{" (MapConstructorEntry ** ",") "}"`
`ArrayConstructor`	::=	`SquareArrayConstructor \| CurlyArrayConstructor`
`PositionalArgumentList`	::=	`"(" PositionalArguments? ")"`
`PositionalArguments`	::=	`(Argument ++ ",")`
`MappingArrowTarget`	::=	`"=!>" ArrowTarget`

The arrow syntax is particularly helpful when applying multiple functions to a value in turn. For example, the following expression invites syntax errors due to misplaced parentheses:

tokenize((normalize-unicode(upper-case($string))),"\s+")

In the following reformulation, it is easier to see that the parentheses are balanced:

$string => upper-case() => normalize-unicode() => tokenize("\s+")

When the operator is written as =!>, the function is applied to each item in the sequence in turn. Assuming that $string is a single string, the above example could equally be written:

$string =!> upper-case() =!> normalize-unicode() =!> tokenize("\s+")

The difference between the two operators is seen when the left-hand operand evaluates to a sequence:

(1, 2, 3) => avg()

returns a value of only one item, 2, the average of all three items, whereas

(1, 2, 3) =!> avg()

returns the original sequence of three items, (1, 2, 3), each item being the average of itself. The following example:

"The cat sat on the mat"
=> tokenize()
=!> concat(".")
=!> upper-case()
=> string-join(" ")

returns "THE. CAT. SAT. ON. THE. MAT.". The first arrow could be written either as => or =!> because the operand is a singleton; the next two arrows have to be =!> because the function is applied to each item in the tokenized sequence individually; the final arrow must be => because the string-join function applies to the sequence as a whole.

Note:

It may be useful to think of this as a map/reduce pipeline. The functions introduced by =!> are mapping operations; the function introduced by => is a reduce operation.

The following example introduces an inline function to the pipeline:

(1 to 5) =!> xs:double() =!> math:sqrt() =!> fn($a) { $a + 1 }() => sum()

This is equivalent to sum((1 to 5) ! (math:sqrt(xs:double(.)) + 1)).

The same effect can be achieved using a focus function:

(1 to 5) =!> xs:double() =!> math:sqrt() =!> fn { . + 1 }() => sum()

It could also be expressed using the mapping operator !:

(1 to 5) ! xs:double(.) ! math:sqrt(.) ! (. + 1) => sum()

Note:

The ArgumentList may include PlaceHolders, though this is not especially useful. For example, the expression "$" => concat(?) is equivalent to concat("$", ?): its value is a function that prepends a supplied string with a $ symbol.

Note:

The ArgumentList may include keyword arguments if the function is identified statically (that is, by name). For example, the following is valid: $xml => xml-to-json(indent := true()) => parse-json(escape := false()).

The sequence arrow operator thus applies the supplied function to the left-hand operand as a whole, while the mapping arrow operator applies the function to each item in the value of the left-hand operand individually. In the case where the result of the left-hand operand is a single item, the two operators have the same effect.

Note:

The mapping arrow symbol =!> is intended to suggest a combination of function application (=>) and sequence mapping (!) combined in a single operation.

The construct on the right-hand side of the arrow operator (=>) can either be a static function call, or a restricted form of dynamic function call. The restrictions are there to ensure that the two forms can be distinguished by the parser with limited lookahead. For a dynamic call, the function item to be called can be expressed as a variable reference, an inline function expression, a named function reference, a map constructor, or an array constructor. Any other expression used to return the required function item must be enclosed in parentheses.

I Change Log (Non-Normative)

Use the arrows to browse significant changes since the 3.1 version of this specification.
See 1 Introduction
Sections with significant changes are marked Δ in the table of contents.
See 1 Introduction
Setting the default namespace for elements and types to the special value ##any causes an unprefixed element name to act as a wildcard, matching by local name regardless of namespace.
See 3.2.7.2 Element Types
The terms FunctionType, ArrayType, MapType, and RecordType replace FunctionTest, ArrayTest, MapTest, and RecordTest, with no change in meaning.
See 3.2.8.1 Function Types
Record types are added as a new kind of ItemType, constraining the value space of maps.
See 3.2.8.3 Record Types
Function coercion now allows a function with arity N to be supplied where a function of arity greater than N is expected. For example this allows the function true#0 to be supplied where a predicate function is required.
See 3.4.4 Function Coercion
PR 1817 1853
An inline function may be annotated as a %method, giving it access to its containing map.
See 4.5.6 Inline Function Expressions
See 4.5.6.1 Methods
See 4.13.3 Lookup Expressions
The symbols × and ÷ can be used for multiplication and division.
See 4.8 Arithmetic Expressions
The rules for value comparisons when comparing values of different types (for example, decimal and double) have changed to be transitive. A decimal value is no longer converted to double, instead the double is converted to a decimal without loss of precision. This may affect compatibility in edge cases involving comparison of values that are numerically very close.
See 4.10.1 Value Comparisons
Operators such as < and > can use the full-width forms ＜ and ＞ to avoid the need for XML escaping.
See 4.10.2 General Comparisons
The lookup operator ? can now be followed by a string literal, for cases where map keys are strings other than NCNames. It can also be followed by a variable reference.
See 4.13.3 Lookup Expressions
PR 1864 1877
The key specifier can reference an item type or sequence type, to select values of that type only. This is especially useful when processing trees of maps and arrays, as encountered when processing JSON input.
See 4.13.3 Lookup Expressions
PR 1763 1830
The syntax on the right-hand side of an arrow operator has been relaxed; a dynamic function call no longer needs to start with a variable reference or a parenthesized expression, it can also be (for example) an inline function expression or a map or array constructor.
See 4.20 Arrow Expressions
The arrow operator => is now complemented by a “mapping arrow” operator =!> which applies the supplied function to each item in the input sequence independently.
See 4.20.2 Mapping Arrow Expressions
PR 1023 1128
It has been clarified that function coercion applies even when the supplied function item matches the required function type. This is to ensure that arguments supplied when calling the function are checked against the signature of the required function type, which might be stricter than the signature of the supplied function item.
See 3.4.4 Function Coercion
Parameter names may be included in a function signature; they are purely documentary.
See 3.2.8.1 Function Types
Parameter names may be included in a function signature; they are purely documentary.
See 3.2.8.1 Function Types
PR tba
Predicates in filter expressions for maps and arrays can now be numeric.
See 4.13.4 Filter Expressions for Maps and Arrays
PR tba
Predicates in filter expressions for maps and arrays can now be numeric.
See 4.13.4 Filter Expressions for Maps and Arrays
The static typing feature has been dropped.
See 5 Conformance
The syntax record() is allowed; the only thing it matches is an empty map.
See 3.2.8.3 Record Types
The syntax record() is allowed; the only thing it matches is an empty map.
See 3.2.8.3 Record Types
The context value static type, which was there purely to assist in static typing, has been dropped.
See 2.2.1 Static Context
The context value static type, which was there purely to assist in static typing, has been dropped.
See 2.2.1 Static Context
Four new axes have been defined: preceding-or-self, preceding-sibling-or-self, following-or-self, and following-sibling-or-self.
See 4.6.4.1 Axes
Four new axes have been defined: preceding-or-self, preceding-sibling-or-self, following-or-self, and following-sibling-or-self.
See 4.6.4.1 Axes
The syntax document-node(N), where N is a NameTestUnion, is introduced as an abbreviation for document-node(element(N)). For example, document-node(*) matches any well-formed XML document (as distinct from a document fragment).
See 3.2.7 Node Types
The rules for subtyping of document node types have been refined.
See 3.3.2.4.2 Subtyping Nodes: Document Nodes
The rules for subtyping of document node types have been refined.
See 3.3.2.4.2 Subtyping Nodes: Document Nodes
The coercion rules now reorder the entries in a map when the required type is a record type.
See 3.4 Coercion Rules
The coercion rules now reorder the entries in a map when the required type is a record type.
See 3.4 Coercion Rules
PR 28
Multiple for and let clauses can be combined in an expression without an intervening return keyword.
See 4.12.1 For Expressions
See 4.12.2 Let Expressions
PR 159
Keyword arguments are allowed on static function calls, as well as positional arguments.
See 4.5.1.1 Static Function Call Syntax
PR 202
The presentation of the rules for the subtype relationship between sequence types and item types has been substantially rewritten to improve clarity; no change to the semantics is intended.
See 3.3 Subtype Relationships
PR 230
The rules for “errors and optimization” have been tightened up to disallow many cases of optimizations that alter error behavior. In particular there are restrictions on reordering the operands of and and or, and of predicates in filter expressions, in a way that might allow the processor to raise dynamic errors that the author intended to prevent.
See 2.4.5 Guarded Expressions
PR 254
The term "function conversion rules" used in 3.1 has been replaced by the term "coercion rules".
See 3.4 Coercion Rules
The coercion rules allow “relabeling” of a supplied atomic item where the required type is a derived atomic type: for example, it is now permitted to supply the value 3 when calling a function that expects an instance of xs:positiveInteger.
See 3.4 Coercion Rules
PR 284
Alternative syntax for conditional expressions is available: if (condition) { X }.
See 4.14 Conditional Expressions
PR 286
Element and attribute tests can include alternative names: element(chapter|section), attribute(role|class).
See 3.2.7 Node Types
The NodeTest in an AxisStep now allows alternatives: ancestor::(section|appendix)
See 3.2.7 Node Types
Element and attribute tests of the form element(N) and attribute(N) now allow N to be any NameTest, including a wildcard.
See 3.2.7.2 Element Types
See 3.2.7.3 Attribute Types
PR 324
String templates provide a new way of constructing strings: for example `{$greeting}, {$planet}!` is equivalent to $greeting || ', ' || $planet || '!'
See 4.9.2 String Templates
PR 326
Support for higher-order functions is now a mandatory feature (in 3.1 it was optional).
See 5 Conformance
PR 344
A for member clause is added to FLWOR expressions to allow iteration over an array.
See 4.12.1 For Expressions
PR 368
The concept of the context item has been generalized, so it is now a context value. That is, it is no longer constrained to be a single item.
See 2.2.2 Dynamic Context
PR 433
Numeric literals can now be written in hexadecimal or binary notation; and underscores can be included for readability.
See 4.2.1.1 Numeric Literals
PR 519
The rules for tokenization have been largely rewritten. In some cases the revised specification may affect edge cases that were handled in different ways by different 3.1 processors, which could lead to incompatible behavior.
See A.3 Lexical structure
PR 521
New abbreviated syntax is introduced (focus function) for simple inline functions taking a single argument. An example is fn { ../@code }
See 4.5.6 Inline Function Expressions
PR 603
The rules for reporting type errors during static analysis have been changed so that a processor has more freedom to report errors in respect of constructs that are evidently wrong, such as @price/@value, even though dynamic evaluation is defined to return an empty sequence rather than an error.
See 2.4.6 Implausible Expressions
See 4.6.4.3 Implausible Axis Steps
PR 606
Element and attribute tests of the form element(A|B) and attribute(A|B) are now allowed.
See 3.2.7.2 Element Types
See 3.2.7.3 Attribute Types
PR 691
Enumeration types are added as a new kind of ItemType, constraining the value space of strings.
See 3.2.6 Enumeration Types
PR 728
The syntax record(*) is allowed; it matches any map.
See 3.2.8.3 Record Types
PR 815
The coercion rules now allow conversion in either direction between xs:hexBinary and xs:base64Binary.
See 3.4 Coercion Rules
PR 837
A deep lookup operator ?? is provided for searching trees of maps and arrays.
See 4.13.3 Lookup Expressions
PR 911
The coercion rules now allow any numeric type to be implicitly converted to any other, for example an xs:double is accepted where the required type is xs:double.
See 3.4 Coercion Rules
PR 996
The value of a predicate in a filter expression can now be a sequence of integers.
See 4.4 Filter Expressions
PR 1031
An otherwise operator is introduced: A otherwise B returns the value of A, unless it is an empty sequence, in which case it returns the value of B.
See 4.15 Otherwise Expressions
PR 1071
In map constructors, the keyword map is now optional, so map { 0: false(), 1: true() } can now be written { 0: false(), 1: true() }, provided it is used in a context where this creates no ambiguity.
See 4.13.1.1 Map Constructors
PR 1125
Lookup expressions can now take a modifier (such as keys, values, or pairs) enabling them to return structured results rather than a flattened sequence.
See 4.13.3 Lookup Expressions
PR 1131
A positional variable can be defined in a for expression.
See 4.12.1 For Expressions
The type of a variable used in a for expression can be declared.
See 4.12.1 For Expressions
The type of a variable used in a let expression can be declared.
See 4.12.2 Let Expressions
PR 1132
Choice item types (an item type allowing a set of alternative item types) are introduced.
See 3.2.5 Choice Item Types
PR 1163
Filter expressions for maps and arrays are introduced.
See 4.13.4 Filter Expressions for Maps and Arrays
PR 1181
The default namespace for elements and types can be set to the value ##any, allowing unprefixed names in axis steps to match elements with a given local name in any namespace.
See 2.2.1 Static Context
If the default namespace for elements and types has the special value ##any, then an unprefixed name in a NameTest acts as a wildcard, matching names in any namespace or none.
See 4.6.4.2 Node Tests
PR 1197
The keyword fn is allowed as a synonym for function in function types, to align with changes to inline function declarations.
See 3.2.8.1 Function Types
In inline function expressions, the keyword function may be abbreviated as fn.
See 4.5.6 Inline Function Expressions
PR 1212
XPath 3.0 included empty-sequence and item as reserved function names, and XPath 3.1 added map and array. This was unnecessary since these names never appear followed by a left parenthesis at the start of an expression. They have therefore been removed from the list. New keywords introducing item types, such as record and enum, have not been included in the list.
See A.4 Reserved Function Names
PR 1217
Predicates in filter expressions for maps and arrays can now be numeric.
See 4.13.4 Filter Expressions for Maps and Arrays
PR 1217
Predicates in filter expressions for maps and arrays can now be numeric.
See 4.13.4 Filter Expressions for Maps and Arrays
PR 1249
A for key/value clause is added to FLWOR expressions to allow iteration over maps.
See 4.12.1 For Expressions
PR 1250
Several decimal format properties, including minus sign, exponent separator, percent, and per-mille, can now be rendered as arbitrary strings rather than being confined to a single character.
See 2.2.1.2 Decimal Formats
PR 1265
The rules regarding the document-uri property of nodes returned by the fn:collection function have been relaxed.
See 2.2.2 Dynamic Context
PR 1344
Parts of the static context that were there purely to assist in static typing, such as the statically known documents, were no longer referenced and have therefore been dropped.
See 2.2.1 Static Context
The static typing option has been dropped.
See 2.3 Processing Model
PR 1361
The term atomic value has been replaced by atomic item.
See 2.1.2 Values
PR 1384
If a type declaration is present, the supplied values in the input sequence are now coerced to the required type. Type declarations are now permitted in XPath as well as XQuery.
See 4.16 Quantified Expressions
PR 1496
The context value static type, which was there purely to assist in static typing, has been dropped.
See 2.2.1 Static Context
PR 1496
The context value static type, which was there purely to assist in static typing, has been dropped.
See 2.2.1 Static Context
PR 1498
The EBNF operators ++ and ** have been introduced, for more concise representation of sequences using a character such as "," as a separator. The notation is borrowed from Invisible XML.
See 2.1 Terminology
The EBNF notation has been extended to allow the constructs (A ++ ",") (one or more occurrences of A, comma-separated, and (A ** ",") (zero or more occurrences of A, comma-separated.
See 2.1.1 Grammar Notation
The EBNF operators ++ and ** have been introduced, for more concise representation of sequences using a character such as "," as a separator. The notation is borrowed from Invisible XML.
See A.1 EBNF
See A.1.1 Notation
PR 1501
The coercion rules now apply recursively to the members of an array and the entries in a map.
See 3.4 Coercion Rules
PR 1532
Four new axes have been defined: preceding-or-self, preceding-sibling-or-self, following-or-self, and following-sibling-or-self.
See 4.6.4.1 Axes
PR 1532
Four new axes have been defined: preceding-or-self, preceding-sibling-or-self, following-or-self, and following-sibling-or-self.
See 4.6.4.1 Axes
PR 1577
The syntax record() is allowed; the only thing it matches is an empty map.
See 3.2.8.3 Record Types
PR 1577
The syntax record() is allowed; the only thing it matches is an empty map.
See 3.2.8.3 Record Types
PR 1686
With the pipeline operator ->, the result of an expression can be bound to the context value before evaluating another expression.
See 4.18 Pipeline operator
PR 1696
Parameter names may be included in a function signature; they are purely documentary.
See 3.2.8.1 Function Types
PR 1696
Parameter names may be included in a function signature; they are purely documentary.
See 3.2.8.1 Function Types
PR 1703
Ordered maps are introduced.
See 4.13.1 Maps
The order of key-value pairs in the map constructor is now retained in the constructed map.
See 4.13.1.1 Map Constructors
PR 1874
The coercion rules now reorder the entries in a map when the required type is a record type.
See 3.4 Coercion Rules
PR 1874
The coercion rules now reorder the entries in a map when the required type is a record type.
See 3.4 Coercion Rules
PR 1898
The rules for subtyping of document node types have been refined.
See 3.3.2.4.2 Subtyping Nodes: Document Nodes
PR 1898
The rules for subtyping of document node types have been refined.
See 3.3.2.4.2 Subtyping Nodes: Document Nodes

XML Path Language (XPath) 4.0 WG Review Draft

W3C Editor's Draft 23 February 2026

Abstract

Status of this Document

Dedication

2 Basics

2.2 Expression Context

2.2.1 Static Context

3 Types

3.2 Item Types

3.2.8 Function, Map, and Array Types

3.2.8.1 Function Types

3.2.8.3 Record Types

3.3 Subtype Relationships

3.3.2 Subtypes of Item Types

3.3.2.4 Subtyping of Node Types

3.3.2.4.2 Subtyping Nodes: Document Nodes

3.4 Coercion Rules

4 Expressions

4.6 Path Expressions

4.6.4 Steps

4.6.4.1 Axes

4.13 Maps and Arrays

4.13.4 Filter Expressions for Maps and Arrays

4.20 Arrow Expressions

I Change Log (Non-Normative)