Please check the errata for any errors or issues reported since publication.
See also translations.
This document is also available in these non-normative formats: XML.
Copyright © 2000 W3C® (MIT, ERCIM, Keio, Beihang). W3C liability, trademark and document use rules apply.
XML is a versatile markup language, capable of labeling the information content of diverse data sources, including structured and semi-structured documents, relational databases, and object repositories. A query language that uses the structure of XML intelligently can express queries across all these kinds of data, whether physically stored in XML or viewed as XML via middleware. This specification describes a query language called XQuery, which is designed to be broadly applicable across many types of XML data sources.
A list of changes made since XQuery 3.1 can be found in J Change Log.
This is a draft prepared by the QT4CG (officially registered in W3C as the XSLT Extensions Community Group). Comments are invited.
The publications of this community group are dedicated to our co-chair, Michael Sperberg-McQueen (1954–2024).
As noted in 2.1.3 Values, every value in XQuery 4.0 is regarded as a sequence of zero, one, or more items. The type system of XQuery 4.0, described in this section, classifies the kinds of value that the language can handle, and the operations permitted on different kinds of value.
The type system of XQuery 4.0 is related to the type system of [XML Schema 1.0] or [XML Schema 1.1] in two ways:
atomic items in XQuery 4.0 (which are one kind of item) have atomic types such as xs:string, xs:boolean, and xs:integer. These types are taken directly from their definitions in [XML Schema 1.0] or [XML Schema 1.1].
XNodes (which are another kind of item) have a property called a type annotation which determines the type of their content. The type annotation is a schema type. The type annotation of a node must not be confused with the item type of the node. For example, an element <age>23</age> might have been validated against a schema that defines this element as having xs:integer content. If this is the case, the type annotation of the node will be xs:integer, and in the XQuery 4.0 type system, the node will match the item typeelement(age, xs:integer).
This chapter of the specification starts by defining sequence types and item types, which describe the range of values that can be bound to variables, used in expressions, or passed to functions. It then describes how these relate to schema types, that is, the simple and complex types defined in an XSD schema.
Note:
In many situations the terms item type and sequence type are used interchangeably to refer either to the type itself, or to the syntactic construct that designates the type: so in the expression $x instance of xs:string*, the construct xs:string* uses the SequenceType syntax to designate a sequence type whose instances are sequences of strings. When more precision is required, the specification is careful to use the terms item type and sequence type to refer to the actual types, while using the production names ItemType and SequenceType to refer to the syntactic designators of these types.
[Definition: An item type is a type that can be expressed using the ItemType syntax, which forms part of the SequenceType syntax. Item types match individual items.]
Note:
While this definition is adequate for the purpose of defining the syntax of XQuery 4.0, it ignores the fact that there are also item types that cannot be expressed using XQuery 4.0 syntax: specifically, item types that reference an anonymous simple type or complex type defined in a schema. Such types can appear as type annotations on nodes following schema validation.
In most cases, the set of items matched by an item type consists either exclusively of atomic items, exclusively of nodes, or exclusively of function itemsDM. Exceptions include the generic types item(), which matches all items, xs:error, which matches no items, and choice item types, which can match any combination of types.
[Definition: An item type designator is a syntactic construct conforming to the grammar rule ItemType. An item type designator is said to designate an item type.]
Note:
Two item type designators may designate the same item type. For example, element() and element(*) are equivalent, as are attribute(A) and attribute(A, xs:anySimpleType).
Lexical QNames appearing in an item type designator(other than within a function assertion) are expanded using the default type namespace rule. Equality of QNames is defined by the eq operator.
This section defines the syntax and semantics of different ItemTypes in terms of the values that they match.
Note:
For an explanation of the EBNF grammar notation (and in particular, the operators ++ and **), see A.1 EBNF.
An item type designator written simply as an EQName (that is, a TypeName) is interpreted as follows:
If the name is written as a lexical QName, then it is expanded using the default type namespace rule.
If the expanded name matches a named item type in the static context, then it is taken as a reference to the corresponding item type. The rules that apply are the rules for the expanded item type definition.
Otherwise, it must match the name of a type in the in-scope schema types in the static context: specifically, an atomic type or a pure union type. See 3.5 Schema Types for details.
Note:
A name in the xs namespace will always fall into this category, since the namespace is reserved. See 2.1.4 Namespaces and QNames.
If the name cannot be resolved to a type, a static error is raised [err:XPST0051].
[Definition: An EnumerationType accepts a fixed set of string values.]
EnumerationType | ::= | "enum" "(" (StringLiteral ++ ",") ")" |
StringLiteral | ::= | AposStringLiteral | QuotStringLiteral |
| /* ws: explicit */ |
An enumeration type has a value space consisting of a set of xs:string values. When matching strings against an enumeration type, strings are always compared using the Unicode codepoint collation.
For example, if an argument of a function declares the required type as enum("red", "green", "blue"), then the string "green" is accepted, while "yellow" is rejected with a type error.
Technically, enumeration types are defined as follows:
[Definition: An enumeration type with a single enumerated value E (such as enum("red")) matches an item S if and only if (a) S is an instance of xs:string, and (b) S is equal to E when compared using Unicode codepoint collation. This is referred to as a singleton enumeration type.]
Note:
When matching a string S against an enumeration type, then apart from the requirement that S is an instance of xs:string, the type annotation of S is immaterial.
A singleton enumeration type whose enumerated value is E is a subtype of xs:string and of every subtype of xs:string that has E in its value space.
Two singleton enumeration types are the same type if and only if they have the same (single) enumerated value, as determined using the Unicode codepoint collation.
An enumeration type with multiple enumerated values is a union of singleton enumeration types, so enum("red", "green", "blue") is equivalent to (enum("red") | enum("green") | enum("blue")).
In consequence, an enumeration type T is a subtype of an enumeration type U if the enumerated values of T are a subset of the enumerated values of U: see 3.3.2 Subtypes of Item Types.
An enumeration type is a generalized atomic type.
It follows from these rules that the expression "red" instance of enum("red", "green", "blue") returns true. By contrast, xs:untypedAtomic("red") instance of enum("red", "green", "blue") returns false; but the coercion rules ensure that where a variable or function declaration specifies an enumeration type as the required type, an xs:untypedAtomic or xs:anyURI value equal to one of the enumerated values will be accepted.
Note:
Some consequences of these rules may not be immediately apparent.
Suppose that an XQuery query contains the declarations:
declare type my:color := enum("red", "green", "orange");
declare type my:fruit := enum("apple", "orange", "banana");
declare variable $orange-color as my:color := "orange";
declare variable $orange-fruit as my:fruit := "orange";The same applies with the equivalent XSLT syntax:
<xsl:item-type name="my:color" as="enum('red', 'green', 'orange')"/>
<xsl:item-type name="my:fruit" as="enum('apple', 'orange', 'banana')"/>
<xsl:variable name="orange-color" as="my:color" select="'orange'"/>
<xsl:variable name="orange-fruit" as="my:fruit" select="'orange'"/>Now, the value of $orange-color is an atomic item whose datum is the string "orange", and whose type annotation is xs:string. Similarly, the value of $orange-fruit is an atomic item whose datum is the string "orange", and whose type annotation is xs:string. That is, the values of the two variables are indistinguishable and interchangeable in every way. In particular, both values are instances of my:color, and both are instances of my:fruit.
This way of handling enumeration values has advantages and disadvantages. On the positive side, it means that enumeration subsets and supersets work cleanly: a value that is an instance of enum("red", "green", "orange") can be used where an instance of enum("red", "orange", "yellow", "green", "blue", "indigo", "violet") is expected. The downside is that labeling a string as an instance of an enumeration type does not provide type safety: a function that expects an instance of my:color can be called with any string that matches one of the required colors, whether or not it has an appropriate type annotation. A function that expects a color can be successfully called passing a fruit, if they happen to have the same name.
In the terminology of computer science, XDM atomic types derived by restriction are nominative types, allowing two types with identical properties but different names to be treated as different types with different instances. By contrast, enumeration types are structural types, where membership of the type is determined purely by a predicate applied to the value.
In consequence, instances of an enumeration type are not annotated as such. The type annotation of such an instance may be xs:string or any type derived by restriction from xs:string, but it will not be the enumeration type itself, which is anonymous.
[Definition: Given two sequence types or item types, the rules in this section determine if one is a subtype of the other. If a type A is a subtype of type B, it follows that every value matched by A is also matched by B.]
Note:
The relationship subtype(A, A) is always true: every type is a subtype of itself.
Note:
The converse is not necessarily true: we cannot infer that if every value matched by A is also matched by B, then A is a subtype of type B. For example, A might be defined as the set of strings matching the regular expression [A-Z]*, while B is the set of strings matching the regular expression [A-Za-z]*; no subtype relationship holds between these types.
The rules for deciding whether one sequence type is a subtype of another are given in 3.3.1 Subtypes of Sequence Types. The rules for deciding whether one item type is a subtype of another are given in 3.3.2 Subtypes of Item Types.
Note:
The subtype relationship is not acyclic. There are cases where subtype(A, B) and subtype(B, A) are both true. This implies that A and B have the same value space, but they can still be different types. For example this applies when A is a union type with member types xs:string and xs:integer, while B is a union type with member types xs:integer and xs:string. These are different types ("23" cast as A produces a string, while "23" cast as B produces an integer, because casting is attempted to each member type in order) but both types have the same value space.
We use the notation A ⊆ B, or itemtype-subtype(A, B) to indicate that an item typeA is a subtype of an item type B. This section defines the rules for deciding whether any two item types have this relationship.
The rules in this section apply to item types, not to item type designators. For example, if the name STR has been defined in the static context as a named item type referring to the type xs:string, then anything said here about the type xs:string applies equally whether it is designated as xs:string or as STR, or indeed as the parenthesized forms (xs:string) or (STR).
References to named item types are handled as described in 3.3.2.10 Subtyping of Named Item Types.
The relationship A ⊆ B is true if and only if at least one of the conditions listed in the following subsections applies:
If A is a singleton enumeration type permitting the string value V, then A ⊆ B is true if anyB ofis the xs:stringfollowing apply:.
B is xs:string.
B is any subtype of xs:string whose value space includes the Section DM of V (regardless of the Section DM of V).
For example, enum("Z") is a subtype of each of the types xs:string, xs:token, and xs:NCName.
Note:
Because a non-singleton enumeration type is defined as a choice type, A ⊆ B also holds if A is enum("red") and B is enum("red", "green"). See 3.3.2.2 Subtyping of Choice Item Types.
Note:
The type enum("red", "green") is not a subtype of xs:NCName, despite the fact that all the enumerated values are valid NCNames. This is because instances of xs:NCName must have a type annotation of xs:NCName or a subtype thereof, whereas instances of enum("red", "green") are not subject to this constraint.
Note:
A type T derived by restriction from xs:string, for example a type with the facet length="0" (which permits only the zero-length string), is not a subtype of any enumeration type, even if every string in the value space of T is an instance of the enumeration type.
Use the arrows to browse significant changes since the 3.1 version of this specification.
See 1 Introduction
Sections with significant changes are marked Δ in the table of contents.
See 1 Introduction
PR 691 2154
Enumeration types are added as a new kind of ItemType, constraining the value space of strings.
Setting the default namespace for elements and types to the special value ##any causes an unprefixed element name to act as a wildcard, matching by local name regardless of namespace.
The terms FunctionType, ArrayType, MapType, and RecordType replace FunctionTest, ArrayTest, MapTest, and RecordTest, with no change in meaning.
Record types are added as a new kind of ItemType, constraining the value space of maps.
Function coercion now allows a function with arity N to be supplied where a function of arity greater than N is expected. For example this allows the function true#0 to be supplied where a predicate function is required.
PR 1817 1853
An inline function may be annotated as a %method, giving it access to its containing map.
See 4.5.6 Inline Function Expressions
See 4.5.6.1 Methods
The symbols × and ÷ can be used for multiplication and division.
The rules for value comparisons when comparing values of different types (for example, decimal and double) have changed to be transitive. A decimal value is no longer converted to double, instead the double is converted to a decimal without loss of precision. This may affect compatibility in edge cases involving comparison of values that are numerically very close.
Operators such as < and > can use the full-width forms < and > to avoid the need for XML escaping.
Operator is-not is introduced, as a complement to the operator is.
Operators precedes and follows are introduced as synonyms for operators << and >>.
PR 1480 1989
When the element name matches a language keyword such as div or value, it must now be written as a QName literal. This is a backwards incompatible change.
See 4.12.3.1 Computed Element Constructors
When the attribute name matches a language keyword such as by or of, it must now be written as a QName literal. This is a backwards incompatible change.
PR 1513 2028
When the processing instruction name matches a language keyword such as try or validate, it must now be written with a preceding # character. This is a backwards incompatible change.
See 4.12.3.5 Computed Processing Instruction Constructors
When the namespace prefix matches a language keyword such as as or at, it must now be written with a preceding # character. This is a backwards incompatible change.
The lookup operator ? can now be followed by a string literal, for cases where map keys are strings other than NCNames. It can also be followed by a variable reference.
PR 1763 1830
The syntax on the right-hand side of an arrow operator has been relaxed; a dynamic function call no longer needs to start with a variable reference or a parenthesized expression, it can also be (for example) an inline function expression or a map or array constructor.
The arrow operator => is now complemented by a “mapping arrow” operator =!> which applies the supplied function to each item in the input sequence independently.
All implementations must now predeclare the namespace prefixes math, map, array, and err. In XQuery 3.1 it was permitted but not required to predeclare these namespaces.
PR 254 2050
The supplied context value is now coerced to the required type specified in the main module using the coercion rules.
Function definitions in the static context may now have optional parameters, provided this does not cause ambiguity across multiple function definitions with the same name. Optional parameters are given a default value, which can be any expression, including one that depends on the context of the caller (so an argument can default to the context value).
PR 682 TODO
The values true() and false() are allowed in function annotations, as well as negated numeric literals and QName literals.
PR 1023 1128
It has been clarified that function coercion applies even when the supplied function item matches the required function type. This is to ensure that arguments supplied when calling the function are checked against the signature of the required function type, which might be stricter than the signature of the supplied function item.
A dynamic function call can now be applied to a sequence of functions, and in particular to an empty sequence. This makes it easier to chain a sequence of calls.
Parts of the static context that were there purely to assist in static typing, such as the statically known documents, were no longer referenced and have therefore been dropped.
The syntax document-node(N), where N is a NameTestUnion, is introduced as an abbreviation for document-node(element(N)). For example, document-node(*) matches any well-formed XML document (as distinct from a document fragment).
See 3.2.7 Node Types
QName literals are new in 4.0.
Path expressions are extended to handle JNodes (found in trees of maps and arrays) as well as XNodes (found in trees representing parsed XML).
PR 159
Keyword arguments are allowed on static function calls, as well as positional arguments.
PR 202
The presentation of the rules for the subtype relationship between sequence types and item types has been substantially rewritten to improve clarity; no change to the semantics is intended.
PR 230
The rules for “errors and optimization” have been tightened up to disallow many cases of optimizations that alter error behavior. In particular there are restrictions on reordering the operands of and and or, and of predicates in filter expressions, in a way that might allow the processor to raise dynamic errors that the author intended to prevent.
PR 254
The term "function conversion rules" used in 3.1 has been replaced by the term "coercion rules".
The coercion rules allow “relabeling” of a supplied atomic item where the required type is a derived atomic type: for example, it is now permitted to supply the value 3 when calling a function that expects an instance of xs:positiveInteger.
The value bound to a variable in a let clause is now converted to the declared type by applying the coercion rules.
The coercion rules are now used when binding values to variables (both global variable declarations and local variable bindings). This aligns XQuery with XSLT, and means that the rules for binding to variables are the same as the rules for binding to function parameters.
PR 284
Alternative syntax for conditional expressions is available: if (condition) { X }.
PR 286
Element and attribute tests can include alternative names: element(chapter|section), attribute(role|class).
See 3.2.7 Node Types
The NodeTest in an AxisStep now allows alternatives: ancestor::(section|appendix)
See 3.2.7 Node Types
Element and attribute tests of the form element(N) and attribute(N) now allow N to be any NameTest, including a wildcard.
PR 324
String templates provide a new way of constructing strings: for example `{$greeting}, {$planet}!` is equivalent to $greeting || ', ' || $planet || '!'
PR 326
Support for higher-order functions is now a mandatory feature (in 3.1 it was optional).
See 6 Conformance
PR 344
A for member clause is added to FLWOR expressions to allow iteration over an array.
PR 364
Switch expressions now allow a case clause to match multiple atomic items.
PR 368
The concept of the context item has been generalized, so it is now a context value. That is, it is no longer constrained to be a single item.
PR 433
Numeric literals can now be written in hexadecimal or binary notation; and underscores can be included for readability.
PR 483
The start clause in window expressions has become optional, as well as the when keyword and its associated expression.
PR 493
A new variable $err:map is available, capturing all error information in one place.
PR 519
The rules for tokenization have been largely rewritten. In some cases the revised specification may affect edge cases that were handled in different ways by different 3.1 processors, which could lead to incompatible behavior.
PR 521
New abbreviated syntax is introduced (focus function) for simple inline functions taking a single argument. An example is fn { ../@code }
PR 587
Switch and typeswitch expressions can now be written with curly brackets, to improve readability.
PR 603
The rules for reporting type errors during static analysis have been changed so that a processor has more freedom to report errors in respect of constructs that are evidently wrong, such as @price/@value, even though dynamic evaluation is defined to return an empty sequence rather than an error.
PR 606
Element and attribute tests of the form element(A|B) and attribute(A|B) are now allowed.
PR 635
The rules for the consistency of schemas imported by different query modules, and for consistency between imported schemas and those used for validating input documents, have been defined with greater precision. It is now recognized that these schemas will not always be identical, and that validation with respect to different schemas may produce different outcomes, even if the components of one are a subset of the components of the other.
PR 659
In previous versions the interpretation of location hints in import schema declarations was entirely at the discretion of the processor. To improve interoperability, XQuery 4.0 recommends (but does not mandate) a specific strategy for interpreting these hints.
PR 678
The comparand expression in a switch expression can be omitted, allowing the switch cases to be provided as arbitrary boolean expressions.
PR 691
Enumeration types are added as a new kind of ItemType, constraining the value space of strings.
PR 728
The syntax record(*) is allowed; it matches any map.
PR 753
The default namespace for elements and types can now be declared to be fixed for a query module, meaning it is unaffected by a namespace declaration appearing on a direct element constructor.
PR 815
The coercion rules now allow conversion in either direction between xs:hexBinary and xs:base64Binary.
PR 820
The value bound to a variable in a for clause is now converted to the declared type by applying the coercion rules.
PR 911
The coercion rules now allow any numeric type to be implicitly converted to any other, for example an xs:double is accepted where the required type is xs:decimal.
PR 943
A FLWOR expression may now include a while clause, which causes early exit from the iteration when a condition is encountered.
PR 996
The value of a predicate in a filter expression can now be a sequence of integers.
PR 1031
An otherwise operator is introduced: A otherwise B returns the value of A, unless it is an empty sequence, in which case it returns the value of B.
PR 1071
In map constructors, the keyword map is now optional, so map { 0: false(), 1: true() } can now be written { 0: false(), 1: true() }, provided it is used in a context where this creates no ambiguity.
PR 1132
Choice item types (an item type allowing a set of alternative item types) are introduced.
PR 1163
Filter expressions for maps and arrays are introduced.
PR 1181
The default namespace for elements and types can be set to the value ##any, allowing unprefixed names in axis steps to match elements with a given local name in any namespace.
If the default namespace for elements and types has the special value ##any, then an unprefixed name in a NameTest acts as a wildcard, matching names in any namespace or none.
The default namespace for elements and types can be set to the value ##any, allowing unprefixed names in axis steps to match elements with a given local name in any namespace.
PR 1197
The keyword fn is allowed as a synonym for function in function types, to align with changes to inline function declarations.
In inline function expressions, the keyword function may be abbreviated as fn.
PR 1212
XQuery 3.0 included empty-sequence and item as reserved function names, and XQuery 3.1 added map and array. This was unnecessary since these names never appear followed by a left parenthesis at the start of an expression. They have therefore been removed from the list. New keywords introducing item types, such as record and enum, have not been included in the list.
PR 1217
Predicates in filter expressions for maps and arrays can now be numeric.
PR 1249
A for key/value clause is added to FLWOR expressions to allow iteration over a map.
PR 1250
Several decimal format properties, including minus sign, exponent separator, percent, and per-mille, can now be rendered as arbitrary strings rather than being confined to a single character.
PR 1254
The rules concerning the interpretation of xsi:schemaLocation and xsi:noNamespaceSchemaLocation attributes have been tightened up.
PR 1265
The rules regarding the document-uri property of nodes returned by the fn:collection function have been relaxed.
PR 1342
The ordered { E } and unordered { E } expressions are retained for backwards compatibility reasons, but in XQuery 4.0 they are deprecated and have no useful effect.
See 4.15 Ordered and Unordered Expressions
The ordering mode declaration is retained for backwards compatibility reasons, but in XQuery 4.0 it is deprecated and has no useful effect.
PR 1344
Parts of the static context that were there purely to assist in static typing, such as the statically known documents, were no longer referenced and have therefore been dropped.
The static typing option has been dropped.
The static typing feature has been dropped.
See 6 Conformance
PR 1361
The term atomic value has been replaced by atomic item.
See 2.1.3 Values
PR 1384
If a type declaration is present, the supplied values in the input sequence are now coerced to the required type. Type declarations are now permitted in XPath as well as XQuery.
PR 1432
In earlier versions, the static context for the initializing expression excluded the variable being declared. This restriction has been lifted.
PR 1470
$err:stack-trace provides information about the current state of execution.
PR 1496
The context value static type, which was there purely to assist in static typing, has been dropped.
PR 1498
The EBNF operators ++ and ** have been introduced, for more concise representation of sequences using a character such as "," as a separator. The notation is borrowed from Invisible XML.
See 2.1 Terminology
The EBNF notation has been extended to allow the constructs (A ++ ",") (one or more occurrences of A, comma-separated, and (A ** ",") (zero or more occurrences of A, comma-separated.
The EBNF operators ++ and ** have been introduced, for more concise representation of sequences using a character such as "," as a separator. The notation is borrowed from Invisible XML.
See A.1 EBNF
See A.1.1 Notation
PR 1501
The coercion rules now apply recursively to the members of an array and the entries in a map.
PR 1532
Four new axes have been defined: preceding-or-self, preceding-sibling-or-self, following-or-self, and following-sibling-or-self.
See 4.6.4.1 Axes
PR 1577
The syntax record() is allowed; the only thing it matches is an empty map.
PR 1686
With the pipeline operator ->, the result of an expression can be bound to the context value before evaluating another expression.
PR 1696
Parameter names may be included in a function signature; they are purely documentary.
PR 1703
Ordered maps are introduced.
See 4.14.1 Maps
The order of key-value pairs in the map constructor is now retained in the constructed map.
PR 1874
The coercion rules now reorder the entries in a map when the required type is a record type.
PR 1898
The rules for subtyping of document node types have been refined.
PR 1914
A finally clause can be supplied, which will always be evaluated after the expressions of the try/catch clauses.
PR 1956
Private variables declared in a library module are no longer required to be in the module namespace.
Private functions declared in a library module are no longer required to be in the module namespace.
PR 1982
Whitespace is now required after the opening (# of a pragma. This is an incompatible change, made to ensure that an expression such as error(#err:XPTY0004) can be parsed as a function call taking a QName literal as its argument value.
PR 1991
Named record types used in the signatures of built-in functions are now available as standard in the static context.
PR 2026
The module feature is no longer an optional feature; processing of library modules is now required.
See 6 Conformance
PR 2030
The technical details of how validation works have been moved to the Functions and Operators specification. The XQuery validate expression is now defined in terms of the new xsd-validator function.
PR 2031
The terms XNode and JNode are introduced; the existing term node remains in use as a synonym for XNode where the context does not specify otherwise.
See 2.1.3 Values
JNodes are introduced
PR 2055
Sequences, arrays, and maps can be destructured in a let clause to extract their components into multiple variables.
PR 2094
A general expression is allowed within a map constructor; this facilitates the creation of maps in which the presence or absence of particular keys is decided dynamically.
PR 2115
This section describes and formalizes a convention that was already in use, but not explicitly stated, in earlier versions of the specification.