View Old View New View Both View Only Previous Next

This draft contains only sections that have differences from the version that it modified.

W3C

XQuery 4.0: An XML Query Language

W3C Editor's Draft 2 February13 March 2026

This version:
https://qt4cg.org/specifications/xquery-40/
Most recent version of XQuery:
https://qt4cg.org/specifications/xquery-40/
Most recent Recommendation of XQuery:
https://www.w3.org/TR/2017/REC-xquery-31-20170321/
Editor:
Michael Kay, Saxonica <mike@saxonica.com>

Please check the errata for any errors or issues reported since publication.

See also translations.


Abstract

XML is a versatile markup language, capable of labeling the information content of diverse data sources, including structured and semi-structured documents, relational databases, and object repositories. A query language that uses the structure of XML intelligently can express queries across all these kinds of data, whether physically stored in XML or viewed as XML via middleware. This specification describes a query language called XQuery, which is designed to be broadly applicable across many types of XML data sources.

A list of changes made since XQuery 3.1 can be found in K Change Log.

Status of this Document

This is a draft prepared by the QT4CG (officially registered in W3C as the XSLT Extensions Community Group). Comments are invited.

Dedication

The publications of this community group are dedicated to our co-chair, Michael Sperberg-McQueen (1954–2024).


5 Modules and Prologs

Module::=VersionDecl? (LibraryModule | MainModule)
MainModule::=PrologQueryBody
LibraryModule::=ModuleDeclProlog
Prolog::=((DefaultNamespaceDecl | Setter | NamespaceDecl | Import) Separator)* ((ContextValueDecl | AnnotatedDecl | OptionDecl) Separator)*
Setter::=BoundarySpaceDecl | DefaultCollationDecl | BaseURIDecl | ConstructionDecl | OrderingModeDecl | EmptyOrderDecl | CopyNamespacesDecl | DecimalFormatDecl
Import::=SchemaImport | ModuleImport
Separator::=";"
QueryBody::=Expr

A query can be assembled from one or more fragments called modules. [Definition: A module is a fragment of XQuery code that conforms to the Module grammar and can independently undergo the static analysis phase described in 2.3.3 Expression Processing. Each module is either a main module or a library module.]

[Definition: A main module consists of a Prolog followed by a Query Body.] A query has exactly one main module. In a main module, the Query Body is evaluated with respect to the static and dynamic contexts of the main module in which it is found, and its value is the result of the query.

[Definition: A module that does not contain a Query Body is called a library module. A library module consists of a module declaration followed by a Prolog.] A library module cannot be evaluated directly; instead, it provides function and variable declarations that can be imported into other modules.

The XQuery syntax does not allow a module to contain both a module declaration and a Query Body.

[Definition: A Prolog is a series of declarations and imports that define the processing environment for the module that contains the Prolog.] Each declaration or import is followed by a semicolon. A Prolog is organized into two parts.

The first part of the Prolog consists of setters, imports, namespace declarations, and default namespace declarations. [Definition: Setters are declarations that set the value of some property that affects query processing, such as construction mode or default collation.] Namespace declarations and default namespace declarations affect the interpretation of lexical QNames within the query. Imports are used to import definitions from schemas and modules. [Definition: The target namespace of a module is the namespace of the objects (such as elements or functions) that it defines. ]

The second part of the Prolog consists of declarations of variables, functions, and options. These declarations appear at the end of the Prolog because they may be affected by declarations and imports in the first part of the Prolog.

[Definition: The Query Body, if present, consists of an expression that defines the result of the query.] Evaluation of expressions is described in 4 Expressions. A module can be evaluated only if it has a Query Body.

5.20 Named Record Types

Although item type declarations, as described in 5.19 Item Type Declarations, can be used to give names to record types as well as any other item type, named record types as described in this section provide a more concise syntax, plus additional functionality. In particular:

  • Named record types can be recursive.

  • Named record types implicitly create a constructor function that can be used to create instances of the record type.

  • A field in a named record type can be a function that has implicit access to the record on which it is defined, rather like methods in object-oriented languages.

The syntax is as follows:

AnnotatedDecl::="declare" Annotation* (VarDecl | FunctionDecl | ItemTypeDecl | NamedRecordTypeDecl)
Annotation::="%" EQName ("(" (AnnotationValue ++ ",") ")")?
NamedRecordTypeDecl::="record" EQName "(" (ExtendedFieldDeclaration ** ",") ExtensibleFlag? ")"
ExtendedFieldDeclaration::=FieldDeclaration (":=" ExprSingle)?
FieldDeclaration::=FieldName "?"? ("as" SequenceType)?
FieldName::=NCName | StringLiteral

A named record declaration serves as both a named item type and as a function definition, and it therefore inherits rules from both these roles. In particular:

  1. Its name must not be the same as the name of any other named item type, or any generalized atomic type, that is present in the same static context [err:XQST0048].

  2. If the declaration appears within a library module then its name must be in the target namespace of the library module [err:XQST0048].

  3. As a function, it must not have an arity range that overlaps the arity range of any other function declaration having the same name in the same static context.

  4. The order of field declarations is significant, because it determines the order of arguments in a call to the constructor function.

  5. The fields must have distinct names. [err:XPST0021]

  6. In order to work as both a record type and a function declaration, the names of the fields must be simple NCNames in no namespace; the names must not be written as string literals [err:XPST0003].

    Note:

    This is described here as a semantic constraint, but an implementation might choose to impose it at the level of the grammar.

  7. If an initializing expression is present in an ExtendedFieldDeclaration, it must follow the rules for the initializing expression of a parameter in a function declaration, given in 5.18.3 Function Parameters. In particular, if any field has an initializing expression then all following fields must have an initializing expression.

  8. Any annotations that are present, such as %public or %private, apply both to the item type declaration and to the function declaration.

5.20.2 Constructor Functions for Named Record Types

The construct:

declare record cx:complex(r as xs:double, i as xs:double := 0);

implicitly defines the function:

declare function cx:complex($r as xs:double, $i as xs:double := 0) as cx:complex {
  map:merge((
    { "r": $r },
    { "i": $i }
  ), { "retain-order" : true() })
};
declare function cx:complex($r as xs:double, $i as xs:double := 0) as cx:complex {
  map:merge(( { "r": $r }, { "i": $i } ))
};

So the call cx:complex(3, 2) produces the value { "r": 3e0, "i": 2e0 }, while the call cx:complex(3) produces the value { "r": 3e0, "i": 0e0 }

The resulting map has itsorder of entries in [TERMDEF dt-map-ordered IN DM40] property set to true, so the order of entriesthe map corresponds to the order of field declarations in the record type. This means, for example, that when the map is serialized using the JSON output method, the order of entries in the output will correspond to the order of field declarations.

If a field is declared as optional, by including a question mark after the name, and if it has no initializer, then the initializer := () is added implicitly. If the declared type of an optional field does not permit an empty sequence, then the declared type of the function parameter is adjusted by changing the occurrence indicator (from absent to ? or from + to *) in order to make the empty sequence an acceptable value.

Furthermore, if a field is optional and has no explicit initializer, the relevant entry in the constructed map will be absent when the value supplied (implicitly or explicitly) to the function argument is an empty sequence. This is achieved by modifying the function body. Given the declaration:

declare record cx:complex(r as xs:double, i? as xs:double);

the equivalent function declaration is:

declare function cx:complex($r as xs:double, $i as xs:double? := ()) as cx:complex {
  map:merge((
    { "r": $r },
    if (exists($i)) { { "i": $i } }
  ), { "retain-order" : true() })
};

If any field is either declared optional, or has an explicit initializer, then all subsequent fields must also either be declared optional, or have an explicit initializer [err:XQST0148].

If the record type is declared as extensible (by the presence of a final ,*), then an additional paremeter is added to the function declaration. The name of the parameter is options (provided this name is available for use), its declared type is map(*), and its default value is {} (an empty map). The function body is then modified as shown in the following example. Given the declaration:

declare record p:person($first as xs:string, $last as xs:string, *);

the equivalent function declaration is:

declare function p:person(
  $first   as xs:string, 
  $last    as xs:string, 
  $options as map(*) := {}
) as p:person {
  map:merge((
    { "first": $first },
    { "last": $last },
    $options
  ),
  { "duplicates": "use-first", "retain-order" : true() }
 };
declare function p:person(
  $first   as xs:string, 
  $last    as xs:string, 
  $options as map(*) := {}
) as p:person {
  map:merge((
    { "first": $first },
    { "last": $last },
    $options
  ),
  { "duplicates": "use-first" }
 };

The effect of the duplicates option here is that when two values are supplied for the same field, one as a direct argument in the function call and the other in the options map, the value supplied as a direct argument is used in preference. The effect of the ordering option is that the resulting map has an entry orderDM in which the named fields appear first, in order of declaration, followed by the extension entries supplied in $options, retaining the entry orderDM of the $options map.

If the name options is already in use for one of the fields, then the first available name from the sequence ("options1", "options2", ...) is used instead for the additional function parameter.

More formally, the equivalent function declaration is derived as follows:

  • The function annotations are the annotations on the named record declaration.

  • The function name is the QName of the named record declaration: an unprefixed name is expanded using the default namespace for elements and types. The resulting QName must be the same as the module namespace if the declaration appears in a library module, and in any event, it must be in some namespace.

  • The parameters of the function declaration are derived from the fields of the named record declaration, in order.

    • The name of the parameter is the name of the field (always an NCName).

    • The declared type of the parameter is the declared type of the field, if present; but if the field is optional, indicated by a question mark (?) after its name, and has no initializer, then the occurrence indicator is adjusted to permit an empty sequence, as described earlier.

    • The default value for the parameter is given by the initializing expression in the ExtendedFieldDeclaration, if present. If the field is optional and has no initializer, then it is given a default value of (), the empty sequence.

  • If the record type is extensible, then a further parameter is added at the end:

    • The name of the parameter is the first available name from the sequence ("options", "options1", "options2", ...).

    • The declared type of the parameter is map(*).

    • The default value of the parameter is {}, the empty map.

  • The return type of the function is the name of the record declaration, with no occurrence indicator.

  • The body of the function is a call of the function map:merge with two arguments:

    • The first argument is a parenthesized expression containing a comma-separated sequence of subexpressions, containing one subexpression for each field, in order, and optionally a further subexpression for the options parameter if present.

    • By default, the relevant subexpression is the map constructor { "N": $N } where N is the field name.

    • If the field is optional and is declared without an explicit initializer, then the relevant subexpression takes the form if (exists($N)) { { "N": $N } } where N is the field name.

    • The optional final subexpression, if present, takes the form $options, where $options is the name allocated to the final parameter.

    • The second argument in the call of the function map:merge is the map { "duplicates": "use-first", "retain-order": true() }.

Note that a question mark ? after the field name indicates that the field is optional from the point of view of conformance of an item to the record type. The presence of an initializer indicates that it is optional from the point of view of a call on the constructor function. The two things are independent of each other. For example:

  • record(longitude, latitude, altitude?)

    Defines a record type in which altitude entry may be absent, and a constructor function with three arguments, of which the last is optional; if the function is called with two arguments (or with the third argument set to an empty sequence), then there will be no altitude entry in the resulting map.

    record(longitude, latitude, altitude := 0)

    Defines a record type in which all three fields will always be present, and a constructor function in which the third argument can be omitted, defaulting to zero.

    record(longitude, latitude, altitude? := ())

    Defines a record type in which the altitude entry may be absent both from the record and in the function call: but because a default value has been supplied explicitly, the constructed map will always have an entry for altitude.

Note:

Although the constructor function for a named record type produces a map in which the order of entries corresponds to the order of field declarations in the record type, the order of entries in a map is immaterial when testing whether a map matches the record type: the entries can be in any order.