Please check the errata for any errors or issues reported since publication.
See also translations.
This document is also available in these non-normative formats: XML.
Copyright © 2000 W3C® (MIT, ERCIM, Keio, Beihang). W3C liability, trademark and document use rules apply.
This document defines the XQuery and XPath Data Model 4.0, which is the data model of [XML Path Language (XPath) 4.0], [XSL Transformations (XSLT) Version 4.0], and [XQuery 4.0: An XML Query Language], and any other specifications that reference it. This document is the result of joint work by the [XSLT Working Group] and the [XML Query Working Group].
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at https://www.w3.org/TR/.
This document is governed by the 1 March 2017 W3C Process Document.
This is a Recommendation of the W3C. It was jointly developed by the W3C XML Query Working Group and the W3C XSLT Working Group, each of which is part of the XML Activity.
This Editor's Draft specifies the XQuery and XPath Data Model (XDM) version 4.0, a fully compatible extension of XDM version 3.1.
This specification is designed to be referenced normatively from other specifications defining a host language for it; it is not intended to be implemented outside a host language. The implementability of this specification has been tested in the context of its normative inclusion in host languages defined by the XQuery 3.1 and XSLT 3.0 specifications; see the XQuery 3.1 implementation report (and, in the future, the WGs expect that there will also be an XSLT 3.0 implementation report) for details.
No substantive changes have been made to this specification since its publication as a Proposed Recommendation.
Please report errors in this document using W3C's public Bugzilla system (instructions can be found at https://www.w3.org/XML/2005/04/qt-bugzilla). If access to that system is not feasible, you may send your comments to the W3C XSLT/XPath/XQuery public comments mailing list, public-qt-comments@w3.org. It will be very helpful if you include the string “[XDM31]” in the subject line of your report, whether made in Bugzilla or in email. Please use multiple Bugzilla entries (or, if necessary, multiple email messages) if you have more than one comment to make. Archives of the comments and responses are available at https://lists.w3.org/Archives/Public/public-qt-comments/.
This document has been reviewed by W3C Members, by software developers, and by other W3C groups and interested parties, and is endorsed by the Director as a W3C Recommendation. It is a stable document and may be used as reference material or cited from another document. W3C's role in making the Recommendation is to draw attention to the specification and to promote its widespread deployment. This enhances the functionality and interoperability of the Web.
This document was produced by groups operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures (W3C XML Query Working Group) and a public list of any patent disclosures (W3C XSLT Working Group) made in connection with the deliverables of each group; these pages also include instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.
The publications of this community group are dedicated to our co-chair, Michael Sperberg-McQueen (1954–2024).
This section outlines a number of general concepts that apply throughout this specification.
In this document, examples and material labeled as “Note” are provided for explanatory purposes and are not normative.
Every value manipulated by XPath, XQuery, or XSLT is a sequence comprising zero or more items.
[Definition: A sequence type constrains the set of permitted sequences, by defining the permitted item types and the permitted number of items in the sequence (exactly zero, exactly one, zero-or-more, one-or-more, zero-or-one).]
Every item is an instance of one or more item types:
All items are instances of the type item().
Every node is an instance of the type node(), and more specifically it is an instance of one of seven node kinds: document(), element(*), attribute(*), text(), comment(), processing-instruction(), or namespace(). Nodes may also be instances of more specific types characterized by the node name and type annotation.
Every atomic item is an instance of a specific atomic type determined by its type annotation; it is also an instance of every type from which that type is derived by restriction (directly or indirectly), and of every union type that includes that type as a member type.
Every function item is an instance of the generic type function(*), and also of a specific function type defining the types of the function's parameters and the type of the result.
A map item, as well as being a function, is also an instance of the generic map type map(*), of more specific map types map(K, V) defining the types of the keys and values, and perhaps of one or more record types that associate a type with specific key values.
An array item, as well as being a function, is also an instance of the generic array type array(*), and also of more specific array types array(M) defining the type of the array's members.
This section describes how item types relate to each other.
The diagrams below show how nodes, functions, primitive simple types, and user defined types fit together into a type system. In the diagrams, connecting lines represent relationships between derived types and the types from which they are derived; the latter are always higher and to the left of the latter.
The xs:IDREFS, xs:NMTOKENS, xs:ENTITIES types, and xs:numeric, and both the user-defined list types and user-defined union types are special types in that these types are lists or unions rather than types derived by extension or restriction.
The first diagram illustrates the relationship of various item types. Item types in the data model form a directed graph, rather than a hierarchy or lattice: in the relationship defined by the derived-from(A, B) function, some types are derived from more than one other type. Examples include functions (function(xs:string) as xs:int is substitutable for function(xs:NCName) as xs:int and also for function(xs:string) as xs:decimal), and union types (A is substitutable for the union type (A | B) and also for the union type (A | C)). In XDM, item types include node types, function types, and built-in atomic types. The list, which shows only hierarchic relationships, is therefore a simplification of the full model.
item (abstract)
anyAtomicType (built-in atomic)
node (node)
attribute (node)
user-defined attribute types (user-defined)
document (node)
user-defined document types (user-defined)
element (node)
user-defined element types (user-defined)
text (node)
comment (node)
processing-instruction (node)
namespace (node)
function(*) (function item)
user-defined function item types (user-defined)
array(*) (function item)
user-defined array types (user-defined)
map(*) (function item)
user-defined map types (user-defined)
user-defined record types (user-defined)
Legend:
Supertype
subtype
Abstract types (abstract)
Built-in atomic types (built-in atomic)
Node types (node)
Function item types (function item)
User-defined types (user-defined)
The XPath Data Model is the abstraction over which XPath expressions are evaluated. Historically, all of the items in the data model could be derived directly (nodes) or indirectly (typed values, sequences) from an XML document. However, as the XPath expression language has matured, new features have been added which require additional types of items to appear in the data model. These items have no direct XML serialization, but they are never the less part of the data model.
The next diagram shows all of the atomic types, including the primitive simple types and the built-in types derived from the primitive simple types. This includes all the built-in datatypes defined in [Schema Part 2]. Atomic types act both as item types (meaning they can be used to declare the types of variables and function arguments), and as schema types (meaning they can be used as type annotations on nodes).
anyAtomicType
anyURI
base64Binary
boolean
date
dateTime
dateTimeStamp
decimal
integer
long
int
short
byte
nonNegativeInteger
positiveInteger
unsignedLong
unsignedInt
unsignedShort
unsignedByte
nonPositiveInteger
negativeInteger
double
duration
dayTimeDuration
yearMonthDuration
float
gDay
gMonth
gMonthDay
gYear
gYearMonth
hexBinary
NOTATION
QName
string
normalizedString
token
NMTOKEN
Name
NCName
ENTITY
ID
IDREF
language
time
untypedAtomic
Legend:
Supertype
subtype
Built-in atomic types
[Definition: A map item (also called simply a map) is an item that represents an ordered sequence of key/value pairs, in which the keys are unique.] In other languages this is sometimes called a hash, dictionary, or associative array. The keys are atomic items, and each key in the map is unique (there is no other key to which it is equal). Each key is associated with a value that may be any sequence of zero or more items. There is no uniqueness constraint on values, only on keys. The semantics of equality when comparing keys are described in Section 13.2.1 fn:atomic-equalFO.
[Definition: The key/value pairs in a map are referred to as entries.]
[Definition: A map containing exactly one entry is referred to as a single-entry map.]
[Definition: A map containing no entries is referred to as an empty map.]
Note:
Maps have no intrinsic identity separate from their content. A map can be given a transient identity, represented by an id property in its label, by applying the fn:pin function. This property is expected to be used in defining operations for deep update of maps.
[Definition: The order of entries in a map is referred to as entry order.] The entry order affects the result of functions such as map:keys and map:for-each, and also determines the order of entries when a map is serialized using the JSON output method.
Constructor and accessor functions for maps are defined in the following sections.
empty-map Constructordm:empty-map() as map(*)The dm:empty-map constructor returns an entryempty map, that is, a map containing no key/value pairs.
Constructors are added, and the single accessor function is now an iterator over the members of the array. [Issue 1335 20 July 2024]
[Definition: An array item (also called simply an array) is a value that represents an array.] [Definition: An array is an ordered list of values; these values are called the members of the array.] Unlike sequences, a member of an array can be any value (including a sequence or an array). The number of members in an array is called its size, and they are referenced by their position, in the range 1 to the size of the array.
[Definition: An array containing exactly one member is referred to as a single-member array.]
[Definition: An array containing no members is referred to as an empty array.]
Note:
Arrays have no intrinsic identity separate from their content. An array can be given a transient identity, represented by an id property in its label, by applying the fn:pin function. This property is expected to be used in defining operations for deep update of arrays.
Constructor and accessor functions for arrays are defined in the following sections.
empty-array Constructordm:empty-array() as array(*)The dm:empty-array constructor returns an entryempty array, that is, an array item containing no members.
The function is exposed in XPath as an empty array constructor, written [] or array {}.
When a property has no value, we say that it is absent.
An array item (also called simply an array) is a value that represents an array.
An atomic item is a pair (T, D) where T (the type annotation) is an atomic type, and D (the datum) is a point in the value space of T.
An atomic type is either a primitive simple typewith variety atomic, or a type derived by restriction from another atomic type.
A character is any Unicode character.
A codepoint is a non-negative integer assigned to a character by the Unicode consortium, or reserved for future assignment to a character.
Two schemasX and Y are compatible if the union of X and Y is a valid schema.
The datum of an atomic item is a point in the value space of its type, which is also a point in the value space of the primitive type from which that type is derived.
A tree whose root node is a document node is referred to as a document.
A document order is defined among all the nodes accessible during a given query or transformation. Document order is a total ordering, although the relative order of some nodes is implementation-dependent. Informally, document order is the order in which nodes appear in the XML serialization of a document.
The key/value pairs in a map are referred to as entries.
A map containing exactly one entry is referred to as a single-entry map.
An array containing no members is referred to as an empty array.
A map containing no entries is referred to as an empty map.
An array containing exactly one member is referred to as a single-member array.
An array containing no members is referred to as an empty array.
The key/value pairs in a map are referred to as entries.
The order of entries in a map is referred to as entry order.
An expanded QName is a triple consisting of a possibly absent prefix, a possibly absent namespace URI, and a local name.
A tree whose root node is not a document node is referred to as a fragment.
The arity of a function item is the number of its parameters.
A function item is an item that can be called.
A function signature represents the type of a function.
Implementation-defined indicates an aspect that may differ between implementations, but must be specified by the implementer for each particular implementation.
Implementation-dependent indicates an aspect that may differ between implementations, is not specified by this or any W3C specification, and is not required to be specified by the implementer for any particular implementation.
An incompletely validated document is an XML document that has a corresponding schema but whose schema-validity assessment has resulted in one or more element or attribute information items being assigned values other than ‘valid’ for the [validity] property in the PSVI.
Every instance of the data model is a sequence.
An item is either a node, a function, or an atomic item.
An item type represents a class of items.
A labeled item is a pair (S, L) where S (called the subject) is any item, and L (called the label) is a map containing supplementary information about the item.
A map item (also called simply a map) is an item that represents an ordered sequence of key/value pairs, in which the keys are unique.
An array is an ordered list of values; these values are called the members of the array.
This specification uses the term Namespace URI to refer to a namespace name, whether or not it is a valid URI or IRI
There are seven kinds of nodes in the data model: document, element, attribute, text, namespace, processing instruction, and comment.
The primitive simple types are the types defined in 2.2.1 Types adopted from XML Schema.
The root node is the topmost node of a tree, the node with no parent.
Following the terminology of [Schema Part 1], a schema is defined as set of schema components. Schema components include, for example, element declarations and type definitions.
A schema type corresponds to a type definition component as defined in XSD.
A sequence is an ordered collection of zero or more items.
A sequence type constrains the set of permitted sequences, by defining the permitted item types and the permitted number of items in the sequence (exactly zero, exactly one, zero-or-more, one-or-more, zero-or-one).
A map containing exactly one entry is referred to as a single-entry map.
An array containing exactly one member is referred to as a single-member array.
Document order is stable, which means that the relative order of two nodes will not change during the processing of a given query or transformation, even if this order is implementation-dependent.
A string is a sequence of zero or more characters.
The term type annotation has two slightly different meanings. For an atomic item, the type annotation of the value is the most specific atomic type that it is an instance of (it is also an instance of every type from which that type is derived). For an element or attribute node, the type annotation is the schema type (a simple or complex type) against which the node has been validated, defaulting to xs:untypedAtomic for unvalidated attribute nodes, and xs:untyped for unvalidated element nodes.
Because every value is a sequence, the term value is used synonymously with sequence.