Please check the errata for any errors or issues reported since publication.
See also translations.
This document is also available in these non-normative formats: XML.
Copyright © 2000 W3C® (MIT, ERCIM, Keio, Beihang). W3C liability, trademark and document use rules apply.
This document defines the XQuery and XPath Data Model 4.0, which is the data model of [XML Path Language (XPath) 4.0], [XSL Transformations (XSLT) Version 4.0], and [XQuery 4.0: An XML Query Language], and any other specifications that reference it. This document is the result of joint work by the [XSLT Working Group] and the [XML Query Working Group].
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at https://www.w3.org/TR/.
This document is governed by the 1 March 2017 W3C Process Document.
This is a Recommendation of the W3C. It was jointly developed by the W3C XML Query Working Group and the W3C XSLT Working Group, each of which is part of the XML Activity.
This Editor's Draft specifies the XQuery and XPath Data Model (XDM) version 4.0, a fully compatible extension of XDM version 3.1.
This specification is designed to be referenced normatively from other specifications defining a host language for it; it is not intended to be implemented outside a host language. The implementability of this specification has been tested in the context of its normative inclusion in host languages defined by the XQuery 3.1 and XSLT 3.0 specifications; see the XQuery 3.1 implementation report (and, in the future, the WGs expect that there will also be an XSLT 3.0 implementation report) for details.
No substantive changes have been made to this specification since its publication as a Proposed Recommendation.
Please report errors in this document using W3C's public Bugzilla system (instructions can be found at https://www.w3.org/XML/2005/04/qt-bugzilla). If access to that system is not feasible, you may send your comments to the W3C XSLT/XPath/XQuery public comments mailing list, public-qt-comments@w3.org. It will be very helpful if you include the string “[XDM31]” in the subject line of your report, whether made in Bugzilla or in email. Please use multiple Bugzilla entries (or, if necessary, multiple email messages) if you have more than one comment to make. Archives of the comments and responses are available at https://lists.w3.org/Archives/Public/public-qt-comments/.
This document has been reviewed by W3C Members, by software developers, and by other W3C groups and interested parties, and is endorsed by the Director as a W3C Recommendation. It is a stable document and may be used as reference material or cited from another document. W3C's role in making the Recommendation is to draw attention to the specification and to promote its widespread deployment. This enhances the functionality and interoperability of the Web.
This document was produced by groups operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures (W3C XML Query Working Group) and a public list of any patent disclosures (W3C XSLT Working Group) made in connection with the deliverables of each group; these pages also include instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.
The publications of this community group are dedicated to our co-chair, Michael Sperberg-McQueen (1954–2024).
[Definition: A JNode is a kind of item used to represent a value within the context of a tree of maps and arrays. A root JNode represents a map or array; a non-root JNode represents a member of an array or, an entry in a map, or an item in a sequence.]
The type JNode is a subtype of GNode (a generic node). JNodes have identity, and are organized into trees called JTrees.
[Definition: Every JNode has a property ·content· which is an arbitrary value (that is, in general, a sequence).]
In addition a JNode that is not a root JNode has the following properties:
·parent·: a JNode·position·: a positive integer·selector·: an atomic item
·parent·: a JNode
·selector·: an atomic item
·kind·: one of "array", "map", or "sequence"
In effect, there are fourfive kinds of JNode (though they are not distinguished as different types in the type system):
A root JNode that represents a map.
A root JNode that represents an array.
A non-root JNode that represents an entry in a map (specifically, the entry whose key is given by the ·selector· property).
A non-root JNode that represents a member of an array (specifically, the member whose 1-based index is given by the ·selector· property).
A non-root JNode that represents an item in a sequence (specifically, the item whose 1-based position is given by the ·selector· property).
[Definition: The accessor j-children, applied to a JNode $P, returns a sequence of non-root JNodes representing the children of $P.]
Values are classified as leaf or non-leaf. A value is classified as non-leaf if and only if at least one item in the value is a non-empty map or array. If the ·content· property of a JNode is classified as leaf, then the j-children accessor returns an empty sequence.
Note:
If the j-children accessor of a JNode returns an empty sequence, then it is necessary to examine the ·content· property in order to distinguish whether the value is (for example) an empty sequence, an atomic item, an empty map, or an empty array.
Sequences of length greater than one do not arise in trees that result from parsing JSON, but they can arise in arbitrary XDM trees, because an array member can be an arbitrary XDM value, as can the value associated with a key in a map. A JNode whose ·content· property is a singleton item other than a map or array has no children: it acts as a leaf node in the tree.
Specifically, a node is leaf node in the tree (that is, a node with no children) if its ·content· is one of the following:
An empty sequence
An empty array
An empty map
A singleton sequence other than an array or a map.
If the ·content· property of a JNode P is classified as non-leaf, then the j-children accessor returns a sequence of JNodes that includes one JNode for each member of an array in J and one JNode for each entry of a map in J. More specifically, the result is determined by the expression below. This expression uses the following notation:
The function dm:j-valuejnode-content is used to access the ·content· property of a JNode
The function dm:JNode is used to construct (or obtain) a JNode whose properties correspond to the names of the argument keywords.
for $item at $pos in dm:j-value($P)
return
if ($item instance of array(*))
then for member $member at $index in $item
return dm:JNode(parent := $P,
position := $pos,
selector := $index,
value := $member)
else if ($item instance of map(*))
then for key $key value $value in $item
return dm:JNode(parent := $P,
position := $pos,
selector := $key
value := $value)
else ()
let $content := jnode-content($P)
return
if (count($content ge 2)
then for $item at $position in $content
return dm:JNode(parent := $P,
kind := "sequence",
selector := $position,
content := $item)
if ($item instance of array(*))
then for member $member at $index in $item
return dm:JNode(parent := $P,
kind := "array",
selector := $index,
content := $member)
else if ($item instance of map(*))
then for key $key value $value in $item
return dm:JNode(parent := $P,
kind := "map",
selector := $key
content := $value)
else ()The order of JNodes returned by the j-children accessor is significant, and is as defined by the above expression.
In a JTree derived from JSON, every member of an array will always be either a single item (a string, number, boolean, array, or map), or an empty sequence representing the JSON value null. Similarly, every entry in a map will have a key that is a single xs:string, and a corresponding value that is either a single item (as above) or an empty sequence representing null.
Consider a tree constructed by parsing the following JSON input:
[
{"a": 1, "b": "XXX", "c": true, "d": null},
{"a": 2, "b": "YYY", "c": false, "d": null}
]Then:
The root JNode R has a ·content· property that is an array of two maps.
The result of the dm:j-children accessor applied to R is a sequence of two JNodes M1 and M2, each representing one of the two maps.
For M1:
·parent· is R.
·content· is the first map.
·positionkind· is 1"array".
·selector· is 1.
The dm:j-children accessor returns a sequence of four JNodes, as follows:
All four have ·parent· set to M1.
All four have ·kind· set to "map".
The ·content· properties are respectively 1, "XXX", true(), and ().
The ·position· properties are all set to 1.
The ·selector· properties are respectively "a""b", "c", and "d".
For each of these JNodes, the dm:j-children accessor returns an empty sequence.
For M2:
·parent· is R.
·content· is the second map.
·positionkind· is 1"array".
·selector· is 2.
The dm:j-children accessor returns a sequence of four JNodes, as follows:
All four have ·parent· set to M2.
All four have ·kind· set to "map".
The ·content· properties are respectively 2, "YYY", false(), and ().
The ·position· properties are all set to 1.
The ·selector· properties are respectively "a""b", "c", and "d".
For each of these JNodes, the dm:j-children accessor returns an empty sequence.
Note that all JNodes in a JTree that represents parsed JSON input will have the ·positionkind· property set to 1either "map" or "array". This is because every construct in JSON (arrays, objects, strings, number, booleans, andsequences null) maps to an XDM sequence of length 0 or 1of length 2 or more do not arise.
Consider a JTree that wraps a map constructed by the following XQuery expression:
{
"a": 1,
"b": ("x", "y"),
"c": [(10, 20), 30],
"d": (10, [20, 30]),
"e": (<p/>, <q/>)
}Then:
The root JNode R has a ·content· property that is a map with five entries.
The result of the dm:j-children accessor applied to R is a sequence of five JNodes J1, J2, J3, J4, and J5, as follows:
J1 is a leaf node. It has ·parent·=R, ·content·=1, ·position·=1, and ·selector·="a". Its dm:j-children accessor returns an empty sequence.
J2 is a leaf node. It has ·parent·=R, ·content·=("x", "y"), ·position·=1, and ·selector·="b". Its dm:j-children accessor returns an empty sequence.
J3 has ·parent·=R, ·content·=[(10, 20), 30], ·position·=1, and ·selector·="c". Its dm:j-children accessor returns a sequence of two JNodes J31 and J32, representing the two members of the contained array, as follows:
J31 is a leaf node. It has ·parent·=J3, ·content·=(10, 20), ·position·=1, and ·selector·="1". Its dm:j-children accessor returns an empty sequence.
J32 is a leaf node. It has ·parent·=J3, ·content·=30, ·position·=1, and ·selector·="2". Its dm:j-children accessor returns an empty sequence.
J4 has ·parent·=R, ·content·=(10, [20, 30]), ·position·=1, and ·selector·="d". Its dm:j-children accessor returns a sequence of two JNodes J41 and J42, representing the two members of the contained array, as follows:
J41 is a leaf node. It has ·parent·=J4, ·content·=20, ·position·=2, and ·selector·="1". Its dm:j-children accessor returns an empty sequence.
J42 is a leaf node. It has ·parent·=J4, ·content·=30, ·position·=2, and ·selector·="2". Its dm:j-children accessor returns an empty sequence.
J5 is a leaf node. It has ·parent·=R, a ·content· that is a sequence of two element nodes named p and q, ·position·=1, and ·selector·="e". Its dm:j-children accessor returns an empty sequence.
The JNodes comprising this tree are as indicated in the following table:
| Id | ·content· | ·kind· | ·selector· | ·parent· | ·children· |
|---|---|---|---|---|---|
| R | {
"a": 1,
"b": ("x", "y"),
"c": [(10, 20), 30],
"d": (10, [20, 30]),
"e": (<p/>, <q/>)
} | () | () | () | J1, J2, J3, J4, J5 |
| J1 | 1 | "map" | "a" | R | () |
| J2 | "x", "y" | "map" | "b" | R | J21, J22 |
| J21 | "x" | "sequence" | 1 | J2 | () |
| J22 | "y" | "sequence" | 2 | J2 | () |
| J3 | [(10, 20), 30] | "map" | "c" | R | J31, J32 |
| J31 | (10, 20) | "array" | 1 | J3 | J311, J312 |
| J311 | 10 | "sequence" | 1 | J31 | () |
| J312 | 20 | "sequence" | 2 | J31 | () |
| J32 | 30 | "array" | 2 | J3 | () |
| J4 | (10, [20, 30]) | "map" | "d" | R | J41, J42 |
| J41 | 10 | "sequence" | 1 | J4 | () |
| J42 | [20, 30] | "sequence" | 2 | J4 | J421, J422 |
| J421 | 20 | "array" | 1 | J42 | () |
| J422 | 30 | "array" | 2 | J42 | () |
| J5 | <p/>, <q/> | "map" | "e" | R | J51, J52 |
| J51 | <p/> | "sequence" | 1 | J5 | () |
| J52 | <q/> | "sequence" | 2 | J5 | () |
Note:
Sequences (as distinct from maps and arrays) are not represented by an extra layer of JNodes in the tree. This is because the structure is designed primarily to assist navigation of JTrees derived from JSON processing, in which sequence-valued nodes never arise. Generally, JTrees are easier to manipulate if none of the contained arrays or maps contain sequence-valued members or entries. JTrees containing non-homogenous content (for example, sequences that mix arrays, maps, and other items) can be represented using JNodes without loss of information, but may be difficult to navigate.
Sequences of length 2 or more have children; sequences of length 0 or 1 have none. The reason for this is to ensure that the tree of JNodes is always finite. This can make processing a little difficult in a structure where it is sensible to treat a sequence in the same way regardless of its length (for example, the authors of a book). One solution is to use the descendant axis in place of the child axis. Another solution is to use the child-plus axis which expands sequences of length one in the same ways as sequences of length 2 or more.
For a JNode representing the root of a JTree, the ·parent·, ·position·, and ·selector· properties will always be absent. For a non-root JNode, these properties, and the ·kind· property will always be present"root".
For a non-root JNode, the ·parent·, and ·selector· will always be present.
The identity of a root JNode is established when the root JNode is constructed, so that every operation that constructs a root JNode returns a JNode with distinct identity. The identity of a non-root JNode is a function of its ·parent·, ·position·, and ·selector· properties: two non-root JNodes are identical by definition if and only if their ·parent·s are identical and their ·position· and ·selector· properties are equal as determined by the dm:atomic-equal function.
A root JNode is constructed, wrapping a map or array, by a call on the dm:jnodejtree function applied to that map or array. This can be called explicitly, but it is also called implicitly in a number of situations: for example when a map or array is used as the left-hand operand of the path operator /.
Note:
An efficient implementation is likely to construct JNodes lazily, as and when a map or array is reached by downward navigation using XPath axis steps, which implicitly invoke the dm:j-children accessor. Typical implementations of maps and arrays do not include parent pointers to a containing map or array; these are contained only transiently in the JNode that wraps a map or array reached by downward navigation in the JTree. A benefit of not storing parent pointers is that a map or array does not need to be copied in order to participate in multiple trees.
This implementation strategy mirrors the concept of a Zipper data structure commonly encountered in other functional programming languages. The general idea is that while the core data structure maintains references in one direction only (in this case from parent to child), an operation that navigates the data structure can retain additional information about the path that was followed, effectively allowing access to nodes of the structure that were visited en route. This additional information is reified in the ·parent·, ·position·, and ·selector· properties of a non-root JNode.
When a property has no value, we say that it is absent.
An array item (also called simply an array) is a function item that represents an array.
An atomic item is a pair (T, D) where T (the type annotation) is an atomic type, and D (the datum) is a point in the value space of T.
An atomic type is either a primitive simple typewith variety atomic, or a type derived by restriction from another atomic type.
A character is any Unicode character.
A codepoint is a non-negative integer assigned to a character by the Unicode consortium, or reserved for future assignment to a character.
Two schemasX and Y are compatible if the union of X and Y is a valid schema.
The datum of an atomic item is a point in the value space of its type, which is also a point in the value space of the primitive type from which that type is derived.
An XTree whose root node is a document node is referred to as a document.
A document order is defined among all the GNodes accessible during a given query or transformation. Document order is a total ordering, although the relative order of some GNodes is implementation-dependent. Informally, document order is the order in which GNodes appear when serialized.
An array containing no members is referred to as an empty array.
A map containing no entries is referred to as an empty map.
The key/value pairs in a map are referred to as entries.
The order of entries in a map is referred to as entry order.
An expanded QName is a triple consisting of a possibly absent prefix, a possibly absent namespace URI, and a local name.
An XTree whose root node is not a document node is referred to as a fragment.
A function annotation consists of an annotation name (an instance of xs:QName) and an annotation value (an arbitrary sequence of atomic items).
The arity of a function item is the number of its parameters.
A function item is an item that can be called with zero or more arguments to return a result.
A function signature represents the type of a function.
The term generic node or GNode is a collective term for XNodes (more commonly called simply nodes) representing the parts of an XML document, and JNodes, often used to represent the parts of a JSON document.
Implementation-defined indicates an aspect that may differ between implementations, but must be specified by the implementer for each particular implementation.
Implementation-dependent indicates an aspect that may differ between implementations, is not specified by this or any W3C specification, and is not required to be specified by the implementer for any particular implementation.
An incompletely validated document is an XML document that has a corresponding schema but whose schema-validity assessment has resulted in one or more element or attribute information items being assigned values other than ‘valid’ for the [validity] property in the PSVI.
Every instance of the data model is a sequence.
An item is either a node, a function, or an atomic item.
An item type represents a class of items.
The accessor j-children, applied to a JNode $P, returns a sequence of non-root JNodes representing the children of $P.
A JNode is a kind of item used to represent a value within the context of a tree of maps and arrays. A root JNode represents a map or array; a non-root JNode represents a member of an array or, an entry in a map, or an item in a sequence.
A tree that is rooted at a parentless JNode is referred to as a JTree.
A map item (also called simply a map) is a function item that represents an ordered sequence of key/value pairs, in which the keys are unique.
An array is an ordered list of values; these values are called the members of the array.
This specification uses the term Namespace URI to refer to a namespace name, whether or not it is a valid URI or IRI
For continuity, and because the term is used in many other specifications, the term node is used as a synonym for XNode in cases where the meaning is clear from the context.
The primitive simple types are the types defined in 4.1.1 Types adopted from XML Schema.
The term root GNode refers to the topmost GNode of a tree, that is, a GNode with no parent.
Following the terminology of [Schema Part 1], a schema is defined as set of schema components. Schema components include, for example, element declarations and type definitions.
A schema type corresponds to a type definition component as defined in XSD.
A sequence is an ordered collection of zero or more items.
A sequence type constrains the set of permitted sequences, by defining the permitted item types and the permitted number of items in the sequence (exactly zero, exactly one, zero-or-more, one-or-more, zero-or-one).
A map containing exactly one entry is referred to as a single-entry map.
An array containing exactly one member is referred to as a single-member array.
A singleton sequence is a sequence of length one, that is, a sequence containing exactly one item.
Document order is stable, which means that the relative order of two GNodes will not change during the processing of a given query or transformation, even if this order is implementation-dependent.
A string is a sequence of zero or more characters.
The term type annotation has two slightly different meanings. For an atomic item, the type annotation of the value is the most specific atomic type that it is an instance of (it is also an instance of every type from which that type is derived). For an element or attribute node, the type annotation is the schema type (a simple or complex type) against which the node has been validated, defaulting to xs:untypedAtomic for unvalidated attribute nodes, and xs:untyped for unvalidated element nodes.
Because every value is a sequence, the term value is used synonymously with sequence.
Every JNode has a property ·content· which is an arbitrary value (that is, in general, a sequence).
An XNode is an item that represents one of the seven kinds of construct found in an XML document: elements, attributes, text nodes, comment nodes, processing instructions, namespaces, and the document node itself.
A tree that is rooted at a parentless XNode is referred to as an XTree.