XQuery and XPath Data Model 4.0

1 Introduction

Changes in 4.0 ⬇

Use the arrows to browse significant changes since the 3.1 version of this specification.
Sections with significant changes are marked Δ in the table of contents.

This document defines the XQuery and XPath Data Model 4.0, which is the data model of [XML Path Language (XPath) 4.0][XPath 4.0], [XSL Transformations (XSLT) Version 4.0], and [XQuery 4.0: An XML Query Language].

The XQuery and XPath Data Model 4.0 (henceforth “data model”) serves two purposes. First, it defines the information contained in the input to an XSLT or XQuery processor. Second, it defines all permissible values of expressions in the XSLT, XQuery, and XPath languages. A language is closed with respect to a data model if the value of every expression in the language is guaranteed to be in the data model. XSLT 4.0, XQuery 4.0, and XPath 4.0 are all closed with respect to the data model.

The data model describes items similar to those of the [Infoset] (henceforth “Infoset”). It is written to provide a data model suitable for XPath, XQuery and XSLT, which was not a goal of the Infoset, and this leads to a number of differences, some of which are:

Support for XML Schema types. The XML Schema recommendations define features, such as structures ([Schema Part 1]) and simple data types ([Schema Part 2]), that extend the Infoset with precise type information.
Representation of collections of documents and of complex values.
Support for typed atomic items.
Support for ordered, heterogeneous sequences.
Support for additional data types such as maps, arrays, and functions.

As with the Infoset, the XQuery and XPath Data Model 4.0 specifies what information in the documents is accessible but does not specify the programming-language interfaces or bindings used to represent or access the data.

The data model can represent various values, including not only the input and the output of a stylesheet or query but all values of expressions used during the intermediate calculations. Examples include the input document or document repository (represented as a document node or a sequence of document nodes), the result of a path expression (represented as a sequence of nodes), the result of an arithmetic or a logical expression (represented as an atomic item), a sequence expression resulting in a sequence of items, etc.

This document provides a precise definition of the properties of nodes in the XQuery and XPath Data Model 4.0, how they are accessed, and how they relate to values in the Infoset and PSVI.

2 Terminology and Notation

This section introduces terminology and notational conventions that apply throughout the document.

In this document, material labeled as as a note or example is provided for explanatory purposes and are not normative.

2.2 Notation

To explain the data model, this specification uses both prose and a defined set of accessor functions. The accessors are shown with the prefix dm:. This prefix is always shown in italics to emphasize that these functions are abstract; they exist to explain the interface between the data model and specifications that rely on the data model: they are not accessible directly from the host language.

Several prefixes are used throughout this document for notational convenience. The following bindings are assumed.

xs: bound to http://www.w3.org/2001/XMLSchema
xsi: bound to http://www.w3.org/2001/XMLSchema-instance
fn: bound to http://www.w3.org/2005/xpath-functions

In practice, any prefix that is bound to the appropriate URI may be used.

The signature of accessor functions is shown using the same style as [XQuery and XPath Functions and Operators 4.0], described in [Functions and Operators 4.0] section Section 1.5 Function signatures and descriptionsFO.

This document relies on the [Infoset] and Post-Schema-Validation Infoset (PSVI). Information items and properties are indicated by the styles information item and [infoset property], respectively.

Some aspects of type assignment rely on the ability to access properties of the schema components. Such properties are indicated by curly brackets, e.g., {component property}. Note that this does not mean a lightweight schema processor cannot be used, it only means that the application must have some mechanism to access the necessary properties.

4 Schemas and Types

4.1 Schema Information

The data model supports strongly typed languages such as [XML Path Language (XPath) 4.0][XPath 4.0] and [XQuery 4.0: An XML Query Language] that have a type system based on [Schema Part 1]. To achieve this, the data model includes (by reference) the Schema Component Model described in [Schema Part 1].

Note:

The Schema Component Model includes a number of kinds of component, such as type definitions and element and attribute declarations, and defines the properties and relationships of these components. Many of these components and properties are not used by the language specifications that rely on XDM, and where this is the case, there is no requirement for implementations to make them visible. However, this specification makes no attempt to define the minimal subset of the schema component model that is needed to support the semantics of XPath and XQuery processing.

There are two main areas where the language semantics depend on information in schema components:

Expressions are evaluated with respect to a static context, which includes schema components, specifically type definitions, element declarations, and attribute declarations. The names of such components may be used in language constructs only if the components are present in the static context.
Values including element and attribute nodes, and atomic items, have a property called a type annotation whose value is a type: this is a reference to a type definition in the Schema Component Model.

The diagram below illustrates the schema type system, in which all types are derived from xs:anyType.

XML Schema types (abstract)
- anyType (built-in complex)
  - Simple types (abstract)
    - anySimpleType (built-inlist)
      - Atomic types (abstract)
        anyAtomicType (built-in atomic)
      - list types (abstract)
        ENTITIES (built-in list)
        IDREFS (built-in list)
        NMTOKENS (built-in list)
        user-defined list types (user-defined)
      - union types (abstract)
        numeric (built-in complex)
        user-defined union types (user-defined)
        user-defined enumeration types (user-defined)
    - complex types (complex)
      - untyped (built-in complex)
      - user-defined complex types (user-defined)

Legend:

Supertype
- subtype

Abstract types (abstract)
Built-in atomic types (built-in atomic)
Built-in complex types (built-in complex)
Built-in list types (built-in list)
User-defined types (user-defined)

4.1.1 Types adopted from XML Schema

[Definition: A schema type corresponds to a type definition component as defined in XSD.] Schema types are either complex types or simple types; simple types are either atomic types, list types, or union types.

The data model adopts the following schema types:

The 19 primitive atomic types defined in Section 3.2 Primitive datatypes^XS2 of [Schema Part 2].
Three built-in list types: xs:NMTOKENS, xs:IDREFS, and xs:ENTITIES.
The following types, which were originally defined in [XQuery 1.0 and XPath 2.0 Data Model (XDM)][XDM 4.0] and were subsequently adopted by [Schema 1.1 Part 2]: xs:anyAtomicType, xs:dayTimeDuration, xs:yearMonthDuration.
In the case of a processor that supports [Schema 1.1 Part 2], the new union type xs:error (a type with no instances) and the new derived type xs:dateTimeStamp.
The following types, which use the xs: namespace and are defined here in the data model but not in XML Schema: xs:untypedAtomic, and xs:numeric, a union type whose members are xs:double, xs:float and xs:decimal.

Schema types fulfill a role different from item types. Schema types other than atomic types arise in the data model only as type annotations on element and attribute nodes. Nodes are not instances of schema types in the sense of the XPath instance of operator; but an element or attribute node may be an instance of the item type element(*, S) or attribute(*, S) where S is a schema type. The node matches this item type if its type annotation is S, or a type derived from S, which will be the case if the node has been validated against type S in the course of schema validation.

Schema types and item types form overlapping categories:

Atomic types belong to both categories.
Node types and function types are item types, but they are not schema types.
Complex types, list types, and union types are schema types, but they are not item types.

5 Atomic Items

Changes in 4.0 ⬇ ⬆

The term atomic value has been replaced by atomic item. [Issue 1337 PR 1361 2 August 2024]

[Definition: An atomic item is a pair (T, D) where T (the type annotation) is an atomic type, and D (the datum) is a point in the value space of T.]

[Definition: The datum of an atomic item is a point in the value space of its type, which is also a point in the value space of the primitive type from which that type is derived.] There are 20 primitive atomic types (19 defined in XSD, plus xs:untypedAtomic), and these have non-overlapping value spaces, so each datum belongs to exactly one primitive atomic type.

Note:

The term value space is defined in [Schema 1.1 Part 2] as a set of values. The term datum is used here in preference to value, because value has a different meaning in this data model.

[Definition: An atomic type is either a primitive simple typewith variety atomic, or a type derived by restriction from another atomic type.] (Types derived by list or union are not atomic.)

Note:

Atomic types include the 19 primitive atomic types defined in XSD (such as xs:string, xs:boolean, and xs:decimal), the built-in non-primitive types defined in XSD (such as xs:integer, and xs:NCName, and xs:dayTimeDuration), atomic types derived from these in a user-defined schema, and the special type xs:untypedAtomic.

[Definition: The primitive simple types are the types defined in 4.1.1 Types adopted from XML Schema.]

[Definition: The term type annotation has two slightly different meanings. For an atomic item, the type annotation of the value is the most specific atomic type that it is an instance of (it is also an instance of every type from which that type is derived). For an element or attribute node, the type annotation is the schema type (a simple or complex type) against which the node has been validated, defaulting to xs:untypedAtomic for unvalidated attribute nodes, and xs:untyped for unvalidated element nodes.]

Named types are identified in the data model by an expanded QName. A schema may also contain anonymous types, and these may be used as type annotations on nodes and atomic items; anonymous types, however, cannot be referenced explicitly in programs.

[Definition: An expanded QName is a triple consisting of a possibly absent prefix, a possibly absent namespace URI, and a local name.] See 5.4 QNames and NOTATIONS.

5.1 String Values

An atomic item can be constructed from a lexical representation. Given a string and an atomic type, the atomic item is constructed in such a way as to be consistent with schema validation. If the string does not represent a valid value of the type, an error is raised. When xs:untypedAtomic is specified as the type, no validation takes place. The details of the construction are described in [Functions and Operators 4.0] section Section 22 Constructor functionsFO and the related [Functions and Operators 4.0] section Section 23 CastingFO section of [XQuery and XPath Functions and Operators 4.0].

A string value can be constructed from an atomic item. Such a value is constructed by converting the atomic item to its string representation as described in [Functions and Operators 4.0] section Section 23 CastingFO.

7 XML Documents and Nodes

7.3 Document Construction

This section describes the constraints on documents (that is, trees of nodes).

The data model supports well-formed XML documents conforming to [Namespaces in XML] or [Namespaces in XML 1.1]. Documents that are not well-formed are, by definition, not XML. XML documents that do not conform to [Namespaces in XML] or [Namespaces in XML 1.1] are not supported (nor are they supported by [Infoset]).

In other words, the data model supports the following classes of XML documents:

Well-formed documents conforming to [Namespaces in XML] or [Namespaces in XML 1.1].
DTD-valid documents conforming to [Namespaces in XML] or [Namespaces in XML 1.1], and
W3C XML Schema-validated documents.

This document describes how to construct a document (a tree of nodes) from an infoset ([Infoset]) or a Post Schema Validation Infoset (PSVI), the augmented infoset produced by an XML Schema validation episode.

A document can also be constructed directly through application APIs, or from non-XML sources such as relational tables in a database. Data model construction from sources other than an Infoset or PSVI is implementation-defined. Regardless of how an instance of the data model is constructed, every node and atomic item in the data model must have a typed value that is consistent with its type.

The data model supports some kinds of values that are not supported by [Infoset]. Examples of these are document fragments and sequences of document nodes. The data model also supports values that are not nodes. Examples of these are sequences of atomic items, or sequences mixing nodes and atomic items. These are necessary to be able to represent the results of intermediate expressions in the data model during expression processing.

7.3.3 Construction from a PSVI

An instance of the data model can be constructed from a PSVI, whose element and attribute information items have been strictly assessed, laxly assessed, or have not been assessed. Constructing an instance of the data model from a PSVI must be consistent with the description provided in this section and with the description provided for each node kind.

Data model construction requires that the PSVI provide unique names for all anonymous schema types.

Note:

[Schema Part 1] does not require all schema processors to provide unique names for anonymous schema types. In order to build an instance of the data model from a PSVI produced by a processor that does not provide the names, some post-processing will be required in order to ensure that they are all uniquely identified before construction begins.

[Definition: An incompletely validated document is an XML document that has a corresponding schema but whose schema-validity assessment has resulted in one or more element or attribute information items being assigned values other than ‘valid’ for the [validity] property in the PSVI.]

The data model supports incompletely validated documents. Elements and attributes that are not valid are treated as having unknown types.

The most significant difference between Infoset construction and PSVI construction occurs in the area of schema type assignment. Other differences can also arise from schema processing: default attribute and element values may be provided, white space normalization of element content may occur, and the user-supplied lexical form of elements and attributes with atomic schema types may be lost.

7.3.3.1 Mapping PSVI Additions to Node Properties

A PSVI element or attribute information item may have a [validity] property. The [validity] property may be “valid ”, “invalid ”, or “notKnown ” and reflects the outcome of schema-validity assessment. In the data model, precise schema type information is exposed for element and attribute nodes that are “valid ”. Nodes that are not “valid ” are treated as if they were simply well-formed XML and only very general schema type information is associated with them.

7.3.3.1.3 Relationship Between Typed-Value and String-Value

Element and attribute nodes have both typed-value and string-value properties (the terms typed value and string value are defined at [XPath 4.0] section Section 2.5.2 Typed Value and String ValueXP of [XML Path Language (XPath) 4.0][XPath 4.0]). However, implementations are allowed some flexibility in how these properties are stored. An implementation may choose to store the string-value property only and derive the typed-value property from it, or to store the typed-value property only and derive the string-value property from it, or to store both the string-value property and the typed-value property.

To permit these various implementation strategies, some variations in the string value of a node are defined as insignificant. Implementations that store only the typed value of a node are permitted to return a string value that is different from the original lexical form of the node content. For example, consider the following element:

<offset xsi:type="xs:integer">0030</offset>

Assuming that the node is valid, it has a typed value of 30 as an xs:integer. An implementation may return either “30” or “0030” as the string value of the node. Any string that is a valid lexical representation of the typed value is acceptable. In this specification, we express this rule by saying that the relationship between the string value of a node and its typed value must be “consistent with schema validation.”

If an implementation stores only the string value of a node, the following considerations apply:

Where union types occur, the implementation must be able to deliver the typed value as an instance of the appropriate member type. For example, if the type of an element node is my:integer-or-string, which is defined as a union of xs:integer and xs:string, and the string value of the node is “47”, the implementation must be able to deliver the typed value of the node as either the integer 47 or the string "47", depending on which member type validated the element.
Where types of xs:QName, xs:NOTATION, or types derived from one of these types occur, the implementation must be able to deliver the typed value as a triple consisting of a local name, a namespace prefix, and a namespace URI, even though the namespace URI is not part of the string-value (see 5.4 QNames and NOTATIONS).
Where an element with a complex type and element-only content occurs, it is an error to attempt to access the typed-value of the element node.

If an implementation stores only the typed value of a node, it must be prepared to construct string values from not only the node, but in some cases also the descendants of that node. For example, an element with a complex type and element-only content has no typed value but does have a string value that is the concatenation of the string values of all its text node descendants in document order.

A further caveat applies if an implementation stores the typed value of a node. If a new data model is constructed by copying portions of another data model, and the copy operation does not preserve inherited namespaces, and the type is a union type that is sensitive to the namespace context, then the typed value may be different than what would be obtained by revalidating the node within its new namespace context. Although this may stretch the semantics of “consistent with schema validation”, we accept this possibility; it is not an error.

7.4 Node Kinds

There are seven kinds of nodes in the data model: document, element, attribute, text, namespace, processing instruction, and comment. Each kind of node is described in the following sections.

Each section consists of an overview of the node, followed by information on accessors and methods construction from an Infoset or a PSVI. The final section provides a mapping to Infosets. No mapping is provided, nor can it be provided, for producing a PSVI. Validation must be used to obtain a PSVI for a (portion of a) data model instance.

All nodes must satisfy the following general constraints:

Every node must have a unique identity, distinct from all other nodes.
The children property of a node must not contain two consecutive text nodes.
The children property of a node must not contain any empty text nodes.
No node may appear more than once in the children or attributes properties of a node.

7.4.4 Namespace nodes

7.4.4.1 Overview

Each namespace node represents the binding of a namespace URI to a namespace prefix or to the default namespace. Implementations that do not use namespace nodes may represent the same information using the namespaces property of an element node. Namespaces have the following properties:

prefix, possibly empty
uri
parent, possibly empty

Namespace nodes must satisfy the following constraints.

If a namespace node N is among the namespaces of an element E, then the parent of Nmust be E.
If a namespace node N has a parent element E, then Nmust be among the namespaces of E.
A namespace node must not have the name xmlns nor the string-value http://www.w3.org/2000/xmlns/.

The data model permits namespace nodes without parents; see below.

In XPath 1.0, namespace nodes were directly accessible by applications, by means of the namespace axis. In XPath 3.1 the namespace axis is deprecated, and it is not available at all in XQuery 3.1. XPath 3.1 implementations are not required to expose the namespace axis, though they may do so if they wish to offer backwards compatibility.

The information held in namespace nodes is instead made available to applications using functions defined in [XQuery and XPath Functions and Operators 4.0]. Some properties of namespace nodes are not exposed by these functions: in particular, properties related to the identity of namespace nodes, their parentage, and their position in document order. Implementations that do not expose the namespace axis can therefore avoid the overhead of maintaining this information.

Implementations that expose the namespace axis must provide unique namespace nodes for each element. Each element has an associated set of namespace nodes, one for each distinct namespace prefix that is in scope for the element (including the xml prefix, which is implicitly declared by [Namespaces in XML] and one for the default namespace if one is in scope for the element. The element is the parent of each of these namespace nodes; however, a namespace node is not a child of its parent element. In implementations that expose the namespace axis, elements never share namespace nodes.

Note:

In implementations that do not expose the namespace axis, there is no means by which the host language can tell if namespace nodes are shared or not and, in such circumstances, sharing namespace nodes may be a very reasonable implementation strategy.

7.5 Accessors

A set of accessors is defined on nodes in the data model. For consistency, all the accessors are defined on every kind of node, although several accessors return a constant empty sequence on some kinds of nodes.

In order for processors to be able to operate on instances of the data model, the model must expose the properties of the items it contains. The data model does this by defining a family of accessor functions. These are not functions in the literal sense; they are not available for users or applications to call directly. Rather they are descriptions of the information that an implementation of the data model must expose to applications. Functions and operators available to end users are described in [XQuery and XPath Functions and Operators 4.0].

Some typed values in the data model are absent. Attempting to access an absent typed value is an error. Behavior in these cases is implementation defined and the host language is responsible for determining the result.

8 Function Items, Maps, and Arrays

8.1 Function Items

Changes in 4.0 ⬇ ⬆

Introduced the concept of function identity. [Issue 520 PR 525 30 May 2023]
The parameter names property of a function item is dropped; the property never had any effect on the semantics of the language. [Issue 1896 ]

[Definition: A function item is an item that can be called with zero or more arguments to return a result. ] Function items have no serialization.

Note:

XDM 4.0 uses the term function item where XDM 3.1 used function. There is no distinction in meaning, but function item is preferred for clarity, because the unqualified term function has additional meanings in relation to function definitions in XSLT and XQuery.

A function item has the following properties:

name: An expanded QName, possibly absent.
identity: An abstract property that can be used to test whether two variables refer to the same function or to different functions. This property is exposed only for this purpose.
Note:
Currently, the concept of function identity is used for two purposes: firstly, when functions appear in the arguments supplied to the fn:deep-equal function; and secondly, in establishing whether the arguments and results of a function are "the same" when deciding whether the function is deterministic.
signature [Definition: A function signature represents the type of a function.] The signature of a function item comprises:
- The required types of its parameters (each one being a SequenceType^XP)
- The required types of the function result (also a SequenceType^XP)
annotations A sequence of zero or more function annotations. [Definition: A function annotation consists of an annotation name (an instance of xs:QName) and an annotation value (an arbitrary sequence of atomic items).] Annotations are ordered and it is permitted for two annotations to share the same name.
body The body of a function provides the logic to map the arguments supplied in a function call into an instance of the function’s result type.
The function body is generally one of the following:
- a user-written construct in XPath, XQuery, XSLT, or some other host language known to the processor.
- vendor-supplied logic internal to the processor.
- external logic written in some third-party programming language, to be invoked by the processor using implementation-dependent mechanisms.
These categories are not mutually exclusive; they may be used in combination.
Note:
The term “function body” replaces “function implementation”, to avoid confusion with the use of the term “implementation” in phrases such as “implementation-defined”.
captured context This includes a static and dynamic context for evaluation of the function body, as described in [XPath 4.0] section Section 2.2 Expression ContextXP. In particular it includes a set of nonlocal variable bindings (a mapping from xs:QName to item()*), which provides a value for each of the function’s free variables (i.e., variables referenced by the function’s body, other than locals and parameters).
Note:
Where the function body is implemented in XPath, XQuery, or XSLT, the captured context includes the static context for the user-written code (for example, its in-scope namespaces) as well as any nonlocal variable bindings.
Functions implemented internally by the processor may capture specific parts of the static or dynamic context, for example fn:position#0 captures the value of the context position.

[Definition: The arity of a function item is the number of its parameters. ] The number of parameter types in a function's signature, must equal the function’s arity.

All function items match the generic function type function(*), which is itself a subtype of item(). A function signature defines a more specific function type, which is always a subtype of function(*). A function signature function(T₁, T₂, T₃, ...) as T_R is a subtype of another function signature function(U₁, U₂, U₃, ...) as U_R if (a) the two signatures have the same arity, (b) the return type T_R is a subtype of U_R, and (c) for each pair of parameter types, T_n is a supertype of U_n. The rules are explained more fully in [XPath 4.0] section Section 3.3 Subtype RelationshipsXP. For example:

function(item()) as item() is a subtype of function(*)
function(item()) as xs:integer is a subtype of function(item()) as item()
function(item()) as item() is a subtype of function(xs:string) as item()

8.2 Map Items

Changes in 4.0 ⬇ ⬆

Constructors are added, and the single accessor function is now an iterator over the key/value pairs in the map. [Issue 1335 20 July 2024]
Ordered maps are introduced. [Issue 1651 PR 1703 14 January 2025]

[Definition: A map item (also called simply a map) is a function item that represents an ordered sequence of key/value pairs, in which the keys are unique.] In other languages this is sometimes called a hash, dictionary, or associative array. The keys are atomic items, and each key in the map is unique (there is no other key to which it is equal). Each key is associated with a value that may be any sequence of zero or more items. There is no uniqueness constraint on values, only on keys. The semantics of equality when comparing keys are described in [Functions and Operators 4.0] section Section 14.2.1 fn:atomic-equalFO.

[Definition: The key/value pairs in a map are referred to as entries.]

Considered as a function item, a map is a function from atomic items to values: if M is a map and K is an atomic item, then the function call M(K) returns the value associated with the key K, if present, or an empty sequence otherwise. More specifically, the properties of a map when considered as a function item are:

name: absent.
identity: implementation dependent.
signature: function(xs:anyAtomicType) as item()*.
annotations: none.
body: equivalent to a call on map:get.
captured context: empty.

[Definition: A map containing exactly one entry is referred to as a single-entry map.]

[Definition: A map containing no entries is referred to as an empty map.]

Note:

Note the distinction between a singleton map, which is a sequence of length one containing a single map item, and a single-entry map, which is a map containing exactly one entry.

[Definition: The order of entries in a map is referred to as entry order.] The entry order affects the result of functions such as map:keys and map:for-each, and also determines the order of entries when a map is serialized using the JSON output method.

Constructor and accessor functions for maps are defined in the following sections.

9 Conformance

The data model is intended primarily as a component that can be used by other specifications. Therefore, the data model relies on specifications that use it (such as [XML Path Language (XPath) 4.0][XPath 4.0], [XSL Transformations (XSLT) Version 4.0], and [XQuery 4.0: An XML Query Language]) to specify conformance criteria for the data model in their respective environments. Specifications that set conformance criteria for their use of the data model must not relax the constraints expressed in this specification.

Authors of conformance criteria for the use of the data model should pay particular attention to the following features of the data model:

Support for the normative construction from an infoset described in 7.3.2 Construction from an Infoset.
Support for the normative construction from a PSVI described in 7.3.3 Construction from a PSVI.
Support for XML 1.0 and XML 1.1.
Support for data types in XML Schema 1.0 and XML Schema 1.1.
How namespaces are supported, through nodes or through the alternative, implementation-dependent representation.

Note:

In addition, the dm:is-id and dm:base-uri accessors are required by functions in [XQuery and XPath Functions and Operators 4.0]. These refer to the specifications [xml:id] and [XML Base] respectively.

B References

B.1 Normative References

XML: Extensible Markup Language (XML) 1.0 (Fifth Edition), Tim Bray, Jean Paoli, Michael Sperberg-McQueen, et. al., Editors. World Wide Web Consortium, 26 Nov 2008. This version is http://www.w3.org/TR/2008/REC-xml-20081126/. The latest version is available at http://www.w3.org/TR/xml.
Infoset: XML Information Set (Second Edition), John Cowan and Richard Tobin, Editors. World Wide Web Consortium, 04 Feb 2004. This version is http://www.w3.org/TR/2004/REC-xml-infoset-20040204. The latest version is available at http://www.w3.org/TR/xml-infoset.
Namespaces in XML: Namespaces in XML 1.0 (Third Edition), Tim Bray, Dave Hollander, Andrew Layman, et. al., Editors. World Wide Web Consortium, 08 Dec 2009. This version is http://www.w3.org/TR/2009/REC-xml-names-20091208/. The latest version is available at http://www.w3.org/TR/xml-names/.
Namespaces in XML 1.1: Namespaces in XML 1.1 (Second Edition), Tim Bray, Dave Hollander, Andrew Layman, and Richard Tobin, Editors. World Wide Web Consortium, 16 Aug 2006. This version is http://www.w3.org/TR/2006/REC-xml-names11-20060816. The latest version is available at http://www.w3.org/TR/xml-names11/.
xml:id: xml:id Version 1.0, Jonathan Marsh, Daniel Veillard, and Norman Walsh, Editors. World Wide Web Consortium, 09 Sep 2005. This version is http://www.w3.org/TR/2005/REC-xml-id-20050909/. The latest version is available at http://www.w3.org/TR/xml-id/.
XQuery 1.0 and XPath 2.0 Data Model (XDM)
XDM 4.0: XQuery 1.0 and XPath 2.0 Data Model (XDM) (Second Edition), Norman Walsh, Mary Fernández, Ashok Malhotra, et. al., Editors. World Wide Web Consortium, 14 December 2010. This version is https://www.w3.org/TR/2010/REC-xpath-datamodel-20101214/. The latest version is available at https://www.w3.org/TR/xpath-datamodel/.
XML Path Language (XPath) 4.0
XPath 4.0: XML Path Language (XPath) 4.0, XSLT Extensions Community Group, World Wide Web Consortium.
Functions and Operators 4.0XQuery and XPath Functions and Operators 4.0: XQuery and XPath Functions and Operators 4.0, XSLT Extensions Community Group, World Wide Web Consortium.
Schema Part 1: XML Schema Part 1: Structures Second Edition, Henry Thompson, David Beech, Murray Maloney, and Noah Mendelsohn, Editors. World Wide Web Consortium, 28 Oct 2004. This version is http://www.w3.org/TR/2004/REC-xmlschema-1-20041028/. The latest version is available at http://www.w3.org/TR/xmlschema-1/.
Schema Part 2: XML Schema Part 2: Datatypes Second Edition, Paul V. Biron and Ashok Malhotra, Editors. World Wide Web Consortium, 28 Oct 2004. This version is http://www.w3.org/TR/2004/REC-xmlschema-2-20041028/. The latest version is available at http://www.w3.org/TR/xmlschema-2/.
Schema 1.1 Part 1: W3C XML Schema Definition Language (XSD) 1.1 Part 1: Structures, Sandy Gao, Michael Sperberg-McQueen, Henry Thompson, et. al., Editors. World Wide Web Consortium, 05 Apr 2012. This version is http://www.w3.org/TR/2012/REC-xmlschema11-1-20120405/. The latest version is available at http://www.w3.org/TR/xmlschema11-1/.
Schema 1.1 Part 2: W3C XML Schema Definition Language (XSD) 1.1 Part 2: Datatypes, David Peterson, Sandy Gao, Ashok Malhotra, et. al., Editors. World Wide Web Consortium, 05 Apr 2012. This version is http://www.w3.org/TR/2012/REC-xmlschema11-2-20120405/. The latest version is available at http://www.w3.org/TR/xmlschema11-2/.
XSLT and XQuery Serialization 4.0Serialization 4.0XSLT and XQuery Serialization 4.0: XSLT and XQuery Serialization 4.0, XSLT Extensions Community Group, World Wide Web Consortium.
XQuery 1.0 and XPath 2.0 Formal Semantics: XQuery 1.0 and XPath 2.0 Formal Semantics (Second Edition), Jérôme Siméon, Denise Draper, Peter Frankhauser, et. al., Editors. World Wide Web Consortium, 14 December 2010. This version is https://www.w3.org/TR/2010/REC-xquery-semantics-20101214/. The latest version is available at https://www.w3.org/TR/xquery-semantics/.
RFC 2119: Key words for use in RFCs to Indicate Requirement Levels, S. Bradner. Network Working Group, IETF, Mar 1997.
RFC 3986: Uniform Resource Identifier (URI): Generic Syntax, T. Berners-Lee, R. Fielding, and L. Masinter. Network Working Group, IETF, Jan 2005.
RFC 3987: Internationalized Resource Identifiers (IRIs), M. Duerst and M. Suignard. Network Working Group, IETF, Jan 2005.
Character Model: Character Model for the World Wide Web 1.0: Fundamentals, Martin Dürst, François Yergeau, Richard Ishida, et. al., Editors. World Wide Web Consortium, 15 Feb 2005. This version is http://www.w3.org/TR/2005/REC-charmod-20050215/. The latest version is available at http://www.w3.org/TR/charmod/.

I PSVI Construction Summary (Non-Normative)

This section summarizes data model construction from a PSVI for each kind of information item. General notes occur elsewhere.

I.1 Document Construction Information Items

Data model construction requires that the PSVI provide unique names for all anonymous schema types.

Note:

The data model supports incompletely validated documents. Elements and attributes that are not valid are treated as having unknown types.

7.3.3.1 Mapping PSVI Additions to Node Properties

7.3.3.1.3 Relationship Between Typed-Value and String-Value

<offset xsi:type="xs:integer">0030</offset>

If an implementation stores only the string value of a node, the following considerations apply:

Where union types occur, the implementation must be able to deliver the typed value as an instance of the appropriate member type. For example, if the type of an element node is my:integer-or-string, which is defined as a union of xs:integer and xs:string, and the string value of the node is “47”, the implementation must be able to deliver the typed value of the node as either the integer 47 or the string "47", depending on which member type validated the element.
Where types of xs:QName, xs:NOTATION, or types derived from one of these types occur, the implementation must be able to deliver the typed value as a triple consisting of a local name, a namespace prefix, and a namespace URI, even though the namespace URI is not part of the string-value (see 5.4 QNames and NOTATIONS).
Where an element with a complex type and element-only content occurs, it is an error to attempt to access the typed-value of the element node.

XQuery and XPath Data Model 4.0

W3C Editor's Draft 23 February 2026

Abstract

Status of this Document

Dedication

1 Introduction

2 Terminology and Notation

2.2 Notation

4 Schemas and Types

4.1 Schema Information

4.1.1 Types adopted from XML Schema

5 Atomic Items

5.1 String Values

7 XML Documents and Nodes

7.3 Document Construction

7.3.3 Construction from a PSVI

7.3.3.1 Mapping PSVI Additions to Node Properties

7.3.3.1.3 Relationship Between Typed-Value and String-Value

7.4 Node Kinds

7.4.4 Namespace nodes

7.4.4.1 Overview

7.5 Accessors

8 Function Items, Maps, and Arrays

8.1 Function Items

8.2 Map Items

9 Conformance

B References

B.1 Normative References

I PSVI Construction Summary (Non-Normative)

I.1 Document Construction Information Items

7.3.3.1 Mapping PSVI Additions to Node Properties

7.3.3.1.3 Relationship Between Typed-Value and String-Value