Please check the errata for any errors or issues reported since publication.
See also translations.
This document is also available in these non-normative formats: Specification in XML format and XML function catalog.
Copyright © 2000 W3C® (MIT, ERCIM, Keio, Beihang). W3C liability, trademark and document use rules apply.
This document defines constructor functions, operators, and functions on the datatypes defined in [XML Schema Part 2: Datatypes Second Edition] and the datatypes defined in [XQuery and XPath Data Model (XDM) 3.1]. It also defines functions and operators on nodes and node sequences as defined in the [XQuery and XPath Data Model (XDM) 3.1]. These functions and operators are defined for use in [XML Path Language (XPath) 4.0] and [XQuery 4.0: An XML Query Language] and [XSL Transformations (XSLT) Version 4.0] and other related XML standards. The signatures and summaries of functions defined in this document are available at: http://www.w3.org/2005/xpath-functions/.
A summary of changes since version 3.1 is provided at H Changes since 3.1.
This version of the specification is work in progress. It is produced by the QT4 Working Group, officially the W3C XSLT 4.0 Extensions Community Group. Individual functions specified in the document may be at different stages of review, reflected in their History notes. Comments are invited, in the form of GitHub issues at https://github.com/qt4cg/qtspecs.
The publications of this community group are dedicated to our co-chair, Michael Sperberg-McQueen (1954–2024).
Changes in 4.0 ⬇
Use the arrows to browse significant changes since the 3.1 version of this specification.
Sections with significant changes are marked Δ in the table of contents. New functions introduced in this version are marked ➕ in the table of contents.
The purpose of this document is to define functions and operators for inclusion in XPath 4.0, XQuery 4.0, and XSLT 4.0. The exact syntax used to call these functions and operators is specified in [XML Path Language (XPath) 4.0], [XQuery 4.0: An XML Query Language] and [XSL Transformations (XSLT) Version 4.0].
This document defines three classes of functions:
General purpose functions, available for direct use in user-written queries, stylesheets, and XPath expressions, whose arguments and results are values defined by the [XQuery and XPath Data Model (XDM) 3.1].
Constructor functions, used for creating instances of a datatype from values of (in general) a different datatype. These functions are also available for general use; they are named after the datatype that they return, and they always take a single argument.
Functions that specify the semantics of operators defined in [XML Path Language (XPath) 4.0] and [XQuery 4.0: An XML Query Language]. These exist for specification purposes only, and are not intended for direct calling from user-written code.
[XML Schema Part 2: Datatypes Second Edition] defines a number of primitive and derived datatypes, collectively known as built-in datatypes. This document defines functions and operations on these datatypes as well as the other types (for example, nodes and sequences of nodes) defined in Section 2.7 Schema Information DM31 of the [XQuery and XPath Data Model (XDM) 3.1]. These functions and operations are available for use in [XML Path Language (XPath) 4.0], [XQuery 4.0: An XML Query Language] and any other host language that chooses to reference them. In particular, they may be referenced in future versions of XSLT and related XML standards.
[XSD 1.1 Part 2] adds to the datatypes defined in [XML Schema Part 2: Datatypes Second Edition]. It introduces a new derived type xs:dateTimeStamp, and it incorporates as built-in types the two types xs:yearMonthDuration and xs:dayTimeDuration which were previously XDM additions to the type system. In addition, XSD 1.1 clarifies and updates many aspects of the definitions of the existing datatypes: for example, it extends the value space of xs:double to allow both positive and negative zero, and extends the lexical space to allow +INF; it modifies the value space of xs:Name to permit additional Unicode characters; it allows year zero and disallows leap seconds in xs:dateTime values; and it allows any character string to appear as the value of an xs:anyURI item. Implementations of this specification may support either XSD 1.0 or XSD 1.1 or both.
In some cases, this specification references XSD for the semantics of operations such as the effect of matching using regular expressions, or conversion of atomic items to strings. In most such cases there is no intended technical difference between the XSD 1.0 and XSD 1.1 specifications, but the 1.1 version often provides clearer explanations and sometimes also corrects technical errors. In such cases this specification often chooses to reference the XSD 1.1 specification. This should not be taken as implying that it is necessary to invoke an XSD 1.1 processor.
References to specific sections of some of the above documents are indicated by cross-document links in this document. Each such link consists of a pointer to a specific section followed a superscript specifying the linked document. The superscripts have the following meanings: XQ [XQuery 4.0: An XML Query Language], XT [XSL Transformations (XSLT) Version 4.0], XP [XML Path Language (XPath) 4.0], and DM [XQuery and XPath Data Model (XDM) 4.0].
The functions and operators defined in this document are contained in one of several namespaces (see [Namespaces in XML]) and referenced using an xs:QName.
This document uses conventional prefixes to refer to these namespaces. User-written applications can choose a different prefix to refer to the namespace, so long as it is bound to the correct URI. The host language may also define a default namespace for function calls, in which case function names in that namespace need not be prefixed at all. In many cases the default namespace will be http://www.w3.org/2005/xpath-functions, allowing a call on the fn:name function (for example) to be written as name() rather than fn:name(); in this document, however, all example function calls are explicitly prefixed.
The URIs of the namespaces and the conventional prefixes associated with them are:
http://www.w3.org/2001/XMLSchema for constructors — associated with xs.
The section 2122 Constructor functions defines constructor functions for the built-in datatypes defined in [XML Schema Part 2: Datatypes Second Edition] and in Section 2.7 Schema Information DM31 of [XQuery and XPath Data Model (XDM) 3.1]. These datatypes and the corresponding constructor functions are in the XML Schema namespace, http://www.w3.org/2001/XMLSchema, and are named in this document using the xs prefix.
http://www.w3.org/2005/xpath-functions for functions — associated with fn.
The namespace prefix used in this document for most functions that are available to users is fn.
http://www.w3.org/2005/xpath-functions/math for functions — associated with math.
This namespace is used for some mathematical functions. The namespace prefix used in this document for these functions is math. These functions are available to users in exactly the same way as those in the fn namespace.
http://www.w3.org/2005/xpath-functions/map for functions — associated with map.
This namespace is used for some functions that manipulate maps (see 18.4 Functions that Operate on Maps). The namespace prefix used in this document for these functions is map. These functions are available to users in exactly the same way as those in the fn namespace.
http://www.w3.org/2005/xpath-functions/array for functions — associated with array.
This namespace is used for some functions that manipulate maps (see 19.2 Functions that Operate on Arrays). The namespace prefix used in this document for these functions is array. These functions are available to users in exactly the same way as those in the fn namespace.
http://www.w3.org/2005/xqt-errors — associated with err.
There are no functions in this namespace; it is used for error codes.
This document uses the prefix err to represent the namespace URI http://www.w3.org/2005/xqt-errors, which is the namespace for all XPath and XQuery error codes and messages. This namespace prefix is not predeclared and its use in this document is not normative.
http://www.w3.org/2010/xslt-xquery-serialization — associated with output.
There are no functions in this namespace: it is used for serialization parameters, as described in [XSLT and XQuery Serialization 3.1]
Functions defined with the op prefix are described here to underpin the definitions of the operators in [XML Path Language (XPath) 4.0], [XQuery 4.0: An XML Query Language] and [XSL Transformations (XSLT) Version 4.0]. These functions are not available directly to users, and there is no requirement that implementations should actually provide these functions. For this reason, no namespace is associated with the op prefix. For example, multiplication is generally associated with the * operator, but it is described as a function in this document:
op:numeric-multiply( | ||
$arg1 | as , | |
$arg2 | as | |
) as | ||
Sometimes there is a need to use an operator as a function. To meet this requirement, the function fn:op takes any simple binary operator as its argument, and returns a corresponding function. So for example fn:for-each-pair($seq1, $seq2, op("+")) performs a pairwise addition of the values in two input sequences.
Note:
The above namespace URIs are not expected to change from one version of this document to another. The contents of these namespaces may be extended to allow additional functions (and errors, and serialization parameters) to be defined.
The diagrams in this section show how nodes, functions, primitive simple types, and user defined types fit together into a type system. This type system comprises two distinct subsystems that both include the primitive atomic types. In the diagrams, connecting lines represent relationships between derived types and the types from which they are derived; the former are always below and to the right of the latter.
The xs:IDREFS, xs:NMTOKENS, xs:ENTITIES types, and xs:numeric and both the user-defined list types and user-defined union types are special types in that these types are lists or unions rather than types derived by extension or restriction.
The first diagram illustrates the relationship of various item types.
Item types are used to characterize the various types of item that can appear in a sequence (nodes, atomic items, and functions), and they are therefore used in declaring the types of variables or the argument types and result types of functions.
In XDM, item types include node types, function types, and built-in atomic types. Item types form a directed graph, rather than a hierarchy or lattice: in the relationship defined by the derived-from(A, B) function, some types are derived from more than one other type. Examples include functions (function(xs:string) as xs:int is substitutable for function(xs:NCName) as xs:int and also for function(xs:string) as xs:decimal), and choice types (A is substitutable for the choice type (A | B) and also for (A | C). Record types provide an alternative way of categorizing maps: the instances of record(longitude, latitude) overlap with the instances of map(xs:string, xs:double). The diagram, which shows only hierarchic relationships, is therefore a simplification of the full model.
item (abstract)
anyAtomicType (built-in atomic)
nodeGNode (node)
attributeXNode (node)
user-defined attribute types (user-defined)
attribute (node)
user-defined attribute types (user-defined)
document (node)
user-defined document types (user-defined)
element (node)
user-defined element types (user-defined)
text (node)
comment (node)
processing-instruction (node)
namespace (node)
document (node)
user-defined document types (user-defined)
element (node)
user-defined element types (user-defined)
text (node)
comment (node)
processing-instruction (node)
namespaceJNode (node)
function(*) (function item)
user-defined function item types (user-defined)
array(*) (function item)
user-defined array types (user-defined)
map(*) (function item)
user-defined map types (user-defined)
user-defined record types (user-defined)
Legend:
Supertype
subtype
Abstract types (abstract)
Built-in atomic types (built-in atomic)
Node types (node)
Function item types (function item)
User-defined types (user-defined)
The terminology used to describe the functions and operators on types defined in [XML Schema Part 2: Datatypes Second Edition] is defined in the body of this specification. The terms defined in this section are used in building those definitions.
Note:
Following in the tradition of [XML Schema Part 2: Datatypes Second Edition], the terms type and datatype are used interchangeably.
This section is concerned with the question of whether two calls on a function, with the same arguments, may produce different results.
In this section the term function, unless otherwise specified, applies equally to function definitionsXP (which can be the target of a static function call) and function itemsDM (which can be the target of a dynamic function call).
[Definition] An execution scope is a sequence of calls to the function library during which certain aspects of the state are required to remain invariant. For example, two calls to fn:current-dateTime within the same execution scope will return the same result. The execution scope is defined by the host language that invokes the function library. In XSLT, for example, any two function calls executed during the same transformation are in the same execution scope (except that static expressions, such as those used in use-when attributes, are in a separate execution scope).
The following definition explains more precisely what it means for two function calls to return the same result:
[Definition] Two values $V1 and $V2 are defined to be identical if they contain the same number of items and the items are pairwise identical. Two items are identical if and only if one of the following conditions applies:
Both items are atomic items, of precisely the same type, and the values are equal as defined using the eq operator, using the Unicode codepoint collation when comparing strings.
Both items are nodes, and represent the same node.
Both items are maps, both maps have the same number of entries, and for every entry E1 in the first map there is an entry E2 in the second map such that the keys of E1 and E2 are the same key, and the corresponding values V1 and V2 are identical.
Both items are arrays, both arrays have the same number of members, and the members are pairwise identical.
Both items are function items, neither item is a map or array, and the two function items have the same function identity. The concept of function identity is explained in Section 7.18.1 Function ItemsDM.
Some functions produce results that depend not only on their explicit arguments, but also on the static and dynamic context.
[Definition] A function definitionXP may have the property of being context-dependent: the result of such a function depends on the values of properties in the static and dynamic evaluation context of the caller as well as on the actual supplied arguments (if any). A function definition may be context-dependent for some arities in its arity range, and context-independent for others: for example fn:name#0 is context-dependent while fn:name#1 is context-independent.
[Definition] A function definitionXP that is not context-dependent is called context-independent.
The main categories of context-dependent functions are:
Functions that explicitly deliver the value of a component of the static or dynamic context, for example fn:static-base-uri, fn:default-collation, fn:position, or fn:last.
Functions with an optional parameter whose default value is taken from the static or dynamic context of the caller, usually either the context value (for example, fn:node-name) or the default collation (for example, fn:index-of).
Functions that use the static context of the caller to expand or disambiguate the values of supplied arguments: for example fn:doc expands its first argument using the static base URI of the caller, and xs:QName expands its first argument using the in-scope namespaces of the caller.
[Definition] A function is focus-dependent if its result depends on the focusXP31 (that is, the context item, position, or size) of the caller.
[Definition] A function that is not focus-dependent is called focus-independent.
Note:
Some functions depend on aspects of the dynamic context that remain invariant within an execution scope, such as the implicit timezone. Formally this is treated in the same way as any other context dependency, but internally, the implementation may be able to take advantage of the fact that the value is invariant.
Note:
User-defined functions in XQuery and XSLT may depend on the static context of the function definition (for example, the in-scope namespaces) and also in a limited way on the dynamic context (for example, the values of global variables). However, the only way they can depend on the static or dynamic context of the caller — which is what concerns us here — is by defining optional parameters whose default values are context-dependent.
Note:
Because the focus is a specific part of the dynamic context, all focus-dependent functions are also context-dependent. A context-dependent function, however, may be either focus-dependent or focus-independent.
A function definition that is context-dependent can be used as the target of a named function reference, can be partially applied, and can be found using fn:function-lookup. The principle in such cases is that the static context used for the function evaluation is taken from the static context of the named function reference, partial function application, or the call on fn:function-lookup; and the dynamic context for the function evaluation is taken from the dynamic context of the evaluation of the named function reference, partial function application, or the call of fn:function-lookup. These constructs all deliver a function itemDM having a captured context based on the static and dynamic context of the construct that created the function item. This captured context forms part of the closure of the function item.
The result of a dynamic call to a function item never depends on the static or dynamic context of the dynamic function call, only (where relevant) on the captured context held within the function item itself.
The fn:function-lookup function is a special case because it is potentially dependent on everything in the static and dynamic context. This is because the static and dynamic context of the call to fn:function-lookupform the captured context of the function item that fn:function-lookup returns.
[Definition] A function that is guaranteed to produce identical results from repeated calls within a single execution scope if the explicit and implicit arguments are identical is referred to as deterministic.
[Definition] A function that is not deterministic is referred to as nondeterministic.
All functions defined in this specification are deterministic unless otherwise stated. Exceptions include the following:
[Definition] Some functions (such as fn:distinct-values, fn:unordered, map:keys, and map:for-each) produce results in an implementation-defined or implementation-dependent order. In such cases two calls with the same arguments are not guaranteed to produce the results in the same order. These functions are said to be nondeterministic with respect to ordering.
Some functions (such as fn:analyze-string, fn:parse-xml, fn:parse-xml-fragment, fn:parse-html, and fn:json-to-xml) construct a tree of nodes to represent their results. There is no guarantee that repeated calls with the same arguments will return the same identical node (in the sense of the is operator). However, if non-identical nodes are returned, their content will be the same in the sense of the fn:deep-equal function. Such a function is said to be nondeterministic with respect to node identity.
Some functions (such as fn:doc and fn:collection) create new nodes by reading external documents. Such functions are guaranteed to be deterministic with the exception that an implementation is allowed to make them nondeterministic as a user option.
Where the results of a function are described as being (to a greater or lesser extent) implementation-defined or implementation-dependent, this does not by itself remove the requirement that the results should be deterministic: that is, that repeated calls with the same explicit and implicit arguments must return identical results.
[Definition] The function fn:concat is defined to be variadic: it accepts any number of arguments. No other function has this property.
Accessors and their semantics are described in [XQuery and XPath Data Model (XDM) 3.1]. Some of these accessors are exposed to the user through the functions described below.
Each of these functions has an arity-zero signature which is equivalent to the arity-one form, with the context value supplied as the implicit first argument. In addition, each of the arity-one functions accepts an empty sequence as the argument, in which case it generally delivers an empty sequence as the result: the exception is fn:string, which delivers a zero-length string.
| Function | Accessor | Accepts | Returns |
|---|---|---|---|
fn:node-name | node-name | node (optional) | xs:QName (optional) |
fn:nilled | nilled | node (optional) | xs:boolean (optional) |
fn:string | string-value | item (optional) | xs:string |
fn:data | typed-value | zero or more items | a sequence of atomic items |
fn:base-uri | base-uri | node (optional) | xs:anyURI (optional) |
fn:document-uri | document-uri | node (optional) | xs:anyURI (optional) |
| Function | Meaning |
|---|---|
fn:node-name | Returns the name of a node, as an xs:QName. |
fn:nilled | Returns true for an element that is nilled. |
fn:string | Returns the value of $value represented as an xs:string. |
fn:data | Returns the result of atomizing a sequence. This process flattens arrays, and replaces nodes by their typed values. |
fn:base-uri | Returns the base URI of a node. |
fn:document-uri | Returns the URI of a resource where a document can be found, if available. |
Returns the name of a node, as an xs:QName.
fn:node-name( | ||
$node | as | := . |
) as | ||
The zero-argument form of this function is deterministic, context-dependent, and focus-dependent.
The one-argument form of this function is deterministic, context-independent, and focus-independent.
If the argument is omitted, it defaults to the context value (.).
If $node is the empty sequence, the empty sequence is returned.
Otherwise, the function returns the result of the dm:node-name accessor as defined in [XQuery and XPath Data Model (XDM) 3.1] (see Section 6.7.107.5.10 node-name AccessorDM).
The following errors may be raised when $node is omitted:
If the context value is absentDM, type error [err:XPDY0002]XP.
If the context value is not an instance of the sequence type node()?, type error [err:XPTY0004]XP.
For element and attribute nodes, the name of the node is returned as an xs:QName, retaining the prefix, namespace URI, and local part.
For processing instructions, the name of the node is returned as an xs:QName in which the prefix and namespace URI are absentDM.
For a namespace node, the function returns an empty sequence if the node represents the default namespace; otherwise it returns an xs:QName in which prefix and namespace URI are absentDM and the local part is the namespace prefix being bound.
For all other kinds of node, the function returns the empty sequence.
| Variables | |
|---|---|
let $e := <doc> <p id="alpha" xml:id="beta">One</p> <p id="gamma" xmlns="http://example.com/ns">Two</p> <ex:p id="delta" xmlns:ex="http://example.com/ns">Three</ex:p> <?pi 3.14159?> </doc> | |
| Expression | Result |
|---|---|
| QName("", "p") |
| QName("http://example.com/ns", "p") |
| QName("http://example.com/ns", "ex:p") |
| QName("", "pi") |
| () |
| QName("", "id") |
| #xml:id |
Returns true for an element that is nilled.
fn:nilled( | ||
$node | as | := . |
) as | ||
The zero-argument form of this function is deterministic, context-dependent, and focus-dependent.
The one-argument form of this function is deterministic, context-independent, and focus-independent.
If the argument is omitted, it defaults to the context value (.).
If $node is the empty sequence, the function returns the empty sequence.
Otherwise the function returns the result of the dm:nilled accessor as defined in [XQuery and XPath Data Model (XDM) 3.1] (see Section 6.7.87.5.8 nilled AccessorDM).
The following errors may be raised when $node is omitted:
If the context value is absentDM, type error [err:XPDY0002]XP
If the context value is not an instance of the sequence type node()?, type error [err:XPTY0004]XP.
If $node is not an element node, the function returns the empty sequence.
If $node is an untyped element node, the function returns false.
In practice, the function returns true only for an element node that has the attribute xsi:nil="true" and that is successfully validated against a schema that defines the element to be nillable; the detailed rules, however, are defined in [XQuery and XPath Data Model (XDM) 3.1].
Returns the value of $value represented as an xs:string.
fn:string( | ||
$value | as | := . |
) as | ||
The zero-argument form of this function is deterministic, context-dependent, and focus-dependent.
The one-argument form of this function is deterministic, context-independent, and focus-independent.
In the zero-argument version of the function, $value defaults to the context value. That is, calling fn:string() is equivalent to calling fn:string(.).
If $value is the empty sequence, the function returns the zero-length string.
If $value is aan XNodenodeDM, the function returns the string value of the node, as obtained using the dm:string-value accessor defined in [XQuery and XPath Data Model (XDM) 3.1] (see Section 6.7.127.5.12 string-value AccessorDM).
If $value is a JNodeDM, the function returns the result of string(JNode-value($value)). This will fail in the case where JNode-value($value) is a map or an array.
If $value is an atomic item, the function returns the result of the expression $value cast as xs:string (see 2223 Casting).
In all other cases, a dynamic error occurs (see below).
The following errors may be raised when $value is omitted:
If the context value is absentDM, type error [err:XPDY0002]XP.
If the context value is not an instance of the sequence type item()?, type error [err:XPTY0004]XP.
A type error is raised [err:FOTY0014] if $value is a function item (this includes maps and arrays).
Every node has a string value, even an element with element-only content (which has no typed value). Moreover, casting an atomic item to a string always succeeds. Functions, maps, and arrays have no string value, so these are the only arguments that satisfy the typesatisfy the type signature but cause failure. Applying the string signature but cause failurefunction to a JNode succeeds if the JNode wraps a simple value such as a string, number, or boolean, or if it wraps an XNode, but it fails in the case where the JNode wraps a map or an array.
| Variables | |
|---|---|
let $para := <para>There lived a <term author="Tolkien">hobbit</term>.</para> | |
| Expression | Result |
|---|---|
| "23" |
| "false" |
| "Paris" |
| Raises error XPTY0004. |
| Raises error FOTY0014. |
| Raises error FOTY0014. |
| "30" |
| "There lived a hobbit." |
Returns the result of atomizing a sequence. This process flattens arrays, and replaces nodes by their typed values.
fn:data( | ||
$input | as | := . |
) as | ||
The zero-argument form of this function is deterministic, context-dependent, and focus-dependent.
The one-argument form of this function is deterministic, context-independent, and focus-independent.
If the argument is omitted, it defaults to the context value (.).
The result of fn:data is the sequence of atomic items produced by applying the following rules to each item in $input:
If the item is an atomic item, it is appended to the result sequence.
If the item is aan XNodenodeDM, the typed value of the node is appended to the result sequence. The typed value is a sequence of zero or more atomic items: specifically, the result of the dm:typed-value accessor as defined in [XQuery and XPath Data Model (XDM) 3.1] (See Section 6.7.147.5.14 typed-value AccessorDM).
If the item is a JNodeDM, the atomized value of its ¶value property is appended to the result sequence.
If the item is an array, the result of applying fn:data to each member of the array, in order, is appended to the result sequence.
A type error is raised [err:FOTY0012] if an item in the sequence $input is a node that does not have a typed value.
A type error is raised [err:FOTY0013] if an item in the sequence $input is a function item other than an array.
A type error is raised [err:XPDY0002]XP if $input is omitted and the context value is absentDM.
The process of applying the fn:data function to a sequence is referred to as atomization. In many cases an explicit call on fn:data is not required, because atomization is invoked implicitly when a node or sequence of nodes is supplied in a context where an atomic item or sequence of atomic items is required.
The result of atomizing an empty sequence is an empty sequence.
The result of atomizing an empty array is an empty sequence.
| Variables | |
|---|---|
let $para := <para>There lived a <term author="Tolkien">hobbit</term>.</para> | |
| Expression | Result |
|---|---|
| 123 |
| 123, 456 |
| 1, 2, 3, 4 |
| xs:untypedAtomic("There lived a hobbit.") |
| xs:untypedAtomic("Tolkien") |
| Raises error FOTY0013. |
Returns the base URI of a node.
fn:base-uri( | ||
$node | as | := . |
) as | ||
The zero-argument form of this function is deterministic, context-dependent, and focus-dependent.
The one-argument form of this function is deterministic, context-independent, and focus-independent.
The zero-argument version of the function returns the base URI of the context node: it is equivalent to calling fn:base-uri(.).
The single-argument version of the function behaves as follows:
If $node is the empty sequence, the function returns the empty sequence.
Otherwise, the function returns the value of the dm:base-uri accessor applied to the node $node. This accessor is defined, for each kind of node, in the XDM specification (See Section 6.7.27.5.2 base-uri AccessorDM).
Note:
As explained in XDM, document, element and processing-instruction nodes have a base-uri property which may be empty. The base-uri property for all other node kinds is the empty sequence. The dm:base-uri accessor returns the base-uri property of a node if it exists and is non-empty; otherwise it returns the result of applying the dm:base-uri accessor to its parent, recursively. If the node does not have a parent, or if the recursive ascent up the ancestor chain encounters a parentless node whose base-uri property is empty, the empty sequence is returned. In the case of namespace nodes, however, the result is always an empty sequence — it does not depend on the base URI of the parent element.
See also fn:static-base-uri.
The following errors may be raised when $node is omitted:
If the context value is absentDM, type error [err:XPDY0002]XP
If the context value is not an instance of the sequence type node()?, type error [err:XPTY0004]XP.
Returns the URI of a resource where a document can be found, if available.
fn:document-uri( | ||
$node | as | := . |
) as | ||
The zero-argument form of this function is deterministic, context-dependent, and focus-dependent.
The one-argument form of this function is deterministic, context-independent, and focus-independent.
If the argument is omitted, it defaults to the context value (.).
If $node is the empty sequence, the function returns the empty sequence.
If $node is not a document node, the function returns the empty sequence.
Otherwise, the function returns the value of the document-uri accessor applied to $node, as defined in [XQuery and XPath Data Model (XDM) 3.1] (See Section 6.6.1.27.4.1.2 AccessorsDM).
The following errors may be raised when $node is omitted:
If the context value is absentDM, type error [err:XPDY0002]XP
If the context value is not an instance of the sequence type node()?, type error [err:XPTY0004]XP.
In the 3.1 version of this specification, it was mandated that two distinct documents could not have the same document-uri property: more specifically, it was guaranteed that for any document node $D, either document-uri($D) would be absent, or doc(document-uri($D)) would return $D.
For various reasons, this constraint has proved impractical. Different parts of an application may read the same external resource in different ways, for example with or without validation or whitespace stripping, leading to different document nodes derived from the same external resource having the same document-uri property. In addition, the specification explicitly allows implementations, at user request, to relax the requirements for determinism of resource access functions, which makes it possible for multiple calls of functions such as fn:doc, fn:json-doc, or fn:collection to return different results for the same supplied URI.
Although the uniqueness of the document-uri property is no longer an absolute constraint, it is still desirable that implementations should where possible respect the principle that URIs are usable as identifiers for resources.
In the case of a document node $D returned by the fn:doc function, it will generally be the case that fn:document-uri($D) returns a URI $U such that a call on fn:doc($U) in the same dynamic context will return the same document node $D. The URI $U will not necessarily be the same URI that was originally passed to the fn:doc function, since several URIs may identify the same resource.
It is recommended that implementations of fn:collection should ensure that any documents included in the returned collection, if they have a non-empty fn:document-uri property, should be such that a call on fn:doc supplying this URI returns the same document node.
This section specifies further functions on nodes. Nodes are formally defined in Section 6 Nodes DM31.
| Function | Meaning |
|---|---|
fn:name | Returns the name of a node, as an xs:string that is either the zero-length string, or has the lexical form of an xs:QName. |
fn:local-name | Returns the local part of the name of $node as an xs:string that is either the zero-length string, or has the lexical form of an xs:NCName. |
fn:namespace-uri | Returns the namespace URI part of the name of $node, as an xs:anyURI value. |
fn:lang | This function tests whether the language of $node, or the context value if the second argument is omitted, as specified by xml:lang attributes is the same as, or is a sublanguage of, the language specified by $language. |
fn:root | Returns the root of the tree to which $node belongs. This will usually, but not necessarily, be a document nodeThe function can be applied both to XNodesDM and to JNodesDM. |
fn:path | Returns a path expression that can be used to select the supplied node relative to the root of its containing document. |
fn:has-children | Returns true if the supplied node has one or more child nodes (of any kind). |
fn:siblings | Returns the supplied node together with its siblings, in document order. |
Returns the local part of the name of $node as an xs:string that is either the zero-length string, or has the lexical form of an xs:NCName.
fn:local-name( | ||
$node | as | := . |
) as | ||
The zero-argument form of this function is deterministic, context-dependent, and focus-dependent.
The one-argument form of this function is deterministic, context-independent, and focus-independent.
If the argument is omitted, it defaults to the context value (.).
If the argument is supplied and is the empty sequence, the function returns the zero-length string.
If the node identified by $node has no name (that is, if it is a document node, a comment, a text node, or a namespace node having no name), the function returns the zero-length string.
Otherwise, the function returns the local part of the expanded-QName of the node identified by $node, as determined by the dm:node-name accessor defined in Section 6.7.107.5.10 node-name AccessorDM. This will be an xs:string whose lexical form is an xs:NCName.
The following errors may be raised when $node is omitted:
If the context value is absentDM, type error [err:XPDY0002]XP
If the context value is not a single node, type error [err:XPTY0004]XP.
| Variables | |
|---|---|
let $e := <doc> <p id="alpha" xml:id="beta">One</p> <p id="gamma" xmlns="http://example.com/ns">Two</p> <ex:p id="delta" xmlns:ex="http://example.com/ns">Three</ex:p> <?pi 3.14159?> </doc> | |
| Expression | Result |
|---|---|
| "p" |
| "p" |
| "p" |
| "pi" |
| "" |
| "id" |
| "id" |
Returns the namespace URI part of the name of $node, as an xs:anyURI value.
fn:namespace-uri( | ||
$node | as | := . |
) as | ||
The zero-argument form of this function is deterministic, context-dependent, and focus-dependent.
The one-argument form of this function is deterministic, context-independent, and focus-independent.
If the argument is omitted, it defaults to the context node (.).
If the node identified by $node is neither an element nor an attribute node, or if it is an element or attribute node whose expanded-QName (as determined by the dm:node-name accessor in the Section 6.7.107.5.10 node-name AccessorDM) is in no namespace, then the function returns the zero-length xs:anyURI value.
Otherwise, the result will be the namespace URI part of the expanded-QName of the node identified by $node, as determined by the dm:node-name accessor defined in Section 6.7.107.5.10 node-name AccessorDM), returned as an xs:anyURI value.
The following errors may be raised when $node is omitted:
If the context value is absentDM, type error [err:XPDY0002]XP
If the context value is not an instance of the sequence type node()?, type error [err:XPTY0004]XP.
| Variables | |
|---|---|
let $e := <doc> <p id="alpha" xml:id="beta">One</p> <p id="gamma" xmlns="http://example.com/ns">Two</p> <ex:p id="delta" xmlns:ex="http://example.com/ns">Three</ex:p> <?pi 3.14159?> </doc> | |
| Expression | Result |
|---|---|
| "" |
| "http://example.com/ns" |
| "http://example.com/ns" |
| "" |
| "" |
| "" |
| "http://www.w3.org/XML/1998/namespace" |
Returns the root of the tree to which $node belongs. This will usually, but not necessarily, be a document nodeThe function can be applied both to XNodesDM and to JNodesDM.
fn:root( | ||
$node | as | := . |
) as | ||
The zero-argument form of this function is deterministic, context-dependent, and focus-dependent.
The one-argument form of this function is deterministic, context-independent, and focus-independent.
If the function is called without an argument, the context value (.) is used as the default argument.
TheIf the (explicit or implicit) argument is a XNodeDM, the function returns the value of the expression ($arg/ancestor-or-self::node())[1[last()].
If the (explicit or implicit) argument is a JNodeDM, the function returns the value of the expression $arg?ancestor-or-self::*[last()].
The following errors may be raised when $node is omitted:
If the context value is absentDM, type error [err:XPDY0002]XP
If the context value is not an instance of the sequence type nodeGNode()?, type error [err:XPTY0004]XP.
These examples use some variables which could be defined in [XQuery 4.0: An XML Query Language] as: | |
let $i := <tool>wrench</tool>
let $o := <order>{ $i }<quantity>5</quantity></order>
let $odoc := document { $o }
let $newi := $o/tool | |
Or they could be defined in [XSL Transformations (XSLT) Version 4.0] as: | |
<xsl:variable name="i" as="element()">
<tool>wrench</tool>
</xsl:variable>
<xsl:variable name="o" as="element()">
<order>
<xsl:copy-of select="$i"/>
<quantity>5</quantity>
</order>
</xsl:variable>
<xsl:variable name="odoc">
<xsl:copy-of select="$o"/>
</xsl:variable>
<xsl:variable name="newi" select="$o/tool"/> | |
| |
| |
| |
| |
The final three examples could be made type-safe by wrapping their operands with |
This section specifies functions on sequences of nodes.
| Function | Meaning |
|---|---|
fn:distinct-ordered-nodes | Removes duplicate nodesGNodes and sorts the input into document order. |
fn:innermost | Returns every node within the input sequence that is not an ancestor of another member of the input sequence; the nodes are returned in document order with duplicates eliminated. |
fn:outermost | Returns every node within the input sequence that has no ancestor that is itself a member of the input sequence; the nodes are returned in document order with duplicates eliminated. |
Removes duplicate nodesGNodes and sorts the input into document order.
fn:distinct-ordered-nodes( | ||
$nodes | as | |
) as | ||
This function is deterministic, context-independent, and focus-independent.
Any duplicate nodesGNodes (that is, XNodes or JNodes) in the input (based on node identity) are discarded. The remaining nodesGNodes are returned in document orderXP.
Document order is implementation-dependent (but stable) for nodesGNodes in different documentstrees. If some node in documentGNode in tree A precedes some node in documentGNode in tree B, then every nodeGNode in A precedes every nodeGNode in B.
| Expression: | let $x := parse-xml('<doc><a/><b/><c/><d/><c/><e/></doc>')
return distinct-ordered-nodes(($x//c, $x//b, $x//a, $x//b)) ! name() |
|---|---|
| Result: | "a", "b", "c", "c" (The two |
| Expression: | let $x := {"a":{"a":{"a":1}}}
return distinct-ordered-nodes(
$x ? descendant::a ? descendant::a)
=> count() |
| Result: | 3(The innermost map entry |
This section specifies arithmetic operators on the numeric datatypes defined in [XML Schema Part 2: Datatypes Second Edition].
The operators described in this section are defined on the following atomic types.
decimal
integer
double
float
Legend:
Supertype
subtype
Built-in atomic types
They also apply to types derived by restriction from the above types.
The type xs:numeric is defined as a union type whose member types are (in order) xs:double, xs:float, and xs:decimal. This type is implicitly imported into the static context, so it can also be used in defining the signature of user-written functions. Apart from the fact that it is implicitly imported, it behaves exactly like a user-defined type with the same definition. This means, for example:
If the expected type of a function parameter is given as xs:numeric, the actual value supplied can be an instance of any of these three types, or any type derived from these three by restriction (this includes the built-in type xs:integer, which is derived from xs:decimal).
If the expected type of a function parameter is given as xs:numeric, and the actual value supplied is xs:untypedAtomic (or a node whose atomized value is xs:untypedAtomic), then it will be cast to the union type xs:numeric using the rules in 22.3.723.3.7 Casting to union types. Because the lexical space of xs:double subsumes the lexical space of the other member types, and xs:double is listed first, the effect is that if the untyped atomic item is in the lexical space of xs:double, it will be converted to an xs:double, and if not, a dynamic error occurs.
When the return type of a function is given as xs:numeric, the actual value returned will be an instance of one of the three member types (and perhaps also of types derived from these by restriction). The rules for the particular function will specify how the type of the result depends on the values supplied as arguments. In many cases, for the functions in this specification, the result is defined to be the same type as the first argument.
Note:
This specification uses [IEEE 754-2019] arithmetic for xs:float and xs:double values. One consequence of this is that some operations result in the value NaN (not a number), which has the unusual property that it is not equal to itself. Another consequence is that some operations return the value negative zero. This differs from [XML Schema Part 2: Datatypes Second Edition], which defines NaN as being equal to itself and defines only a single zero in the value space. The text accompanying several functions defines behavior for both positive and negative zero inputs and outputs in the interest of alignment with [IEEE 754-2019]. A conformant implementation must respect these semantics. In consequence, the expression -0.0e0 (which is actually a unary minus operator applied to an xs:double value) will always return negative zero: see 4.2.8 op:numeric-unary-minus. As a concession to implementations that rely on implementations of XSD 1.0, however, when casting from string to double the lexical form -0may be converted to positive zero, though negative zero is recommended.
XML Schema 1.1 introduces support for positive and negative zero as distinct values, and also uses the [IEEE 754-2019] semantics for comparisons involving NaN.
It is possible to convert strings to values of type xs:integer, xs:float, xs:decimal, or xs:double using the constructor functions described in 2122 Constructor functions or using cast expressions as described in 2223 Casting.
In addition the fn:number function is available to convert strings to values of type xs:double. It differs from the xs:double constructor function in that any value outside the lexical space of the xs:double datatype is converted to the xs:double value NaN.
| Function | Meaning |
|---|---|
fn:number | Returns the value indicated by $value or, if $value is not specified, the context value after atomization, converted to an xs:double. |
fn:parse-integer | Converts a string to an integer, recognizing any radix in the range 2 to 36. |
Returns the value indicated by $value or, if $value is not specified, the context value after atomization, converted to an xs:double.
fn:number( | ||
$value | as | := . |
) as | ||
The zero-argument form of this function is deterministic, context-dependent, and focus-dependent.
The one-argument form of this function is deterministic, context-independent, and focus-independent.
Calling the zero-argument version of the function is defined to give the same result as calling the single-argument version with the context value (.). That is, fn:number() is equivalent to fn:number(.), as defined by the rules that follow.
If $value is the empty sequence or if $value cannot be converted to an xs:double, the xs:double value NaN is returned.
Otherwise, $value is converted to an xs:double following the rules of 22.1.3.223.1.3.2 Casting to xs:double. If the conversion to xs:double fails, the xs:double value NaN is returned.
A type error is raised [err:XPDY0002]XP if $value is omitted and the context value is absentDM.
As a consequence of the rules given above, a type error is raised [err:XPTY0004]XP if the context value cannot be atomized, or if the result of atomizing the context value is a sequence containing more than one atomic item.
XSD 1.1 allows the string +INF as a representation of positive infinity; XSD 1.0 does not. It is implementation-defined whether XSD 1.1 is supported.
Generally fn:number returns NaN rather than raising a dynamic error if the argument cannot be converted to xs:double. However, a type error is raised in the usual way if the supplied argument cannot be atomized or if the result of atomization does not match the required argument type.
| Variables | |
|---|---|
let $e := <e price="12.1" discount="NONE"/> | |
| Expression | Result |
|---|---|
| 1.2e1 |
| 1.2e1 |
| xs:double('INF') |
| xs:double('NaN') |
| xs:double('NaN') |
| 1.21e1 |
| xs:double('NaN') |
| xs:double('NaN') |
| 1.0e1, 1.1e1, 1.2e1 |
This section defines functions and operators on the xs:boolean datatype.
The following functions are defined on boolean values:
| Function | Meaning |
|---|---|
fn:boolean | Computes the effective boolean value of the sequence $input. |
fn:not | Returns true if the effective boolean value of $input is false, or false if it is true. |
Computes the effective boolean value of the sequence $input.
fn:boolean( | ||
$input | as | |
) as | ||
The function computes the effective boolean value of a sequence, defined according to the following rules. See also Section 2.5.4 Effective Boolean ValueXP.
If $input is the empty sequence, fn:boolean returns false.
If $input is a sequence whose first item is a GNodeDMnode (a generalized node), fn:boolean returns true.
If $input is a singleton value of type xs:boolean or of a type derived from xs:boolean, fn:boolean returns $input.
If $input is a singleton value of type xs:untypedAtomic, xs:string, xs:anyURI, or a type derived from xs:string or xs:anyURI, fn:boolean returns false if the operand value has zero length; otherwise it returns true.
If $input is a singleton value of any numeric type or a type derived from a numeric type, fn:boolean returns false if the operand value is NaN or is numerically equal to zero; otherwise it returns true.
In all cases other than those listed above, fn:boolean raises a type error [err:FORG0006].
The result of this function is not necessarily the same as $input cast as xs:boolean. For example, fn:boolean("false") returns the value true whereas "false" cast as xs:boolean (which can also be written xs:boolean("false")) returns false.
| Variables | |
|---|---|
let $abc := ("a", "b", "") | |
| Expression | Result |
|---|---|
| true() |
| false() |
| false() |
| |
| |
A sequence is an ordered collection of zero or more items. An item is a node, an atomic item, or a function, such as a map or an array. The terms sequence and item are defined formally in [XQuery 4.0: An XML Query Language] and [XML Path Language (XPath) 4.0].
The functions in this section perform comparisons between the items in one or more sequences.
| Function | Meaning |
|---|---|
fn:atomic-equal | Determines whether two atomic items are equal, under the rules used for comparing keys in a map. |
fn:deep-equal | This function assesses whether two sequences are deep-equal to each other. To be deep-equal, they must contain items that are pairwise deep-equal; and for two items to be deep-equal, they must either be atomic items that compare equal, or nodes of the same kind, with the same name, whose children are deep-equal, or maps with matching entries, or arrays with matching members. |
fn:compare | Returns -1, 0, or 1, depending on whether the first value is less than, equal to, or greater than the second value. |
fn:distinct-values | Returns the values that appear in a sequence, with duplicates eliminated. |
fn:duplicate-values | Returns the values that appear in a sequence more than once. |
fn:index-of | Returns a sequence of positive integers giving the positions within the sequence $input of items that are equal to $target. |
fn:starts-with-subsequence | Determines whether one sequence starts with another, using a supplied callback function to compare items. |
fn:ends-with-subsequence | Determines whether one sequence ends with another, using a supplied callback function to compare items. |
fn:contains-subsequence | Determines whether one sequence contains another as a contiguous subsequence, using a supplied callback function to compare items. |
When comments and processing instructions are ignored, any text nodes either side of the comment or processing instruction are now merged prior to comparison. [Issue 930 PR 933 16 January 2024]
The $options parameter has been added, absorbing the $collation parameter. [Issues 934 1167 PR 1191 21 May 2024]
A callback function can be supplied for comparing individual items. [Issues 99 1142 PRs 1120 1150 9 April 2024]
This function assesses whether two sequences are deep-equal to each other. To be deep-equal, they must contain items that are pairwise deep-equal; and for two items to be deep-equal, they must either be atomic items that compare equal, or nodes of the same kind, with the same name, whose children are deep-equal, or maps with matching entries, or arrays with matching members.
fn:deep-equal( | ||
$input1 | as , | |
$input2 | as , | |
$options | as | := {} |
) as | ||
The two-argument form of this function is deterministic, context-dependent, and focus-independent. It depends on collations, and implicit timezone.
The three-argument form of this function is deterministic, context-dependent, and focus-independent. It depends on collations, and static base URI, and implicit timezone.
The $options argument, if present, defines additional parameters controlling how the comparison is done. If it is supplied as a map, then the option parameter conventions apply.
For backwards compatibility reasons, the $options argument can also be set to a string containing a collation name. Supplying a string $S for this argument is equivalent to supplying the map { 'collation': $S }. Omitting the argument, or supplying the empty sequence, is equivalent to supplying an empty map.
If the two sequences ($input1 and $input2) are both empty, the function returns true.
If the two sequences are of different lengths, the function returns false.
If the two sequences are of the same length, the comparison is controlled by the ordered option:
By default, the option is true: The function returns true if and only if every item in the sequence $input1 is deep-equal to the item at the same position in the sequence $input2.
If the option is set to false, the function returns false if and only if every item in the sequence $input1 is deep-equal to an item at some position in the sequence $input2, and vice versa.
The rules for deciding whether two items are deep-equal appear below.
The entries that may appear in the $options map are as follows. The detailed rules for the interpretation of each option appear later.
record( | |
base-uri? | as xs:boolean, |
collation? | as xs:string, |
comments? | as xs:boolean, |
debug? | as xs:boolean, |
id-property? | as xs:boolean, |
idrefs-property? | as xs:boolean, |
in-scope-namespaces? | as xs:boolean, |
items-equal? | as fn(item(), item()) as xs:boolean?, |
map-order? | as xs:boolean, |
namespace-prefixes? | as xs:boolean, |
nilled-property? | as xs:boolean, |
normalization-form? | as xs:string?, |
ordered? | as xs:boolean, |
processing-instructions? | as xs:boolean, |
timezones? | as xs:boolean, |
type-annotations? | as xs:boolean, |
type-variety? | as xs:boolean, |
typed-values? | as xs:boolean, |
unordered-elements? | as xs:QName*, |
whitespace? | as enum("preserve", "strip", "normalize") |
) | |
| Key | Meaning |
|---|---|
| Determines whether the base-uri of a node is significant.
|
| Identifies a collation which is used at all levels of recursion when strings are compared (but not when names are compared), according to the rules in 5.3.7 Choosing a collation. If the argument is not supplied, or if it is empty, then the default collation from the dynamic context of the caller is used.
|
| Determines whether comments are significant.
|
| Requests diagnostics in the case where the function returns false. When this option is set and the two inputs are found to be not equal, the implementation should output messages (in an implementation-dependent format and to an implementation-dependent destination) indicating the nature of the differences that were found.
|
| Determines whether the id property of elements and attributes is significant.
|
| Determines whether the idrefs property of elements and attributes is significant.
|
| Determines whether the in-scope namespaces of elements are significant.
|
| A user-supplied function to test whether two items are considered equal. The function can return true or false to indicate that two items are or are not equal, overriding the normal rules that would apply to those items; or it can return an empty sequence, to indicate that the normal rules should be followed. Note that returning () is not equivalent to returning false.
|
| Determines whether the order of entries in maps is significant.
|
| Determines whether namespace prefixes in xs:QName values (particularly the names of elements and attributes) are significant.
|
| Determines whether the nilled property of elements and attributes is significant.
|
| If present, indicates that text and attributes are converted to the specified Unicode normalization form prior to comparison. The value is as for the corresponding argument of fn:normalize-unicode.
|
| Controls whether the top-level order of the items of the input sequences is considered.
|
| Determines whether processing instructions are significant.
|
| Determines whether timezones in date/time values are significant.
|
| Determines whether type annotations are significant.
|
| Determines whether the variety of the type annotation of an element (whether it has complex content or simple content) is significant.
|
| Determines whether nodes are compared using their typed values rather than their string values.
|
| A list of QNames of elements considered to be unordered: that is, their child elements may appear in any order.
|
| Determines the extent to which whitespace is treated as significant. The value preserve retains all whitespace. The value strip ignores text nodes consisting entirely of whitespace. The value normalize ignores whitespace text nodes in the same way as the strip option, and additionally compares text and attribute nodes after normalizing whitespace in accordance with the rules of the fn:normalize-space function. The detailed rules, given below, also take into account type annotations and xml:space attributes.
|
Note:
As a general rule for boolean options (but not invariably), the value true indicates that the comparison is more strict.
In the following rules, where a recursive call on fn:deep-equal is made, this is assumed to use the same values of $options as the original call.
The rules reference a function equal-strings which compares two xs:string or xs:anyURI values as follows:
If the whitespace option is set to normalize, then each string is processed by calling the fn:normalize-space function.
If the normalization-form option is present, each string is then normalized by calling the fn:normalize-unicode function, supplying the specified normalization form.
The two strings are then compared for equality under the requested collation.
More formally, the equal-strings function is equivalent to the following implementation in XQuery:
declare function equal-strings(
$string1 as xs:string,
$string2 as xs:string,
$options as map(*)
) as xs:boolean {
let $n1 := if ($options?normalization-form)
then normalize-unicode(?, $options?normalization-form)
else identity#1
let $n2 := if ($options?whitespace = "normalize")
then normalize-space#1
else identity#1
return compare($n1($n2($string1)), $n1($n2($string2)), $options?collation) eq 0
}The rules for deciding whether two items $i1 and $i2 are deep-equal are as follows.
Labels (see Section 3.3 Labeled ItemsDM) are ignored. Specifically, if $i1 or $i2 is a labeled item then it is replaced by its subject.
The two items are nextfirst compared using the function supplied in the items-equal option. If this returns true then the items are deep-equal. If it returns false then the items are not deep-equal. If it returns an empty sequence (which is always the case if the option is not explicitly specified) then the two items are deep-equal if one or more of the following conditions are true:
All of the following conditions are true:
$i1 is an atomic item.
$i2 is an atomic item.
Either the type-annotations option is false, or both atomic items have the same type annotation.
One of the following conditions is true:
If both $i1 and $i2 are instances of xs:string, xs:untypedAtomic, or xs:anyURI, equal-strings($i1, $i2, $collation, $options) returns true.
If both $i1 and $i2 are instances of xs:date, xs:time or xs:dateTime, $i1 eq $i2 returns true.
If both $i1 and $i2 are instances of xs:hexBinary or xs:base64Binary, $i1 eq $i2 returns true.
Otherwise, fn:atomic-equal($i1, $i2) returns true.
Note:
If $i1 and $i2 are not comparable, that is, if the expression ($i1 eq $i2) would raise an error, then the function returns false; it does not report an error.
One of the following conditions is true:
Option namespace-prefixes is false.
Neither $i1 nor $i2 is of type xs:QName or xs:NOTATION.
$i1 and $i2 are qualified names with the same namespace prefix.
One of the following conditions is true:
Option timezones is false.
Neither $i1 nor $i2 is of type xs:date, xs:time, xs:dateTime, xs:gYear, xs:gYearMonth, xs:gMonth, xs:gMonthDay, or xs:gDay.
Neither $i1 nor $i2 has a timezone component.
Both $i1 and $i2 have a timezone component and the timezone components are equal.
All of the following conditions are true:
$i1 is a map.
$i2 is a map.
Both maps have the same number of entries.
For every entry in the first map, there is an entry in the second map that:
has the same key (note that the collation is not used when comparing keys), and
has the same associated value (compared using the fn:deep-equal function, recursively).
Either map-order is false, or the entries in both maps appear in the same order, that is, the Nth key in the first map is the same key as the Nth key in the second map, for all N.
All the following conditions are true:
$i1 is an array.
$i2 is an array.
Both arrays have the same number of members (array:size($i1) eq array:size($i2)).
Members in the same position of both arrays are deep-equal to each other: that is, every $p in 1 to array:size($i1) satisfies deep-equal($i1($p), $i2($p), $collation, $options).
All the following conditions are true:
$i1 is a function item and is not a map or array.
$i2 is a function item and is not a map or array.
$i1 and $i2 have the same function identity. The concept of function identity is explained in Section 7.18.1 Function ItemsDM.
All the following conditions are true:
$i1 is a node (specifically, an XNode).
$i2 is a node (specifically, an XNode).
Both nodes have the same node kind.
Either the base-uri option is false, or both nodes have the same value for their base URI property, or both nodes have an absent base URI.
Let significant-children($parent) be the sequence of nodes obtained by applying the following steps to the children of $parent, in turn:
Comment nodes are discarded if the option comments is false.
Processing instruction nodes are discarded if the option processing-instructions is false.
Adjacent text nodes are merged.
Whitespace-only text nodes are discarded if both the following conditions are true:
The option whitespace is set to strip or normalize; and
The text node is not within the scope of an element that has the attribute xml:space="preserve".
Note:
Whitespace text nodes will already have been discarded if $parent is a schema-validated element node whose type annotation is a complex type with an element-only or empty content model.
One of the following conditions is true.
Both nodes are document nodes, and the sequence significant-children($i1) is deep-equal to the sequence significant-children($i2).
Both nodes are element nodes, and all the following conditions are true:
The two nodes have the same name, that is (node-name($i1) eq node-name($i2)).
Either the option namespace-prefixes is false, or both element names have the same prefix.
Either the option in-scope-namespaces is false, or both element nodes have the same in-scope namespace bindings.
Either the option type-annotations is false, or both element nodes have the same type annotation.
Either the option id-property is false, or both element nodes have the same value for their is-id property.
Either the option idrefs-property is false, or both element nodes have the same value for their is-idrefs property.
Either the option nilled-property is false, or both element nodes have the same value for their nilled property.
One of the following conditions is true:
The option type-variety is false.
Both nodes are annotated as having simple content. For this purpose simple content means either a simple type or a complex type with simple content.
Both nodes are annotated as having complex content. For this purpose complex content means a complex type whose variety is mixed, element-only, or empty.
Note:
It is a consequence of this rule that, by default, validating a document D against a schema will usually (but not necessarily) result in a document that is not deep-equal to D. The exception is when the schema allows all elements to have mixed content.
The two nodes have the same number of attributes, and for every attribute $a1 in $i1/@* there exists an attribute $a2 in $i2/@* such that node-name($a1) eq node-name($a2) and $a1 and $a2 are deep-equal.
Note:
Attributes, like other items, may be compared using the supplied items-equal function. However, this function will not be called to compare two attribute nodes unless they have the same name.
One of the following conditions holds:
Both element nodes are annotated as having simple content (as defined above), the typed-values option is true, and the typed value of $i1 is deep-equal to the typed value of $i2.
Note:
The typed value of an element node is used only when the element has simple content, which means that no error can occur as a result of atomizing a node with no typed value.
Both element nodes are annotated as having simple content (as defined above), the typed-values option is false, and the equal-strings function returns true when applied to the string value of $i1 and the string value of $i2.
Both element nodes have a type annotation that is a complex type with element-only, mixed, or empty content, the (common) element name is not present in the unordered-elements option, and the sequence significant-children($i1) is deep-equal to the sequence significant-children($i2).
Both element nodes have a type annotation that is a complex type with element-only, mixed, or empty content, the (common) element name is present in the unordered-elements option, and the sequence significant-children($i1) is deep-equal to some permutation of the sequence significant-children($i2).
Note:
Elements annotated as xs:untyped fall into this category.
Including an element name in the unordered-elements list is unlikely to be useful except when the relevant elements have element-only content, but this is not a requirement: the rules apply equally to elements with mixed content, or even (trivially) to elements with empty content.
Both nodes are attribute nodes, and all the following conditions are true:
The two attribute nodes have the same name, that is (node-name($i1) eq node-name($i2)).
Either the option namespace-prefixes is false, or both attribute names have the same prefix.
Either the option type-annotations is false, or both attribute nodes have the same type annotation.
Either the option id-property is false, or both attribute nodes have the same value for their is-id property.
Either the option idrefs-property is false, or both attribute nodes have the same value for their is-idrefs property.
Let T be true if the option typed-value is true and both attributes $i1 and $i2 have a type annotation other than xs:untypedAtomic.
Then either T is true and the typed value of $i1 is deep-equal to the typed value of $i2, or T is false and the equal-strings function returns true when applied to the string value of $i1 and the string value of $i2.
Both nodes are processing instruction nodes, and all the following conditions are true:
The two nodes have the same name, that is (node-name($i1) eq node-name($i2)).
The equal-strings function returns true when applied to the string value of $i1 and the string value of $i2.
Both nodes are namespace nodes, and all the following conditions are true:
The two nodes either have the same name or are both nameless, that is fn:deep-equal(node-name($i1), node-name($i2)).
The string value of $i1 is equal to the string value of $i2 when compared using the Unicode codepoint collation.
Note:
Namespace nodes are not considered directly unless they appear in the top-level sequences passed explicitly to the fn:deep-equal function.
Both nodes are comment nodes, and the equal-strings function returns true when applied to their string values.
Both nodes are text nodes, and the equal-strings function returns true when applied to their string values.
All the following conditions are true:
$i1 is a JNode.
$i2 is a JNode.
The ¶value property of $i1 is deep-equal to the ¶value property of $i2.
Note:
The other properties of the two JNodes, such as ¶parent and ¶selector, are ignored. As with XNodes, deep equality considers only the subtree rooted at the node, and not its position within a containing tree.
In all other cases the result is false.
A type error is raised [err:XPTY0004]XP if the value of $options includes an entry whose key is defined in this specification, and whose value is not of the permitted type for that key.
A dynamic error is raised [err:FOJS0005] if the value of $options includes an entry whose key is defined in this specification, and whose value is not a permitted value for that key.
By default, whitespace in text nodes and attributes is considered significant. There are various ways whitespace differences can be ignored:
If nodes have been schema-validated, setting the typed-values option to true causes the typed values rather than the string values to be compared. This will typically cause whitespace to be ignored except where the type of the value is xs:string.
Setting the whitespace option to normalize causes all text and attribute nodes to have leading and trailing whitespace removed, and intermediate whitespace reduced to a single character.
By default, two nodes are not required to have the same type annotation, and they are not required to have the same in-scope namespaces. They may also differ in their parent, their base URI, and the values returned by the is-id and is-idrefs accessors (see Section 6.7.57.5.5 is-id AccessorDM and Section 6.7.67.5.6 is-idrefs AccessorDM). The order of children is significant, but the order of attributes is insignificant.
By default, the contents of comments and processing instructions are significant only if these nodes appear directly as items in the two sequences being compared. The content of a comment or processing instruction that appears as a descendant of an item in one of the sequences being compared does not affect the result. In previous versions of this specification, the presence of a comment or processing instruction, if it caused text to be split across two text nodes, might affect the result; this has been changed in 4.0 so that adjacent text nodes are merged after comments and processing instructions have been stripped.
Comparing items of different kind (for example, comparing an atomic item to a node, or a map to an array, or an integer to an xs:date) returns false, it does not return an error. So the result of fn:deep-equal(1, current-dateTime()) is false.
The items-equal callback function may be used to override the default rules for comparing individual items. For example, it might return true unconditionally when comparing two @timestamp attributes, if there is no expectation that the two trees will have identical timestamps. Given two nodes $n1 and $n2, it might compare them using the is operator, so that instead of comparing the descendants of the two nodes, the function simply checks whether they are the same node. Given two function items $f1 and $f2 it might return true unconditionally, knowing that there is no effective way to test if the functions are equivalent. Given two numeric values, it might return true if they are equal to six decimal places.
It is good practice for the items-equal callback function to be reflexive, symmetric, and transitive; if it is not, then the fn:deep-equal function itself will lack these qualities. Reflexive means that every item (including NaN) should be equal to itself; symmetric means that items-equal(A, B) should return the same result as items-equal(B, A), and transitive means that items-equal(A, B) and items-equal(B, C) should imply items-equal(A, C).
Setting the ordered option to false or supplying the unordered-elements option may result in poor performance when comparing long sequences, especially if the items-equal callback function is supplied.
| Variables | |
|---|---|
let $at := <attendees> <name last="Parker" first="Peter"/> <name last="Barker" first="Bob"/> <name last="Parker" first="Peter"/> </attendees> | |
| Expression: |
|
|---|---|
| Result: | false() |
| Expression: |
|
| Result: | false() |
| Expression: |
|
| Result: | true() |
| Expression: |
|
| Result: | false() |
| Expression: | deep-equal(
$at//name[@first="Bob"],
$at//name[@last="Barker"],
options := { 'items-equal': op('is') }
) |
| Result: | true() (Tests whether the two input sequences contain exactly the same nodes.) |
| Expression: |
|
| Result: | true() |
| Expression: |
|
| Result: | false() |
| Expression: | deep-equal(
{ 1: 'a', 2: 'b' },
{ 2: 'b', 1: 'a' }
) |
| Result: | true() |
| Expression: | deep-equal(
(1, 2, 3, 4),
(1, 4, 3, 2),
options := { 'ordered': false() }
) |
| Result: | true() |
| Expression: | deep-equal(
(1, 1, 2, 3),
(1, 2, 3, 3),
options := { 'ordered': false() }
) |
| Result: | false() |
| Expression: | deep-equal(
parse-xml("<a xmlns='AA'/>"),
parse-xml("<p:a xmlns:p='AA'/>")
) |
| Result: | true() (By default, namespace prefixes are ignored). |
| Expression: | deep-equal(
parse-xml("<a xmlns='AA'/>"),
parse-xml("<p:a xmlns:p='AA'/>"),
options := { 'namespace-prefixes': true() }
) |
| Result: | false() (False because the namespace prefixes differ). |
| Expression: | deep-equal(
parse-xml("<a xmlns='AA'/>"),
parse-xml("<p:a xmlns:p='AA'/>"),
options := { 'in-scope-namespaces': true() }
) |
| Result: | false() (False because the in-scope namespace bindings differ). |
| Expression: | deep-equal(
parse-xml("<a><b/><c/></a>"),
parse-xml("<a><c/><b/></a>")
) |
| Result: | false() (By default, order of elements is significant). |
| Expression: | deep-equal(
parse-xml("<a><b/><c/></a>"),
parse-xml("<a><c/><b/></a>"),
options := { 'unordered-elements': #a) }
) |
| Result: | true() (The |
| Expression: | deep-equal(
parse-xml("<para style='bold'><span>x</span></para>"),
parse-xml("<para style=' bold'> <span>x</span></para>")
) |
| Result: | false() (By default, both the leading whitespace in the |
| Expression: | deep-equal(
parse-xml("<para style='bold'><span>x</span></para>"),
parse-xml("<para style=' bold'> <span>x</span></para>"),
options := { 'whitespace': 'normalize' }
) |
| Result: | true() (The |
| Expression: | deep-equal(
(1, 2, 3),
(1.0007, 1.9998, 3.0005),
options := { 'items-equal': fn($x, $y) {
if (($x, $y) instance of xs:numeric+) {
abs($x - $y) lt 0.001
}
} }
) |
| Result: | true() (For numeric values, the callback function tests whether they are approximately equal. For any other items, it returns an empty sequence, so the normal comparison rules apply.) |
| Expression: | deep-equal(
(1, 2, 3, 4, 5),
(1, 2, 3, 8, 5),
options := { 'items-equal': fn($x, $y) {
trace((), `comparing { $x } and { $y }`)
} }
) |
| Result: | false() (The callback function traces which items are being compared, without changing the result of the comparison.) |
This section defines a number of functions used to find elements by ID or IDREF value, or to generate identifiers.
| Function | Meaning |
|---|---|
fn:id | Returns the sequence of element nodes that have an ID value matching the value of one or more of the IDREF values supplied in $values. |
fn:element-with-id | Returns the sequence of element nodes that have an ID value matching the value of one or more of the IDREF values supplied in $values. |
fn:idref | Returns the sequence of element or attribute nodes with an IDREF value matching the value of one or more of the ID values supplied in $values. |
fn:generate-id | This function returns a string that uniquely identifies a given nodeGNode. |
Returns the sequence of element nodes that have an ID value matching the value of one or more of the IDREF values supplied in $values.
fn:id( | ||
$values | as , | |
$node | as | := . |
) as | ||
The one-argument form of this function is deterministic, context-dependent, and focus-dependent.
The two-argument form of this function is deterministic, context-independent, and focus-independent.
The function returns a sequence, in document order with duplicates eliminated, containing every element node E that satisfies all the following conditions:
E is in the target document. The target document is the document containing $node, or the document containing the context value (.) if the second argument is omitted. The behavior of the function if $node is omitted is exactly the same as if the context value had been passed as $node.
E has an ID value equal to one of the candidate IDREF values, where:
An element has an ID value equal to V if either or both of the following conditions are true:
The is-id property (See Section 6.7.57.5.5 is-id AccessorDM.) of the element node is true, and the typed value of the element node is equal to V under the rules of the eq operator using the Unicode codepoint collation (http://www.w3.org/2005/xpath-functions/collation/codepoint).
The element has an attribute node whose is-id property (See Section 6.7.57.5.5 is-id AccessorDM.) is true and whose typed value is equal to V under the rules of the eq operator using the Unicode code point collation (http://www.w3.org/2005/xpath-functions/collation/codepoint).
Each xs:string in $values is parsed as if it were of type IDREFS, that is, each xs:string in $values is treated as a whitespace-separated sequence of tokens, each acting as an IDREF. These tokens are then included in the list of candidate IDREFs. If any of the tokens is not a lexically valid IDREF (that is, if it is not lexically an xs:NCName), it is ignored. Formally, the candidate IDREF values are the strings in the sequence given by the expression:
for $s in $values return tokenize(normalize-space($s), ' ')[. castable as xs:IDREF]
If several elements have the same ID value, then E is the one that is first in document order.
A dynamic error is raised [err:FODC0001] if $node, or the context value if the second argument is absent, is a node in a tree whose root is not a document node.
The following errors may be raised when $node is omitted:
If the context value is absentDM, type error [err:XPDY0002]XP
If the context value is not a single node, type error [err:XPTY0004]XP.
The effect of this function is anomalous in respect of element nodes with the is-id property. For legacy reasons, this function returns the element that has the is-id property, whereas it would be more appropriate to return its parent, that being the element that is uniquely identified by the ID. A new function fn:element-with-id has been introduced with the desired behavior.
If the data model is constructed from an Infoset, an attribute will have the is-id property if the corresponding attribute in the Infoset had an attribute type of ID: typically this means the attribute was declared as an ID in a DTD.
If the data model is constructed from a PSVI, an element or attribute will have the is-id property if its typed value is a single atomic item of type xs:ID or a type derived by restriction from xs:ID.
No error is raised in respect of a candidate IDREF value that does not match the ID of any element in the document. If no candidate IDREF value matches the ID value of any element, the function returns the empty sequence.
It is not necessary that the supplied argument should have type xs:IDREF or xs:IDREFS, or that it should be derived from a node with the is-idrefs property.
An element may have more than one ID value. This can occur with synthetic data models or with data models constructed from a PSVI where the element and one of its attributes are both typed as xs:ID.
If the source document is well-formed but not valid, it is possible for two or more elements to have the same ID value. In this situation, the function will select the first such element.
It is also possible in a well-formed but invalid document to have an element or attribute that has the is-id property but whose value does not conform to the lexical rules for the xs:ID type. Such a node will never be selected by this function.
| Variables | |
|---|---|
let $emp := validate lax {
document {
<employee xml:id="ID21256"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<empnr xsi:type="xs:ID">E21256</empnr>
<first>John</first>
<last>Brown</last>
</employee>
}
} | |
| Expression | Result |
|---|---|
$emp/id('ID21256')/name() | "employee" (The |
$emp/id('E21256')/name() | "empnr" (Assuming the |
Returns the sequence of element nodes that have an ID value matching the value of one or more of the IDREF values supplied in $values.
fn:element-with-id( | ||
$values | as , | |
$node | as | := . |
) as | ||
The one-argument form of this function is deterministic, context-dependent, and focus-dependent.
The two-argument form of this function is deterministic, context-independent, and focus-independent.
Note:
The effect of this function is identical to fn:id in respect of elements that have an attribute with the is-id property. However, it behaves differently in respect of element nodes with the is-id property. Whereas the fn:id function, for legacy reasons, returns the element that has the is-id property, this function returns the element identified by the ID, which is the parent of the element having the is-id property.
The function returns a sequence, in document order with duplicates eliminated, containing every element node E that satisfies all the following conditions:
E is in the target document. The target document is the document containing $node, or the document containing the context value (.) if the second argument is omitted. The behavior of the function if $node is omitted is exactly the same as if the context value had been passed as $node.
E has an ID value equal to one of the candidate IDREF values, where:
An element has an ID value equal to V if either or both of the following conditions are true:
The element has an child element node whose is-id property (See Section 6.7.57.5.5 is-id AccessorDM.) is true and whose typed value is equal to V under the rules of the eq operator using the Unicode code point collation (http://www.w3.org/2005/xpath-functions/collation/codepoint).
The element has an attribute node whose is-id property (See Section 6.7.57.5.5 is-id AccessorDM.) is true and whose typed value is equal to V under the rules of the eq operator using the Unicode code point collation (http://www.w3.org/2005/xpath-functions/collation/codepoint).
Each xs:string in $values is parsed as if it were of type IDREFS, that is, each xs:string in $values is treated as a whitespace-separated sequence of tokens, each acting as an IDREF. These tokens are then included in the list of candidate IDREFs. If any of the tokens is not a lexically valid IDREF (that is, if it is not lexically an xs:NCName), it is ignored. Formally, the candidate IDREF values are the strings in the sequence given by the expression:
for $s in $arg return tokenize(normalize-space($s), ' ')[. castable as xs:IDREF]
If several elements have the same ID value, then E is the one that is first in document order.
A dynamic error is raised [err:FODC0001] if $node, or the context value if the second argument is omitted, is a node in a tree whose root is not a document node.
The following errors may be raised when $node is omitted:
If the context value is absentDM, type error [err:XPDY0002]XP
If the context value is not a single node, type error [err:XPTY0004]XP.
This function is equivalent to the fn:id function except when dealing with ID-valued element nodes. Whereas the fn:id function selects the element containing the identifier, this function selects its parent.
If the data model is constructed from an Infoset, an attribute will have the is-id property if the corresponding attribute in the Infoset had an attribute type of ID: typically this means the attribute was declared as an ID in a DTD.
If the data model is constructed from a PSVI, an element or attribute will have the is-id property if its typed value is a single atomic item of type xs:ID or a type derived by restriction from xs:ID.
No error is raised in respect of a candidate IDREF value that does not match the ID of any element in the document. If no candidate IDREF value matches the ID value of any element, the function returns the empty sequence.
It is not necessary that the supplied argument should have type xs:IDREF or xs:IDREFS, or that it should be derived from a node with the is-idrefs property.
An element may have more than one ID value. This can occur with synthetic data models or with data models constructed from a PSVI where the element and one of its attributes are both typed as xs:ID.
If the source document is well-formed but not valid, it is possible for two or more elements to have the same ID value. In this situation, the function will select the first such element.
It is also possible in a well-formed but invalid document to have an element or attribute that has the is-id property but whose value does not conform to the lexical rules for the xs:ID type. Such a node will never be selected by this function.
| Variables | |
|---|---|
let $emp := validate lax {
document {
<employee xml:id="ID21256"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<empnr xsi:type="xs:ID">E21256</empnr>
<first>John</first>
<last>Brown</last>
</employee>
}
} | |
| Expression: | $emp/element-with-id('ID21256')/name() |
|---|---|
| Result: | "employee" (The |
| Expression: |
|
| Result: | "employee" (Assuming the |
Returns the sequence of element or attribute nodes with an IDREF value matching the value of one or more of the ID values supplied in $values.
fn:idref( | ||
$values | as , | |
$node | as | := . |
) as | ||
The one-argument form of this function is deterministic, context-dependent, and focus-dependent.
The two-argument form of this function is deterministic, context-independent, and focus-independent.
The function returns a sequence, in document order with duplicates eliminated, containing every element or attribute node $N that satisfies all the following conditions:
$N is in the target document. The target document is the document containing $node, or the document containing the context value (.) if the second argument is omitted. The behavior of the function if $node is omitted is exactly the same as if the context value had been passed as $node.
$N has an IDREF value equal to one of the candidate ID values, where:
A node $N has an IDREF value equal to V if both of the following conditions are true:
The is-idrefs property (see Section 6.7.67.5.6 is-idrefs AccessorDM) of $N is true.
The sequence
tokenize(normalize-space(string($N)), ' ')
contains a string that is equal to V under the rules of the eq operator using the Unicode code point collation (http://www.w3.org/2005/xpath-functions/collation/codepoint).
Each xs:string in $values is parsed as if it were of lexically of type xs:ID. These xs:strings are then included in the list of candidate xs:IDs. If any of the strings in $values is not a lexically valid xs:ID (that is, if it is not lexically an xs:NCName), it is ignored. More formally, the candidate ID values are the strings in the sequence:
$values[. castable as xs:NCName]
A dynamic error is raised [err:FODC0001] if $node, or the context value if the second argument is omitted, is a node in a tree whose root is not a document node.
The following errors may be raised when $node is omitted:
If the context value is absentDM, type error [err:XPDY0002]XP
If the context value is not a single node, type error [err:XPTY0004]XP.
An element or attribute typically acquires the is-idrefs property by being validated against the schema type xs:IDREF or xs:IDREFS, or (for attributes only) by being described as of type IDREF or IDREFS in a DTD.
Because the function is sensitive to the way in which the data model is constructed, calls on this function are not always interoperable.
No error is raised in respect of a candidate ID value that does not match the IDREF value of any element or attribute in the document. If no candidate ID value matches the IDREF value of any element or attribute, the function returns the empty sequence.
It is possible for two or more nodes to have an IDREF value that matches a given candidate ID value. In this situation, the function will return all such nodes. However, each matching node will be returned at most once, regardless how many candidate ID values it matches.
It is possible in a well-formed but invalid document to have a node whose is-idrefs property is true but that does not conform to the lexical rules for the xs:IDREF type. The effect of the above rules is that ill-formed candidate ID values and ill-formed IDREF values are ignored.
If the data model is constructed from a PSVI, the typed value of a node that has the is-idrefs property will contain at least one atomic item of type xs:IDREF (or a type derived by restriction from xs:IDREF). It may also contain atomic items of other types. These atomic items are treated as candidate ID values if two conditions are met: their lexical form must be valid as an xs:NCName, and there must be at least one instance of xs:IDREF in the typed value of the node. If these conditions are not satisfied, such values are ignored.
| Variables | |
|---|---|
let $emp := validate lax {
document {
<employees xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<employee xml:id="ID21256">
<empnr xsi:type="xs:ID">E21256</empnr>
<first>Anil</first>
<last>Singh</last>
<deputy xsi:type="xs:IDREF">E30561</deputy>
</employee>
<employee xml:id="ID30561">
<empnr xsi:type="xs:ID">E30561</empnr>
<first>John</first>
<last>Brown</last>
<manager xsi:type="xs:IDREF">ID21256</manager>
</employee>
</employees>
}
} | |
| Expression: | $emp/(
element-with-id('ID21256')/@xml:id => idref()
)/ancestor::employee/last
=> string() |
|---|---|
| Result: | "Brown" (Assuming that |
| Expression: | $emp/(
element-with-id('E30561')/empnr => idref()
)/ancestor::employee/last
=> string() |
| Result: | "Singh" (Assuming that |
This function returns a string that uniquely identifies a given nodeGNode.
fn:generate-id( | ||
$node | as | := . |
) as | ||
The zero-argument form of this function is deterministic, context-dependent, and focus-dependent.
The one-argument form of this function is deterministic, context-independent, and focus-independent.
If the argument is omitted, it defaults to the context value (.).
If the argument is the empty sequence, the result is the zero-length string.
In other cases, the function returns a string that uniquely identifies a given node. More formally, it is guaranteed that within a single execution scope, fn:codepoint-equal(fn:generate-id($N), fn:generate-id($M)) returns true if and only if ($M is $N) returns true.
The returned identifier must consist of ASCII alphanumeric characters and must start with an alphabetic character. Thus, the string is syntactically an XML name.
The following errors may be raised when $node is omitted:
If the context value is absentDM, type error [err:XPDY0002]XP
If the context value is not an instance of the sequence type nodeGNode()?, type error [err:XPTY0004]XP.
An implementation is free to generate an identifier in any convenient way provided that it always generates the same identifier for the same nodeGNode and that different identifiers are always generated from different nodesGNodes. An implementation is under no obligation to generate the same identifiers each time a document is transformed or queried.
There is no guarantee that a generated unique identifier will be distinct from any unique IDs specified in the source document.
There is no inverse to this function; it is not directly possible to find the nodeGNode with a given generated ID. Of course, it is possible to search a given sequence of nodesGNodes using an expression such as $nodes[generate-id()=$id].
It is advisable, but not required, for implementations to generate IDs that are distinct even when compared using a case-blind collation.
The primary use case for this function is to generate hyperlinks. For example, when generating HTML, an anchor for a given section | |
| |
and a link to that section can then be produced with code such as: | |
| |
Note that anchors generated in this way will not necessarily be the same each time a document is republished. | |
Since the keys in a map must be atomic items, it is possible to use generated IDs as surrogates for nodes when constructing a map. For example, in some implementations, testing whether a node | |
| |
and then testing for membership of the node-set using: | |
|
These functions convert between the lexical representation and XPath and XQuery data model representation of various file formats.
These functions convert between the lexical representation of XML and the tree representation.
(The fn:serialize function also handles HTML and JSON output, but is included in this section for editorial convenience.)
| Function | Meaning |
|---|---|
fn:parse-xml | This function takes as input an XML document, and returns the document node at the root of an XDM tree representing the parsed document. |
fn:parse-xml-fragment | This function takes as input an XML external entity represented as a string, and returns the document node at the root of an XDM tree representing the parsed document fragment. |
fn:serialize | This function serializes the supplied input sequence $input as described in [XSLT and XQuery Serialization 3.1], returning the serialized representation of the sequence as a string. |
fn:xsd-validator | Given an XSD schema, delivers a function item that can be invoked to validate a document or element node against this schema. |
This description of the XSD validation process was previously found (with some duplication) in the XQuery and XSLT specifications; those specifications now reference this description. As a side-effects, the descriptions of the process in XQuery and XSLT are better aligned. [Issue 2029 PR 2030 28 May 2025]
This section describes a process called XSD validation, which validates a supplied node against a supplied XSD schema. The validation process refers to the process defined in [XML Schema Part 1: Structures Second Edition] or [XSD 1.1 Part 1].
The validation process takes the following inputs:
A schema to be used for validation, called the effective schema.
A boolean indicating whether any xsi:schemaLocation or xsi:noNamespaceSchemaLocation attributes are to be taken into consideration.
A document, element, or attribute node to be validated; this is called the operand node.
A validation mode, which is one of strictlax, or by-type.
Note:
XSLT also allows the value strip, but this does not invoke validation (instead, it invokes stripping of existing type annotations, and re-annotation of nodes as xs:untyped.)
If the validation mode is by-type, then a schema type to be used for validating the operand node. This may be any simple or complex type present in the effective schema: it must not be xs:untyped or xs:untypedAtomic.
Note:
An XQuery ValidateExpr allows the type to be specified as xs:untyped or xs:untypedAtomic, but this does not invoke validation (instead, it invokes stripping of existing type annotations and re-annotation of nodes as untyped.)
The output of the validation process comprises one or more of the following:
A boolean indicating whether the operand node was found to be valid.
If the operand node was found to be valid, a deep copy of the operand node augmented with type annotations corresponding to the types against which they were validated, the copies may also include expanded values for element and attribute defaults defined in the schema.
This creates a new node with its own identity and with no parent.
The base URI property of every node in the resulting XDM tree is the same as the base URI property of the corresponding node in the input tree.
If the operand node was not found to be valid, then optionally, a set of error diagnostics in implementation-defined format.
The operand node must be one of:
An element node
An attribute node
A well-formed document node, that is, a document node having among its children exactly one element node and zero or more comment and processing instruction nodes.
The term validation root is used to refer to the operand node if it is an element or attribute node, or to the single element child of the operand node when the operand node is a document node.
Note that a schema is defined as a collection of schema components (for example, element and attribute declarations, complex and simple type definitions). In some cases the schema that is used is the set of schema components found in the in-scope schema definitionsXP, but this is not the only possibility.
The result of the validation process is defined by the following rules.
The invoking application determines whether the validity assessment process takes account of any xsi:schemaLocation or xsi:noNamespaceSchemaLocation attributes in the tree being validated. If it does so, then it should adhere to the following rules:
Any schema loaded using these attributes must be compatibleDM with the existing effective schema.
Any schema loaded using these attributes must not override or redefine any schema components in the effective schema.
Any schema components loaded using this mechanism must be used for this validity assessment only, and must not affect the outcome of any subsequent validity assessments of other documents.
Note:
A processor may choose to cache such schema components but the existence of such a cache should only affect performance, not the validation outcome.
A consequence of validating a document using schema components that are not in the static context is that nodes may be annotated with types that are not in the static context. But the rules for schema compatibilityDM mean that this is not a problem.
If the instance being validated contains any xml:id attributes, such attributes are validated against the type xs:ID, making the containing element eligible as a target for the id function. Uniqueness checking of elements and attributes typed as xs:ID, however, is carried out only if the operand node is a document node.
If the operand node is a document node:
The children of the document node must consist of exactly one element node and zero or more comment and processing instruction nodes, in any order.
The element node child is validated, as described below.
The validation rule “Validation Root Valid (ID/IDREF)” is applied to the single element node child of the document node. This means that validation will fail if there are non-unique ID values or dangling IDREF values in the document tree.
Note:
This rule is not applied when the operand node is an element or attribute node.
There is no check that the tree contains unparsed entities whose names match the values of nodes of type xs:ENTITY or xs:ENTITIES. This is because it is not possible (either in XSLT or XQuery) to construct a tree containing unparsed entities. It is possible to add unparsed entity declarations to the result document by referencing a suitable DOCTYPE during serialization.
All other children of the document node (comments and processing instructions) are copied unchanged, and the results become the children of a new document node, which is returned as the validation result.
If the operand node is an element node, then:
For specification purposes, because the XSD specifications require the input document to be expressed as an XML Information Set ([XML Infoset]), the operand node is first converted to an Infoset according to the “Infoset Mapping” rules defined in [XQuery and XPath Data Model (XDM) 4.0]. Note that this process discards any existing type annotations.
Validity assessment is carried out on the root element information item of the resulting Infoset, using the supplied schema. The process of validation applies recursively to contained elements and attributes to the extent required by the supplied schema.
Note:
A practical implementation is unlikely to perform any physical conversion, but the process is defined this way in order to align with the XSD specification.
If the validation mode is by-type, then Schema-validity assessment is carried out according to the rules defined in [XML Schema Part 1: Structures Second Edition] or [XSD 1.1 Part 1] Part 1, section 3.3.4 "Element Declaration Validation Rules", “Validation Rule: Schema-Validity Assessment (Element)”, clauses 1.2 and 2, using this type definition as the “processor-stipulated type definition” for validation.
If validation mode is strict, then strict validation is carried out as described in [XML Schema Part 1: Structures Second Edition] Part 1, section 5.2, “Assessing Schema-Validity”, item 2, or its counterpart in XSD 1.1. This means that the root element information item in the Infoset must either:
have a name that matches a top-level element declaration in the effective schema, or
have an xsi:type attribute whose value matches the name of a top-level type definition in the effective schema
If there is no such element declaration or type definition, the element is assessed as invalid.
If validation mode is lax, then schema-validity assessment is carried out in accordance with [XML Schema Part 1: Structures Second Edition] Part 1, section 5.2, “Assessing Schema-Validity”, item 3, or its counterpart in XSD 1.1.
If validation mode is lax and the root element information item has neither a top-level element declaration nor an xsi:type attribute, XSD 1.0 and XSD 1.1 define the recursive checking of children and attributes as optional. This specification prescribes that this recursive checking is required.
Note:
This means, for example, that when an instance document is structured as having an envelope in one namespace wrapping a payload in a different namespaces, and when schema definitions are available for the payload but not for the envelope, lax validation of the envelope may trigger validation of the payload.
If the operand node is an element node, the validation rules named “Validation Root Valid (ID/IDREF)” are not applied. This means that document-level constraints relating to uniqueness and referential integrity are not enforced.
There is no check that the document contains unparsed entities whose names match the values of nodes of type xs:ENTITY or xs:ENTITIES.
If the operand node is an attribute node, in particular when it is a parentless attribute node, then validation cannot be defined directly in terms of the XSD-defined validation process. Instead, conceptually, a copy of the attribute is first added to an element node that is created for the purpose, and namespace fixup is performed on this element node to ensure that it has an in-scope namespace binding for the prefix and namespace of the attribute name. The name of this element is of no consequence, but it must be the same as the name of a synthesized element declaration of the form:
<xs:element name="E">
<xs:complexType>
<xs:sequence/>
<xs:attribute ref="A"/>
</xs:complexType>
</xs:element>where A is the name of the attribute being validated.
This synthetic element is then validated using the procedure given above for validating elements, and if it is found to be valid, a copy of the validated attribute is made, retaining its type annotation, but detaching it from the containing element (and thus, from any in-scope namespace bindings).
The XDM data model does not permit an attribute node with no parent to have a typed value that includes a namespace-qualified name, that is, a value whose type is derived from xs:QName or xs:NOTATION. This restriction is imposed because these types rely on the in-scope namespaces of a containing element to resolve namespace prefixes. Therefore, a parentless attribute is considered to be invalid against such a type.
The outcome of the validation expression depends on the validity property of the root element information item in the PSVI that results from the XSD validation process.
If the validity property of the root element information item is valid, or if validation mode is lax and the validity property of the root element information item is notKnown, the PSVI is converted back into a data model instance as described in [XQuery and XPath Data Model (XDM) 4.0] Section 3.3, “Construction from a PSVI”. The resulting node (a new node of the same kind as the operand node) is returned as the result of the validate expression.
Otherwise, the operand node is deemed invalid.
Note:
During conversion of the PSVI into an XDM instance after validation, any element information items whose validity property is notKnown are converted into element nodes with type annotationxs:anyType, and any attribute information items whose validity property is notKnown are converted into attribute nodes with type annotationxs:untypedAtomic, as described in Section 6.5.3.1.17.3.3.1.1 Element and Attribute Node TypesDM.
This function converts between the lexical representation of HTML and the XDM tree representation.
| Function | Meaning |
|---|---|
fn:parse-html | This function takes as input an HTML document, and returns the document node at the root of an XDM tree representing the parsed document. |
fn:html-doc | Reads an external resource containing HTML, and returns the result of parsing the resource as HTML. |
The fn:parse-html function conceptually works in two phases:
The lexical HTML (supplied as a string) is parsed into an HTML DOM as defined by the HTML5 specification: see [HTML: Living Standard] and [DOM: Living Standard].
The resulting DOM is converted to an XDM tree as described in this section. This is described by defining the actions of the accessor functions defined in Section 6.77.5 AccessorsDM.
Note:
Because the [DOM: Living Standard] and [HTML: Living Standard] are not fixed, it is implementation-defined which versions are used.
Note:
An implementation must match the semantics of the mapping described in this section, but the specific way it achieves that is implementation-dependent.
Some possible implementation strategies are:
Parse the HTML to an HTML DOM and then convert the HTML DOM to an XDM node tree.
Parse the HTML to an HTML DOM and then implement a wrapper or facade that presents an XDM interface to the HTML DOM.
Parse the lexical HTML directly to an XDM node tree, bypassing the HTML DOM.
The [DOM: Living Standard] defines parsing algorithms for two different formats, which it refers to as the HTML and XML serializations (or concrete syntaxes). The XML serialization is an XML document which typically uses the namespace http://www.w3.org/1999/xhtml and the content type application/xhtml+xml, and is popularly referred to as XHTML. The HTML parsing algorithm constructs an HTML DOM HTMLDocument document object for the HTML document. The XHTML parsing algorithm constructs an HTML DOM XMLDocument object for the HTML document, following XML parsing rules. This mapping supports both of these document types.
The [DOM: Living Standard] specification defines HTML DOM nodes that are mapped to XDM nodes as follows:
The HTML DOM Document interface maps to Section 6.6.17.4.1 Document nodesDM.
The HTML DOM Element interface maps to Section 6.6.27.4.2 Element nodesDM.
The HTML DOM Attr interface maps to Section 6.6.37.4.3 Attribute nodesDM.
Note:
Any HTML DOM Attr instances in an HTML DOM HTMLDocument that represent namespace declarations will have been filtered out: see 15.2.1.1 attributes Accessor.
The HTML DOM ProcessingInstruction interface maps to Section 6.6.57.4.5 Processing instruction nodesDM.
Note:
The HTML parsing algorithm does not generate processing instruction nodes. If encountered they are parsed as comment nodes. The HTML DOM ProcessingInstruction interface is relevant only when the XHTML parsing algorithm is used.
The HTML DOM Comment interface maps to Section 6.6.67.4.6 Comment nodesDM.
The HTML DOM Text interface maps to Section 6.6.77.4.7 Text nodesDM. Adjacent HTML DOM Text nodes are combined into a single Section 6.6.77.4.7 Text nodesDM.
Note:
The HTML DOM CDATASection interface is an instance of HTML DOM Text, so CDATA sections also map to Section 6.6.77.4.7 Text nodesDM.
The use of CDATA sections can result in the HTML DOM containing adjacent text nodes, which the mapping to XDM will merge into a single node.
Note:
The HTML DOM DocumentFragment interface is not supported as an XML node. There are two places in the HTML DOM where this is used:
The HTML DOM ShadowRoot interface is not present in the main HTML DOM tree. It is only accessible via JavaScript.
The template element’s content property contains the child nodes of the template element. The behaviour of this is defined by the include-template-content key in the $options map.
If an implementation allows these nodes to be passed in via an API or similar mechanism, their behaviour is implementation-defined.
The result of the Section 6.7.17.5.1 attributes AccessorDMdm:attributes($node) for an HTML DOM Node is as follows:
If the node is an instance of HTML DOM Element then the result is the value of the Element.attributes property mapped to a sequence as described below;
Otherwise, the result is an empty sequence.
An HTML DOM NamedNodeMap is mapped to a sequence as follows:
NamedNodeMap.length is the length of the sequence, where a length of 0 results in an empty sequence;
NamedNodeMap.item(n) is the nth element of the sequence.
That sequence is then filtered as follows:
If the Attr.namespaceURI property is "http://www.w3.org/2000/xmlns/", the attribute is not included in this sequence;
If the Attr.localName property is "xmlns", the attribute is not included in this sequence;
If the Attr.localName property starts with "xmlns:", the attribute is not included in this sequence;
Otherwise, the attribute is included in this sequence using the XDM mapping rules described in this section.
Note:
The HTML DOM Element.attributes property includes namespace and non-namespace attributes in the list when the HTML or XML parser is used. As such, the namespace attributes have to be filtered from the resulting XDM attribute sequence.
Note:
When the resulting document is an HTML DOM HTMLDocument, the Attr.localName and Attr.name properties of HTML DOM Attr nodes are both set to the qualified name. This includes namespace declarations which are filtered out by the logic in this section.
Note:
The Attr.localName property will be ASCII lowercase. The [HTML: Living Standard] section 13.2.5.33, Attribute name state specifies that ASCII upper alpha characters are appended to the attribute’s name in lowercase.
The result of the Section 6.7.27.5.2 base-uri AccessorDMdm:base-uri($node) for an HTML DOM Node is the value of the Node.baseURI property mapped as follows:
If the value is null or an empty string, then the result is an empty sequence;
Otherwise, the string value is cast to an xs:anyURI.
The result of the Section 6.7.37.5.3 children AccessorDMdm:children($node) for an HTML DOM Node is as follows:
If the node is an instance of HTML DOM Document then the result is the value of the Node.childNodes property mapped to a sequence;
If the node is an instance of HTML DOM HTMLTemplateElement then the result is determined as follows:
If the include-template-content key of the parse-html-options map is false(), the result is an empty sequence;
Select the HTML DOM DocumentFragment from the HTMLTemplateElement.content property;
The HTML DOM DocumentFragment’s Node.childNodes property is mapped to a sequence;
If the node is an instance of HTML DOM Element then the result the value of the Node.childNodes property mapped to a sequence;
Otherwise, the result is an empty sequence.
An HTML DOM NodeList is mapped to a sequence as follows:
NodeList.length is the length of the sequence, where a length of 0 results in an empty sequence;
NodeList.item(n) is the nth element of the sequence.
That sequence is then filtered as follows:
If the child is an instance of HTML DOM DocumentType, that child is not included in this sequence;
A sequence of consecutive HTML DOM Text nodes is combined into a single XDM text node;
Otherwise, the HTML DOM Node nodes are mapped to XDM according to the rules in this section.
The result of the Section 6.7.47.5.4 document-uri AccessorDMdm:document-uri($node) for an HTML DOM Node is as follows:
If the node is an instance of HTML DOM Document then the value of the Document.documentURI property mapped as follows:
If the value is null or an empty string, then the result is an empty sequence;
Otherwise, the string value is cast to an xs:anyURI.
Otherwise, the result is an empty sequence.
The result of the Section 6.7.57.5.5 is-id AccessorDMdm:is-id($node) for an HTML DOM Node is as follows:
If the node is an instance of HTML DOM Attr then:
If the Attr.name property (its qualified name) is "id", then:
If the Attr.value is castable to an xs:NCName, the result is true;
Otherwise, the result is false;
Otherwise, the result is false;
Otherwise, the result is false.
Note:
In [HTML: Living Standard] section 3.2.5, Global attributes, the id attribute is defined as being unique in the element’s tree, containing at least one character, and not having any ASCII whitespace characters. This means that an HTML id attribute may not conform to an xs:NCName.
If an HTML id is not a valid xs:NCName then that attribute is not an XML ID.
The result of the Section 6.7.67.5.6 is-idrefs AccessorDMdm:is-idrefs($node) for an HTML DOM Node is an empty sequence.
The result of the Section 6.7.77.5.7 namespace-nodes AccessorDMdm:namespace-nodes($node) for an HTML DOM Node is as follows:
If the node is an instance of HTML DOM Element then an implementation-dependent sequence of namespace nodes that is sufficient to define the namespace context of the node.
Otherwise, the result is the empty sequence.
For the XHTML parsing algorithm, this will be equivalent to constructing the namespace nodes from an XML infoset, PSVI, or similar mapping.
For the HTML parsing algorithm, the [HTML: Living Standard] specification defines the namespace context in various places:
Section 2.1.3 XML compatibility defines the default element namespace to be http://www.w3.org/1999/xhtml.
Section 4.8.15 MathML defines rules for embedded MathML content in HTML documents. Section 13.1.2 Elements defines these elements as foreign elements, placing them in the MathML namespace (http://www.w3.org/1998/Math/MathML). The default element namespace for these elements is the MathML namespace.
Section 4.8.16 SVG defines rules for embedded SVG content in HTML documents. Section 13.1.2 Elements defines these elements as foreign elements, placing them in the SVG namespace (http://www.w3.org/2000/svg). The default element namespace for these elements is the SVG namespace.
Section 13.1.2.3 Attributes defines several namespaced attributes available on foreign elements. If any of these namespaced attributes are present, a namespace node for that namespace must be present on the element.
The supported namespace prefixes are:
xlink in the http://www.w3.org/1999/xlink namespace;
xml in the http://www.w3.org/XML/1998/namespace namespace; and
xmlns in the http://www.w3.org/2000/xmlns/ namespace.
No other namespaces are supported by the HTML parser.
Note:
Section number references to [HTML: Living Standard] may change over time.
The result of the Section 6.7.87.5.8 nilled AccessorDMdm:nilled($node) for an HTML DOM Node is false().
The result of the Section 6.7.97.5.9 node-kind AccessorDMdm:node-kind($node) for an HTML DOM Node is as follows:
If the node is an instance of HTML DOM Document then the result is "document".
If the node is an instance of HTML DOM Element then the result is "element".
If the node is an instance of HTML DOM Attr then the result is "attribute".
If the node is an instance of HTML DOM ProcessingInstruction then the result is "processing-instruction".
If the node is an instance of HTML DOM Comment then the result is "comment".
If the node is an instance of HTML DOM Text then the result is "text".
The result of the Section 6.7.107.5.10 node-name AccessorDMdm:node-name($node) for an HTML DOM Node is as follows:
If the node is an instance of HTML DOM Element then the result is determined as follows:
The local name is the value of the Element.localName property. This is derived as follows:
The local name is initially set to the ASCII lowercase tag name. The [HTML: Living Standard] section 13.2.5.8, Tag name state specifies that ASCII upper alpha characters are appended to the element’s name in lowercase.
If the local name is an SVG element name, the case-sensitive name is used. [HTML: Living Standard] section 13.2.6.5, The rules for parsing tokens in foreign content has a table mapping the lowercase element names to their SVG names.
If the local name contains a character that is not a valid XML NameStartChar or NameChar, then an implementation-defined replacement string is used. The result must be a valid NCName.
Note:
[HTML: Living Standard] section 13.2.9 Coercing an HTML DOM into an infoset uses a Unnnnnn escape sequence. That would map : to U00003A.
This local name escaping applies only to the HTML parsing algorithm. If the XHTML parsing algorithm is used, the localName and prefix will be correctly set for QName-based node names.
The namespace prefix is the value of the Element.prefix property, or empty if the value is null;
The namespace URI is the value of the Element.namespaceURI property, or empty if the value is null.
If the element is an HTML element, the namespace URI is "http://www.w3.org/1999/xhtml".
If the element is an SVG element, the namespace URI is "http://www.w3.org/2000/svg".
If the element is a MathML element, the namespace URI is "http://www.w3.org/1998/Math/MathML".
If the node is an instance of HTML DOM Attr then the result is determined as follows:
The attribute name is the tokenized attribute name. The [HTML: Living Standard] section 13.2.5.33, Attribute name state specifies that ASCII upper alpha characters are appended to the attribute’s name in lowercase.
The local name is the value of the Attr.localName property. This is derived as follows:
The local name is initially set to the attribute name.
If the local name is an SVG or MathML attribute name, the case-sensitive name is used. [HTML: Living Standard] section 13.2.6.1, Creating and inserting nodes has a table mapping the lowercase attribute names to their SVG/MathML names.
If the local name is an allowed xlink, xml, or xmlns attribute name the local name is the value of the local name column of the attribute name mapping table in [HTML: Living Standard] section 13.2.6.1, Creating and inserting nodes.
If the local name contains a character that is not a valid XML NameStartChar or NameChar, then an implementation-defined replacement string is used. The result must be a valid NCName.
Note:
[DOM: Living Standard] section 13.2.9 Coercing an HTML DOM into an infoset uses a Unnnnnn escape sequence. That would map : to U00003A.
This local name escaping applies only to the HTML parsing algorithm. If the XHTML parsing algorithm is used, the localName and prefix will be correctly set for QName-based node names.
The namespace prefix is the value of the Attr.prefix property, or empty if the value is null.
If the attribute name is an allowed xlink, xml, or xmlns attribute name the namespace prefix is the value of the prefix column of the attribute name mapping table in [HTML: Living Standard] section 13.2.6.1, Creating and inserting nodes.
The namespace URI is the value of the Attr.namespaceURI property, or empty if the value is null;
If the attribute name is an allowed xlink, xml, or xmlns attribute name the namespace URI is the value of the namespace column of the attribute name mapping table in [HTML: Living Standard] section 13.2.6.1, Creating and inserting nodes.
If the node is an instance of HTML DOM ProcessingInstruction then the result is an xs:QName constructed as follows:
The local name is the value of the ProcessingInstruction.target property;
The namespace prefix is empty;
The namespace URI is empty;
Otherwise, the result is an empty sequence.
Note:
When the resulting document is an HTML DOM HTMLDocument, the Element.localName and Element.name properties of HTML DOM Element nodes are both set to the qualified name.
Note:
When the resulting document is an HTML DOM HTMLDocument, the Attr.localName and Attr.name properties of HTML DOM Attr nodes are both set to the qualified name.
The result of the Section 6.7.117.5.11 parent AccessorDMdm:parent($node) for an HTML DOM Node is as follows:
Let $parent be the Node.parentNode property of the node;
If $parent is an instance of HTML DOM DocumentFragment, then for each HTML DOM HTMLTemplateElement$template in the parsed DOM tree:
Let $content be the value of the HTMLTemplateElement.content property of $template;
If $content is the same node as $parent, then the result is $template using the XDM mapping rules described in this section;
If there are no more $template nodes, then the result is an empty sequence;
If $parent is null, then the result is an empty sequence;
Otherwise, the result is $parent using the XDM mapping rules described in this section.
Note:
The current node can have a HTML DOM DocumentFragment parent node only if the include-template-content key of the html-parser-options is true().
Note:
The HTML DOM DocumentFragment’s Node.parentNode property is null, and a DocumentFragment attached to HTMLTemplateElement.content property does not have a host property connecting the fragment back to the template element.
If a future version of [DOM: Living Standard] adds a DocumentFragment.host property that references the node’s template element, or the implementation has access to that internal property, the implementation may choose to use that instead of traversing the parsed HTML tree.
The result of the Section 6.7.127.5.12 string-value AccessorDMdm:string-value($node) for an HTML DOM Node is as follows:
If the node is an instance of HTML DOM Document, then use the algorithm described in 15.2.1.12.1 Tree string construction;
If the node is an instance of HTML DOM Element, then use the algorithm described in 15.2.1.12.1 Tree string construction;
If the node is an instance of HTML DOM Text, then use the algorithm described in 15.2.1.12.2 Text node string construction;
Otherwise, the result is the value of the Node.nodeValue property.
The result of the Section 6.7.137.5.13 type-name AccessorDMdm:type-name($node) for an HTML DOM Node is as follows:
If the node is an instance of HTML DOM Element then the result is xs:untyped.
If the node is an instance of HTML DOM Attr then the result is xs:untypedAtomic.
If the node is an instance of HTML DOM Text then the result is xs:untypedAtomic.
Otherwise, the result is an empty sequence.
The result of the Section 6.7.147.5.14 typed-value AccessorDMdm:typed-value($node) for an HTML DOM Node is as follows:
Let $string-value be the 15.2.1.12 string-value Accessor for the node;
If the node is an instance of HTML DOM Document then the result is $string-value as an xs:untypedAtomic;
If the node is an instance of HTML DOM Element then the result is $string-value as an xs:untypedAtomic;
If the node is an instance of HTML DOM Attr then the result is $string-value as an xs:untypedAtomic;
If the node is an instance of HTML DOM Text then the result is $string-value as an xs:untypedAtomic;
Otherwise, the result is $string-value.
The result of the Section 6.7.157.5.15 unparsed-entity-public-id AccessorDMdm:unparsed-entity-public-id($node) for an HTML DOM Node is an empty sequence.
The result of the Section 6.7.167.5.16 unparsed-entity-system-id AccessorDMdm:unparsed-entity-system-id($node) for an HTML DOM Node is an empty sequence.
The functions listed in this section parse or serialize JSON data.
JSON is a popular format for exchange of structured data on the web: it is specified in [RFC 7159]. This section describes facilities allowing JSON data to be converted to and from XDM values.
This specification describes two ways of representing JSON data losslessly using XDM constructs. The first method uses XDM maps to represent JSON objects, and XDM arrays to represent JSON arrays. The second method represents all JSON constructs using XDM element and attribute nodes.
| Function | Meaning |
|---|---|
fn:parse-json | Parses input supplied in the form of a JSON text, returning the results typically in the form of a map or array. |
fn:json-doc | Reads an external resource containing JSON, and returns the result of parsing the resource as JSON. |
fn:json-to-xml | Parses a string supplied in the form of a JSON text, returning the results in the form of an XML document node. |
fn:xml-to-json | Converts an XML tree, whose format corresponds to the XML representation of JSON defined in this specification, into a string conforming to the JSON grammar. |
fn:pin | Adapts a map or array so that retrieval operations retain additional information. |
fn:label | Returns the label associated with a labeled item, as a map. |
Note also:
The function fn:serialize has an option to generate JSON output from a structure of maps and arrays.
The function fn:element-to-map enables arbitrary XML node trees to be converted to trees of maps and arrays suitable for serializing as JSON.
Adapts a map or array so that retrieval operations retain additional information.
fn:pin( | ||
$input | as | |
) as | ||
This function is nondeterministic, context-independent, and focus-independent.
The function creates a deep copy of the supplied map or array, adapted so that navigation within the deep copy returns items that are labeled with additional information about their position within the containing tree structure.
Note:
The formal specification of the function describes it as constructing a deep copy of the entire tree, but a practical implementation is likely to use a lazy evaluation strategy, so the only costs incurred are for items actually selected within the tree.
The function makes use of the concept of labeled items, an extension to the data model described in Section 3.3 Labeled ItemsDM.
The supplied value of $input must be either a map or an array.
The result is as follows:
If $input is a map M, the result is a map M′ derived from M as follows:
Any existing label on M is discarded.
M′ acquires a label having the property pinned set to the value true, and the property id set to an arbitrary xs:string value that is unique within the execution scope.
For every key-value pair (K, V) in M, M′ will have a key-value pair (K, V′) in which the key K is unchanged, and the value V′ is derived from V by applying the function derived-value(M', K, V), defined below.
The entry orderDM of M is retained in M′.
If $input is an array A, the result is an array A′ derived from A as follows:
Any existing label on A is discarded.
A′ acquires a label having the property pinned set to the value true, and the property id set to an arbitrary xs:string value that is unique within the execution scope.
For every member V in A, whose 1-based index position in A is X, A′ will have a member V′ derived from V by applying the function derived-value(A', X, V), defined below.
The id property described in the previous paragraphs is allocated only to the top-level map or array (the one supplied as an explicit argument to the fn:pin function). The function is notdeterministic: that is, if the function is called twice with the same arguments, it is implementation-dependent whether the same id property is allocated on both occasions.
If $input is anything other than a map or an array, a type error is raised.
The function derived-value(P, K, V) has the following logic. For every item J in V, V′ will contain an item J′ that is derived from J as follows:
Let TEMP be:
If J is a map or array, then fn:pin(J).
Note:
Note however that the id property of TEMP is not used, so there is no need to generate it.
Otherwise, J.
J′ is then a labeled item having the same subject as TEMP, together with a label having the following properties:
true
K
The 1-based position of J within V.
P
A zero-arity function item delivering the value of (?parent, ?parent ! label(.)?ancestors()).
A zero-arity function item delivering the value of (?parent ! label(.)?path(), ?key).
The effect of calling pin on a map or array is that subsequent retrieval operations within the pinned map or array return labeled results, whose labels contain useful information about where the results were found. For example, an expression such as json-doc($source)??name will return the values of all entries in the JSON tree having the key "name"; but very little can be done with this information because the result is simply a sequence of (typically) strings with no context. By contrast, the result of pin(json-doc($source))??name is the same set of strings, labeled with information about where they were found. For example, if $result is the result of the expression pin(json-doc($source))??name, then:
$result => label()?parent?ssn locates the map that contained each name, and returns the value of the ssn entry in that map.
$result => label()?ancestors()?course returns the values of any course entries in containing maps.
$result => label()?path() returns a sequence of map keys and array index values representing the location of the found entries within the JSON structure.
| Editorial note | |
The id property on the root of a pinned map or array is intended to support deep update operations, which have not yet been defined. | |
| Expression: |
|
|---|---|
| Result: | "c" |
| Expression: |
|
| Result: | 1, 3, 4 |
| Expression: | let $data := {
"fr": { "capital": "Paris", "languages": [ "French" ] },
"de": { "capital": "Berlin", "languages": [ "German" ] }
}
return pin($data)??languages[. = 'German'] ! label(.)?path()[1] |
| Result: | "de" |
Returns the label associated with a labeled item, as a map.
fn:label( | ||
$input | as | |
) as | ||
This function is deterministic, context-independent, and focus-independent.
If $input is an empty sequence, the function returns an empty sequence.
If $input is an item that has no label, the function returns an empty map.
If $input is a labeled item, the function returns the label, as a map.
The function makes use of the concept of labeled items, an extension to the data model described in Section 3.3 Labeled ItemsDM.
The data model allows any item to be labeled, and allows the label to be any map with string-valued keys. Currently the only operation that creates labeled values is the fn:pin function. For examples illustrating the use of fn:label, see fn:pin.
The functions included in this section operate on function items, that is, values referring to a function.
[Definition] Functions that accept functions among their arguments, or that return functions in their result, are described in this specification as higher-order functions.
Note:
Some functions such as fn:parse-json allow the option of supplying a callback function for example to define exception behavior. Where this is not essential to the use of the function, the function has not been classified as higher-order for this purpose; in applications where function items cannot be created, these particular options will not be available.
| Function | Meaning |
|---|---|
fn:function-lookup | Returns a function item having a given name and arity, if there is one. |
fn:function-name | Returns the name of the function identified by a function item. |
fn:function-arity | Returns the arity of the function identified by a function item. |
fn:function-identity | Returns a string representing the identity of a function item. |
fn:function-annotations | Returns the annotations of the function item. |
Returns a string representing the identity of a function item.
fn:function-identity( | ||
$function | as | |
) as | ||
This function is deterministic, context-independent, and focus-independent.
The fn:function-identity function returns a string that represents the identity of $function.
The returned string has the property that fn:function-identity($f1) and fn:function-identity($f2) are codepoint-equal if and only if $f1 and $f2 have the same function identity. Apart from this property, the result is implementation-dependent.
Any label attached to a function item is ignored (see Section 3.3 Labeled ItemsDM). Specifically, if L is a labeled item then fn:function-identity(L) returns the function identity of the subject of L.
In the case of maps and arrays, the result follows the following rule: If $X and $Y are both maps or arrays then fn:function-identity($X)must not be codepoint-equal to fn:function-identity($Y) unless $X and $Y are indistinguishable, that is unless every operator or function applied to $X returns the same result as for $Y. Even in this case, however, the result of the comparison fn:function-identity($X) eq fn:function-identity($Y) is implementation-dependent.
This function enables applications to test whether two expressions or variables reference the same function item. This may be useful, for example, to allow caching of function results to avoid repeated evaluation. The results of previous function invocations might be held in a map whose key is the function identity.
The function identity, by definition, is generated upon the creation of a function item. Specific expressions that create function items have their own rules for the identity of the returned functions: for example, it is guaranteed that evaluation of a function reference to a system function with no captured context (such as fn:abs#1) will always return the same function item.
It is not meaningful to store or compare the result of calling fn:function-identity across different execution scopes, because the string used to represent the function identity will generally vary from one execution scope to another.
The result of an expression such as function-identity(abs#1) eq function-identity(abs(?)) may be either true or false, because it is implementation-dependent whether abs#1 and abs(?) return the same function item.
Similarly, function-identity({ 1:() }) eq function-identity(map:entry(1, ())) may be either true or false.
Labels on function items are ignored because they typically represent information about how the function item was retrieved, rather than about the item itself. For example, a function item held in a map might be retrieved using a variety of lookup expressions, which may return the same function item but with different labels.
| Expression | Result |
|---|---|
| true() |
| false() |
| false() |
| false() |
The following functions take function items as an argument.
| Function | Meaning |
|---|---|
fn:apply | Makes a dynamic call on a function with an argument list supplied in the form of an array. |
fn:do-until | Processes a supplied value repeatedly, continuing when some condition is false, and returning the value that satisfies the condition. |
fn:every | Returns true if every item in the input sequence matches a supplied predicate. |
fn:filter | Returns those items from the sequence $input for which the supplied function $predicate returns true. |
fn:fold-left | Processes the supplied sequence from left to right, applying the supplied function repeatedly to each item in turn, together with an accumulated result value. |
fn:fold-right | Processes the supplied sequence from right to left, applying the supplied function repeatedly to each item in turn, together with an accumulated result value. |
fn:for-each | Applies the function item $action to every item from the sequence $input in turn, returning the concatenation of the resulting sequences in order. |
fn:for-each-pair | Applies the function item $action to successive pairs of items taken one from $input1 and one from $input2, returning the concatenation of the resulting sequences in order. |
fn:highest | Returns those items from a supplied sequence that have the highest value of a sort key, where the sort key can be computed using a caller-supplied function. |
fn:index-where | Returns the positions in an input sequence of items that match a supplied predicate. |
fn:lowest | Returns those items from a supplied sequence that have the lowest value of a sort key, where the sort key can be computed using a caller-supplied function. |
fn:partial-apply | Performs partial application of a function item by binding values to selected arguments. |
fn:partition | Partitions a sequence of items into a sequence of non-empty arrays containing the same items, starting a new partition when a supplied condition is true. |
fn:scan-left | Produces the sequence of successive partial results from the evaluation of fn:fold-left with the same arguments. |
fn:scan-right | Produces the sequence of successive partial results from the evaluation of fn:fold-right with the same arguments. |
fn:some | Returns true if at least one item in the input sequence matches a supplied predicate. |
fn:sort | Sorts a supplied sequence, based on the value of a sort key supplied as a function. |
fn:sort-by | Sorts a supplied sequence, based on the value of a number of sort keys supplied as functions. |
fn:sort-with | Sorts a supplied sequence, according to the order induced by the supplied comparator functions. |
fn:subsequence-where | Returns a contiguous sequence of items from $input, with the start and end points located by applying predicates. |
fn:take-while | Returns items from the input sequence prior to the first one that fails to match a supplied predicate. |
fn:transitive-closure | Returns all the nodes reachable from a given start node by applying a supplied function repeatedly. |
fn:while-do | Processes a supplied value repeatedly, continuing while some condition remains true, and returning the first value that does not satisfy the condition. |
With all these functions, if the caller-supplied function fails with a dynamic error, this error is propagated as an error from the higher-order function itself.
Performs partial application of a function item by binding values to selected arguments.
fn:partial-apply( | ||
$function | as , | |
$arguments | as | |
) as | ||
This function is deterministic, context-independent, and focus-independent.
The result is a function obtained by binding values to selected arguments of the function item $function. The arguments to be bound are represented by entries in the $arguments map: an entry with key $i and value $v causes the argument at position $i (1-based) to be bound to $v.
Any entries in $arguments whose keys are greater than the arity of $function are ignored.
If $arguments is an empty map then the function returns $function unchanged.
For example, the effect of calling fn:partial-apply($f, { 2: $x }) is the same as the effect of the partial appplication $f(?, $x, ?, ?, ....). The coercion rules are applied to the supplied arguments in the usual way.
Unlike a partial application using place-holder arguments:
The arity of $function need not be statically known.
It is possible to bind all the arguments of $function: the effect is to return a zero-arity function.
The result is a partially applied functionXP having the following properties (which are defined in Section 7.18.1 Function ItemsDM):
name: absent.
identity: A new function identity distinct from the identity of any other function item.
Note:
See also Section 4.5.7 Function IdentityXP.
arity: The arity of $function minus the number of parameters in $function that map to supplied arguments in $arguments.
parameter names: The names of the parameters of $function that do not map to supplied arguments in $arguments.
signature: The parameters in the returned function are the parameters of $function that do not map to supplied arguments in $arguments, retaining order. The result type of the returned function is the same as the result type of $function.
An implementation that can determine a more specific signature (for example, through use of type analysis) is permitted to do so.
body: The body of $function.
captured context: The static and dynamic context of $function, augmented, for each supplied argument, with a binding of the converted argument value to the corresponding parameter name.
A type error is raised if any of the supplied arguments, after applying the coercion rules, does not match the required type of the corresponding function parameter.
In addition, a dynamic error may be raised if any of the supplied arguments does not match other constraints on the value of that argument (for example, if the value supplied for a parameter expecting a regular expression is not a valid regular expression); or if the processor is able to establish that evaluation of the resulting function will fail for any other reason (for example, if an error is raised while evaluating a subexpression in the function body that depends only on explicitly supplied and defaulted parameters).
See also Section 4.5.4 Partial Function ApplicationXP.
The function is useful where the arity of a function item is not known statically, or where all arguments in a function are to be bound, returning a zero-arity function.
| Expression: | let $f := partial-apply(dateTime#2, {2: xs:time('00:00:00') })
return $f(xs:date('2025-03-01')) |
|---|---|
| Result: | xs:dateTime('2025-03-01T00:00:00') |
Maps were introduced as a new datatype in XDM 3.1. This section describes functions that operate on maps.
A map is a kind of item.
[Definition] A map consists of a sequence of entries, also known as key-value pairs. Each entry comprises a key which is an arbitrary atomic item, and an arbitrary sequence called the associated value.
[Definition] Within a map, no two entries have the same key. Two atomic items K1 and K2 are the same key for this purpose if the function call fn:atomic-equal($K1, $K2) returns true.
It is not necessary that all the keys in a map should be of the same type (for example, they can include a mixture of integers and strings).
Maps are immutable, and have no identity separate from their content. For example, the map:remove function returns a map that differs from the supplied map by the omission (typically) of one entry, but the supplied map is not changed by the operation. Two calls on map:remove with the same arguments return maps that are indistinguishable from each other; there is no way of asking whether these are “the same map”.
A map can also be viewed as a function from keys to associated values. To achieve this, a map is also a function item. The function corresponding to the map has the signature function($key as xs:anyAtomicValue) as item()*. Calling the function has the same effect as calling the map:get function: the expression $map($key) returns the same result as get($map, $key). For example, if $books-by-isbn is a map whose keys are ISBNs and whose assocated values are book elements, then the expression $books-by-isbn("0470192747") returns the book element with the given ISBN. The fact that a map is a function item allows it to be passed as an argument to higher-order functions that expect a function item as one of their arguments.
The functions defined in this section use a conventional namespace prefix map, which is assumed to be bound to the namespace URI http://www.w3.org/2005/xpath-functions/map.
The function call map:get($map, $key) can be used to retrieve the value associated with a given key.
There is no operation to atomize a map or convert it to a string. The function fn:serialize can in some cases be used to produce a JSON representation of a map.
| Function | Meaning |
|---|---|
map:build | Returns a map that typically contains one entry for each item in a supplied input sequence. |
map:contains | Tests whether a supplied map contains an entry for a given key. |
map:empty | Returns true if the supplied map contains no entries. |
map:entries | Returns a sequence containing all the key-value pairs present in a map, each represented as a single-entry map. |
map:entry | Returns a single-entry map that represents a single key-value pair. |
map:filter | Selects entries from a map, returning a new map. |
map:find | Searches the supplied input sequence and any contained maps and arrays for a map entry with the supplied key, and returns the corresponding values. |
map:for-each | Applies a supplied function to every entry in a map, returning the sequence concatenationXP of the results. |
map:get | Returns the value associated with a supplied key in a given map. |
map:items | Returns a sequence containing all the values present in a map, in order. |
map:keys | Returns a sequence containing all the keys present in a map. |
map:keys-where | Returns a sequence containing selected keys present in a map. |
map:merge | Returns a map that combines the entries from a number of existing maps. |
map:of-pairs | Returns a map that combines data from a sequence of key-value pair maps. |
map:pair | Returns a key-value pair map that represents a single key-value pair. |
map:pairs | Returns a sequence containing all the key-value pairs present in a map, each represented as a key-value pair map. |
map:put | Returns a map containing all the contents of the supplied map, but with an additional entry, which replaces any existing entry for the same key. |
map:remove | Returns a map containing all the entries from a supplied map, except those having a specified key. |
map:size | Returns the number of entries in the supplied map. |
Returns a map that typically contains one entry for each item in a supplied input sequence.
map:build( | ||
$input | as , | |
$key | as | := fn:identity#1, |
$value | as | := fn:identity#1, |
$options | as | := {} |
) as | ||
This function is deterministic, context-independent, and focus-independent.
Informally, the function processes each item in $input in order. It calls the $key function on that item to obtain a sequence of key values, and the $value function to obtain an associated value. Then, for each key value:
If the key is not already present in the target map, the processor adds a new key-value pair to the map, with that key and that value.
If the key is already present, the processor combines the new value for the key with the existing value; the way they are combined is determined by the duplicates option.
By default, when two duplicate entries occur:
A single combined entry will be present in the result.
This entry will contain the sequence concatenationXP of the supplied values.
The position of the combined entry in the entry orderDM of the result map will correspond to the position of the first of the duplicates.
The key of the combined entry will correspond to the key of one of the duplicates: it is implementation-dependent which one is chosen. (It is possible for two keys to be considered duplicates even if they differ: for example, they may have different type annotations, or they may be xs:dateTime values in different timezones.)
The $options argument can be used to control the way in which duplicate keys are handled. The allowed options, and their meanings, are the same as for the map:of-pairs function. The option parameter conventions apply.
The effect of the function is equivalent to the result of the following XPath expression.
for-each(
$input,
fn($item, $pos) {
for-each($keys($item, $pos), fn($k) {
map:pair($k, $value($item, $pos))
}
)}
)
=> map:of-pairs($options)An error is raised [err:FOJS0003] if the value of $options indicates that duplicates are to be rejected, and a duplicate key is encountered.
An error is raised [err:FOJS0005] if the value of $options includes an entry whose key is defined in this specification, and whose value is not a permitted value for that key.
The default function for both $keys and $value is the identity function. Although it is permitted to default both, this serves little purpose: usually at least one of these arguments will be supplied.
| Expression: |
|
|---|---|
| Result: | {} |
| Expression: |
|
| Result: | { 0: (3, 6, 9), 1: (1, 4, 7, 10), 2: (2, 5, 8) }(Returns a map with one entry for each distinct value of |
| Expression: | map:build( 1 to 5, value := format-integer(?, "w") ) |
| Result: | { 1: "one", 2: "two", 3: "three", 4: "four", 5: "five" }(Returns a map with five entries. The function to compute the key is an identity function, the function to compute the value invokes |
| Expression: | map:build(
("January", "February", "March", "April", "May", "June",
"July", "August", "September", "October", "November", "December"),
substring(?, 1, 1)
) |
| Result: | {
"A": ("April", "August"),
"D": ("December"),
"F": ("February"),
"J": ("January", "June", "July"),
"M": ("March", "May"),
"N": ("November"),
"O": ("October"),
"S": ("September")
} |
| Expression: | map:build(
("apple", "apricot", "banana", "blueberry", "cherry"),
substring(?, 1, 1),
string-length#1,
{ "duplicates": op("+") }
) |
| Result: | { "a": 12, "b": 15, "c": 6 }(Constructs a map where the key is the first character of an input item, and where the corresponding value is the total string-length of the items starting with that character.) |
| Expression: | map:build(
('Wang', 'Liu', 'Zhao'),
key := fn($name, $pos) { $name },
value := fn($name, $pos) { $pos }
) |
| Result: | { "Wang": 1, "Liu": 2, "Zhao": 3 }(Returns an inverted index for the input sequence with the string stored as key and the position stored as value.) |
| Expression: | let $titles := <titles>
<title>A Beginner’s Guide to <ix>Java</ix></title>
<title>Learning <ix>XML</ix></title>
<title>Using <ix>XML</ix> with <ix>Java</ix></title>
</titles>
return map:build($titles/title, fn($title) { $title/ix }) |
| Result: | {
"Java": (
<title>A Beginner’s Guide to <ix>Java</ix></title>,
<title>Using <ix>XML</ix> with <ix>Java</ix></title>
),
"XML": (
<title>Learning <ix>XML</ix></title>,
<title>Using <ix>XML</ix> with <ix>Java</ix></title>
)
} |
The following expression creates a map whose keys are employee | |
map:build(//employee, fn { @ssn }) | |
The following expression creates a map whose keys are employee | |
map:build(//employee, fn { @location }, fn { 1 }, { "duplicates": op("+") }) | |
The following expression creates a map whose keys are employee | |
map:build(
//employee,
key := fn { @location },
combine := fn($a, $b) { highest(($a, $b), fn { xs:decimal(@salary) }) }
) | |
The following expression creates a map allowing efficient access to every element in a document by means of its | |
map:build(//*, generate-id#1) | |
The following expression creates a map allowing efficient access to values in a recursive JSON structure using hierarchic paths: | |
let $tree := parse-json('{
"type": "package",
"name": "org",
"content": [
{ "type": "package",
"name": "xml,
"content: [
{ "type": "package",
"name": "sax",
"content": [
{ "type": "class",
"name": "Attributes"},
{ "type": "class",
"name": "ContentHandler"},
{ "type": "class",
"name": "XMLReader"}
]
}]
}]
}')
return map:build($tree ? descendant::~[record(type, name, *)],
fn{?ancestor-or-self::name => reverse() => string-join(,)},
fn{`{?type} {?name}`}) | |
The result is the map: | |
{ "org.xml.sax.Attributes": "class Attributes",
"org.xml.sax.ContentHandler": "class ContentHandler",
"org.xml.sax.XMLReader": "class XMLReader" } | |
A JNodeDM is a wrapper around a map or array, or around a value that appears within the content of a map or array. JNodes are described at Section 8.4 JNodesDM. Wrapping a map or array in a JNode enables the use of XPath lookup expressions such as $jnode?descendant::title, as described at Section 4.13.3 Lookup ExpressionsXP.
In addition to the functions defined in this section, functions that operate on JNodes include:
Delivers a root JNodeDM wrapping a map or array, enabling the use of lookup expression to navigate a JTreeDM rooted at that map or array.
fn:JNode( | ||
$input | as | |
) as | ||
This function is nondeterministic, context-independent, and focus-independent.
The function creates a JNodeDM that wraps the supplied map or array. Specifically, it creates a root JNode whose ¶value property is $input, and whose ¶parent, ¶position, and ¶selector properties are absent.
This has the effect that lookup expressions starting from this JNode retain information for subsequent navigation.
A JNode has unique identity. If two maps or arrays M1 and M2 have the same function identity, as determined by the function-identity function, then JNode(M1) is JNode(M2) must return true: that is, the same JNode must be delivered for both.
It is to some extent implementation-defined whether two maps or arrays have the same function identity. Processors should ensure as a minimum that when a variable $m is bound to a map or array, calling JNode($m) more than once (with the same variable reference) will deliver the same JNode each time.
The effect of the coercion rules is technically that if an existing JNode is supplied as $input, the wrapped value will be extracted, and then rewrapped as a JNode: in practice, this can be short-circuited by returning the supplied JNode unchanged.
Although fn:JNode is available as a function for user applications to call explicitly, it is also invoked implicitly by some expressions, notably when a lookup expression is written in a form such as $map?child::*. Specifically, if the left-hand operand of the lookup operator is a map or array, and the right-hand side uses an explicit axis such as child::, then the supplied map or array is implicitly wrapped in a JNode. The same is true when the deep lookup operator ?? is used.
The effect of applying fn:JNode to a map or array is that subsequent retrieval operations within the wrapped map or array return results that retain useful information about where the results were found. For example, consider an expression such as json-doc($source)??name. In this case the call on fn:JNode is implicit. This expression returns a set of JNodes representing all entries in the JTree having the key "name"; each of these JNodes contains not only the value of the relevant "name" entry, but also the key (which in this simple example is always "name" and the containing map. This means, for example, if $result is the result of the expression json-doc($source) ?? name, then:
$result ? .. ? ssn locates the map that contained each name, and returns the value of the ssn entry in that map.
$result ? ancestor::course returns the values of any course entries in containing maps.
$result ? ancestor::* => selector() returns a sequence of map keys and array index values representing the location of the found entries within the JSON structure.
An alternative way of wrapping a map or array, rather than calling JNode($X), is to use the lookup expression $X?..
| Expression: |
|
|---|---|
| Result: | "c" |
| Expression: |
|
| Result: | 1, 2, 3, 4 |
| Expression: | let $data := {
"fr": { "capital": "Paris", "languages": [ "French" ] },
"de": { "capital": "Berlin", "languages": [ "German" ] }
}
return JNode($data) ?? languages[. = 'German'] ? .. ? capital) => string() |
| Result: | "Berlin" |
Returns the ¶value property of a JNode.
fn:JNode-value( | ||
$input | as | |
) as | ||
This function is deterministic, context-independent, and focus-independent.
If $input is an empty sequence, the function returns an empty sequence.
Otherwise, the function returns the ¶value property of $input.
In many cases it is unnecessary to make an explicit call on JNode-value, because the coercion rules will take care of this automatically. For example, in an expression such as $X ? descendant::name [matches(., '^J')], the call on matches is supplied with a JNode as its first argument; atomization ensures that the actual value being passed to the first argument of matches is the atomized value of the ¶value property.
One case where the function call may be needed is when computing the effective boolean value. As with XNodes, writing if (?child::*[1]) ... tests for the existence of a child, it does not test its value. To test its value, write if (JNode-value(?child::*[1])) ..., or equivalently if (xs:boolean(?child::*[1])) ....
| Expression: | let $array := [1, 3, 4.5, 7, "eight", 10]
return $array ? child::~xs:integer =!> JNode-value() |
|---|---|
| Result: | 1, 3, 7, 10 |
| Expression: | let $map := {'Mo': 'Monday', 'Tu': 'Tuesday', 'We': 'Wednesday'}
return $map ? child::("Mo", "We", "Fr", "Su") =!> JNode-value() |
| Result: | "Monday", "Wednesday" |
| Expression: | let $array := [[4, 18], [30, 4, 22]]
return $array ? descendant::*[. gt 25][1] ? ancestor-or-self::* =!> JNode-value() => reverse() |
| Result: | [[4, 18], [30, 4, 22]], [30, 4, 22], 30 |
Returns the ¶selector property of a JNode.
fn:JNode-selector( | ||
$input | as | |
) as | ||
This function is deterministic, context-independent, and focus-independent.
If $input is an empty sequence, the function returns an empty sequence.
If $input is a root JNode (one in which the ¶selector property is absent), the function returns an empty sequence.
Otherwise, the function returns the ¶selector property of $input. In the case where the parent JNode wraps a map, this will be the key of the relevant entry within that map; in the case where the parent JNode wraps an array, it will be the 1-based index of the relevant member of the array.
| Expression: | let $array := [1, 3, 4.5, 7, "eight", 10]
return $array ? child::~xs:integer =!> JNode-selector() |
|---|---|
| Result: | 1, 2, 4, 6 |
| Expression: | let $map := {'Mo': 'Monday', 'Tu': 'Tuesday', 'We': 'Wednesday'}
return $map ? child::("Mo", "We", "Fr", "Su") =!> JNode-selector() |
| Result: | "Mo", "We" |
| Expression: | let $array := [[4, 18], [30, 4, 22]]
return $array ? descendant::*[. gt 25][1] ? ancestor::* =!> JNode-selector() => reverse() |
| Result: | 2, 1 |
Returns the ¶position property of a JNode.
fn:JNode-position( | ||
$input | as | |
) as | ||
This function is deterministic, context-independent, and focus-independent.
If $input is an empty sequence, the function returns an empty sequence.
If $input is a root JNode (one in which the ¶position property is absent), the function returns an empty sequence.
Otherwise, the function returns the ¶position property of $input. The value of this property will be 1 (one) except in cases where the value of an entry in a map, or a member in an array, is a sequence that contains multiple items including maps and/or arrays; in such cases the position will be the 1-based position of the relevant map or array.
This function is relevant only when there are maps whose entries are multi-item sequences that include maps and arrays, or arrays whose members include such multi-item sequences. Such structures are uncommon, and never arise from parsing of JSON source text. It is generally best to avoid such structures by using arrays rather than sequences within array and map content; apart from other considerations, this allows the data to be serialized in JSON format.
If an entry within a map, or a member of an array, contains a sequence of items that mixes arrays and maps with other content (for example the array [1, 2, ([1,2], [3,4], 5)), then a lookup using the child axis will only construct JNodes in respect of those items that are non-empty maps or arrays. This may leave gaps in the position numbering sequence, as illustrated in the examples below.
| Expression: | let $input := {
"a": [10, 20, 30],
"b": ([40, 50, 60], [], 0, [70, 80, (90, 100)])
}
return $input ? child::b ? *
! { "position": JNode-position(.),
"index": JNode-selector(.)
"value": JNode-value(.)
} |
|---|---|
| Result: | { "position": 1, "index": 1, "value": 40 },
{ "position": 1, "index": 2, "value": 50 },
{ "position": 1, "index": 3, "value": 60 },
{ "position": 4, "index": 1, "value": 70 },
{ "position": 4, "index": 2, "value": 80 },
{ "position": 4, "index": 3, "value": (90, 100) } |
| Expression: | let $input := {
"a": {"x": 10, "y": 20, "z": 30},
"b": ( {"x": 40, "y": 50, "z": 60},
{},
{"x": 70, "y": 80, "z": (90, 100)})
}
return $input ? child::b ? *
! { "position": JNode-position(.),
"key": JNode-selector(.)
"value": JNode-value(.)
} |
| Result: | { "position": 1, "key": "x", "value": 40 },
{ "position": 1, "key": "y", "value": 50 },
{ "position": 1, "key": "z", "value": 60 },
{ "position": 3, "key": "x", "value": 70 },
{ "position": 3, "key": "y", "value": 80 },
{ "position": 3, "key": "z", "value": (90, 100) } |
A function is provided to make a modified copy of a tree rooted at either an XNode or JNode.
Updates the contents of a tree of XNodes or JNodes, returning a modified copy.
fn:update( | ||
$root | as , | |
$select | as , | |
$action | as | |
) as | ||
This function is deterministic, context-independent, and focus-independent.
If $input is an empty sequence, the function returns an empty sequence.
Informally, the function returns a modified copy of $root, in which any GNodes appearing in the value of $select are modified by applying the function $action.
The result of the $action function must be compatible with its input. Specifically,
If the input is an attribute node, the result must be a sequence of zero or more attribute nodes. These replace any existing attribute nodes having the same name; but if two or more replacement attribute nodes on the same element have the same name, then an error is raised.
If the input is an element, text, comment, or processing instruction node, then the result must be a sequence of element, text, comment, or processing instruction nodes.
If the input is a JNode representing an entry in a map (that is, the JNode J has JNode-parent(J) instance of JNode(map(*))), the result must be a map. The entries in this map replace any existing map entries with the same key, but if two or more replacement map entries have the same key, then an error is raised.
If the input is a JNode representing an entry in an array (that is, the JNode J has JNode-parent(J) instance of JNode(array(*))), the result must be an array. The members of this array replace the selected array member.
Note:
The GNode supplied to the $action function will always be in one of the above categories.
The effect of the function is equivalent to the following XSLT 4.0 implementation, except for error handling:
<xsl:function name="fn:update" as="GNode()?">
<xsl:param name="root" as="GNode()?"/>
<xsl:param name="select" as="GNode()*"/>
<xsl:param name="action" as="fn(GNode()) as GNode()*"/>
<!-- Function to process an individual GNode. If it is a selected GNode,
call the supplied $action function. Otherwise, call fn:update
to process it recursively using the same options -->
<xsl:variable
name="process" as="fn(GNode()) as GNode()*"
select="fn{ if (. intersect $select)
then $action(.)
else update(., $select, $action) }"/>
<xsl:choose>
<!-- Processing for XNodes -->
<xsl:when test="$root instance of node()">
<xsl:copy select="$root">
<xsl:sequence select="$root ! (@*, node()) ! process(.)"/>
</xsl:copy>
</xsl:when>
<!-- Processing for JNodes that wrap maps -->
<xsl:when test="$root instance of JNode(map(*))
select="map:merge( $root ? child::* ! $process(.) ) => JNode()"/>
<!-- Processing for JNodes that wrap arrays -->
<xsl:when test="$root instance of JNode(array(*))
select="array:join( $root ? child::* ! $process(.) ) => JNode()"/>
<!-- Processing anything else -->
<xsl:otherwise select="()"/>
</xsl:choose>
</xsl:function>The $select argument identifies those GNodes within the tree rooted at $root that are to be replaced.
A GNode selected by the $select argument is effectively ignored if:
it does not not have $root as an ancestor; or
it has an ancestor GNode that is itself selected in $select
A dynamic error occurs if the replacement for a selected GNode is unsuitable. For example, an error will arise if an attribute is replaced by an element, or an element by an attribute (unless it happens to be the last attribute or the first element).
When updating an XTree, each node in the new tree has new node identity. Although optimizations are always possible, the complexities of handling node identity, type annotations, in-scope namespaces, and parent pointers mean that in practice, it is likely that the function will make a physical copy of the entire tree.
By contrast, when updating a JTree, none of these complexities arise, and by using persistent (also known as immutable or functional) data structures, an implementation may well be able to reuse those parts of the JTree that are not affected by changes, meaning that the cost in time and space will be proportional to the extent of the change, not to the size of the input tree.
| Expression: | let $tree := parse-xml('<a><b v="1"/><b v="2"/></a>')
return update($tree, $tree//@v, fn{.+1}) |
|---|---|
| Result: | parse-xml('<a><b v="2"/><b v="3"/></a>(Modifies the value of selected attributes.) |
| Expression: | let $tree := parse-xml('<a><b v="1"/><b v="2"/></a>')
return update($tree, $tree//@v, fn{()}) |
| Result: | parse-xml('<a><b/><b/></a>(Deletes selected attributes.) |
| Expression: | let $tree := <a><b v="1"/><b v="2"/></a>
return update($tree, $tree//@v[. gt 1], fn{., attribute #new {'true'}) |
| Result: | <a><b v="1"/><b v="2" new="true"/></a>(Inserts new attributes under specified conditions.) |
| Expression: | let $tree := <a><b/><b>12</b></a>
return update($tree, $tree//b[empty(child::node)], fn{., text {'default'}) |
| Result: | <a><b>default</b><b>12</b></a>(Expands empty elements with default values.) |
| Expression: | let $tree := {1: ["a", "b", "c"], 2: ["x", "y", "z"]}
return update($tree, $tree/?*?2, fn{[upper-case(.)]}) |
| Result: | {1: ["a", "B", "c"], 2: ["x", "Y", "z"]}(Updates selected members of selected arrays.) |
| Expression: | let $tree := {1: ["a", "b", "c"], 2: ["x", "y", "z"]}
return update($tree, $tree/?*?2, fn{[]}) |
| Result: | {1: ["a", "c"], 2: ["x", "z"]}(Deletes selected members of selected arrays.) |
The functions in this section deliver information about schema types (including simple types and complex types). These may represent built-in types (such as xs:dateTime), user-defined types found in the static context (typically because they appear in an imported schema), or types used as type annotations on schema-validated nodes.
For more information on schema types, see 1.8.2 Schema Type Hierarchy. The properties of a schema type are described in terms of the properties of a Simple Type Definition or Complex Type Definition component as described in Section 3.16.1 The Simple Type Definition Schema Component XS11-1 and Section 3.4.1 The Complex Type Definition Schema Component XS11-1 respectively. Not all properties are exposed.
The structured representation of a schema type is described in 20.1.121.1.1 Record fn:schema-type-record.
Note:
Simple properties of a schema type that can be expressed as strings or booleans are represented in this record structure directly as atomic field values, while complex properties whose values are themselves types (for example, base-type and primitive-type) are represented as functions. This is done partly to make it easier for implementations to compute complex properties on demand rather than in advance, and partly to ensure that the overall structure is always acyclic. For example, the primitive type of xs:decimal is itself xs:decimal, and if this were represented as a field value without a guarding function, serialization of the map using the JSON output method would not terminate.
| Function | Meaning |
|---|---|
fn:schema-type | Returns a record containing information about a named schema type in the static context. |
fn:type-of | Returns information about the type of a value, as a string. |
fn:atomic-type-annotation | Returns a record containing information about the type annotation of an atomic value. |
fn:node-type-annotation | Returns a record containing information about the type annotation of an element or attribute node. |
This record type represents the properties of a simple or complex type in a schema.
| Name | Meaning |
|---|---|
| The name of the type. Empty in the case of an anonymous type. Corresponds to {name}XS11-1 and {target namespace}XS11-1 in the XSD component model for simple and complex type components.
|
| True for a simple type, false for a complex type.
|
| Function item returning the base type (the type from which this type is derived by restriction or extension). The function is always present, and returns an empty sequence in the case of the type
|
| For an atomic type, a function item returning the primitive type from which this type is ultimately derived. Corresponds to the {primitive type definition}XS11-1 in the XSD component model for simple types. Absent if the type is non atomic, or if it is the simple type
|
| For a simple type, one of
|
| For a simple type with variety
|
| For a complex type with variety
|
| For a generalized atomic typeXP, a function item that can be called to establish whether the supplied atomic item is an instance of this type. In all other cases, absent.
|
| For a simple type, a function item that can be used to construct instances of this type. In the case of a named type that is present in the dynamic context, the result is the same function as returned by
|
| The record type is extensible (it may contain additional fields beyond those listed). |
Returns a record containing information about a named schema type in the static context.
fn:schema-type( | ||
$name | as | |
) as schema-type-record? | ||
This function is deterministic, context-dependent, and focus-independent.
If the static context (specifically, the in-scope schema typesXP) includes a schema type whose name matches $name, the function returns a schema-type-record containing information about that schema type. If not, it returns an empty sequence.
| Expression: |
|
|---|---|
| Result: | #xs:integer |
| Expression: |
|
| Result: | #xs:decimal |
| Expression: |
|
| Result: | #xs:nonNegativeInteger |
| Expression: |
|
| Result: | true() |
| Expression: |
|
| Result: | "union" |
| Expression: |
|
| Result: | #xs:double, #xs:float, #xs:decimal |
Returns information about the type of a value, as a string.
fn:type-of( | ||
$value | as | |
) as | ||
This function is deterministic, context-independent, and focus-independent.
The function returns a string, whose lexical form will always match the grammar of SequenceTypeXP, representing a sequence type that matches $value.
If $value is the empty sequence, the function returns the string "empty-sequence()".
Otherwise, the returned string is the concatenation of:
A string representing the distinct item types that are present in $value, formed as follows:
For each item in $value, construct a string representing its item type as described below.
Eliminate duplicate strings from this list by applying the fn:distinct-values function, forming a sequence of strings $ss.
If $ss contains only one string, use that string.
Otherwise, return the result of the expression `({ fn:string-join($ss, "|") })`.
An occurrence indicator: absent if $value contains exactly one item, or "+" if it contains more than one item.
The string representing the type of an individual item J is constructed as follows:
If J is aan XNodenodeDM, the result is one of the following strings, determined by the node kind of the node (see Section 6.7.97.5.9 node-kind AccessorDM):
"document-node()""element()""attribute()""text()""processing-instruction()""comment()""namespace-node()"
If J is a JNodeDM, the result is in the form JNode(T), where T is the result of applying the type-of function to the ¶value property of J.
If J is an atomic item, the result is a string chosen as follows:
Let T be the type denoted by the type annotation of J.
If T is an anonymous type, set T to the base type of T, and repeat until a type is reached that is not anonymous.
If the name of T is in the namespace http://www.w3.org/2001/XMLSchema, return the string "xs:local" where local is the local part of the name of T.
Otherwise, return the name of T in the form of a URIQualifiedNameXP (that is, "Q{uri}local", or "Q{}local" if the name is in no namespace).
If J is a function item:
If J is an array, return "array(*)".
If J is a map, return "map(*)".
Otherwise, return "function(*)".
If the $value argument is omitted and the context value is absentDM, the function raises type error [err:XPDY0002]XP.
In general, an item matches more than one type, and there are cases where there is no single matching type that is more specific than all the others. This is especially true with functions, maps, and arrays. This function therefore selects one of the types that matches the item, which is not necessarily the most specific type.
This function should not be used as a substitute for an instance of test. The precise type annotation of the result of an expression is not always predictable, because processors are free to deliver a more specific type than is mandated by the specification. For example, if $n is of type xs:positiveInteger, then the result of abs($n) is guaranteed to be an instance of xs:integer, but an implementation might reasonably return the supplied value unchanged: that is, a value whose actual type annotation is xs:positiveInteger. Similarly the type annotation of the value returned by position() might be xs:long rather than xs:integer.
Implementations should, however, refrain from exposing types that are purely internal. For example, an implementation might have an optimized internal representation for strings consisting entirely of ASCII characters, or for single-character strings; if this is the case then the type annotation returned by this function should be a user-visible supertype such as xs:string.
| Variables | |
|---|---|
let $e := <doc> <p id="alpha" xml:id="beta">One</p> <p id="gamma" xmlns="http://example.com/ns">Two</p> <ex:p id="delta" xmlns:ex="http://example.com/ns">Three</ex:p> <?pi 3.14159?> </doc> | |
| Expression | Result |
|---|---|
| "element()" |
| "element()+" |
| "attribute()" |
| "processing-instruction()" |
| "empty-sequence()" |
| "(element()|processing-instruction())+" |
| "xs:integer" |
| "xs:integer+" |
| "(xs:integer|xs:decimal)+" |
| "array(*)" |
| "map(*)" |
| "function(*)" |
| "JNode(array(*))" |
Returns a record containing information about the type annotation of an atomic value.
fn:atomic-type-annotation( | ||
$value | as | |
) as schema-type-record | ||
This function is deterministic, context-independent, and focus-independent.
Given an atomic value, the function returns a schema-type-record containing information about the atomic type represented by its type annotationDM.
The result will always have ?is-simple = true() and ?variety = "atomic". In a non-schema-aware environment the type will always be a built-in atomic type in the xs namespace: see 1.8.3 Atomic Type Hierarchy. Where a schema is in use, however, the result may be an atomic type defined in the schema, which may be an anonymous type.
Note that under the function coercion rules, it is possible to supply a node as the argument, which will then be atomized. In simple cases the type annotation on the atomized value will be the same as the type annotation on the node. But this is not always true: for example the type annotation on the node might be a complex type with simple content, while the type annotation on its atomized value is the corresponding simple content type. To get the type annotation on the node, use the function fn:node-type-annotation.
This function should not be used as a substitute for an instance of test. The precise type annotation of the result of an expression is not always predictable, because processors are free to deliver a more specific type than is mandated by the specification. For example, if $n is of type xs:positiveInteger, then the result of abs($n) is guaranteed to be an instance of xs:integer, but an implementation might reasonably return the supplied value unchanged: that is, a value whose actual type annotation is xs:positiveInteger. Similarly the type annotation of the value returned by position() might be xs:long rather than xs:integer.
Implementations should, however, refrain from exposing types that are purely internal. For example, an implementation might have an optimized internal representation for strings consisting entirely of ASCII characters, or for single-character strings; if this is the case then the type annotation returned by this function should be a user-visible supertype such as xs:string.
| Expression: | atomic-type-annotation(23) ? name |
|---|---|
| Result: | #xs:integer |
| Expression: | let $x := 23, $y := 93.7 return atomic-type-annotation($x) ? matches($y) |
| Result: | false() |
| Expression: | atomic-type-annotation(xs:numeric('23.2')) ? name |
| Result: | #xs:double |
Returns a record containing information about the type annotation of an element or attribute node.
fn:node-type-annotation( | ||
$node | as | |
) as schema-type-record | ||
This function is deterministic, context-independent, and focus-independent.
Given an element or attribute node, the function returns a schema-type-record containing information about the schema type represented by its type annotationDM.
For an element that has not been schema-validated, the type annotation is always xs:untyped.
For an attribute that has not been schema-validated, the type annotation is always xs:untypedAtomic.
The type annotation of an attribute node is always a simple type; the type annotation of an element node may be simple or complex.
| Expression: | let $e := parse-xml("<e/>")/*
return node-type-annotation($e) ? name |
|---|---|
| Result: | #xs:untyped |
| Expression: | let $a := parse-xml("<e a='3'/>")//@a
return node-type-annotation($a) ? name |
| Result: | #xs:untypedAtomic |
| Expression: | let $x := json-to-xml('[23, 24]', { 'validate': true() })
return node-type-annotation($x/*) ? name |
| Result: | #fn:arrayType |
| Expression: | let $x := json-to-xml('[23, 24]', { 'validate': true() })
let $n23 := $x//fn:number[. = 23]
let $type := node-type-annotation($n23)
return ($type ? name,
$type ? base-type() ? name,
$type ? base-type() ? base-type() ? name) |
| Result: | #fn:numberType, #fn:finiteNumberType, #xs:double |
Constructor functions are used to convert a supplied value to a given type, and the name of the function is the same as the name of the target type. This section describes constructor functions corresponding to the following types:
Simple types (atomic types, union types, and list types as defined in [XML Schema Part 2: Datatypes Second Edition]), which are present in the static context either because they appear in the in-scope schema typesXP or because they appear as named item typesXP.
These constructor functions always take a single argument.
Record types defined as named item typesXP.
These take one argument for each named field of the record type. Constructor functions for record types are defined in 21.622.6 Constructor functions for named record types.
Constructor functions are defined for all user-defined named simple types, and for most built-in atomic, list, and union types. The only named simple types that have no constructor function are those that have no instances other than instances of their derived types: specifically, xs:anySimpleType, xs:anyAtomicType, and xs:NOTATION.
Every built-in atomic type that is defined in [XML Schema Part 2: Datatypes Second Edition], except xs:anyAtomicType and xs:NOTATION, has an associated constructor function. The type xs:untypedAtomic, defined in Section 2.7 Schema Information DM31 and the two derived types xs:yearMonthDuration and xs:dayTimeDuration defined in Section 2.7 Schema Information DM31 also have associated constructor functions. Implementations may additionally provide a constructor functions for the new datatype xs:dateTimeStamp introduced in [XSD 1.1 Part 2].
A constructor function is not defined for xs:anyAtomicType as there are no atomic items with type annotation xs:anyAtomicType at runtime, although this can be a statically inferred type. A constructor function is not defined for xs:NOTATION since it is defined as an abstract type in [XML Schema Part 2: Datatypes Second Edition]. If the static context (See Section 2.1.1 Static Context XP31) contains a type derived from xs:NOTATION then a constructor function is defined for it. See 21.522.5 Constructor functions for user-defined atomic and union types.
The form of the constructor function for an atomic type eg:TYPE is:
eg:TYPE( | ||
$value | as | := . |
) as | ||
If $arg is the empty sequence, the empty sequence is returned. For example, the signature of the constructor function corresponding to the xs:unsignedInt type defined in [XML Schema Part 2: Datatypes Second Edition] is:
xs:unsignedInt( | ||
$arg | as | := . |
) as | ||
Calling the constructor function xs:unsignedInt(12) returns the xs:unsignedInt value 12. Another call of that constructor function that returns the same xs:unsignedInt value is xs:unsignedInt("12"). The same result would also be returned if the constructor function were to be called with a node that had a typed value equal to the xs:unsignedInt 12. The standard features described in Section 2.4.2 Atomization XP31 would atomize the node to extract its typed value and then call the constructor with that value. If the value passed to a constructor is not in the lexical space of the datatype to be constructed, and cannot be converted to a value in the value space of the datatype under the rules in this specification, then an dynamic error is raised [err:FORG0001].
The semantics of the constructor function xs:TYPE(arg) are identical to the semantics of arg cast as xs:TYPE? . See 2223 Casting.
If the argument to a constructor function is a literal, the result of the function may be evaluated statically; if an error is found during such evaluation, it may be reported as a static error.
Special rules apply to constructor functions for xs:QName and types derived from xs:QName and xs:NOTATION. See 21.222.2 Constructor functions for xs:QName and xs:NOTATION.
The argument is optional, and defaults to the context value (which will be atomized if necessary).
The following constructor functions for the built-in atomic types are supported:
xs:string( | ||
$value | as | := . |
) as | ||
xs:boolean( | ||
$value | as | := . |
) as | ||
xs:decimal( | ||
$value | as | := . |
) as | ||
xs:float( | ||
$value | as | := . |
) as | ||
Implementations should return negative zero for xs:float("-0.0E0"). But because [XML Schema Part 2: Datatypes Second Edition] does not distinguish between the values positive zero and negative zero, implementations may return positive zero in this case.
xs:double( | ||
$value | as | := . |
) as | ||
Implementations should return negative zero for xs:double("-0.0E0"). But because [XML Schema Part 2: Datatypes Second Edition] does not distinguish between the values positive zero and negative zero, implementations may return positive zero in this case.
xs:duration( | ||
$value | as | := . |
) as | ||
xs:dateTime( | ||
$value | as | := . |
) as | ||
xs:time( | ||
$value | as | := . |
) as | ||
xs:date( | ||
$value | as | := . |
) as | ||
xs:gYearMonth( | ||
$value | as | := . |
) as | ||
xs:gYear( | ||
$value | as | := . |
) as | ||
xs:gMonthDay( | ||
$value | as | := . |
) as | ||
xs:gDay( | ||
$value | as | := . |
) as | ||
xs:gMonth( | ||
$value | as | := . |
) as | ||
xs:hexBinary( | ||
$value | as | := . |
) as | ||
xs:base64Binary( | ||
$value | as | := . |
) as | ||
xs:anyURI( | ||
$value | as | := . |
) as | ||
xs:QName( | ||
$value | as | := . |
) as | ||
See 21.222.2 Constructor functions for xs:QName and xs:NOTATION for special rules.
xs:normalizedString( | ||
$value | as | := . |
) as | ||
xs:token( | ||
$value | as | := . |
) as | ||
xs:language( | ||
$value | as | := . |
) as | ||
xs:NMTOKEN( | ||
$value | as | := . |
) as | ||
xs:Name( | ||
$value | as | := . |
) as | ||
xs:NCName( | ||
$value | as | := . |
) as | ||
xs:ID( | ||
$value | as | := . |
) as | ||
xs:IDREF( | ||
$value | as | := . |
) as | ||
xs:ENTITY( | ||
$value | as | := . |
) as | ||
See 22.1.1023.1.10 Casting to xs:ENTITY for rules related to constructing values of type xs:ENTITY and types derived from it.
xs:integer( | ||
$value | as | := . |
) as | ||
xs:nonPositiveInteger( | ||
$value | as | := . |
) as | ||
xs:negativeInteger( | ||
$value | as | := . |
) as | ||
xs:long( | ||
$value | as | := . |
) as | ||
xs:int( | ||
$value | as | := . |
) as | ||
xs:short( | ||
$value | as | := . |
) as | ||
xs:byte( | ||
$value | as | := . |
) as | ||
xs:nonNegativeInteger( | ||
$value | as | := . |
) as | ||
xs:unsignedLong( | ||
$value | as | := . |
) as | ||
xs:unsignedInt( | ||
$value | as | := . |
) as | ||
xs:unsignedShort( | ||
$value | as | := . |
) as | ||
xs:unsignedByte( | ||
$value | as | := . |
) as | ||
xs:positiveInteger( | ||
$value | as | := . |
) as | ||
xs:yearMonthDuration( | ||
$value | as | := . |
) as | ||
xs:dayTimeDuration( | ||
$value | as | := . |
) as | ||
xs:untypedAtomic( | ||
$value | as | := . |
) as | ||
xs:dateTimeStamp( | ||
$value | as | := . |
) as | ||
Available only if the implementation supports XSD 1.1.
Special rules apply to constructor functions for the types xs:QName and xs:NOTATION, for two reasons:
Values cannot belong directly to the type xs:NOTATION, only to its subtypes.
The lexical representation of these types uses namespace prefixes, whose meaning is context-dependent.
These constraints result in the following rules:
There is no constructor function for xs:NOTATION. Constructors are defined, however, for xs:QName, for types derived or constructed from xs:QName, and for types derived or constructed from xs:NOTATION.
When converting from an xs:string, the prefix within the lexical xs:QName supplied as the argument is resolved to a namespace URI using the statically known namespaces from the static context. If the lexical xs:QName has no prefix, the namespace URI of the resulting expanded-QName is the default namespace for elements and types, taken from the static context. Components of the static context are defined in Section 2.1.1 Static Context XP31. A dynamic error is raised [err:FONS0004] if the prefix is not bound in the static context. As described in Section 2.1 Terminology DM31, the supplied prefix is retained as part of the expanded-QName value.
When a constructor function for a namespace-sensitive type is used as a literal function item or in a partial function application (for example, xs:QName#1 or xs:QName(?)) the namespace bindings that are relevant are those from the static context of the literal function item or partial function application. When a constructor function for a namespace-sensitive type is obtained by means of the fn:function-lookup function, the relevant namespace bindings are those from the static context of the call on fn:function-lookup.
Note:
When the supplied argument to the xs:QName constructor function is a node, the node is atomized in the usual way, and if the result is xs:untypedAtomic it is then converted as if a string had been supplied. The effect might not be what is desired. For example, given the attribute xsi:type="my:type", the expression xs:QName(@xsi:type) might fail on the grounds that the prefix my is undeclared. This is because the namespace bindings are taken from the static context (that is, from the query or stylesheet), and not from the source document containing the @xsi:type attribute. The solution to this problem is to use the function call resolve-QName(@xsi:type, .) instead.
Each of the three built-in list types defined in [XML Schema Part 2: Datatypes Second Edition], namely xs:NMTOKENS, xs:ENTITIES, and xs:IDREFS, has an associated constructor function.
The function signatures are as follows:
xs:NMTOKENS( | ||
$value | as | := . |
) as | ||
xs:ENTITIES( | ||
$value | as | := . |
) as | ||
xs:IDREFS( | ||
$value | as | := . |
) as | ||
The semantics are equivalent to casting to the corresponding types from xs:string.
All three of these types have the facet minLength = 1 meaning that there must always be at least one item in the list. The return type, however, allows for the fact that when the argument to the function is an empty sequence, the result is an empty sequence.
Note:
In the case of atomic types, it is possible to use an expression such as xs:date(@date-of-birth) to convert an attribute value to an instance of xs:date, knowing that this will work both in the case where the attribute is already annotated as xs:date, and also in the case where it is xs:untypedAtomic. This approach does not work with list types, because it is not permitted to use a value of type xs:NMTOKEN* as input to the constructor function xs:NMTOKENS. Instead, it is necessary to use conditional logic that performs the conversion only in the case where the input is untyped: if (@x instance of attribute(*, xs:untypedAtomic)) then xs:NMTOKENS(@x) else data(@x)
There is a constructor function for the union type xs:numeric defined in [XQuery and XPath Data Model (XDM) 3.1]. The function signature is:
xs:numeric( | ||
$value | as | := . |
) as | ||
The semantics are determined by the rules in 22.3.723.3.7 Casting to union types. These rules have the effect that:
If the argument is an instance of xs:double, xs:float, or xs:decimal, then the result is an instance of the same primitive type, with the same value;
If the argument is an instance of xs:boolean, the result is the xs:double value 0.0e0 or 1.0e0;
If the argument is an instance of xs:string or xs:untypedAtomic, then:
If the value is in the lexical space of xs:double, the result will be the corresponding xs:double value;
Otherwise, a dynamic error [err:FORG0001] occurs;
Note:
The result will never be an instance of xs:float, xs:decimal, or xs:integer. This is because xs:double appears first in the list of member types of xs:numeric, and its lexical space subsumes the lexical space of the other numeric types. Thus, unlike XPath numeric literals, the result does not depend on the lexical form of the supplied value. The reason for this design choice is to retain compatibility with the function conversion rules: functions such as fn:abs and fn:round are declared to expect an instance of xs:numeric as their first or only argument, and compatibility with the function conversion rules defined in earlier versions of these specifications demands that when an untyped atomic item (or untyped node) is supplied as the argument, it is converted to an xs:double value even if its lexical form is that (say) of an integer.
In all other cases, a dynamic error [err:FORG0001] occurs.
In the case of an implementation that supports XSD 1.1, there is a constructor function associated with the built-in union type xs:error.
The function signature is as follows:
xs:error( | ||
$value | as | := . |
) as | ||
The semantics are equivalent to casting to the corresponding union type (see 22.3.723.3.7 Casting to union types).
Note:
Because xs:error has no member types, and therefore has an empty value space, casting will always fail with a dynamic error except in the case where the supplied argument is an empty sequence, in which case the result is also an empty sequence.
For every named user-defined simple type in the static context (See Section 2.1.1 Static Context XP31), there is a constructor function whose name is the same as the name of the type.
For named atomic types, the rules are the same as the rules for constructing built-in derived atomic types defined in 21.122.1 Constructor functions for XML Schema built-in atomic types. For a named atomic type T, the signature of the function takes the form T($value as xs:anyAtomicType? := .) as T?, and the semantics are the same as casting to derived types: see 22.3.123.3.1 Casting to derived types..
For named union types, the rules follow the same principles as the rules for constructing built-in union types defined in 21.422.4 Constructor functions for XML Schema built-in union types. For a named union type U, the signature of the function takes the form U($value as xs:anyAtomicType? := .) as U?, and the semantics are the same as casting to union types: see 22.3.723.3.7 Casting to union types.
For named list types, the rules follow the same principles as the rules for constructing built-in list types defined in 21.322.3 Constructor functions for XML Schema built-in list types. For a named list type L, where the item type of L is I, the signature of the function takes the form L($value as xs:string? := .) as I*, and the semantics are the same as casting to list types: see 22.3.823.3.8 Casting to list types.
Constructor functions are available both for named types defined in an imported schema (that is, named simple types in the in-scope schema typesXP), and for types defined by means of named item typesXP. Specifically, named enumeration types follow the same rules as schema types derived by restricting xs:string, and named local union types follow the same rules as union types defined in a schema.
Special rules apply to constructor functions for namespace-sensitive types, that is, atomic types derived from xs:QName and xs:NOTATION, list types that have a namespace-sensitive item type, and union types that have a namespace-sensitive member type. See 21.222.2 Constructor functions for xs:QName and xs:NOTATION.
Consider a situation where the static context contains an atomic type called hatSize defined in a schema whose target namespace is bound to the prefix eg. In such a case the following constructor function is available to users:
eg:hatSize( | ||
$value | as | |
) as | ||
The resulting function may be used in an expression such as eg:hatSize("10½").
Note:
To construct an instance of a user-defined type that is not in a namespace, it is possible to use an EQName (for example Q{}hatsize(17)). Alternatives are to use a cast expression (17 cast as hatsize) or (if the host language allows it) to undeclare the default function namespace.
Both XQuery 4.0 and XSLT 4.0 provide syntax to declare named record types; such a declaration implicitly adds a constructor function for values of that type to the (See Section 2.1.1 Static Context XP31).
For example, if there is a named item type with the XQuery definition:
declare record my:location ( latitude as xs:double, longitude as xs:double )
then there will be a function definition equivalent to:
declare function my:location (
$latitude as xs:double,
$longitude as xs:double
) as my:location {
{ 'latitude': $latitude, 'longitude': $longitude }
}Equivalently using XSLT syntax, if there is a named item type with the XSLT definition:
<xsl:record name="my:location" as="record(latitude as xs:double, longitude as xs:double)"/>
then there will be a function definition equivalent to:
<xsl:function name="my:location" as="my:location">
<xsl:param name="latitude" as="xs:double"/>
<xsl:param name="longitude" as="xs:double"/>
<xsl:map>
<xsl:map-entry key="'latitude'" select="$latitude"/>
<xsl:map-entry key="'longitude'" select="$longitude"/>
</xsl:map>
</xsl:function>The rules defining the relationship of the function definition to the record type are given for XQuery 4.0 in Section 5.20.2 Constructor Functions for Named Record TypesXQ.
| Editorial note | |
| TODO: Add cross-reference to XSLT here. Anticipates resolution of issue #1485. | |
Constructor functions and cast expressions accept an expression and return a value of a given type. They both convert a source value SV, of a source type, ST to a target value TV, of the given target type TT.
Constructor functions and cast expressions have identical semantics but different syntax. The name of the constructor function is the same as the name of the built-in [XML Schema Part 2: Datatypes Second Edition] datatype or the datatype defined in Section 2.7 Schema Information DM31 of [XQuery and XPath Data Model (XDM) 3.1] (see 21.122.1 Constructor functions for XML Schema built-in atomic types) or the user-derived datatype (see 21.522.5 Constructor functions for user-defined atomic and union types) that is the target for the conversion, and the semantics are exactly the same as for a cast expression; for example, xs:date("2003-01-01") means exactly the same as "2003-01-01" cast as xs:date?.
The cast expression takes a type name to indicate the target type of the conversion. See Section 3.14.2 Cast XP31. If the type name allows the empty sequence and the expression to be cast is the empty sequence, the empty sequence is returned. If the type name does not allow the empty sequence and the expression to be cast is the empty sequence, a type error is raised [err:XPTY0004]XP.
Where the argument to a cast is a literal, the result of the function may be evaluated statically; if an error is encountered during such evaluation, it may be reported as a static error.
The general rules for casting from primitive types to primitive types are defined in 22.123.1 Casting from primitive types to primitive types, and subsections describe the rules for specific target types. The general rules for casting from xs:string (and xs:untypedAtomic) follow in 22.223.2 Casting from xs:string and xs:untypedAtomic. Casting to non-primitive types, including atomic types derived by restriction, union types, and list types, is described in 22.323.3 Casting involving non-primitive types. Casting from derived types is defined in 22.3.423.3.4 Casting from derived types to parent types, 22.3.523.3.5 Casting within a branch of the type hierarchy and 22.3.623.3.6 Casting across the type hierarchy.
Casting is not supported to or from xs:anySimpleType. Casting to xs:anySimpleType is not permitted and raises a static error: [err:XPST0080]XP.
Similarly, casting is not supported to or from xs:anyAtomicType and will raise a static error: [err:XPST0080]XP. There are no atomic items with the type annotation xs:anyAtomicType, although this can be a statically inferred type.
This section now uses the term primitive type strictly to refer to the 20 atomic types that are not derived by restriction from another atomic type: that is, the 19 primitive atomic types defined in XSD, plus xs:untypedAtomic. The three types xs:integer, xs:dayTimeDuration, and xs:yearMonthDuration, which have custom casting rules but are not strictly-speaking primitive, are now handled in other subsections. [Issue 1401 PR 1409]
This section defines casting between primitive types (specifically, the 19 primitive types defined in [XML Schema Part 2: Datatypes Second Edition] plus xs:untypedAtomic. The type conversions that are supported between primitive atomic types are indicated in the table below; casts between other (non-primitive) types are defined in terms of these primitives.
Where the target type TT is a primitive type, the result TV will always be an instance of TT. The result may also be an instance of a type derived from TT: for example casting an xs:NCNameSV to xs:stringmay return SV unchanged, with its original type annotation.
In this table, there is a row for each primitive type acting as the source of the conversion and there is a column for each primitive type acting as the target of the conversion. The intersections of rows and columns contain one of three characters:
Y indicates that a conversion from values of the type to which the row applies to the type to which the column applies is supported;
N indicates that there are no supported conversions from values of the type to which the row applies to the type to which the column applies;
M indicates that a conversion from values of the type to which the row applies to the type to which the column applies may succeed for some values in the value space and fail for others.
There is no row or column for xs:untypedAtomic because the casting rules are exactly the same as for xs:string. When casting from xs:string or xs:untypedAtomic the semantics in 22.223.2 Casting from xs:string and xs:untypedAtomic apply, regardless of target type.
[XML Schema Part 2: Datatypes Second Edition] defines xs:NOTATION as an abstract type. Thus, casting to xs:NOTATION from any other type including xs:NOTATION is not permitted and raises a static error [err:XPST0080]XP. However, casting from one subtype of xs:NOTATION to another subtype of xs:NOTATION is permitted.
Casting is not supported to or from xs:anySimpleType. Thus, there is no row or column for this type in the table below. For any node that has not been validated or has been validated as xs:anySimpleType, the typed value of the node is an atomic item of type xs:untypedAtomic. There are no atomic items with the type annotation xs:anySimpleType at runtime. Casting to xs:anySimpleType is not permitted and raises a static error: [err:XPST0080]XP.
Similarly, casting is not supported to or from xs:anyAtomicType and will raise a static error: [err:XPST0080]XP. There are no atomic items with the type annotation xs:anyAtomicType at runtime, although this can be a statically inferred type.
If casting is attempted from an ST to a TT for which casting is not supported, as defined in the table below, a type error is raised [err:XPTY0004]XP.
In the following table, the columns and rows are identified by short codes that identify simple types as follows:
aURI = xs:anyURI
b64 = xs:base64Binary
bool = xs:boolean
dat = xs:date
gDay = xs:gDay
dbl = xs:double
dec = xs:decimal
dT = xs:dateTime
dur = xs:duration
flt = xs:float
hxB = xs:hexBinary
gMD = xs:gMonthDay
gMon = xs:gMonth
NOT = xs:NOTATION
QN = xs:QName
str = xs:string
tim = xs:time
gYM = xs:gYearMonth
gYr = xs:gYear
In the following table, the notation “S\T” indicates that the source (“S”) of the conversion is indicated in the column below the notation and that the target (“T”) is indicated in the row to the right of the notation.
| S\T | str | flt | dbl | dec | dur | dT | tim | dat | gYM | gYr | gMD | gDay | gMon | bool | b64 | hxB | aURI | QN | NOT |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| str | Y | M | M | M | M | M | M | M | M | M | M | M | M | M | M | M | M | M | M |
| flt | Y | Y | Y | M | N | N | N | N | N | N | N | N | N | Y | N | N | N | N | N |
| dbl | Y | Y | Y | M | N | N | N | N | N | N | N | N | N | Y | N | N | N | N | N |
| dec | Y | Y | Y | Y | N | N | N | N | N | N | N | N | N | Y | N | N | N | N | N |
| dur | Y | N | N | N | Y | N | N | N | N | N | N | N | N | N | N | N | N | N | N |
| dT | Y | N | N | N | N | Y | Y | Y | Y | Y | Y | Y | Y | N | N | N | N | N | N |
| tim | Y | N | N | N | N | N | Y | N | N | N | N | N | N | N | N | N | N | N | N |
| dat | Y | N | N | N | N | Y | N | Y | Y | Y | Y | Y | Y | N | N | N | N | N | N |
| gYM | Y | N | N | N | N | N | N | N | Y | N | N | N | N | N | N | N | N | N | N |
| gYr | Y | N | N | N | N | N | N | N | N | Y | N | N | N | N | N | N | N | N | N |
| gMD | Y | N | N | N | N | N | N | N | N | N | Y | N | N | N | N | N | N | N | N |
| gDay | Y | N | N | N | N | N | N | N | N | N | N | Y | N | N | N | N | N | N | N |
| gMon | Y | N | N | N | N | N | N | N | N | N | N | N | Y | N | N | N | N | N | N |
| bool | Y | Y | Y | Y | N | N | N | N | N | N | N | N | N | Y | N | N | N | N | N |
| b64 | Y | N | N | N | N | N | N | N | N | N | N | N | N | N | Y | Y | N | N | N |
| hxB | Y | N | N | N | N | N | N | N | N | N | N | N | N | N | Y | Y | N | N | N |
| aURI | Y | N | N | N | N | N | N | N | N | N | N | N | N | N | N | N | Y | N | N |
| QN | Y | N | N | N | N | N | N | N | N | N | N | N | N | N | N | N | N | Y | M |
| NOT | Y | N | N | N | N | N | N | N | N | N | N | N | N | N | N | N | N | Y | M |
xs:untypedAtomicAny atomic item SV can be cast to xs:untypedAtomic.
The effect is the same as casting to xs:string (see 22.1.223.1.2 Casting to xs:string) and then returning the xs:untypedAtomic value comprising the same sequence of characters.
xs:stringAny atomic item SV can be cast to xs:string.
The resulting xs:string value TV depends on the source type ST as follows.
If SV is an instance of xs:string, TV is an instance of xs:string comprising the same sequence of characters as SV.
Note:
The implementation is free to return SV unchanged, including its original type annotation.
If SV is an instance of xs:anyURI, the result TV is an instance of xs:string comprising the same sequence of characters as SV, but with a type annotation of xs:anyURI. No escaping of special characters takes place.
If SV is an instance of xs:QName or xs:NOTATION:
if the qualified name has a prefix, then TV is the concatenation of the prefix of SV, a single colon (:), and the local name of SV.
otherwise TV is the local name of SV.
If SV is an instance of xs:numeric, the rules in 22.1.2.123.1.2.1 Casting numeric values to xs:string apply.
If SV is an instance of xs:dateTime, xs:date or xs:time, the rules in 22.1.2.223.1.2.2 Casting date/time values to xs:string apply.
If ST is xs:duration, or any subtype thereof including xs:yearMonthDuration and xs:dayTimeDuration, then the rules in 22.1.2.323.1.2.3 Casting xs:duration values to xs:string apply.
In all other cases, TV is the [XML Schema Part 2: Datatypes Second Edition] canonical representation of SV. For datatypes that do not have a canonical representation defined an implementation-dependent canonical representation may be used.
To cast as xs:untypedAtomic the value is cast as xs:string, as described above, and the type annotation changed to xs:untypedAtomic.
xs:stringThe following rules apply when the source type ST is xs:decimal, xs:double, or xs:float, or any subtype of these including xs:integer.
If SV is an instance of xs:decimal, then the canonical representation of SV is returned, as defined in [XSD 1.1 Part 2]. Specifically, see decimalCanonicalMap.
Note:
Unlike previous versions of this specification, no special rule is given for the case where SV is an instance of xs:integer. This is because the general rule for xs:decimal gives the same result. The result in this case will be a sequence of decimal digits in the range U+0030 (DIGIT ZERO, 0) to U+0039 (DIGIT NINE, 9) , optionally preceded by a minus sign, with no leading zeroes. For example: 42, -1, 0, or 1000000000.
Note:
An xs:decimal that is equal to an integer is converted to a string as if it were first cast to an xs:integer. Specifically, there will be no decimal point and no fractional part.
If the value is not equal to an integer, then there will be a decimal point and a fractional part, which will be a sequence of decimal digits with no trailing zeroes. For example: 42.3, -1.5, or 0.00001.
If SV is an instance of xs:float or xs:double, then:
TV will be an xs:string in the lexical space of xs:double or xs:float that when converted to an xs:double or xs:float under the rules of 22.223.2 Casting from xs:string and xs:untypedAtomic produces a value that is equal to SV, or is NaN if SV is NaN. In addition, TV must satisfy the constraints in the following sub-bullets.
If SV has an absolute value that is greater than or equal to 0.000001 (one millionth) and less than 1000000 (one million), then the value is converted to an xs:decimal and the resulting xs:decimal is converted to an xs:string according to the rules above, as though using an implementation of xs:decimal that imposes no limits on the totalDigits or fractionDigits facets.
If SV has the value positive or negative zero, TV is "0" or "-0" respectively.
If SV is positive or negative infinity, TV is the string "INF" or "-INF" respectively.
In other cases, the result consists of a mantissa, which has the lexical form of an xs:decimal, followed by the letter "E", followed by an exponent which has the lexical form of an xs:integer. Leading zeroes and "+" signs are prohibited in the exponent. For the mantissa, there must be a decimal point, and there must be exactly one digit before the decimal point, which must be non-zero. The "+" sign is prohibited. There must be at least one digit after the decimal point. Apart from this mandatory digit, trailing zero digits are prohibited.
Note:
The above rules allow more than one representation of the same value. For example, the xs:float value whose exact decimal representation is 1.26743223E15 might be represented by any of the strings "1.26743223E15", "1.26743222E15" or "1.26743224E15" (inter alia). It is implementation-dependent which of these representations is chosen.
Note:
The string representations of numeric values are backwards compatible with XPath 1.0 except for the special values positive and negative infinity, negative zero and values outside the range 1.0e-6 to 1.0e+6.
xs:stringIf SV is an instance of xs:dateTime, xs:date, xs:time, xs:gYear, xs:gYearMonth, xs:gMonth, xs:gMonthDay, or xs:gDay, then TV is the canonical representation of SV as defined in [XSD 1.1 Part 2].
Note:
The result TV includes the original timezone if a timezone is present.
All these data types contain different combinations of the components year, month, day, hour, minute, second, and timezone; all the components relevant to the data type (with the exception of the timezone) are output, and the results are concatenated together with suitable punctuation. Specifically:
The year component is represented as a xs:string of four digits, or more if needed. A leading minus sign is present for BCE years.
The month, day, hour and minute components are represented as two digits (with a leading zero if needed). For example, February is represented as 02.
The hours component will never be "24": midnight is always represented as "00:00:00".
The second component is output using as a two-digit integer if it is a whole number (for example, 30, 05, or 00), or if it is fractional, as two digits followed by a decimal point followed by as many digits as are necessary, with no trailing zeroes (for example 30.5 or 00.001).
The timezone component, if present, is cast to xs:string by applying the function eg:convertTZtoString given in 22.1.523.1.5 Casting to date and time types. Examples are Z, +01:00, -05:00, or +05:30.
.
xs:duration values to xs:stringIf SV is an instance of xs:duration (including its subtypes xs:yearMonthDuration and xs:dayTimeDuration), then TV is the canonical representation of SV as defined in [XSD 1.1 Part 2]. Specifically, see durationCanonicalMap.
Note:
The rules have the effect of normalizing the value so that the number of months is always less than 12, the number of hours less than 24, and the number of minutes and seconds less than 60. Zero-valued components are omitted. Fractional seconds follow the same rules as xs:decimal. For example, the duration P15MT30H is represented as P1Y3M1DT6H. A zero-length duration is output as PT0S.
Note:
At the time of writing, the published XSD 1.1 recommendation contains cut-and-paste errors in the definition of the dayTimeDuration canonical mapping. The binding of variable s should be to dt's ·seconds· (not ·months·) component, and the return expression given as sgn & 'P' & ·duYearMonthCanonicalFragmentMap·(|s|) should read sgn & 'P' & ·duDayTimeCanonicalFragmentMap·(|s|)
In reading these XSD formulations, be aware that a & b represents string concatenation, while |s| computes the absolute value of a number.
This section defines the rules for casting to the primitive numeric types xs:float, xs:double, and xs:decimal. Rules for casting to the derived type xs:integer are given in 22.3.223.3.2 Casting to xs:integer.
When a value of any simple type is cast as xs:float, the xs:floatTV is derived from the ST and the SV as follows:
If ST is xs:float, then TV is SV and the conversion is complete.
If ST is xs:double, then TV is obtained as follows:
if SV is the xs:double value INF, -INF, NaN, positive zero, or negative zero, then TV is the xs:float value INF, -INF, NaN, positive zero, or negative zero respectively.
otherwise, SV can be expressed in the form m × 2^e where the mantissa m and exponent e are signed xs:integers whose value range is defined in [XML Schema Part 2: Datatypes Second Edition], and the following rules apply:
if m (the mantissa of SV) is outside the permitted range for the mantissa of an xs:float value (-2^24-1 to +2^24-1), then it is divided by 2^N where N is the lowest positive xs:integer that brings the result of the division within the permitted range, and the exponent e is increased by N. This is integer division (in effect, the binary value of the mantissa is truncated on the right). Let M be the mantissa and E the exponent after this adjustment.
if E exceeds 104 (the maximum exponent value in the value space of xs:float) then TV is the xs:float value INF or -INF depending on the sign of M.
if E is less than -149 (the minimum exponent value in the value space of xs:float) then TV is the xs:float value positive or negative zero depending on the sign of M
otherwise, TV is the xs:float value M × 2^E.
If ST is xs:decimal, or xs:integer, then TV is xs:float(SV cast as xs:string) and the conversion is complete.
If ST is xs:boolean, SV is converted to 1.0E0 if SV is true and to 0.0E0 if SV is false and the conversion is complete.
If ST is xs:untypedAtomic or xs:string, see 22.223.2 Casting from xs:string and xs:untypedAtomic.
Note:
XSD 1.1 adds the value +INF to the lexical space, as an alternative to INF. XSD 1.1 also adds negative zero to the value space.
Note:
Implementations should return negative zero for xs:float("-0.0E0"). But because [XML Schema Part 2: Datatypes Second Edition] does not distinguish between the values positive zero and negative zero. Implementations may return positive zero in this case.
When a value of any simple type is cast as xs:double, the xs:double value TV is derived from the ST and the SV as follows:
If ST is xs:double, then TV is SV and the conversion is complete.
If ST is xs:float or a type derived from xs:float, then TV is obtained as follows:
if SV is the xs:float value INF, -INF, NaN, positive zero, or negative zero, then TV is the xs:double value INF, -INF, NaN, positive zero, or negative zero respectively.
otherwise, SV can be expressed in the form m × 2^e where the mantissa m and exponent e are signed xs:integer values whose value range is defined in [XML Schema Part 2: Datatypes Second Edition], and TV is the xs:double value m × 2^e.
If ST is xs:decimal or xs:integer, then TV is xs:double(SV cast as xs:string) and the conversion is complete.
If ST is xs:boolean, SV is converted to 1.0E0 if SV is true and to 0.0E0 if SV is false and the conversion is complete.
If ST is xs:untypedAtomic or xs:string, see 22.223.2 Casting from xs:string and xs:untypedAtomic.
Note:
XSD 1.1 adds the value +INF to the lexical space, as an alternative to INF. XSD 1.1 also adds negative zero to the value space.
Note:
Implementations should return negative zero for xs:double("-0.0E0"). But because [XML Schema Part 2: Datatypes Second Edition] does not distinguish between the values positive zero and negative zero. Implementations may return positive zero in this case.
This section defines the rules for casting to the primitive type xs:decimal. The rules are also invoked implicitly as part of the process of converting to types derived from xs:decimal. There are special rules, however, if the target type TT is xs:integer, or a type derived from xs:integer: those rules are given in 22.3.223.3.2 Casting to xs:integer.
When the target type TT is xs:decimal, the resulting xs:decimal value TV is derived from ST and SV as follows:
If ST is xs:decimal or a subtype thereof (including xs:integer), then the result TV has the same datum as SV. The type annotation may be xs:decimal or any subtype of xs:decimal for which this is a valid instance, including the original type ST.
If ST is xs:float or xs:double, then TV is the xs:decimal value, within the set of xs:decimal values that the implementation is capable of representing, that is numerically closest to SV. If two values are equally close, then the one that is closest to zero is chosen. If SV is too large to be accommodated as an xs:decimal, (see [XML Schema Part 2: Datatypes Second Edition] for implementation-defined limits on numeric values) a dynamic error is raised [err:FOCA0001]. If SV is one of the special xs:float or xs:double values NaN, INF, or -INF, a dynamic error is raised [err:FOCA0002].
If ST is xs:boolean, the result TV is 1.0 if SV is 1 or true and to 0.0 if SV is 0 or false. The type annotation of the result may be any subtype of xs:decimal whose value space includes the integer values 0 and 1.
If ST is xs:untypedAtomic or xs:string, see 22.223.2 Casting from xs:string and xs:untypedAtomic.
This section defines the rules for casting to the primitive duration type xs:duration. Rules for casting to the derived types xs:yearMonthDuration and xs:dayTimeDuration are given in 22.3.323.3.3 Casting to xs:yearMonthDuration and xs:dayTimeDuration.
If the source value SV is an instance of xs:duration (including instances of subtypes such as xs:yearMonthDuration and xs:dayTimeDuration, then the datum of the result TV is the same as the datum of SV, and the type annotation is xs:duration or any subtype thereof that includes this datum in its value space (in particular, it may be the same as the type annotation of SV).
If ST is xs:untypedAtomic or xs:string, see 22.223.2 Casting from xs:string and xs:untypedAtomic.
In several situations, casting to date and time types requires the extraction of a component from SV or from the result of fn:current-dateTime and converting it to an xs:string. These conversions must follow certain rules. For example, converting an xs:integer year value requires converting to an xs:string with four or more characters, preceded by a minus sign if the value is negative.
This document defines four functions to perform these conversions. These functions are for illustrative purposes only and make no recommendations as to style or efficiency. References to these functions from the following text are not normative.
The arguments to these functions come from functions defined in this document. Thus, the functions below assume that they are correct and do no range checking on them.
declare function eg:convertYearToString($year as xs:integer) as xs:string {
let $plusMinus := if ($year >= 0) then "" else "-"
let $yearString := abs($year) cast as xs:string
let $length := string-length($yearString)
return if ($length = 1) then concat($plusMinus, "000", $yearString)
else if ($length = 2) then concat($plusMinus, "00", $yearString)
else if ($length = 3) then concat($plusMinus, "0", $yearString)
else concat($plusMinus, $yearString)
};declare function eg:convertTo2CharString($value as xs:integer) as xs:string {
let $string := $value cast as xs:string
return if (string-length($string) = 1) then concat("0", $string)
else $string
};declare function eg:convertSecondsToString($seconds as xs:decimal) as xs:string {
let $string := $seconds cast as xs:string
let $intLength := string-length(($seconds cast as xs:integer) cast as xs:string)
return if ($intLength = 1) then concat("0", $string)
else $string
};declare function eg:convertTZtoString($tz as xs:dayTimeDuration?) as xs:string {
if (empty($tz)) then ""
else if ($tz eq xs:dayTimeDuration('PT0S')) then "Z"
else let $tzh := hours-from-duration($tz)
let $tzm := minutes-from-duration($tz)
let $plusMinus := if ($tzh >= 0) then "+" else "-"
let $tzhString := eg:convertTo2CharString(abs($tzh))
let $tzmString := eg:convertTo2CharString(abs($tzm))
return concat($plusMinus, $tzhString, ":", $tzmString)
};Conversion from primitive types to date and time types follows the rules below.
When a value of any primitive type is cast as xs:dateTime, the xs:dateTime value TV is derived from ST and SV as follows:
If ST is xs:dateTime, then TV is SV.
If ST is xs:date, then let SYR be eg:convertYearToString( year-from-date(SV)), let SMO be eg:convertTo2CharString( month-from-date(SV)), let SDA be eg:convertTo2CharString( day-from-date(SV)) and let STZ be eg:convertTZtoString( timezone-from-date(SV)); TV is xs:dateTime( concat(SYR, '-', SMO, '-', SDA, 'T00:00:00 ', STZ) ).
If ST is xs:untypedAtomic or xs:string, see 22.223.2 Casting from xs:string and xs:untypedAtomic.
When a value of any primitive type is cast as xs:time, the xs:time value TV is derived from ST and SV as follows:
If ST is xs:time, then TV is SV.
If ST is xs:dateTime, then TV is xs:time( concat( eg:convertTo2CharString( hours-from-dateTime(SV)), ':', eg:convertTo2CharString( minutes-from-dateTime(SV)), ':', eg:convertSecondsToString( seconds-from-dateTime(SV)), eg:convertTZtoString( timezone-from-dateTime(SV)) )).
If ST is xs:untypedAtomic or xs:string, see 22.223.2 Casting from xs:string and xs:untypedAtomic.
When a value of any primitive type is cast as xs:date, the xs:date value TV is derived from ST and SV as follows:
If ST is xs:date, then TV is SV.
If ST is xs:dateTime, then let SYR be eg:convertYearToString( year-from-dateTime(SV)), let SMO be eg:convertTo2CharString( month-from-dateTime(SV)), let SDA be eg:convertTo2CharString( day-from-dateTime(SV)) and let STZ be eg:convertTZtoString(timezone-from-dateTime(SV)); TV is xs:date( concat(SYR, '-', SMO, '-', SDA, STZ) ).
If ST is xs:untypedAtomic or xs:string, see 22.223.2 Casting from xs:string and xs:untypedAtomic.
When a value of any primitive type is cast as xs:gYearMonth, the xs:gYearMonth value TV is derived from ST and SV as follows:
If ST is xs:gYearMonth, then TV is SV.
If ST is xs:dateTime, then let SYR be eg:convertYearToString( year-from-dateTime(SV)), let SMO be eg:convertTo2CharString( month-from-dateTime(SV)) and let STZ be eg:convertTZtoString( timezone-from-dateTime(SV)); TV is xs:gYearMonth( concat(SYR, '-', SMO, STZ) ).
If ST is xs:date, then let SYR be eg:convertYearToString( year-from-date(SV)), let SMO be eg:convertTo2CharString( month-from-date(SV)) and let STZ be eg:convertTZtoString( timezone-from-date(SV)); TV is xs:gYearMonth( concat(SYR, '-', SMO, STZ) ).
If ST is xs:untypedAtomic or xs:string, see 22.223.2 Casting from xs:string and xs:untypedAtomic.
When a value of any primitive type is cast as xs:gYear, the xs:gYear value TV is derived from ST and SV as follows:
If ST is xs:gYear, then TV is SV.
If ST is xs:dateTime, let SYR be eg:convertYearToString( year-from-dateTime(SV)) and let STZ be eg:convertTZtoString( timezone-from-dateTime(SV)); TV is xs:gYear(concat(SYR, STZ)).
If ST is xs:date, let SYR be eg:convertYearToString( year-from-date(SV)); and let STZ be eg:convertTZtoString( timezone-from-date(SV)); TV is xs:gYear(concat(SYR, STZ)).
If ST is xs:untypedAtomic or xs:string, see 22.223.2 Casting from xs:string and xs:untypedAtomic.
When a value of any primitive type is cast as xs:gMonthDay, the xs:gMonthDay value TV is derived from ST and SV as follows:
If ST is xs:gMonthDay, then TV is SV.
If ST is xs:dateTime, then let SMO be eg:convertTo2CharString( month-from-dateTime(SV)), let SDA be eg:convertTo2CharString( day-from-dateTime(SV)) and let STZ be eg:convertTZtoString( timezone-from-dateTime(SV)); TV is xs:gYearMonth( concat( '--', SMO '-', SDA, STZ) ).
If ST is xs:date, then let SMO be eg:convertTo2CharString( month-from-date(SV)), let SDA be eg:convertTo2CharString( day-from-date(SV)) and let STZ be eg:convertTZtoString( timezone-from-date(SV)); TV is xs:gYearMonth( concat( '--', SMO, '-', SDA, STZ) ).
If ST is xs:untypedAtomic or xs:string, see 22.223.2 Casting from xs:string and xs:untypedAtomic.
When a value of any primitive type is cast as xs:gDay, the xs:gDay value TV is derived from ST and SV as follows:
If ST is xs:gDay, then TV is SV.
If ST is xs:dateTime, then let SDA be eg:convertTo2CharString( day-from-dateTime(SV)) and let STZ be eg:convertTZtoString( timezone-from-dateTime(SV)); TV is xs:gDay( concat( '---', SDA, STZ)).
If ST is xs:date, then let SDA be eg:convertTo2CharString( day-from-date(SV)) and let STZ be eg:convertTZtoString( timezone-from-date(SV)); TV is xs:gDay( concat( '---', SDA, STZ)).
If ST is xs:untypedAtomic or xs:string, see 22.223.2 Casting from xs:string and xs:untypedAtomic.
When a value of any primitive type is cast as xs:gMonth, the xs:gMonth value TV is derived from ST and SV as follows:
If ST is xs:gMonth, then TV is SV.
If ST is xs:dateTime, then let SMO be eg:convertTo2CharString( month-from-dateTime(SV)) and let STZ be eg:convertTZtoString( timezone-from-dateTime(SV)); TV is xs:gMonth( concat( '--' , SMO, STZ)).
If ST is xs:date, then let SMO be eg:convertTo2CharString( month-from-date(SV)) and let STZ be eg:convertTZtoString( timezone-from-date(SV)); TV is xs:gMonth( concat( '--', SMO, STZ)).
If ST is xs:untypedAtomic or xs:string, see 22.223.2 Casting from xs:string and xs:untypedAtomic.
xs:booleanWhen the target type TT is xs:boolean, the resulting xs:boolean value TV is derived from the source value SV as follows:
If SV is an instance of xs:boolean, then TV is SV.
If SV is an instance of xs:numeric and SV is 0, +0, -0, 0.0, 0.0E0 or NaN, then TV is false.
If ST is is an instance of xs:numeric and SV is not one of the above values, then TV is true.
If ST is xs:untypedAtomic or xs:string, see 22.223.2 Casting from xs:string and xs:untypedAtomic.
xs:base64Binary and xs:hexBinaryValues of type xs:base64Binary can be cast as xs:hexBinary and vice versa, since the two types have the same value space. Casting to xs:base64Binary and xs:hexBinary is also supported from the same type and from xs:untypedAtomic, xs:string and subtypes of xs:string using [XML Schema Part 2: Datatypes Second Edition] semantics.
Casting to xs:anyURI is supported only from the same type, xs:untypedAtomic or xs:string.
When a value of any primitive type is cast as xs:anyURI, the xs:anyURI value TV is derived from the ST and SV as follows:
If ST is xs:untypedAtomic or xs:string see 22.223.2 Casting from xs:string and xs:untypedAtomic.
Casting from xs:string or xs:untypedAtomic to xs:QName or xs:NOTATION is described in 22.223.2 Casting from xs:string and xs:untypedAtomic.
It is also possible to cast from xs:NOTATION to xs:QName, or from xs:QName to any type derived by restriction from xs:NOTATION. (Casting to xs:NOTATION itself is not allowed, because xs:NOTATION is an abstract type.) The resulting xs:QName or xs:NOTATION has the same prefix, local name, and namespace URI parts as the supplied value.
Note:
See 21.222.2 Constructor functions for xs:QName and xs:NOTATION for a discussion of how the combination of atomization and casting might not produce the desired effect.
[XML Schema Part 2: Datatypes Second Edition] says that “The value space of ENTITY is the set of all strings that match the NCName production ... and have been declared as an unparsed entity in a document type definition.” However, [XSL Transformations (XSLT) Version 4.0] and [XQuery 4.0: An XML Query Language] do not check that constructed values of type xs:ENTITY match declared unparsed entities. Thus, this rule is relaxed in this specification and, in casting to xs:ENTITY and types derived from it, no check is made that the values correspond to declared unparsed entities.
When casting from a string to a duration or time or dateTime, it is now specified that when there are more digits in the fractional seconds than the implementation is able to retain, excess digits are truncated. Rounding upwards (which could affect the number of minutes or hours in the value) is not permitted. [Issue 1089 PR 1090 19 March 2024]
This section applies when the supplied value SV is an instance of xs:string or xs:untypedAtomic, including types derived from these by restriction. If the value is xs:untypedAtomic, it is treated in exactly the same way as a string containing the same sequence of characters.
The supplied string is mapped to a typed value of the target type as defined in [XML Schema Part 2: Datatypes Second Edition]. Whitespace normalization is applied as indicated by the whiteSpace facet for the datatype. The resulting whitespace-normalized string must be a valid lexical form for the datatype. The semantics of casting follow the rules of XML Schema validation. For example, "13" cast as xs:unsignedInt returns the xs:unsignedInt typed value 13. This could also be written xs:unsignedInt("13").
The target type can be any simple type other than an abstract type. Specifically, it can be a type whose variety is atomic, union, or list. In each case the effect of casting to the target type is the same as constructing an element with the supplied value as its content, validating the element using the target type as the governing type, and atomizing the element to obtain its typed value.
When the target type is a derived type that is restricted by a pattern facet, the lexical form is first checked against the pattern before further casting is attempted (See 22.3.123.3.1 Casting to derived types). If the lexical form does not conform to the pattern, a dynamic error [err:FORG0001] is raised.
For example, consider a user-defined type my:boolean which is derived by restriction from xs:boolean and specifies the pattern facet value="0|1". The expression "true" cast as my:boolean would fail with a dynamic error [err:FORG0001].
Facets other than pattern are checked after the conversion. For example if there is a user-defined datatype called my:height defined as a restriction of xs:integer with the facet <maxInclusive value="84"/>, then the expression "100" cast as my:height would fail with a dynamic error [err:FORG0001].
Casting to the types xs:NOTATION, xs:anySimpleType, or xs:anyAtomicType is not permitted because these types are abstract (they have no immediate instances).
Special rules apply when casting to namespace-sensitive types. The types xs:QName and xs:NOTATION are namespace-sensitive. Any type derived by restriction from a namespace-sensitive type is itself namespace-sensitive, as is any union type having a namespace-sensitive type among its members, and any list type having a namespace-sensitive type as its item type. For details, see 21.222.2 Constructor functions for xs:QName and xs:NOTATION.
Note:
Since version 3.0 of this specification, casting has been allowed between xs:QName and xs:NOTATION in either direction; this was not permitted in previous Recommendations. Version 3.0 also removed the rule that only a string literal (rather than a dynamic string) may be cast to an xs:QName
When casting to a numeric type:
If the value is too large or too small to be accurately represented by the implementation, it is handled as an overflow or underflow as defined in 4.2 Arithmetic operators on numeric values.
If the target type is xs:float or xs:double, the string -0 (and equivalents such as -0.0 or -000) should be converted to the value negative zero. However, if the implementation is reliant on an implementation of XML Schema 1.0 in which negative zero is not part of the value space for these types, these lexical forms may be converted to positive zero.
In casting to xs:decimal or to a type derived from xs:decimal, if the value is not too large or too small but nevertheless cannot be represented accurately with the number of decimal digits available to the implementation, the implementation may round to the nearest representable value or may raise a dynamic error [err:FOCA0006]. The choice of rounding algorithm and the choice between rounding and error behavior is implementation-defined.
When casting to xs:duration, xs:dateTime, or xs:time, if the seconds component has more fractional digits than are supported by the implementation, excess digits must be truncated. This rule ensures that components other than the seconds component are unaffected: for example xs:dateTime('2023-12-31T23:59:59.999999999') is guaranteed to deliver an xs:dateTime value whose year component is 2023 rather than 2024.
Note:
Implementations are required to support millisecond precision or greater.
In casting to xs:date, xs:dateTime, xs:gYear, or xs:gYearMonth (or types derived from these), if the value is too large or too small to be represented by the implementation, a dynamic error [err:FODT0001] is raised.
In casting to a duration value, if the value is too large or too small to be represented by the implementation, a dynamic error [err:FODT0002] is raised.
For xs:anyURI, the extent to which an implementation validates the lexical form of xs:anyURI is implementation-dependent.
If the cast fails for any other reason, a dynamic error [err:FORG0001] is raised.
Casting from xs:string and xs:untypedAtomic to any other type (primitive or non-primitive) has been described in 22.223.2 Casting from xs:string and xs:untypedAtomic. This section defines how other casts to non-primitive types operate, including casting to types derived by restriction, to union types, and to list types.
Casting a value to a derived type can be separated into a number of cases. In these rules:
The types xs:integer, xs:yearMonthDuration, and xs:dayTimeDuration are treated as quasi-primitive types (alongside the 20 truly primitive types).
For any atomic type T, let P(T) denote the most specific primitive or quasi-primitive type such that itemType-subtype(T, P(T)) is true.
The rules are then:
When the source type ST is the same type as the target type TT: this case always succeeds, returning the source value SV unchanged.
When itemType-subtype(ST, TT) is true: see 22.3.423.3.4 Casting from derived types to parent types.
When TT is the quasi-primitive type xs:integer and SV is an instance of xs:numeric: see 22.3.223.3.2 Casting to xs:integer.
When TT is the quasi-primitive type xs:yearMonthDuration or xs:dayTimeDuration and SV is an instance of xs:duration: see 22.3.323.3.3 Casting to xs:yearMonthDuration and xs:dayTimeDuration.
When P(ST) is the same type as P(TT): see 22.3.523.3.5 Casting within a branch of the type hierarchy.
Otherwise (P(ST) is not the same type as P(TT)): see 22.3.623.3.6 Casting across the type hierarchy.
When an atomic item SV is cast as xs:integer, the resulting xs:integer value TV is obtained as follows:
If ST is xs:decimal, xs:float or xs:double, then TV is SV with the fractional part discarded and the value converted to xs:integer. Thus, casting 3.1456 returns 3 while -17.89 returns -17. Casting 3.124E1 returns 31. If SV is too large to be accommodated as an integer, (see [XML Schema Part 2: Datatypes Second Edition] for implementation-defined limits on numeric values) a dynamic error is raised [err:FOCA0003]. If SV is one of the special xs:float or xs:double values NaN, INF, or -INF, a dynamic error is raised [err:FOCA0002].
In all other cases, the general rules of 22.3.123.3.1 Casting to derived types apply.
Note:
When casting to a subtype of xs:integer (for example, xs:long), the rules in 22.3.123.3.1 Casting to derived types apply. Note, however, that these rules treat xs:integer as a quasi-primitive type.
xs:yearMonthDuration and xs:dayTimeDurationWhen the source value SV is an instance of xs:duration (including any subtype of xs:duration), then:
If the target type TT is xs:yearMonthDuration, the result is an instance of xs:yearMonthDuration whose months component is equal to the months component of SV. The seconds component of SV is ignored.
If the target type TT is xs:dayTimeDuration, the result is an instance of xs:dayTimeDuration whose seconds component is equal to the seconds component of SV. The months component of SV is ignored.
In all other cases, the general rules of 22.3.123.3.1 Casting to derived types apply.
Note:
In general, casting to xs:yearMonthDuration or xs:dayTimeDuration loses information.
Note:
When casting to a subtype of xs:dayTimeDuration or xs:yearMonthDuration, the rules in 22.3.123.3.1 Casting to derived types apply. Note, however, that these rules treat xs:dayTimeDuration and xs:yearMonthDuration as quasi-primitive types.
It is always possible to cast an atomic item A to a type T if the relation A instance of T is true, provided that T is not an abstract type.
For example, it is possible to cast an xs:unsignedShort to an xs:unsignedInt, to an xs:integer, to an xs:decimal, or to a union type whose member types are xs:integer and xs:double.
Since the value space of the original type is a subset of the value space of the target type, such a cast is always successful.
For the expression A instance of T to be true, T must be either an atomic type, or a union type that has no constraining facets. It cannot be a list type, nor a union type derived by restriction from another union type, nor a union type that has a list type among its member types.
The result will have the same value as the original, but will have a new type annotation:
If T is an atomic type, then the type annotation of the result is T.
If T is a union type, then the type of the result is an atomic type M such that M is one of the atomic types in the transitive membership of the union type T and A instance of M is true; if there is more than one type M that satisfies these conditions (which could happen, for example, if T is the union of two overlapping types such as xs:int and xs:positiveInteger) then the first one is used, taking the member types in the order in which they appear within the definition of the union type.
It is possible to cast an SV to a TT if the type of the SV and the TT type are both derived by restriction (directly or indirectly) from the same primitive type, provided that the supplied value conforms to the constraints implied by the facets of the target type. This includes the case where the target type is derived from the type of the supplied value, as well as the case where the type of the supplied value is derived from the target type. For example, an instance of xs:byte can be cast as xs:unsignedShort, provided the value is not negative.
If the value does not conform to the facets defined for the target type, then a dynamic error is raised [err:FORG0001]. See [XML Schema Part 2: Datatypes Second Edition]. In the case of the pattern facet (which applies to the lexical space rather than the value space), the pattern is tested against the canonical representation of the value, as defined for the source type (or the result of casting the value to an xs:string, in the case of types that have no canonical representation defined for them).
Note that this will cause casts to fail if the pattern excludes the canonical lexical representation of the source type. For example, if the type my:distance is defined as a restriction of xs:decimal with a pattern that requires two digits after the decimal point, casting of an xs:integer to my:distance will always fail, because the canonical representation of an xs:integer does not conform to this pattern.
In some cases, casting from a parent type to a derived type requires special rules. See 22.1.423.1.4 Casting to duration types for rules regarding casting to xs:yearMonthDuration and xs:dayTimeDuration. See 22.1.1023.1.10 Casting to xs:ENTITY, below, for casting to xs:ENTITY and types derived from it.
When the ST and the TT are derived, directly or indirectly, from different primitive types, this is called casting across the type hierarchy. Casting across the type hierarchy is logically equivalent to three separate steps performed in order. Errors can occur in either of the latter two steps.
Cast the SV, up the hierarchy, to the primitive type of the source, as described in 22.3.423.3.4 Casting from derived types to parent types.
If SV is an instance of xs:string or xs:untypedAtomic, check its value against the pattern facet of TT, and raise a dynamic error [err:FORG0001] if the check fails.
Let P(TT) be the most specific primitive or quasi-primitive type of which TT is a subtype, as described in 22.3.123.3.1 Casting to derived types.
Cast the value to P(TT), as described in 22.123.1 Casting from primitive types to primitive types if P(TT) is primitive, or as described in 22.3.123.3.1 Casting to derived types if P(TT) is quasi-primitive.
If TT is derived from xs:NOTATION, assume for the purposes of this rule that casting to xs:NOTATION succeeds.
Cast the value down to the target type TT, as described in 22.3.523.3.5 Casting within a branch of the type hierarchy
If the target type of a cast expression (or a constructor function) is a type with variety union, the supplied value must be one of the following:
A value of type xs:string or xs:untypedAtomic. This case follows the general rules for casting from strings, and has already been described in 22.223.2 Casting from xs:string and xs:untypedAtomic.
If the union type has a pattern facet, the pattern is tested against the supplied value after whitespace normalization, using the whiteSpace normalization rules of the member datatype against which validation succeeds.
A value that is an instance of one of the atomic types in the transitive membership of the union type, and of the union type itself. This case has already been described in 22.3.423.3.4 Casting from derived types to parent types
This situation only applies when the value is an instance of the union type, which means it will never apply when the union is derived by facet-based restriction from another union type.
A value that is castable to one or more of the atomic types in the transitive membership of the union type (in the sense that the castable as operator returns true).
In this case the supplied value is cast to each atomic type in the transitive membership of the union type in turn (in the order in which the member types appear in the declaration) until one of these casts is successful; if none of them is successful, a dynamic error occurs [err:FORG0001]. If the union type has constraining facets then the resulting value must satisfy these facets, otherwise a dynamic error occurs [err:FORG0001].
If the union type has a pattern facet, the pattern is tested against the canonical representation of the result value.
Only the atomic types in the transitive membership of the union type are considered. The union type may have list types in its transitive membership, but (unless the supplied value is of type xs:string or xs:untypedAtomic, in which case the rules in 22.223.2 Casting from xs:string and xs:untypedAtomic apply), any list types in the membership are effectively ignored.
If more than one of these conditions applies, then the casting is done according to the rules for the first condition that applies.
If none of these conditions applies, the cast fails with a dynamic error [err:FORG0001].
Example: consider a type U whose member types are xs:integer and xs:date.
The expression "123" cast as U returns the xs:integer value 123.
The expression current-date() cast as U returns the current date as an instance of xs:date.
The expression 23.1 cast as U returns the xs:integer value 23.
Example: consider a type V whose member types are xs:short and xs:negativeInteger.
The expression "-123" cast as V returns the xs:short value -123.
The expression "-100000" cast as V returns the xs:negativeInteger value -100000.
The expression 93.7 cast as V returns the xs:short value 93.
The expression "93.7" cast as V raises a dynamic error [err:FORG0001] on the grounds that the string "93.7" is not in the lexical space of the union type.
Example: consider a type W that is derived from the above type V by restriction, with a pattern facet of -?\d\d.
The expression "12" cast as V returns the xs:short value 12.
The expression "123" cast as V raises an dynamic error [err:FORG0001] on the grounds that the string "123" does not match the pattern facet.
If the target type of a cast expression (or a constructor function) is a type with variety list, the supplied value must be of type xs:string or xs:untypedAtomic. The rules follow the general principle for all casts from xs:string outlined in 22.223.2 Casting from xs:string and xs:untypedAtomic.
If the supplied value is not of type xs:string or xs:untypedAtomic, a type error is raised [err:XPTY0004]XP.
The semantics of the operation are consistent with validation: that is, the effect of casting a string S to a list type L is the same as constructing an element or attribute node whose string value is S, validating it using L as the governing type, and atomizing the resulting node. The result will always be either failure, or a sequence of zero or more atomic items each of which is an instance of the item type of L (or if the item type of L is a union type, an instance of one of the atomic types in its transitive membership).
If the item type of the list type is namespace-sensitive, then the namespace bindings in the static context will be used to resolve any namespace prefix, in the same way as when the target type is xs:QName.
If the list type has a pattern facet, the pattern must match the supplied value after collapsing whitespace (an operation equivalent to the use of the fn:normalize-space function).
For example, the expression cast "A B C D" as xs:NMTOKENS produces a sequence of four xs:NMTOKEN values, ("A", "B", "C", "D").
For example, given a user-defined type my:coordinates defined as a list of xs:integer with the facet <xs:length value="2"/>, the expression my:coordinates("2 -1") will return a sequence of two xs:integer values (2, -1), while the expression my:coordinates("1 2 3") will result in a dynamic error because the length of the list does not conform to the length facet. The expression my:coordinates("1.0 3.0") will also fail because the strings 1.0 and 3.0 are not in the lexical space of xs:integer.
It is implementation-defined which version of Unicode is supported, but it is recommended that the most recent version of Unicode be used. (See Conformance.)
It is implementation-defined whether the type system is based on XML Schema 1.0 or XML Schema 1.1. (See Conformance.)
It is implementation-defined whether definitions that rely on XML (for example, the set of valid XML characters) should use the definitions in XML 1.0 or XML 1.1. (See Conformance.)
Implementations may attach an implementation-defined meaning to options in the map that are not described in this specification. These options should use values of type xs:QName as the option names, using an appropriate namespace. (See Options.)
It is implementation-defined which version of [The Unicode Standard] is supported, but it is recommended that the most recent version of Unicode be used. (See Strings, characters, and codepoints.)
[Definition] Some functions (such as fn:distinct-values, fn:unordered, map:keys, and map:for-each) produce results in an implementation-defined or implementation-dependent order. In such cases two calls with the same arguments are not guaranteed to produce the results in the same order. These functions are said to be nondeterministic with respect to ordering. (See Properties of functions.)
Where the results of a function are described as being (to a greater or lesser extent) implementation-defined or implementation-dependent, this does not by itself remove the requirement that the results should be deterministic: that is, that repeated calls with the same explicit and implicit arguments must return identical results. (See Properties of functions.)
In addition, the values of $input, typically serialized and converted to an xs:string, and $label (if supplied and non-empty) may be output to an implementation-defined destination. (See fn:trace.)
Consider a situation in which a user wants to investigate the actual value passed to a function. Assume that in a particular execution, $v is an xs:decimal with value 124.84. Writing fn:trace($v, 'the value of $v is:') will return $v. The processor may output "124.84" and "the value of $v is:" to an implementation-defined destination. (See fn:trace.)
Similar to fn:trace, the values of $input, typically serialized and converted to an xs:string, and $label (if supplied and non-empty) may be output to an implementation-defined destination. (See fn:message.)
They may provide an implementation-defined mechanism that allows users to choose between raising an error and returning a result that is modulo the largest representable integer value. See [ISO 10967]. (See Arithmetic operators on numeric values.)
For xs:decimal values, let N be the number of digits of precision supported by the implementation, and let M (M <= N) be the minimum limit on the number of digits required for conformance (18 digits for XSD 1.0, 16 digits for XSD 1.1). Then for addition, subtraction, and multiplication operations, the returned result should be accurate to N digits of precision, and for division and modulus operations, the returned result should be accurate to at least M digits of precision. The actual precision is implementation-defined. If the number of digits in the mathematical result exceeds the number of digits that the implementation retains for that operation, the result is truncated or rounded in an implementation-defined manner. (See Arithmetic operators on numeric values.)
The [IEEE 754-2019] specification also describes handling of two exception conditions called divideByZero and invalidOperation. The IEEE divideByZero exception is raised not only by a direct attempt to divide by zero, but also by operations such as log(0). The IEEE invalidOperation exception is raised by attempts to call a function with an argument that is outside the function’s domain (for example, sqrt(-1) or log(-1)). Although IEEE defines these as exceptions, it also defines “default non-stop exception handling” in which the operation returns a defined result, typically positive or negative infinity, or NaN. With this function library, these IEEE exceptions do not cause a dynamic error at the application level; rather they result in the relevant function or operator returning the defined non-error result. The underlying IEEE exception may be notified to the application or to the user by some implementation-defined warning condition, but the observable effect on an application using the functions and operators defined in this specification is simply to return the defined result (typically -INF, +INF, or NaN) with no error. (See Arithmetic operators on numeric values.)
The [IEEE 754-2019] specification distinguishes two NaN values: a quiet NaN and a signaling NaN. These two values are not distinguishable in the XDM model: the value spaces of xs:float and xs:double each include only a single NaN value. This does not prevent the implementation distinguishing them internally, and triggering different implementation-defined warning conditions, but such distinctions do not affect the observable behavior of an application using the functions and operators defined in this specification. (See Arithmetic operators on numeric values.)
The implementation may adopt a different algorithm provided that it is equivalent to this formulation in all cases where implementation-dependent or implementation-defined behavior does not affect the outcome, for example, the implementation-defined precision of the result of xs:decimal division. (See op:numeric-integer-divide.)
There may be implementation-defined limits on the precision available. If the requested $precision is outside this range, it should be adjusted to the nearest value supported by the implementation. (See fn:round.)
There may be implementation-defined limits on the precision available. If the requested $precision is outside this range, it should be adjusted to the nearest value supported by the implementation. (See fn:round-half-to-even.)
There may be implementation-defined limits on the precision available. If the requested $precision is outside this range, it should be adjusted to the nearest value supported by the implementation. (See fn:divide-decimals.)
XSD 1.1 allows the string +INF as a representation of positive infinity; XSD 1.0 does not. It is implementation-defined whether XSD 1.1 is supported. (See fn:number.)
Any other format token, which indicates a numbering sequence in which that token represents the number 1 (one) (but see the note below). It is implementation-defined which numbering sequences, additional to those listed above, are supported. If an implementation does not support a numbering sequence represented by the given token, it must use a format token of 1. (See fn:format-integer.)
For all format tokens other than a digit-pattern, there may be implementation-defined lower and upper bounds on the range of numbers that can be formatted using this format token; indeed, for some numbering sequences there may be intrinsic limits. For example, the format token U+2460 (CIRCLED DIGIT ONE, ①) has a range imposed by the Unicode character repertoire — zero to 20 in Unicode versions prior to 3.2, or zero to 50 in subsequent versions. For the numbering sequences described above any upper bound imposed by the implementation must not be less than 1000 (one thousand) and any lower bound must not be greater than 1. Numbers that fall outside this range must be formatted using the format token 1. (See fn:format-integer.)
The set of languages for which numbering is supported is implementation-defined. If the $language argument is absent, or is set to an empty sequence, or is invalid, or is not a language supported by the implementation, then the number is formatted using the default language from the dynamic context. (See fn:format-integer.)
...either a or t, to indicate alphabetic or traditional numbering respectively, the default being implementation-defined. (See fn:format-integer.)
The string of characters between the parentheses, if present, is used to select between other possible variations of cardinal or ordinal numbering sequences. The interpretation of this string is implementation-defined. No error occurs if the implementation does not define any interpretation for the defined string. (See fn:format-integer.)
It is implementation-defined what combinations of values of the format token, the language, and the cardinal/ordinal modifier are supported. If ordinal numbering is not supported for the combination of the format token, the language, and the string appearing in parentheses, the request is ignored and cardinal numbers are generated instead. (See fn:format-integer.)
The use of the a or t modifier disambiguates between numbering sequences that use letters. In many languages there are two commonly used numbering sequences that use letters. One numbering sequence assigns numeric values to letters in alphabetic sequence, and the other assigns numeric values to each letter in some other manner traditional in that language. In English, these would correspond to the numbering sequences specified by the format tokens a and i. In some languages, the first member of each sequence is the same, and so the format token alone would be ambiguous. In the absence of the a or t modifier, the default is implementation-defined. (See fn:format-integer.)
The static context provides a set of decimal formats. One of the decimal formats is unnamed, the others (if any) are identified by a QName. There is always an unnamed decimal format available, but its contents are implementation-defined. (See Defining a decimal format.)
IEEE states that the preferred quantum is language-defined. In this specification, it is implementation-defined. (See Trigonometric and exponential functions.)
IEEE defines various rounding algorithms for inexact results, and states that the choice of rounding direction, and the mechanisms for influencing this choice, are language-defined. In this specification, the rounding direction and any mechanisms for influencing it are implementation-defined. (See Trigonometric and exponential functions.)
The map returned by the fn:random-number-generator function may contain additional entries beyond those specified here, but it must match the record type defined above. The meaning of any additional entries is implementation-defined. To avoid conflict with any future version of this specification, the keys of any such entries should start with an underscore character. (See fn:random-number-generator.)
It is no longer automatically an error if the input contains a codepoint that is not valid in XML. Instead, the codepoint must be a permitted character. The set of permitted characters is implementation-defined, but it is recommended that all Unicode characters should be accepted. (See fn:codepoints-to-string.)
If two query parameters use the same keyword then the last one wins. If a query parameter uses a keyword or value which is not defined in this specification then the meaning is implementation-defined. If the implementation recognizes the meaning of the keyword and value then it should interpret it accordingly; if it does not recognize the keyword or value then if the fallback parameter is present with the value no it should reject the collation as unsupported, otherwise it should ignore the unrecognized parameter. (See The Unicode Collation Algorithm.)
The following query parameters are defined. If any parameter is absent, the default is implementation-defined except where otherwise stated. The meaning given for each parameter is non-normative; the normative specification is found in [UTS #35]. (See The Unicode Collation Algorithm.)
Because the set of collations that are supported is implementation-defined, an implementation has the option to support all collation URIs, in which case it will never raise this error. (See Choosing a collation.)
The properties available are as defined for the Unicode Collation Algorithm (see 5.3.4 The Unicode Collation Algorithm). Additional implementation-defined properties may be specified as described in the rules for UCA collation URIs. (See fn:collation.)
It is possible to define collations that do not have the ability to generate collation keys. Supplying such a collation will cause the function to fail. The ability to generate collation keys is an implementation-defined property of the collation. (See fn:collation-key.)
Conforming implementations must support normalization form NFC and may support normalization forms NFD, NFKC, NFKD, and FULLY-NORMALIZED. They may also support other normalization forms with implementation-defined semantics. (See fn:normalize-unicode.)
It is implementation-defined which version of Unicode (and therefore, of the normalization algorithms and their underlying data) is supported by the implementation. See [UAX #15] for details of the stability policy regarding changes to the normalization rules in future versions of Unicode. If the input string contains codepoints that are unassigned in the relevant version of Unicode, or for which no normalization rules are defined, the fn:normalize-unicode function leaves such codepoints unchanged. If the implementation supports the requested normalization form then it must be able to handle every input string without raising an error. (See fn:normalize-unicode.)
It is possible to define collations that do not have the ability to decompose a string into units suitable for substring matching. An argument to a function defined in this section may be a URI that identifies a collation that is able to compare two strings, but that does not have the capability to split the string into collation units. Such a collation may cause the function to fail, or to give unexpected results, or it may be rejected as an unsuitable argument. The ability to decompose strings into collation units is an implementation-defined property of the collation. The fn:collation-available function can be used to ask whether a particular collation has this property. (See Functions based on substring matching.)
The result of the function will always be such that validation against this schema would succeed. However, it is implementation-defined whether the result is typed or untyped, that is, whether the elements and attributes in the returned tree have type annotations that reflect the result of validating against this schema. (See fn:analyze-string.)
Some URI schemes are hierarchical and some are non-hierarchical. Implementations must treat the following schemes as non-hierarchical: jar, mailto, news, tag, tel, and urn. Whether additional schemes are known to be non-hierarchical implementation-defined. If a scheme is not known to be non-hierarchical, it must be treated as hierarchical. (See Parsing and building URIs.)
If the omit-default-ports option is true, the port is discarded and set to the empty sequence if the port number is the same as the default port for the given scheme. Implementations should recognize the default ports for http (80), https (443), ftp (21), and ssh (22). Exactly which ports are recognized is implementation-defined. (See fn:parse-uri.)
If the omit-default-ports option is true then the $port is set to the empty sequence if the port number is the same as the default port for the given scheme. Implementations should recognize the default ports for http (80), https (443), ftp (21), and ssh (22). Exactly which ports are recognized is implementation-defined. (See fn:build-uri.)
Processors may support a greater range and/or precision. The limits are implementation-defined. (See Limits and precision.)
Similarly, a processor may be unable accurately to represent the result of dividing a duration by 2, or multiplying a duration by 0.5. A processor that limits the precision of the seconds component of duration values must deliver a result that is as close as possible to the mathematically precise result, given these limits; if two values are equally close, the one that is chosen is implementation-defined. (See Limits and precision.)
All conforming processors must support year values in the range 1 to 9999, and a minimum fractional second precision of 1 millisecond or three digits (i.e., s.sss). However, processors may set larger implementation-defined limits on the maximum number of digits they support in these two situations. Processors may also choose to support the year 0 and years with negative values. The results of operations on dates that cross the year 0 are implementation-defined. (See Limits and precision.)
Similarly, a processor that limits the precision of the seconds component of date and time or duration values may need to deliver a rounded result for arithmetic operations. Such a processor must deliver a result that is as close as possible to the mathematically precise result, given these limits: if two values are equally close, the one that is chosen is implementation-defined. (See Limits and precision.)
...the format token n, N, or Nn, indicating that the value of the component is to be output by name, in lower-case, upper-case, or title-case respectively. Components that can be output by name include (but are not limited to) months, days of the week, timezones, and eras. If the processor cannot output these components by name for the chosen calendar and language then it must use an implementation-defined fallback representation. (See The picture string.)
...indicates alphabetic or traditional numbering respectively, the default being implementation-defined. This has the same meaning as in the second argument of fn:format-integer. (See The picture string.)
The sequence of characters in the (adjusted) first presentation modifier is reversed (for example, 999'### becomes ###'999). If the result is not a valid decimal digit pattern, then the output is implementation-defined. (See Formatting Fractional Seconds.)
The output for these components is entirely implementation-defined. The default presentation modifier for these components is n, indicating that they are output as names (or conventional abbreviations), and the chosen names will in many cases depend on the chosen language: see 10.8.4.8 The language, calendar, and place arguments. (See Formatting Other Components.)
The set of languages, calendars, and places that are supported in the date formatting functions is implementation-defined. When any of these arguments is omitted or is an empty sequence, an implementation-defined default value is used. (See The language, calendar, and place arguments.)
The choice of the names and abbreviations used in any given language is implementation-defined. For example, one implementation might abbreviate July as Jul while another uses Jly. In German, one implementation might represent Saturday as Samstag while another uses Sonnabend. Implementations may provide mechanisms allowing users to control such choices. (See The language, calendar, and place arguments.)
The choice of the names and abbreviations used in any given language for calendar units such as days of the week and months of the year is implementation-defined. (See The language, calendar, and place arguments.)
The calendar value if present must be a valid EQName (dynamic error: [err:FOFD1340]). If it is a lexical QName then it is expanded into an expanded QName using the statically known namespaces; if it has no prefix then it represents an expanded-QName in no namespace. If the expanded QName is in no namespace, then it must identify a calendar with a designator specified below (dynamic error: [err:FOFD1340]). If the expanded QName is in a namespace then it identifies the calendar in an implementation-defined way. (See The language, calendar, and place arguments.)
At least one of the above calendars must be supported. It is implementation-defined which calendars are supported. (See The language, calendar, and place arguments.)
The requirement to deliver a deterministic result has performance implications, and for this reason implementations may provide a user option to evaluate the function without a guarantee of determinism. The manner in which any such option is provided is implementation-defined. If the user has not selected such an option, a call of the function must either return a deterministic result or must raise a dynamic error [err:FODC0003]. (See fn:doc.)
Various aspects of this processing are implementation-defined. Implementations may provide external configuration options that allow any aspect of the processing to be controlled by the user. In particular:... (See fn:doc.)
It is implementation-defined whether DTD validation and/or schema validation is applied to the source document. (See fn:doc.)
The effect of a fragment identifier in the supplied URI is implementation-defined. One possible interpretation is to treat the fragment identifier as an ID attribute value, and to return a document node having the element with the selected ID value as its only child. (See fn:doc.)
By default, this function is deterministic. This means that repeated calls on the function with the same argument will return the same result. However, for performance reasons, implementations may provide a user option to evaluate the function without a guarantee of determinism. The manner in which any such option is provided is implementation-defined. If the user has not selected such an option, a call to this function must either return a deterministic result or must raise a dynamic error [err:FODC0003]. (See fn:collection.)
By default, this function is deterministic. This means that repeated calls on the function with the same argument will return the same result. However, for performance reasons, implementations may provide a user option to evaluate the function without a guarantee of determinism. The manner in which any such option is provided is implementation-defined. If the user has not selected such an option, a call to this function must either return a deterministic result or must raise a dynamic error [err:FODC0003]. (See fn:uri-collection.)
It is no longer automatically an error if the resource (after decoding) contains a codepoint that is not valid in XML. Instead, the codepoint must be a permitted character. The set of permitted characters is implementation-defined, but it is recommended that all Unicode characters should be accepted. (See fn:unparsed-text.)
The processor may use implementation-defined heuristics to determine the likely encoding. (See fn:unparsed-text.)
The fact that the resolution of URIs is defined by a mapping in the dynamic context means that in effect, various aspects of the behavior of this function are implementation-defined. Implementations may provide external configuration options that allow any aspect of the processing to be controlled by the user. In particular:... (See fn:unparsed-text.)
The fact that the resolution of URIs is defined by a mapping in the dynamic context means that in effect, various aspects of the behavior of this function are implementation-defined. Implementations may provide external configuration options that allow any aspect of the processing to be controlled by the user. In particular:... (See fn:unparsed-binary.)
The collation used for matching names is implementation-defined, but must be the same as the collation used to ensure that the names of all environment variables are unique. (See fn:environment-variable.)
Except to the extent defined by these options, the precise process used to construct the XDM instance is implementation-defined. In particular, it is implementation-defined whether an XML 1.0 or XML 1.1 parser is used. (See fn:parse-xml.)
Options set in $options may be supplemented or modified based on configuration options defined externally using implementation-defined mechanisms. (See fn:parse-xml.)
Except as explicitly defined, the precise process used to construct the XDM instance is implementation-defined. In particular, it is implementation-defined whether an XML 1.0 or XML 1.1 parser is used. (See fn:parse-xml-fragment.)
If the second argument is omitted, or is supplied in the form of an output:serialization-parameters element, then the values of any serialization parameters that are not explicitly specified is implementation-defined, and may depend on the context. (See fn:serialize.)
A list of target namespaces identifying schema components to be used for validation. The way in which the processor locates schema components for the specified target namespaces is implementation-defined. A zero-length string denotes a no-namespace schema.... (See fn:xsd-validator.)
Set to the decimal value 1.0 or 1.1 to indicate which version of XSD is to be used. The default is implementation-defined. A processor may use a later version of XSD than the version requested, but must not use an earlier version.... (See fn:xsd-validator.)
The XSD specification allows a schema to be used for validation even when it contains unresolved references to absent schema components. It is implementation-defined whether this function allows the schema to be incomplete in this way. For example, some processors might allow validation using a schema in which an element declaration contains a reference to a type declaration that is not present in the schema, provided that the element declaration is never needed in the course of a particular validation episodes. (See fn:xsd-validator.)
...error-details as map(*)*. This field is present only when (a) the option return-error-details was set to true, and (b) the supplied document was found to be invalid. The value is a sequence of maps, each containing details of one invalidity that was found. The precise details of the invalidities are implementation-defined, but they may include the following fields, if the information is available:... (See fn:xsd-validator.)
Because the [DOM: Living Standard] and [HTML: Living Standard] are not fixed, it is implementation-defined which versions are used. (See XDM Mapping from HTML DOM Nodes.)
If an implementation allows these nodes to be passed in via an API or similar mechanism, their behaviour is implementation-defined. (See XDM Mapping from HTML DOM Nodes.)
If the local name contains a character that is not a valid XML NameStartChar or NameChar, then an implementation-defined replacement string is used. The result must be a valid NCName. (See node-name Accessor.)
If the local name contains a character that is not a valid XML NameStartChar or NameChar, then an implementation-defined replacement string is used. The result must be a valid NCName. (See node-name Accessor.)
The default behaviour is implementation-defined. (See fn:parse-html.)
The input may contain deviations from the grammar of [RFC 7159], which are handled in an implementation-defined way. (Note: some popular extensions include allowing quotes on keys to be omitted, allowing a comma to appear after the last item in an array, allowing leading zeroes in numbers, and allowing control characters such as tab and newline to be present in unescaped form.) Since the extensions accepted are implementation-defined, an error may be raised [err:FOJS0001] if the input does not conform to the grammar. (See fn:parse-json.)
The supplied function is called to process the string value of any JSON number in the input. By default, numbers are processed by converting to xs:double using the XPath casting rules. Supplying the value xs:decimal#1 will instead convert to xs:decimal (which potentially retains more precision, but disallows exponential notation), while supplying a function that casts to (xs:decimal | xs:double) will treat the value as xs:decimal if there is no exponent, or as xs:double otherwise. Supplying the value fn:identity#1 causes the value to be retained unchanged as an xs:untypedAtomic. If the liberal option is false (the default), then the supplied number-parser is called if and only if the value conforms to the JSON grammar for numbers (for example, a leading plus sign and redundant leading zeroes are not allowed). If the liberal option is true then it is also called if the value conforms to an implementation-defined extension of this grammar. (See fn:parse-json.)
It is no longer automatically an error if the input contains a codepoint that is not valid in XML. Instead, the codepoint must be a permitted character. The set of permitted characters is implementation-defined, but it is recommended that all Unicode characters should be accepted. (See fn:json-doc.)
The input may contain deviations from the grammar of [RFC 7159], which are handled in an implementation-defined way. (Note: some popular extensions include allowing quotes on keys to be omitted, allowing a comma to appear after the last item in an array, allowing leading zeroes in numbers, and allowing control characters such as tab and newline to be present in unescaped form.) Since the extensions accepted are implementation-defined, an error may be raised (see below) if the input does not conform to the grammar. (See fn:json-to-xml.)
Default: Implementation-defined. (See fn:json-to-xml.)
Indicates that the resulting XDM instance must be typed; that is, the element and attribute nodes must carry the type annotations that result from validation against the schema given at D.2 Schema for the result of fn:json-to-xml, or against an implementation-defined schema if the liberal option has the value true. (See fn:json-to-xml.)
The result of the function will always be such that validation against this schema would succeed. However, it is implementation-defined whether the result is typed or untyped, that is, whether the elements and attributes in the returned tree have type annotations that reflect the result of validating against this schema. (See fn:csv-to-xml.)
Additional, implementation-defined options may be available, for example, to control aspects of the XML serialization, to specify the grammar start symbol, or to produce output formats other than XML. (See fn:invisible-xml.)
If the arguments to fn:function-lookup identify a function that is present in the static context of the function call, the function will always return the same function that a static reference to this function would bind to. If there is no such function in the static context, then the results depend on what is present in the dynamic context, which is implementation-defined. (See fn:function-lookup.)
Default: The version given in the prolog of the library module; or implementation-defined if this is absent. (See fn:load-xquery-module.)
A sequence of URIs (in the form of xs:string values) which may be used or ignored in an implementation-defined way.... (See fn:load-xquery-module.)
Values for vendor-defined configuration options for the XQuery processor used to process the request. The key is the name of an option, expressed as a QName: the namespace URI of the QName should be a URI controlled by the vendor of the XQuery processor. The meaning of the associated value is implementation-defined. Implementations should ignore options whose names are in an unrecognized namespace. The option parameter conventions do not apply to this contained map.... (See fn:load-xquery-module.)
It is implementation-defined whether constructs in the library module are evaluated in the same execution scope as the calling module. (See fn:load-xquery-module.)
The library module that is loaded may import schema declarations using an import schema declaration. It is implementation-defined whether schema components in the in-scope schema definitions of the calling module are automatically added to the in-scope schema definitions of the dynamically loaded module. The in-scope schema definitions of the calling and called modules must be consistent, according to the rules defined in Section 2.2.5 Consistency Constraints XQ31. (See fn:load-xquery-module.)
Default: Implementation-defined. (See fn:transform.)
Default: Implementation-defined. (See fn:transform.)
If the implementation provides a way of writing or invoking functions with side-effects, this post-processing function might be used to save a copy of the result document to persistent storage. For example, if the implementation provides access to the EXPath File library [EXPath], then a serialized document might be written to filestore by calling the file:write function. Similar mechanisms might be used to issue an HTTP POST request that posts the result to an HTTP server, or to send the document to an email recipient. The semantics of calling functions with side-effects are entirely implementation-defined. (See fn:transform.)
Calls to fn:transform can potentially have side-effects even in the absence of the post-processing option, because the XSLT specification allows a stylesheet to invoke extension functions that have side-effects. The semantics in this case are implementation-defined. (See fn:transform.)
A string intended to be used as the static base URI of the principal stylesheet module. This value must be used if no other static base URI is available. If the supplied stylesheet already has a base URI (which will generally be the case if the stylesheet is supplied using stylesheet-node or stylesheet-location) then it is implementation-defined whether this parameter has any effect. If the value is a relative reference, it is resolved against the executable base URIXP of the fn:transform function call.... (See fn:transform.)
Values for vendor-defined configuration options for the XSLT processor used to process the request. The key is the name of an option, expressed as a QName: the namespace URI of the QName should be a URI controlled by the vendor of the XSLT processor. The meaning of the associated value is implementation-defined. Implementations should ignore options whose names are in an unrecognized namespace. Default is an empty map.... (See fn:transform.)
It is implementation-defined whether the XSLT transformation is executed within the same execution scope as the calling code. (See fn:transform.)
XSLT 1.0 does not define any error codes, so this is the likely outcome with an XSLT 1.0 processor. XSLT 2.0 and 3.0 do define error codes, but some APIs do not expose them. If multiple errors are signaled by the transformation (which is most likely to happen with static errors) then the error code should where possible be that of one of these errors, chosen arbitrarily; the processor may make details of additional errors available to the application in an implementation-defined way. (See fn:transform.)
It is to some extent implementation-defined whether two maps or arrays have the same function identity. Processors should ensure as a minimum that when a variable $m is bound to a map or array, calling JNode($m) more than once (with the same variable reference) will deliver the same JNode each time. (See fn:JNode.)
If ST is xs:float or xs:double, then TV is the xs:decimal value, within the set of xs:decimal values that the implementation is capable of representing, that is numerically closest to SV. If two values are equally close, then the one that is closest to zero is chosen. If SV is too large to be accommodated as an xs:decimal, (see [XML Schema Part 2: Datatypes Second Edition] for implementation-defined limits on numeric values) a dynamic error is raised [err:FOCA0001]. If SV is one of the special xs:float or xs:double values NaN, INF, or -INF, a dynamic error is raised [err:FOCA0002]. (See Casting to xs:decimal.)
In casting to xs:decimal or to a type derived from xs:decimal, if the value is not too large or too small but nevertheless cannot be represented accurately with the number of decimal digits available to the implementation, the implementation may round to the nearest representable value or may raise a dynamic error [err:FOCA0006]. The choice of rounding algorithm and the choice between rounding and error behavior is implementation-defined. (See Casting from xs:string and xs:untypedAtomic.)
If ST is xs:decimal, xs:float or xs:double, then TV is SV with the fractional part discarded and the value converted to xs:integer. Thus, casting 3.1456 returns 3 while -17.89 returns -17. Casting 3.124E1 returns 31. If SV is too large to be accommodated as an integer, (see [XML Schema Part 2: Datatypes Second Edition] for implementation-defined limits on numeric values) a dynamic error is raised [err:FOCA0003]. If SV is one of the special xs:float or xs:double values NaN, INF, or -INF, a dynamic error is raised [err:FOCA0002]. (See Casting to xs:integer.)
The tz timezone database, available at http://www.iana.org/time-zones. It is implementation-defined which version of the database is used. (See IANA Timezone Database.)
Unicode Standard Annex #15: Unicode Normalization Forms. Ed. Mark Davis and Ken Whistler, Unicode Consortium. The current version is 9.0.0, dated 2016-02-24. As with [The Unicode Standard], the version to be used is implementation-defined. Available at: http://www.unicode.org/reports/tr15/. (See UAX #15.)
Unicode Standard Annex #29: Unicode Text Segmentation. Ed. Josh Hadley, Unicode Consortium. The current version is 15.1.0, dated 2023-08-16. As with [The Unicode Standard], the version to be used is implementation-defined. Available at: http://www.unicode.org/reports/tr29/. (See UAX #29.)
The Unicode Consortium, Reading, MA, Addison-Wesley, 2016. The Unicode Standard as updated from time to time by the publication of new versions. See http://www.unicode.org/standard/versions/ for the latest version and additional information on versions of the standard and of the Unicode Character Database. The version of Unicode to be used is implementation-defined, but implementations are recommended to use the latest Unicode version; currently, Version 9.0.0. (See The Unicode Standard.)
Unicode Technical Standard #10: Unicode Collation Algorithm. Ed. Mark Davis and Ken Whistler, Unicode Consortium. The current version is 9.0.0, dated 2016-05-18. As with [The Unicode Standard], the version to be used is implementation-defined. Available at: http://www.unicode.org/reports/tr10/. (See UTS #10.)
Unicode Technical Standard #35: Unicode Locale Data Markup Language. Ed Mark Davis et al, Unicode Consortium. The current version is 29, dated 2016-03-15. As with [The Unicode Standard], the version to be used is implementation-defined. Available at: http://www.unicode.org/reports/tr35/. (See UTS #35.)
Use the arrows to browse significant changes since the 3.1 version of this specification.
See 1 Introduction
Sections with significant changes are marked Δ in the table of contents. New functions introduced in this version are marked ➕ in the table of contents.
See 1 Introduction
PR 1620 1886
Options are added to customize the form of the output.
See 2.2.6 fn:path
PR 1547 1551
New in 4.0
PR 629 803
New in 4.0
See 3.2.2 fn:message
PR 1260 1275
A third argument has been added, providing control over the rounding mode.
See 4.4.4 fn:round
New in 4.0
See 4.4.7 fn:is-NaN
PR 1049 1151
Decimal format parameters can now be supplied directly as a map in the third argument, rather than referencing a format defined in the static context.
PR 1205 1230
New in 4.0
See 4.8.2 math:e
See 4.8.16 math:sinh
See 4.8.17 math:cosh
See 4.8.18 math:tanh
The 3.1 specification suggested that every value in the result range should have the same chance of being chosen. This has been corrected to say that the distribution should be arithmetically uniform (because there are as many xs:double values between 0.01 and 0.1 as there are between 0.1 and 1.0).
PR 261 306 993
New in 4.0
See 5.4.1 fn:char
New in 4.0
PR 937 995 1190
New in 4.0
See 5.4.13 fn:hash
New in 4.0
PR 1423 1413
New in 4.0
New in 4.0
Reformulated in 4.0 in terms of the new fn:in-scope-namespaces function; the semantics are unchanged.
Reformulated in 4.0 in terms of the new fn:in-scope-namespaces function; the semantics are unchanged.
New in 4.0
New in 4.0
See 14.1.12 fn:slice
New in 4.0. The function is identical to the internal op:same-key function in 3.1
PR 1120 1150
A callback function can be supplied for comparing individual items.
Changed in 4.0 to use transitive equality comparisons for numeric values.
PR 614 987
New in 4.0
New in 4.0. Originally proposed under the name fn:uniform
New in 4.0. Originally proposed under the name fn:unique
PR 1117 1279
The $options parameter has been added.
New in 4.0
PR 259 956
A new function is available for processing input data in HTML format.
See 15.2 Functions on HTML Data
New in 4.0
An option is provided to control how JSON numbers should be formatted.
Additional options are available, as defined by fn:parse-json.
New in 4.0
New in 4.0
New in 4.0
New in 4.0
See 17.2.3 fn:every
New in 4.0
New in 4.0
New in 4.0
New in 4.0
New in 4.0
See 17.2.16 fn:some
PR 521 761
New in 4.0
New in 4.0
New in 4.0
PR 478 515
New in 4.0
New in 4.0
New in 4.0
See 18.4.15 map:pair
New in 4.0
PR 1575 1906
A new function fn:element-to-map is provided for converting XDM trees to maps suitable for serialization as JSON. Unlike the fn:xml-to-json function retained from 3.1, this can handle arbitrary XML as input.
New in 4.0
PR 968 1295
New in 4.0
PR 476 1087
New in 4.0
PR 360 476
New in 4.0
New in 4.0
New in 4.0
Supplying an empty sequence as the value of an optional argument is equivalent to omitting the argument.
PR 533 719 834
New functions are available for processing input data in CSV (comma separated values) format.
PR 289 1901
A third argument is added, allowing user control of how absent keys should be handled.
See 18.4.9 map:get
A third argument is added, allowing user control of how index-out-of-bounds conditions should be handled.
A new collation URI is defined for Unicode case-insensitive comparison and ordering.
The specification now describes how an initial BOM should be handled.
PR 1727 1740
It is no longer guaranteed that the new key replaces the existing key.
See 18.4.17 map:put
PR 173
New in 4.0
See 17.3.4 fn:op
PR 203
New in 4.0
See 18.4.1 map:build
PR 207
New in 4.0
PR 222
New in 4.0
See 14.2.7 fn:starts-with-subsequence
PR 250
New in 4.0
See 14.1.3 fn:foot
See 14.1.15 fn:trunk
PR 258
New in 4.0
PR 313
The second argument can now be a sequence of integers.
See 14.1.8 fn:remove
PR 314
New in 4.0
PR 326
Higher-order functions are no longer an optional feature.
See 1.2 Conformance
PR 419
New in 4.0
PR 434
New in 4.0
The function has been extended to allow output in a radix other than 10, for example in hexadecimal.
PR 482
Deleted an inaccurate statement concerning the behavior of NaN.
PR 507
New in 4.0
PR 546
It is no longer automatically an error if the input contains a codepoint that is not valid in XML. Instead, the codepoint must be a permitted character. The set of permitted characters is implementation-defined, but it is recommended that all Unicode characters should be accepted.
See 5.2.1 fn:codepoints-to-string
It is no longer automatically an error if the resource (after decoding) contains a codepoint that is not valid in XML. Instead, the codepoint must be a permitted character. The set of permitted characters is implementation-defined, but it is recommended that all Unicode characters should be accepted.
The rules regarding use of non-XML characters in JSON texts have been relaxed.
See 15.3.3 JSON character repertoire
It is no longer automatically an error if the input contains a codepoint that is not valid in XML. Instead, the codepoint must be a permitted character. The set of permitted characters is implementation-defined, but it is recommended that all Unicode characters should be accepted.
PR 631
New in 4.0
PR 662
Constructor functions now have a zero-arity form; the first argument defaults to the context item.
PR 680
The case-insensitive collation is now defined normatively within this specification, rather than by reference to the HTML "living specification", which is subject to change. The collation can now be used for ordering comparisons as well as equality comparisons.
PR 702
The function can now take any number of arguments (previously it had to be two or more), and the arguments can be sequences of strings rather than single strings.
See 5.4.4 fn:concat
PR 710
Changes the function to return a sequence of key-value pairs rather than a map.
PR 727
It has been clarified that loading a module has no effect on the static or dynamic context of the caller.
PR 795
New in 4.0
PR 828
The $predicate callback function accepts an optional position argument.
See 17.2.4 fn:filter
The $action callback function accepts an optional position argument.
The $predicate callback function now accepts an optional position argument.
The $action callback function now accepts an optional position argument.
PR 881
The way that fn:min and fn:max compare numeric values of different types has changed. The most noticeable effect is that when these functions are applied to a sequence of xs:integer or xs:decimal values, the result is an xs:integer or xs:decimal, rather than the result of converting this to an xs:double
See 14.4.3 fn:max
See 14.4.4 fn:min
PR 901
All three arguments are now optional, and each argument can be set to an empty sequence. Previously if $description was supplied, it could not be empty.
See 3.1.1 fn:error
The $label argument can now be set to an empty sequence. Previously if $label was supplied, it could not be empty.
See 3.2.1 fn:trace
The third argument can now be supplied as an empty sequence.
The second argument can now be an empty sequence.
The optional second argument can now be supplied as an empty sequence.
The 3rd, 4th, and 5th arguments are now optional; previously the function required either 2 or 5 arguments.
The optional third argument can now be supplied as an empty sequence.
PR 905
The rule that multiple calls on fn:doc supplying the same absolute URI must return the same document node has been clarified; in particular the rule does not apply if the dynamic context for the two calls requires different processing of the documents (such as schema validation or whitespace stripping).
See 14.6.1 fn:doc
PR 909
The function has been expanded in scope to handle comparison of values other than strings.
PR 924
Rules have been added clarifying that users should not be allowed to change the schema for the fn namespace.
See D Schemas
PR 925
The decimal format name can now be supplied as a value of type xs:QName, as an alternative to supplying a lexical QName as an instance of xs:string.
PR 932
The specification now prescribes a minimum precision and range for durations.
PR 933
When comments and processing instructions are ignored, any text nodes either side of the comment or processing instruction are now merged prior to comparison.
PR 940
New in 4.0
PR 953
Constructor functions for named record types have been introduced.
PR 962
New in 4.0
PR 969
New in 4.0
See 18.4.3 map:empty
PR 984
New in 4.0
See 9.4.1 fn:seconds
PR 987
The order of results is now prescribed; it was previously implementation-dependent.
PR 988
New in 4.0
See 15.3.8 fn:pin
See 15.3.9 fn:label
PR 1022
Regular expressions can include comments (starting and ending with #) if the c flag is set.
See 6.1 Regular expression syntax
See 6.2 Flags
PR 1028
An option is provided to control how the JSON null value should be handled.
PR 1032
New in 4.0
See 14.1.17 fn:void
PR 1046
New in 4.0
PR 1059
Use of an option keyword that is not defined in the specification and is not known to the implementation now results in a dynamic error; previously it was ignored.
See 1.7 Options
PR 1068
New in 4.0
PR 1072
The return type is now specified more precisely.
PR 1090
When casting from a string to a duration or time or dateTime, it is now specified that when there are more digits in the fractional seconds than the implementation is able to retain, excess digits are truncated. Rounding upwards (which could affect the number of minutes or hours in the value) is not permitted.
PR 1093
New in 4.0
PR 1117
The $options parameter has been added.
PR 1182
The $predicate callback function may return an empty sequence (meaning false).
See 17.2.3 fn:every
See 17.2.4 fn:filter
See 17.2.16 fn:some
PR 1191
New in 4.0
See 2.3.1 fn:distinct-ordered-nodes
The $options parameter has been added, absorbing the $collation parameter.
PR 1250
For selected properties including percent and exponent-separator, it is now possible to specify a single-character marker to be used in the picture string, together with a multi-character rendition to be used in the formatted output.
PR 1257
The $options parameter has been added.
PR 1262
New in 4.0
PR 1265
The constraints on the result of the function have been relaxed.
PR 1280
As a result of changes to the coercion rules, the number of supplied arguments can be greater than the number required: extra arguments are ignored.
See 17.2.1 fn:apply
PR 1288
Additional error conditions have been defined.
PR 1296
New in 4.0
PR 1333
A new option is provided to allow the content of the loaded module to be supplied as a string.
PR 1353
An option has been added to suppress the escaping of the solidus (forwards slash) character.
PR 1358
New in 4.0
PR 1361
The term atomic value has been replaced by atomic item.
See 1.9 Terminology
PR 1393
Changes the function to return a sequence of key-value pairs rather than a map.
PR 1409
This section now uses the term primitive type strictly to refer to the 20 atomic types that are not derived by restriction from another atomic type: that is, the 19 primitive atomic types defined in XSD, plus xs:untypedAtomic. The three types xs:integer, xs:dayTimeDuration, and xs:yearMonthDuration, which have custom casting rules but are not strictly-speaking primitive, are now handled in other subsections.
See 22.123.1 Casting from primitive types to primitive types
The rules for conversion of dates and times to strings are now defined entirely in terms of XSD 1.1 canonical mappings, since these deliver exactly the same result as the XPath 3.1 rules.
See 22.1.2.223.1.2.2 Casting date/time values to xs:string
The rules for conversion of durations to strings are now defined entirely in terms of XSD 1.1 canonical mappings, since the XSD 1.1 rules deliver exactly the same result as the XPath 3.1 rules.
See 22.1.2.323.1.2.3 Casting xs:duration values to xs:string
PR 1455
Numbers now retain their original lexical form, except for any changes needed to satisfy JSON syntax rules (for example, stripping leading zero digits).
PR 1473
New in 4.0
PR 1481
The function has been extended to handle other Gregorian types such as xs:gYearMonth.
See 10.5.1 fn:year-from-dateTime
See 10.5.2 fn:month-from-dateTime
The function has been extended to handle other Gregorian types such as xs:gMonthDay.
See 10.5.3 fn:day-from-dateTime
The function has been extended to handle other types including xs:time.
See 10.5.4 fn:hours-from-dateTime
See 10.5.5 fn:minutes-from-dateTime
The function has been extended to handle other types such as xs:gYearMonth.
PR 1504
New in 4.0
Optional $separator added.
PR 1523
New functions are provided to obtain information about built-in types and types defined in an imported schema.
New in 4.0
See 20.1.221.1.2 fn:schema-type
PR 1545
New in 4.0
PR 1565
The default for the escape option has been changed to false. The 3.1 specification gave the default value as true, but this appears to have been an error, since it was inconsistent with examples given in the specification and with tests in the test suite.
PR 1570
New in 4.0
PR 1587
New in 4.0
PR 1611
The spec has been corrected to note that the function depends on the implicit timezone.
PR 1671
New in 4.0.
PR 1703
The order of entries in maps is retained.
Ordered maps are introduced.
Enhanced to allow for ordered maps.
See 18.4.7 map:find
See 18.4.17 map:put
PR 1711
It is explicitly stated that the limits for $precision are implementation-defined.
See 4.4.4 fn:round
PR 1727
For consistency with the new functions map:build and map:of-pairs, the handling of duplicates may now be controlled by supplying a user-defined callback function as an alternative to the fixed values for the earlier duplicates option.
PR 1734
In 3.1, given a mixed input sequence such as (1, 3, 4.2e0), the specification was unclear whether it was permitted to add the first two integer items using integer arithmetic, rather than converting all items to doubles before performing any arithmetic. The 4.0 specification is clear that this is permitted; but since the items can be reordered before being added, this is not required.
See 14.4.2 fn:avg
See 14.4.5 fn:sum
PR 1825
New in 4.0
PR 1856
Word boundaries can be matched. Lookahead and lookbehind assertions are supported. Assertions (including ^ and $) can no longer be followed by a quantifier.
See 6.1 Regular expression syntax
It is now permitted for the regular expression to match a zero-length string.
See 6.3.2 fn:replace
The output of the function is extended to allow the represention of captured groups found within lookahead assertions.
It is now permitted for the regular expression to match a zero-length string.
PR 1879
Additional options to control DTD and XInclude processing have been added.
PR 1897
The $replacement argument can now be a function that computes the replacement strings.
See 6.3.2 fn:replace
PR 1906
New in 4.0
See 18.5.10 fn:element-to-map-plan
New in 4.0.
PR 1910
An $options parameter is added. Note that the rules for the $options parameter control aspects of processing that were implementation-defined in earlier versions of this specification. An implementation may provide configuration options designed to retain backwards-compatible behavior when no explicit options are supplied.
See 14.6.1 fn:doc
PR 1991
Named record types used in the signatures of built-in functions are now available as standard in the static context.
PR 2001
New in 4.0.
PR 2013
Support for binary input has been added.
See 15.1.2 fn:parse-xml-fragment
New in 4.0
Support for binary input has been added.
New in 4.0
PR 2030
This description of the XSD validation process was previously found (with some duplication) in the XQuery and XSLT specifications; those specifications now reference this description. As a side-effects, the descriptions of the process in XQuery and XSLT are better aligned.
PR 2031
Introduced the concept of JNodes.
New in 4.0
See 20.1.1 fn:JNode
This section summarizes the extent to which this specification is compatible with previous versions.
Version 4.0 of this function library is fully backwards compatible with version 3.1, except as noted below:
In fn:deep-equal, and in other functions such as fn:distinct-values that refer to fn:deep-equal, the rules for comparing values of different numeric types (for example, xs:double and xs:decimal) have changed. In previous versions of the specification, xs:decimal values were converted to xs:double, leading to a possible loss of precision. This could make comparisons non-transitive, leading to problems when grouping, and potentially (depending on the sort algorithm) with sorting. The problem has been fixed by requiring comparisons to be performed based on the exact mathematical value without any loss of precision.
This means, for example, that deep-equal(0.2, 0.2e0) is now false, whereas in previous versions it was true. The two values are not mathematically equal, because the exact decimal equivalent of the xs:double value written as 0.2e0 is 0.200000000000000011102230246251565404236316680908203125.
The corresponding change has not been made to the = and eq operators, because it was found to be too disruptive. For example, if the context node is the element <e price="10.0" discount="0.2"/>, there is an expectation that the expression @price - @discount = 9.8 should return true. But (assuming untyped data), the result of the subtraction is an xs:double whose precise value is 9.800000000000000710542735760100185871124267578125, so comparing the two values as decimals would return false.
In previous versions, unrecognized options supplied to the $options parameter of functions such as fn:parse-json were silently ignored. In 4.0, they are rejected as a type error, unless they are QNames with a non-absent namespace, or are extensions recognized by the implementation.
In version 4.0, omitting the $value of fn:error has the same effect as setting it to an empty sequence. In 3.1, the effects could be different (the effect of omitting the argument was implementation-defined).
In version 3.1, the fn:deep-equal function did not merge adjacent text nodes after stripping comments and processing instructions, so the elements <a>abc<!--note1-->def</code> and <a>abcde<!--note2-->f</code> were considered non-equal. In version 4.0, the text nodes are now merged prior to comparison, so these two elements compare equal.
The format of numeric values in the output of fn:xml-to-json may be different. In version 3.1, the supplied value was parsed as an xs:double and then serialized using the casting rules, resulting in an input value of 10000000 being output as 1e7. In version 4.0, the value is output as is, except for any changes (such as stripping of leading zeroes or a leading plus sign) that might be needed to ensure the result is valid JSON.
In version 4.0, the function signature of fn:namespace-uri-for-prefix constrains the first argument to be either an xs:NCName or a zero-length string (the new coercion rules mean that any string in the form of an xs:NCName is acceptable). If a string is supplied that does not meet these requirements, a type error will be raised. In version 3.1, this was not an error: it came under the rule that when no namespace binding existed for the supplied prefix, the function would return an empty sequence.
Furthermore, because the expected type of this parameter is no longer xs:string, the special coercion rules for xs:string parameters in XPath 1.0 compatibility mode no longer apply. For example, supplying xs:duration('PT1H') as the first argument will now raise a type error, rather than looking for a namespace binding for the prefix PT1H.
Version 4.0 makes it clear that the casting of a value other than xs:string or xs:untypedAtomic to a list type (whether using a cast expression or a constructor function) is a type error [err:XPTY0004]XP. Previously this was defined as an error, but the kind of error and the error code were left unspecified. Accordingly, the function signatures of the constructor functions for built-in list types have been changed to use an argument type of xs:string?.
The way that fn:min and fn:max compare numeric values of different types has changed. The most noticeable effect is that when these functions are applied to a sequence of xs:integer or xs:decimal values, the result is an xs:integer or xs:decimal, rather than the result of converting this to an xs:double.
The type of the third argument of fn:format-number has changed from xs:string to (xs:string | xs:QName). Because the expected type of this parameter is no longer xs:string, the special coercion rules for xs:string parameters no longer apply. For example, it is no longer possible to supply an instance of xs:anyURI or (when XPath 1.0 compatibility mode is in force) an instance of xs:boolean or xs:duration.
When fn:putmap:put replaces an entry in a map with a new value for an existing key, in the case where the existing key and the new key differ (for example, if they have different type annotations), it is no longer guaranteed that the new entry includes the new key rather than the existing key.
In regular expressions, the assertions ^ and $ can no longer be followed by a quantifier. This is because (a) a quantifier that allows zero occurrences means that the assertion will always match, and (b) a quantifier that allows multiple occurrences has no effect. Processors may provide an option that allows such regular expressions to be accepted for compatibility reasons.
For compatibility issues regarding earlier versions, see the 3.1 version of this specification.