View Old View New View Both View Only Previous Next

This draft contains only sections that have differences from the version that it modified.

W3C

XPath and XQuery Functions and Operators 4.0

W3C Editor's Draft 18 February12 March 2026

This version:
https://qt4cg.org/specifications/xpath-functions-40/
Latest version of XPath and XQuery Functions and Operators 4.0:
https://qt4cg.org/specifications/xpath-functions-40/
Most recent Recommendation of XPath and XQuery Functions and Operators:
https://www.w3.org/TR/2017/REC-xpath-functions-31-20170321/
Editor:
Michael Kay, Saxonica <http://www.saxonica.com/>

This document is also available in these non-normative formats: Specification in XML format and XML function catalog.


Abstract

This document defines constructor functions, operators, and functions on the datatypes defined in [XML Schema Part 2: Datatypes Second Edition] and the datatypes defined in [XQuery and XPath Data Model (XDM) 4.0]. It also defines functions and operators on nodes and node sequences as defined in the [XQuery and XPath Data Model (XDM) 4.0]. These functions and operators are defined for use in [XML Path Language (XPath) 4.0] and [XQuery 4.0: An XML Query Language] and [XSL Transformations (XSLT) Version 4.0] and other related XML standards. The signatures and summaries of functions defined in this document are available at: http://www.w3.org/2005/xpath-functions/.

A summary of changes since version 3.1 is provided at H Changes since 3.1.

Status of this Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document.

This document is a working draft developed and maintained by a W3C Community Group, the XQuery and XSLT Extensions Community Group unofficially known as QT4CG (where "QT" denotes Query and Transformation). This draft is work in progress and should not be considered either stable or complete. Standard W3C copyright and patent conditions apply.

The community group welcomes comments on the specification. Comments are best submitted as issues on the group's GitHub repository.

As the Community Group moves towards publishing dated, stable drafts, some features that the group thinks may likely be removed or substantially changed are marked “at risk” in their changes section. In this draft:

The community group maintains two extensive test suites, one oriented to XQuery and XPath, the other to XSLT. These can be found at qt4tests and xslt40-test respectively. New tests, or suggestions for correcting existing tests, are welcome. The test suites include extensive metadata describing the conditions for applicability of each test case as well as the expected results. They do not include any test drivers for executing the tests: each implementation is expected to provide its own test driver.

Dedication

The publications of this community group are dedicated to our co-chair, Michael Sperberg-McQueen (1954–2024).


12 Processing nodes

12.2 Other properties of nodes

This section specifies further functions that return properties of nodes. Nodes are formally defined in 6 Nodes DM31.

FunctionMeaning
fn:has-childrenReturns true if the supplied GNode has one or more child nodes (of any kind).
fn:in-scope-namespacesReturns the in-scope namespaces of an element node, as a map.
fn:in-scope-prefixesReturns the prefixes of the in-scope namespaces for an element node.
fn:langThis function tests whether the language of $node, or the context value if the second argument is omitted, as specified by xml:lang attributes is the same as, or is a sublanguage of, the language specified by $language.
fn:local-nameReturns the local part of the name of $node as an xs:string that is either the zero-length string, or has the lexical form of an xs:NCName.
fn:nameReturns the name of a node, as an xs:string that is either the zero-length string, or has the lexical form of an xs:QName.
fn:namespace-uriReturns the namespace URI part of the name of $node, as an xs:anyURI value.
fn:namespace-uri-for-prefixReturns the namespace URI of one of the in-scope namespaces for $element, identified by its namespace prefix.
fn:pathReturns a path expression that can be used to select the supplied node relative to the root of its containing document.
fn:rootReturns the root of the tree to which $node belongs. The function can be applied both to XNodesDM and to JNodesDM.
fn:siblingsReturns the supplied GNode together with its siblings, in document order.

12.2.1 fn:has-children

Changes in 4.0 (next | previous)

  1. Generalized to work with JNodes as well as XNodes.  [Issue 2100 PR 2149 12 August 2025]

Summary

Returns true if the supplied GNode has one or more child nodes (of any kind).

Signature
fn:has-children(
$nodeas gnode()?:= .
) as xs:boolean
Properties

The zero-argument form of this function is deterministic, context-dependent, and focus-dependent.

The one-argument form of this function is deterministic, context-independent, and focus-independent.

Rules

If the argument is omitted, it defaults to the context value (.).

Provided that the supplied argument $node matches the expected type gnode()?, the result of the function call fn:has-children($node) is defined to be the same as the result of the expression fn:exists($node/child::gnode()).

Error Conditions

The following errors may be raised when $node is omitted:

  • If the context value is absentDM, type error [err:XPDY0002]XP

  • If the context value is not an instance of the sequence type gnode()?, type error [err:XPTY0004]XP.

Notes

If $node is the empty sequence the result is false.

The motivation for this function is to support streamed evaluation. According to the streaming rules in [XSL Transformations (XSLT) Version 4.0], the following construct is not streamable:

<xsl:if test="exists(row)">
  <ulist>
    <xsl:for-each select="row">
      <item><xsl:value-of select="."/></item>
    </xsl:for-each>
  </ulist>
</xsl:if>

This is because it makes two downward selections to read the child row elements. The use of fn:has-children in the xsl:if conditional is intended to circumvent this restriction.

Although the function was introduced to support streaming use cases, it has general utility as a convenience function.

If the supplied argument is a map or an array, it will automatically be coerced to a JNode.

Examples
Variables
let $e := <doc>
  <p id="alpha">One</p>
  <p/>
  <p>Three</p>
  <?pi 3.14159?>
</doc>
ExpressionResult
has-children($e)

true()

has-children($e//p[1])

true()

has-children($e//p[2])

false()

has-children($e//p[3])

true()

has-children($e//processing-instruction())

false()

has-children($e//p[1]/text())

false()

has-children($e//p[1]/@id)

false()

jtree([1,2,3]) => has-children()
[1,2,3] => has-children()

true()

jtree([]) => has-children()
[] => has-children()

false()

12.2.9 fn:path

Changes in 4.0 (next | previous)

  1. Options are added to customize the form of the output.  [Issues 332 1660 PRs 1620 1886 29 November 2024]

  2. The function is extended to handle JNodes.  [Issue 2100 PR 2149 5 August 2025]

Summary

Returns a path expression that can be used to select the supplied node relative to the root of its containing document.

Signature
fn:path(
$nodeas gnode()?:= .,
$optionsas map(*)?:= {}
) as xs:string?
Properties

The zero-argument form of this function is deterministic, context-dependent, and focus-dependent.

The one-argument form of this function is deterministic, context-independent, and focus-independent.

The two-argument form of this function is deterministic, context-independent, and focus-independent.

Rules

The behavior of the function if the $nodeargument is omitted is exactly the same as if the context value (.) had been passed as the argument.

If $node is the empty sequence, the function returns the empty sequence.

The $options argument, if present, defines additional parameters controlling how the output is formatted. The option parameter conventions apply. The options available are as follows:

record(
origin?as gnode()?,
lexical?as xs:boolean,
namespaces?as map((xs:NCName | enum('')), xs:anyURI)?,
indexes?as xs:boolean
)
KeyMeaning

origin?

A GNode, which must be an ancestor of $node. If present, the returned path will be a relative path that selects $node starting from the supplied origin node, rather than from the root of the containing tree.
  • Type: gnode()?

  • Default: ()

lexical?

If true, the names of element nodes in the path are represented by the result of a call on the name function applied to each element. The result in this case does not contain sufficient information to identify the namespace URI of the element.
  • Type: xs:boolean

  • Default: false

namespaces?

A map from namespace prefixes to namespace URIs, such as might be returned by the function fn:in-scope-namespaces. If a prefix is available for a given URI, it is used in preference to using Q{uri}local notation.
  • Type: map((xs:NCName | enum('')), xs:anyURI)?

  • Default: ()

indexes?

If true, the returned path includes the index positions of nodes. If false, only the node names are included.
  • Type: xs:boolean

  • Default: true

Let R be the GNode supplied in the origin option, or the root GNode of the tree containing $node otherwise.

If $node is a document node, or a JNode with no parent, the function returns the string "/".

Otherwise, the function returns a string that consists of a sequence of steps, one for each ancestor-or-self of $node that is not an ancestor-or-self of R.

If R is an XNode other than a document node and the origin option is absent or empty, then this string is preceded by a string notionally representing a call to the fn:root function, expressed as follows:

  • If the lexical option is present with the value true, then the string "fn:root()".

  • If the namespaces option is present and defines a mapping from a non empty prefix P to the namespace URI http://www.w3.org/2005/xpath-functions, then "P:root()"

  • If the namespaces option is present and defines a mapping from the empty string to the namespace URI http://www.w3.org/2005/xpath-functions, then "root()"

  • Otherwise, "Q{http://www.w3.org/2005/xpath-functions}root()".

Each step is the concatenation of:

  1. The character "/", which is omitted for the first step if the origin option is present;

  2. A string whose form depends on the kind of node selected by that step, as follows:

    1. For an element node, the concatenation of:

      1. A representation of the element name, chosen as follows:

        1. If the lexical option is present with the value true, then the result of applying the name function to the element node.

        2. Otherwise, if the namespaces option is present and the element is in a namespace U and the namespaces option includes a mapping from a prefix P to the namespace U, then the string P:L, where L is the local part of the element name. If there is more than one such prefix, then one of them is chosen arbitrarily.

        3. Otherwise, if the namespaces option is present and the element is in a namespace U and the namespaces option includes a mapping from the zero-length string to the namespace U, then the local part of the element name.

        4. Otherwise, if the namespaces option is present and the element is in no namespace and the namespaces option includes no mapping from the zero-length string to any namespace, then the local part of the element name.

        5. Otherwise, the string Q{U}L, where U is the namespace URI of the element name or the empty string if the element is in no namespace, and L is the local part of the element name.

      2. Unless the indexes option is present with the value false, a string in the form [position] where position is an integer representing the one-based position of the selected node among its like-named siblings.

    2. For an attribute node, the concatenation of:

      1. The character "@"

      2. If the lexical option is present with the value true, then the result of applying the name function to the attribute node.

      3. Otherwise, if the attribute node is in no namespace, the local part of the attribute name.

      4. Otherwise, if the namespaces option is present, and if it includes a mapping from a non-empty namespace prefix P to the namespace URI of the attribute, then a string in the form P:L, where L is the local part of the attribute name. If there is more than one such prefix, then one of them is chosen arbitrarily.

      5. Otherwise, the string Q{U}L, where U is the namespace URI of the attribute name, and L is the local part of the attribute name.

    3. For a text node: text()[position] where position is an integer representing the position of the selected node among its text node siblings.

      The suffix [position] is omitted if the indexes option is present with the value false.

    4. For a comment node: comment()[position] where position is an integer representing the position of the selected node among its comment node siblings.

      The suffix [position] is omitted if the indexes option is present with the value false.

    5. For a processing-instruction node: processing-instruction(local)[position] where local is the name of the processing instruction node and position is an integer representing the position of the selected node among its like-named processing-instruction node siblings.

      The suffix [position] is omitted if the indexes option is present with the value false.

    6. For a namespace node:

      1. If the namespace node has a name: namespace::prefix, where prefix is the local part of the name of the namespace node (which represents the namespace prefix).

      2. If the namespace node has no name (that is, if it represents the default namespace): namespace::*[Ulocal-name() = ""]

        Here Ulocal-name() represents a call on the function fn:local-name and is formatted using the same conventions as the call on fn:root described earlier.

    7. For a JNode where the ·content· property of the parent is an array, then as the string *[N] where N is the value of the ·selector· property.

    8. For any other JNode (including the case where the ·content· property of the parent is a map):

      1. If the value is an xs:string, xs:untypedAtomic, or xs:anyURI that is castable to xs:NCName, then the result of casting the value to xs:NCName.

      2. If the value is an xs:string, xs:untypedAtomic, or xs:anyURI that is not castable to xs:NCName, then then as the string get("S") where S is the string value.

      3. If the value is numeric, then as the string get(N) where N is the result of casting the numeric value to xs:string.

      4. If the value is an xs:QName, then as the string get(#Q{uri}local) where uri and local are the namespace URI and local name parts of the QName.

      5. If the value is an xs:boolean, then as the string get(true()) or get(false()).

      6. If the value is of any other type, then as the string get(xs:T("S")) where T is the local part of the most specific built-in atomic type of which the value is an instance, and S is the result of casting the value to xs:string.

      TODO: Better handling of the case where the parent is neither a map nor an array, for example where it is a sequence of several maps or several arrays. It's hard to provide a better path for these when there is no AxisStep for selecting within such values.

Error Conditions

The following errors may be raised when $node is omitted:

  • If the context value is absentDM, type error [err:XPDY0002]XP

  • If the context value is not an instance of the sequence type gnodenode()?, type error [err:XPTY0004]XP.

If the value of the origin option is a node that is not an ancestor of $node (or in the absence of $node, the context value), dynamic error [err:FOPA0001].

Notes

Using the namespaces option to shorten the generated path is often convenient, but the resulting path may be unusable if the input tree contains multiple bindings for the same prefix.

Similarly, using the lexical option is convenient if there is no need for precise namespace information: it is especially suitable when the containing node tree declares no namespaces.

If the supplied argument is a map or an array, it will automatically be coerced to a JNode. This however is not useful, because this will be a root JNode, yielding the path /.

Examples
Variables
let $e := document {            
  <p xmlns="http://example.com/one" xml:lang="de" author="Friedrich von Schiller">
Freude, schöner Götterfunken,<br/>
Tochter aus Elysium,<br/>
Wir betreten feuertrunken,<br/>
Himmlische, dein Heiligtum.
</p>}
let $emp := 
  <employee xml:id="ID21256">
     <empnr>E21256</empnr>
     <first>John</first>
     <last>Brown</last>
  </employee>
Expression:
path($e)
Result:
'/'
Expression:
path($e/*:p)
Result:
'/Q{http://example.com/one}p[1]'
Expression:
path($e/*:p, { 'namespaces': in-scope-namespaces($e/*) })
Result:
'/p[1]'
Expression:
path($e/*:p, { 'indexes': false() })
Result:
'/Q{http://example.com/one}p'
Expression:
path($e/*:p/@xml:lang)
Result:
'/Q{http://example.com/one}p[1]/@Q{http://www.w3.org/XML/1998/namespace}lang'
Expression:
path($e//@xml:lang, { 'namespaces': in-scope-namespaces($e/*) })
Result:
'/p[1]/@xml:lang'
Expression:
path($e/*:p/@author)
Result:
'/Q{http://example.com/one}p[1]/@author'
Expression:
path($e/*:p/*:br[2])
Result:
'/Q{http://example.com/one}p[1]/Q{http://example.com/one}br[2]'
Expression:
path($e/*:p/*:br[2], {
  'namespaces': { 'N': 'http://example.com/one' },
  'indexes': false() 
})
Result:
'/N:p/N:br'
Expression:
path($e//text()[starts-with(normalize-space(), 'Tochter')])
Result:
'/Q{http://example.com/one}p[1]/text()[2]'
Expression:
path($e/*:p/*:br[2], { 'lexical': true() })
Result:
'/p[1]/br[2]'
Expression:
path($e/*:p/*:br[2], { 'lexical': true(), 'origin': $e/*:p })
Result:
'br[2]'
Expression:
path($emp)
Result:
'Q{http://www.w3.org/2005/xpath-functions}root()'
Expression:
path($emp/@xml:id)
Result:
'Q{http://www.w3.org/2005/xpath-functions}root()/@Q{http://www.w3.org/XML/1998/namespace}id'
Expression:
path($emp/empnr)
Result:
'Q{http://www.w3.org/2005/xpath-functions}root()/Q{}empnr[1]'
Expression:
path($emp/empnr, { 'lexical': true() })
Result:
'fn:root()/empnr[1]'
Expression:
path($emp/empnr, {
  'namespaces': {
    'fn': 'http://www.w3.org/2005/xpath-functions',
    '': ''
  }
})
Result:
'fn:root()/empnr[1]'
Expression:
let $in := [{"b":[3,4]}]
return path($in/*[1]/b/*[2])
Result:
"/*[1]/b/*[2]"
Expression:
let $in := [[{'a':1}], [{'a':2}]]
return path($in//a[. = 2])
Result:
"/*[2]/*[1]/a"

12.2.11 fn:siblings

Changes in 4.0 (next | previous)

  1. New in 4.0  [Issues 1542 1552 PRs 1547 1551 5 November 2024]

Summary

Returns the supplied GNode together with its siblings, in document order.

Signature
fn:siblings(
$nodeas gnode()?:= .
) as gnode()*
Properties

The zero-argument form of this function is deterministic, context-dependent, and focus-dependent.

The one-argument form of this function is deterministic, context-independent, and focus-independent.

Rules

If the $node argument is omitted, it defaults to the context value (.).

If the value of $node is the empty sequence, the function returns the empty sequence.

If $node is a child of some parent GNode P, the function returns all the children of P (including $node), in document order, as determined by the value of $node/child::gnode().

Otherwise (specifically, if $node is parentless, or if it is an attribute or namespace node), the function returns $node.

Formal Equivalent

The effect of the function is equivalent to the result of the following XPath expression.

if ($node intersect $node/parent::gnode()/child::gnode())
then $node/parent::gnode()/child::gnode()
else $node
if ($node intersect $node/parent::node()/child::node())
then $node/parent::node()/child::node()
else $node
Error Conditions

The following errors may be raised when $node is omitted:

  • If the context value is absentDM, type error [err:XPDY0002]XP

  • If the context value is not an instance of the sequence type gnodenode()?, type error [err:XPTY0004]XP.

Notes

The result of siblings($n) (except in error cases) is the same as the result of $n/(preceding-sibling::node() | following-sibling-or-self::node()). It is also the same as $n/(preceding-sibling-or-self::node() | following-sibling::node())

As with names such as parent and child, the word sibling used here as a technical term is not a precise match to its use in describing human family relationships, but is chosen for convenience.

Examples
Variables
let $e := <doc x="X"><a>A</a>text<?pi 3.14159?></doc>
ExpressionResult
siblings($e//a) ! string()

"A", "text", "3.14159"

siblings($e//processing-instruction('pi')) ! string()

"A", "text", "3.14159"

siblings($e//@x) ! string()

"X"

[[1,2], [11,12], [13,14]]//jnode()[.='12'] => siblings() => sum()

23

12.3 Functions on sequences of nodes

This section specifies functions on sequences of nodes.

FunctionMeaning
fn:distinct-ordered-nodesRemoves duplicate GNodes and sorts the input into document order.
fn:innermostReturns every GNode within the input sequence that is not an ancestor of another member of the input sequence; the GNodes are returned in document order with duplicates eliminated.
fn:outermostReturns every GNode within the input sequence that has no ancestor that is itself a member of the input sequence; the nodes are returned in document order with duplicates eliminated.

12.3.3 fn:outermost

Changes in 4.0 (next | previous)

  1. Generalized to work with JNodes as well as XNodes.  [Issue 2100 PR 2149 12 August 2025]

Summary

Returns every GNode within the input sequence that has no ancestor that is itself a member of the input sequence; the nodes are returned in document order with duplicates eliminated.

Signature
fn:outermost(
$nodesas gnode()*
) as gnode()*
Properties

This function is deterministic, context-independent, and focus-independent.

Rules

The effect of the function call fn:outermost($nodes) is defined to be equivalent to the result of the expression:

$nodes[not(ancestor::gnode() intersect $nodes)]/.

That is, the function takes as input a sequence of GNodes, and returns every GNode within the sequence that does not have another GNode within the sequence as an ancestor; the GNodes are returned in document order with duplicates eliminated.

Notes

The formulation $nodes except $nodes/descendant::node() might appear to be simpler, but does not correctly account for attribute nodes, as these are not descendants of their parent element.

The motivation for the function was based on XSLT streaming use cases. There are cases where the [XSL Transformations (XSLT) Version 4.0] streaming rules allow the construct outermost(//section) but do not allow //section; the function can therefore be useful in cases where it is known that sections will not be nested, as well as cases where the application actually wishes to process all sections except those that are nested within another.

If the supplied argument includes a map or an array, it will automatically be coerced to a JNode.

Examples
Expression:
parse-xml("<doc>
     <div id='a'><div id='b'><div id='c'/></div></div>
  </doc>")//div
   => outermost() => for-each(fn{string(@id)})
Result:
"a"
Expression:
[[[1], [2]], [[3], [4]], [[5], [6]]]//jnode(*, array(*))
   => outermost() =!> array:size()
[[[1], [2]], [[3], [4]], [[5], [6]]]//self::array(*)
   => outermost() =!> array:size()
Result:
3

14 Processing maps

Maps were introduced as a new datatype in XDM 3.1. This section describes functions that operate on maps.

A map is a kind of item.

[Definition] A map consists of a sequence of entries, also known as key-value pairs. Each entry comprises a key which is an arbitrary atomic item, and an arbitrary sequence called the associated value.

[Definition] Within a map, no two entries have the same key. Two atomic items K1 and K2 are the same key for this purpose if the function call fn:atomic-equal($K1, $K2) returns true.

It is not necessary that all the keys in a map should be of the same type (for example, they can include a mixture of integers and strings).

Maps are immutable, and have no identity separate from their content. For example, the map:remove function returns a map that differs from the supplied map by the omission (typically) of one entry, but the supplied map is not changed by the operation. Two calls on map:remove with the same arguments return maps that are indistinguishable from each other; there is no way of asking whether these are “the same map”.

A map can also be viewed as a function from keys to associated values. To achieve this, a map is also a function item. The function corresponding to the map has the signature function($key as xs:anyAtomicValue) as item()*. Calling the function has the same effect as calling the map:get function: the expression $map($key) returns the same result as get($map, $key). For example, if $books-by-isbn is a map whose keys are ISBNs and whose assocated values are book elements, then the expression $books-by-isbn("0470192747") returns the book element with the given ISBN. The fact that a map is a function item allows it to be passed as an argument to higher-order functions that expect a function item as one of their arguments.

14.5 Converting elements to maps

Changes in 4.0 (next | previous)

  1. A new function fn:element-to-map is provided for converting XDM trees to maps suitable for serialization as JSON. Unlike the fn:xml-to-json function retained from 3.1, this can handle arbitrary XML as input.   [Issues 528 1645 1646 1647 1648 1658 1658 1797 PRs 1575 1906 19 November 2024]

The fn:element-to-map function converts a tree rooted at an XML element node to a corresponding tree of maps, in a form suitable for serialization as JSON. In effect it provides a mechanism for converting XML to JSON.

This section describes the mappings used by this function.

This mapping is designed with three objectives:

  • It should be possible to represent any XML element as a map suitable for JSON serialization.

  • The resulting JSON should be intuitive and easy to use.

  • The JSON should be consistent and stable: small variations in the input should not result in large variations in the output.

Achieving all three objectives requires design compromises. It also requires sacrificing some other desiderata. In consequence:

The requirement for consistency and stability is particularly challenging. An element such as <name>John</name> maps naturally to the map { "name": "John" }; but adding an attribute (so it becomes <name role="first">John</name>) then requires an incompatible change in the JSON representation. The format could be made extensible by converting <name>John</name> to { "name": {"#content":"John"} } and <name role="first">John</name> to { "name": { "@role":"first", "#content":"John" } }, but this imposes unwanted complexity on the simplest cases. The solution adopted is threefold:

  • It is possible to analyze a corpus of XML documents to develop a conversion plan, which can then be applied consistently to individual input documents, whether or not these documents were present in the corpus. The conversion plan can be serialized and subsequently reused, so that it can be applied to input documents that might not have existed at the time the conversion plan was formulated.

  • Alternatively, the function can make use of schema information where available, so it considers not just the structure of an individual element instance, but the rules governing the element type.

  • It is possible to override the choices made by the system, and explicitly specify the format to be used for elements or attributes having a given name.

14.5.1 Element Layouts

The key challenge in mapping XML to JSON is in deciding how element content is to be represented. To illustrate the variety of mappings that are possible, the following table lists some examples of typical XML elements and their JSON equivalents:

XML elementJSON equivalent
<hr/>
"hr": ""
<date-of-birth>2023-05-18</date-of-birth>
"date-of-birth": "2023-05-18"
<box width="5" height="10"/>
"box": { "@width": "5", "@height": "10" }
<label id="t41">Warning!</label>
"label": { "@id": "t41", "#content": "Warning!" }
<box>
    <width>5</width>
    <height>10</height>
</box>
"box": {
    "width": 5, 
    "height": 10
}
<polygon>
    <point x="0" y="0"/>
    <point x="1" y="0"/>
    <point x="1" y="1"/>
    <point x="0" y="1"/>
</polygon>
"polygon": [
    { "x": 0, "y": 0 }, 
    { "x": 1, "y": 0 }, 
    { "x": 1, "y": 1 }, 
    { "x": 0, "y": 1 }
]

This specification defines a number of named mappings, called layouts, and allows the layout for a particular element to be selected in a number of different ways:

  • The layout to be used for a specific elements can be explicitly selected by supplying a conversion plan as input to the fn:element-to-map function.

  • It is possible to construct a conversion plan by analyzing a corpus of documents using the fn:element-to-map-plan function.

  • It is also possible to construct a conversion plan manually, or to modify the conversion plan produced by the fn:element-to-map-plan function before use.

  • In the absence of an explicit conversion plan, if the data has been schema-validated, the layout is inferred from the content model for the element type as defined in the schema.

  • When the data is untyped and no specific layout has been selected, a default layout is chosen based on the properties of the individual element instance.

The advantage of using schema information is that it gives a consistent representation for all elements of a particular type, even if they vary in content: for example if an element type allows optional attributes, the JSON representation will be consistent between those elements that have attributes and those without. In the absence of a schema, consistency can be achieved by supplying a conversion plan that applies uniformly to multiple documents.

The different layouts available are defined in the following sections. For each layout there is a table showing:

  • Layout name: the name to be used to select this layout in a conversion plan supplied to the fn:element-to-map function.

  • Usage: the situations for which this layout is designed.

  • Example input: an example of a typical element for which this layout is appropriate, shown as serialized XML.

  • Example output: the result of converting this example, shown as serialized JSON. The result is always shown as a singleton map, which is how it will appear when the layout is used for the top-level elements supplied in the $elements argument; when used to convert a descendant element, the corresponding key-value pair may appear as part of a larger map, depending on the layout chosen for its parent element..

    Note:

    The fn:element-to-map function produces a map as its result, but it is convenient to illustrate the form of the map by showing the effect of serializing the map as JSON.

  • Mapping rules: The rules for mapping the XML element to an XDM map representation.

  • Mapping for nilled elements: special rules that apply to an element having the attribute xsi:nil="true". These rules only apply if the element has been schema-validated.

  • Errors: situations where the layout cannot be used, and where attempting to use it will fail. For example, the empty layout cannot be used for an element that is not empty. In such a situation the recovery action is as follows, in order:

    1. Attributes are dropped, and if this is sufficient to enable the layout to be used, then the element is converted without its attributes.

    2. If the type of an element or attribute in the conversion plan is given as boolean or numeric, but the actual value of the element or attribute is not castable to xs:boolean or xs:numeric respectively, then the node is output ignoring the type property, that is, as an instance of xs:untypedAtomic.

    3. If the conversion plan supplies a fallback layout (an entry with key "*"), then the fallback layout is used.

    4. The element-to-map function fails with a dynamic error.

The rules for selecting the layout for a particular element are given later, in 14.5.614.5.5 Selecting an element layout.

Note that it is possible to request any layout for any element. If an inappropriate layout is chosen for a particular element (for example, empty layout for an element that is not empty), then the rules for that layout specify what happens. It is possible to specify a fallback layout for use when the selected layout fails: this will typically be a layout such as xml or mixed that can handle any element.

Note:

Acknowledgements for this categorization: see [Goessner]. Although Goessner's categories have been used, the detailed mappings vary from his proposal.

14.5.1.2 Layout: Empty Content with Attributes
Layout name

empty-plus

Usage

Intended for XML elements that have no content but may have attributes.

Example input
<hr class="ccc" id="zzz"/>
Example output
{ "hr": { "@class": "ccc", "@id": "zzz" } }
Mapping rules

The content is represented by a map containing one entry for each attribute in the XML element; if there are no attributes, the content is represented as the empty map. The rules for attribute names are defined in 14.5.714.5.6 Element and Attribute Names, and the rules for attribute content in 14.5.814.5.7 Element and Attribute Content.

Mapping for nilled elements

An additional key-value pair "#content": #fn:null is added, which serializes in JSON as "#content": null. For example <hr id="x" xsi:nil="true"/> becomes { "hr": { "@id": "x", "#content": #fn:null } }.

Errors

Child comment nodes, processing instructions, and whitespace-only text nodes are discarded.

If any other child nodes are present, this layout fails.

14.5.1.3 Layout: Simple Content
Layout name

simple

Usage

Intended for XML elements that have simple content and no attributes.

Example input
<date>2023-05-30</date>
Example output
{ "date": "2023-05-30" }
Mapping rules

The element is atomized and the resulting atomized value is handled as described in 14.5.814.5.7 Element and Attribute Content. If atomization fails, the element is treated as if it were untyped.

Note:

If the element is untyped, the atomized value will always appear in the result as an instance of xs:untypedAtomic.

Mapping for nilled elements

The content is represented by the value #fn:null, which is serialized as the JSON value null. For example. <name xsi:nil="true"/> becomes { "name": #fn:null }.

Errors

Attributes are discarded, along with child comment nodes and processing instructions; whitespace is retained.

If any child elements are present, this layout fails.

14.5.1.4 Layout: Simple Content with Attributes
Layout name

simple-plus

Usage

Intended for XML elements that have simple content and (optionally) attributes.

Example input
<price currency="USD">23.50</date>
Example output
{ "price": { "@currency": "USD", "#content": 23.50 } }
Mapping rules

The element is represented by a map containing one entry for each of its attributes, plus an entry with key "#content" representing the result of atomizing the element. The atomized value is handled as described in 14.5.814.5.7 Element and Attribute Content.

The rules for attribute names are defined in 14.5.714.5.6 Element and Attribute Names, and the rules for attribute content in 14.5.814.5.7 Element and Attribute Content.

Note:

If the element is untyped, the value of each attribute, and of "#content", will always be an instance of xs:untypedAtomic.

If the element has been schema-validated, the types of the items in the atomized value are retained.

Mapping for nilled elements

The "#content" property is represented by the value #fn:null, which is serialized in JSON as null.

Errors

Child comment nodes and processing instructions are discarded; whitespace is retained.

If any child elements are present, this layout fails.

14.5.1.9 Layout: Mixed
Layout name

mixed

Usage

Intended for XML elements that contain mixed content (that is, elements that contain both child elements and child text nodes, intermingled). The element may or may not have attributes.

Example input
<para id="x">This is a <i>fine</i> mess!</para>
Example output
{ "para": [
     { "@id": "x" },
     "This is a ",
     { "i": "fine" },
     "mess!"
   ] }
Mapping rules

The content is represented by an XDM array containing one entry for each attribute in the XML element, and one entry for each child node, in order.

Each attribute node is represented within this array by a single-entry map: the rules for attribute names are defined in 14.5.714.5.6 Element and Attribute Names, and the rules for attribute content in 14.5.814.5.7 Element and Attribute Content.

Child nodes are represented within the array as follows:

  • A text node child is represented as an atomic item of type xs:untypedAtomic.

  • An element node child is represented as a map containing a single entry, with the key representing the element name and the value representing the element's content, formatted according to the chosen layout for that element.

  • A comment node is represented as a map containing a single entry whose key is the string "#comment", and whose corresponding value is an atomic item of type xs:string containing the text of the comment.

  • A processing instruction node is represented as a map containing a single entry whose key is the string "#processing-instruction" and whose value is a map with two entries: the first has the key "#target" with the value being the name of the processing instruction as an atomic item of type xs:NCName; the second has the key "#data" with the value being an atomic item of type xs:string containing the string value of the processing instruction node.

Whitespace text nodes are retained.

Mapping for nilled elements

A nilled element is indicated by including an additional map { "#content" : #fn:null} in the array, after any attributes. For example, <para id="p2" xsi:nil="true"/> becomes {"para": [ { "id": "p2" }, { "#content": #fn:null } ] }. In JSON the value #fn:null is serialized as null.

Errors

All children are retained, including comments, processing instructions, and text nodes, whether or not they are whitespace-only.

This layout never fails.

14.5.2 Creating a conversion plan

It is possible to create a conversion plan by analyzing a collection of sample input documents. The function fn:element-to-map-plan is supplied with a collection of nodes (which will normally be element or document nodes), and it examines all the elements within the trees rooted at these nodes, looking for commonalities among like-named elements.

The output of this function (the conversion plan) holds information about how elements and attributes (identified by name) should be converted.

For elements, the information is primarily a mapping from element names (xs:QName instances) to layout names. In some cases additional information beyond the layout name is also included. The conversion plan is represented as an XDM map, whose structure is defined in this specification. A conversion plan can be constructed directly, or the plan produced by calling fn:element-to-map-plan can be modified before use. The plan can be serialized using the JSON output method and reloaded so that the same plan is used whenever a query or stylesheet is executed.

The fn:element-to-map-plan function selects a layout for a given element name N by applying the following rules:

  1. Let $EE be the set of all elements named N, specifically $input/descendant-or-self::*[node-name(.) eq N].

  2. If empty($EE/(* | text()) (that is, if there are no child elements or text nodes) then:

    1. If empty($EE/@*) (that is, if there are no attributes), then the layout is empty: see 14.5.1.1 Layout: Empty Content.

    2. Otherwise, the layout is empty-plus: see 14.5.1.2 Layout: Empty Content with Attributes.

  3. If empty($EE/*) (that is, if there are no child elements) then:

    1. If empty($EE/@*) (that is, if there are no attributes) then the layout is simple: see 14.5.1.3 Layout: Simple Content.

    2. Otherwise, simple-plus: see 14.5.1.4 Layout: Simple Content with Attributes.

    3. The plan also includes the property type. If all the elements in $EE are castable as xs:boolean, then the type is boolean; otherwise, if all the elements in $EE as castable as xs:numeric, then the type is numeric; otherwise, the type is string.

  4. If empty($EE/text()[normalize-space()]) (that is, there are no text node children other than whitespace), then:

    1. If all-equal($EE/*/node-name()) and exists($EE/*[2]) (that is, if all child elements have the same name, and at least one element has multiple child elements), then:

      1. If empty($EE/@*) (that is, if there are no attributes) then list: see 14.5.1.5 Layout: Simple List.

      2. Otherwise, list-plus: see 14.5.1.6 Layout: List with Attributes.

    2. If every $e in $EE satisfies all-different($e/*/node-name()) (that is, the child elements are uniquely named among their siblings), then record: see 14.5.1.7 Layout: Record.

    3. Otherwise, sequence: see 14.5.1.8 Layout: Sequence.

  5. Otherwise, mixed: see 14.5.1.9 Layout: Mixed.

For elements with simple content (more specifically, elements where the chosen layout is simple or simple-plus) the conversion plan also includes an entry indicating whether the content should be represented as a boolean, a number, or a string. If every instance of the element name has content that is castable to xs:boolean, the plan indicates "type": "boolean". If every instance of the element name has content that is castable to xs:numeric, the plan indicates "type": "numeric". In other cases, the plan indicates "type": "string"; however, this may be omitted because it is the default.

For attributes, the conversion plan identifies whether attributes (with a given name) should be represented as booleans, numbers, or strings; alternatively, it may indicate that attributes with a given name should be discarded. For every distinct attribute name present in the input, an entry is output associating the attribute name with one of the types boolean or numeric; the entry is generally omitted when the values are to be represented as strings, though the type can also be given explicitly as string. An entry with type boolean is generated for an attribute name if all the attributes with that name are castable as xs:boolean. Similarly, an entry with type numeric is generated for an attribute name if all the attributes with that name are castable as xs:numeric. In other case, the attributes are treated as being of type string. Entries with type string may be omitted, since that is the default. The entry for an attribute may also specify "type": "skip" to indicate that the attribute should be discarded.

A plan that is produced by analyzing a corpus of input documents can then be customized by the user if required. For example:

  • If simple layout is chosen for a particular element name, but it is known that some documents might be encountered in which that element has attributes, then simple might be changed to simple-plus.

  • If record layout is chosen for a particular element name, but it is known that some documents might be encountered in which child elements can be repeated, then record might be changed to sequence.

  • If a generated plan determines that phone numbers should be represented as numbers, it might be modified to treat them as strings.

The conversion plan is a map of type map(xs:string, record(*)). The key is an element or attribute name, representing element names in the form Q{uri}local, and attributes in the form @Q{uri}localnotation: in both cases the Q{uri} part must be omitted for a name in no namespace. Strings are used as keys in preference to xs:QName instances to allow the plan to be serialized in JSON format.

A more detailed definition of the structure is given in 14.5.414.5.3 Structure of the conversion plan.

A small example might be (in its JSON serialization):

{ "bookList": { "layout": "list", "child": "book" },
  "book": { "layout": "record" },
  "author": { "layout": "simple" },
  "title": { "layout": "simple" },
  "price": { "layout": "simple", "type": "numeric" },
  "hardback": { "layout": "simple", "type": "boolean" },
  "@out-of-print": { "type": "boolean" },
  "@Q{http://www.w3.org/2001/XMLSchema-instance}nil": { "type": "skip" }
}

14.5.3 Attributes in the xsi namespace

This section defines modifications to the above rules that apply to elements having attributes in the xsi namespace (that is, http://www.w3.org/2001/XMLSchema-instance).

  • When analyzing a corpus using fn:element-to-map-plan, elements having the attribute xsi:nil="true" are ignored. If all elements with a given name have this attribute, allocate the layout mixed.

  • When deciding whether an element has any attributes (for example to decide between the layouts empty and empty-plus), all attributes in the xsi namespace are ignored.

  • When converting an individual element to a map, all attributes in the xsi namespace are ignored.

  • Notwithstanding the above, elements having the nilled property (which essentially means they are schema-validated and have the attribute xsi:nil="true"), are treated specially by each of the possible element layouts.

14.5.414.5.3 Structure of the conversion plan

This section provides a definition of the structure of the conversion plan that is output by the fn:element-to-map-plan function, and used as input to the fn:element-to-map function.

The structure is defined by the following item type:

map( xs:string,
     record ( layout? as enum("empty", "empty-plus", "simple", "simple-plus",
                              "list", list-plus",
                              "record", "sequence", "mixed",
                              "xml", "error", "deep-skip"),
              child? as xs:string,
              type? as enum("boolean", "numeric", "string", "skip")
              * )
)

The rules relating to this structure are as follows:

  1. The keys of the map entries are strings of the form:

    1. local-name representing the name of an element in no namespace.

    2. Q{uri}local-name representing the name of an element in a namespace.

    3. * representing a fallback rule for use with elements where either (a) there is no more specific rule, or (b) processing using the selected layout fails.

    4. @local-name representing the name of an attribute in no namespace.

    5. @Q{uri}local-name representing the name of an attribute in a namespace.

    Any entries whose keys are not in this format will be ignored.

  2. The layout entry is present if and only if the key represents the name of an element.

  3. The child entry is present if and only if the value of layout is list or list-plus. It represents an element name in the format local-name for a name in no namespace, or Q{uri}local-name for a name in a namespace.

  4. The type entry is present if, and only if, one of the following conditions applies:

    1. The key represents the name of an attribute.

    2. The layout is simple or simple-plus. In this case the value must not be "skip".

If additional entries (beyond those described above) are present in any of the maps, they are ignored, provided that the map is coercible to the given type definition.

The fallback rule (with key "*") is used to process elements whose name has no specific entry, and also for elements where normal processing fails (for example when the selected layout is "empty", but the element has children). If no fallback rule is present then "error" is assumed: this causes processing to fail with a dynamic error. The fallback rule will typically set the layout property to one of the following:

  • error: this causes the function to fail with a dynamic error.

  • deep-skip: this causes the element and its content (recursively) to be omitted from the output.

  • mixed: this causes the element to be output using layout mixed

  • xml: this outputs the element to be output using layout xml, which represents the content as a string containing serialized XML.

However, any layout may be used as the fallback; if it fails, the error is unrecoverable.

14.5.514.5.4 Schema-based conversion

As an alternative to constructing a conversion plan by analyzing a corpus of specimen documents, conversion may be controlled using type annotations derived from schema validation.

If the function element-to-map encounters an element whose name is not present in the conversion plan (including the case where no plan is supplied), and if the element has a type annotation T other than xs:anyType or xs:untyped, then the following rules apply:

Note:

This section uses the notation {prop} to refer to properties of schema components, as defined in [XSD 1.1 Part 1]. The schema component model from XSD 1.1 is used; when XSD 1.0 is used for validation, some properties such as {open content} will inevitably be absent.

  1. Let zeroLength(ST) be true for a simple type ST if any of the following conditions is true:

    1. ST.{variety} = list, and ST.{facets} includes a length or maxLength facet whose value is 0 (zero).

    2. ST.{variety} = atomic, and ST.{facets} includes a length or maxLength facet whose value is 0 (zero).

    3. ST.{variety} = atomic, and ST.{facets} includes an enumeration facet constraining the value to be zero-length.

    4. ST.{variety} = atomic, and ST.{facets} includes a pattern facet with the value "" (a zero-length string).

  2. If T is a simple type:

    1. If zeroLength(T), then the selected layout is empty (see 14.5.1.1 Layout: Empty Content).

    2. Otherwise, the selected layout is simple (see 14.5.1.3 Layout: Simple Content), and the selected type is boolean if T is derived from xs:boolean; numeric if T is derived from xs:decimal, xs:double, or xs:float; or string otherwise.

  3. Otherwise (if T is a complex type):

    1. Let $noAttributes be true if T.{attribute uses} is empty and T.{attribute wildcard} is absent.

    2. If T.{content type}.{variety} = empty, then:

      1. If $noAttributes and if empty layout is not disabled, then the selected layout is empty (see 14.5.1.1 Layout: Empty Content).

      2. Otherwise, the selected layout is empty-plus (see 14.5.1.2 Layout: Empty Content with Attributes).

    3. If T.{content type}.{variety} = simple (a complex type with simple content), then:

      1. Let ST be T.{content type}.{simple type definition} (the corresponding simple type).

      2. If zeroLength(ST), then:

        1. If $noAttributes, the selected layout is empty (see 14.5.1.1 Layout: Empty Content).

        2. Otherwise, the selected layout is empty-plus (see 14.5.1.2 Layout: Empty Content with Attributes).

      3. Otherwise:

        1. If $noAttributes, the selected layout is simple (see 14.5.1.3 Layout: Simple Content).

        2. Otherwise the selected layout is simple-plus (see 14.5.1.4 Layout: Simple Content with Attributes).

        3. In both cases the selected type is one of booleannumeric, or string, chosen in the same way as for elements having a simple type.

    4. If T.{content type}.{variety} = element-only (a complex type with an element-only content model):

      1. Let $noWildcards be true if T.{content type}.{open content} is absent, and T.{content type}.{particle}, expanded recursively, contains no wildcard term.

      2. Let $childCardinalities be a set of (xs:QName, xs:double) pairs representing the expanded names of the element declaration terms within T.{content type}.{particle}, expanded recursively, and for each one, the maximum number of occurrences of elements with that name, computed using the value of the {maxOccurs} property of the particles at each level, taking the value unbounded as positive infinity.

      3. If $noWildcards is true, and if $childCardinalities contains a single entry, and that entry has a cardinality greater than one, then:

        1. If $noAttributes then the selected layout is list (see 14.5.1.5 Layout: Simple List).

        2. Otherwise, the selected layout is list-plus (see 14.5.1.6 Layout: List with Attributes).

      4. If $noWildcards is true, and if every entry in $childCardinalities has a cardinality of one, then the selected layout is record (see 14.5.1.7 Layout: Record).

      5. Otherwise, the selected layout is sequence (see 14.5.1.8 Layout: Sequence).

    5. Otherwise (that is, when T.{content type}.{variety} = mixed, the selected layout is mixed (see 14.5.1.9 Layout: Mixed).

For attribute nodes, the selected type is boolean if the type annotation is derived from xs:boolean; numeric if the type annotation is derived from xs:decimal, xs:double, or xs:float; and string otherwise.

14.5.614.5.5 Selecting an element layout

The various layouts available for elements are described in 14.5.1 Element Layouts. This section defines the rules for selecting an element layout for a given element E. The rules are applied in order.

  1. If an explicit layout is given for the element name of E in the conversion plan supplied to the fn:element-to-map function call, then that layout is used. If the selected layout is deep-skip, then no output is produced for that element. If the selected layout is error, then the function fails with a dynamic error. If the selected layout fails for the element instance, then the fallback layout (identified with the key "*" in the conversion plan) is used; in the absence of a fallback layout, the function fails with a dynamic error.

  2. Otherwise (when no explicit layout is given for E), if the type annotation of the element is something other than xs:untyped or xs:anyType, then a schema-determined layout is used as defined in 14.5.514.5.4 Schema-based conversion.

  3. Otherwise, if the conversion plan supplies a fallback layout (identified with the key "*"), then the fallback layout is used.

  4. If the above rules do not provide a layout for E, then a conversion plan for E is determined by applying the rules in 14.5.2 Creating a conversion plan, with an input that contains the single element E and no others. (Only the element E itself is considered, not its descendants.)

14.5.714.5.6 Element and Attribute Names

The name-format option gives control over how element and attribute names are formatted. There are four options:

  • The default option (which may be explicitly requested by specifying "name-format": "default") retains the namespace URI for any element that is either (a) the top-level element of a tree being converted, or (b) has a name that is in a different namespace from its parent element. In such cases the format "Q{uri}local" is used. For other elements, the name is output using the local part of the element name alone. For attributes, the form "Q{uri}local" is used for an attribute in a namespace, and the local name alone is used for a no-namespace name. Namespace prefixes are not retained.

  • The option eqname uses the format "Q{uri}local" for all element and attribute names that are in a namespace, or the local name alone for all names that are not in a namespace.

  • The option local discards all namespace information: all elements and attributes are output using the local name alone.

  • The option lexical outputs element and attribute names in the form obtained by calling the function fn:name. If the name has a prefix, the prefix is retained in the output. However, the output contains no information that enables the prefix to be associated with a namespace URI, so this format is suitable only when prefixes in the input documents are used predictably.

Regardless of the chosen name-format, and regardless of the above rules, attributes in the xml namespace (http://www.w3.org/XML/1998/namespace) are output using a lexical QName, with the prefix xml.

Attribute names in the output are typically prefixed with the character "@". The option attribute-marker allows this to be changed to a different prefix or none.

Whichever format of names is chosen, if the rules for the selected layout would result in an output map having two entries with the same key, the conflict is resolved by combining these entries into an array. For example if name-format is set to local then the element <data x:val="3" y:val="4"/> becomes either { "data": { "@val": ["3", "4"] } } or (because attribute order is unpredictable) { "data": { "@val": ["4", "3"] } }.

14.5.814.5.7 Element and Attribute Content

The conversion plan may indicate that element content is to be output as type string, numeric, or boolean: the default is string. In the case of untyped elements and attributes, the value is output as an instance of a string, numeric, or boolean type, according to this prescription. Specifically:

  • If the prescribed type is boolean and the value is castable as xs:boolean, then it is output as an instance of xs:boolean.

  • If the prescribed type is numeric and the value is castable as xs:numeric, then it is output as an instance of xs:integer, xs:decimal, or xs:double depending on the lexical form of the value, following the same rules as for XPath numeric literals. For example, "-1" becomes an xs:integer, 12.00 becomes an xs:decimal, and 1e-3 becomes an xs:double. The special xs:double values NaN and INF (which cannot be used as numeric literals) are also recognized.

  • In all other cases the value is output as an instance of xs:untypedAtomic, retaining its original lexical form.

Where the element or attribute is schema-validated, however:

  1. If an element has the nilled property (that is, xsi:nil="true"), then the mapping for nilled elements with the chosen layout is used.

  2. Let AV be the typed value of the node (that is, the result of atomization).

  3. If, however, an element is annotated with a type that does not allow atomization (specifically, a complex type with element-only content) then let AV be the string value of the element, as an atomic item of type xs:untypedAtomic.

  4. If an attribute is annotated as having a simple type of {variety} list, or if an element using layout simple or simple-plus is annotated as having either a simple type of {variety} list or a complex type with simple content of {variety} list then the atomized value AV is represented in the result as the array represented by the XPath expression array{AV}. This applies whether or not the atomized value actually contains multiple atomic items. The individual atomic items in the array retain their type, for example items of type xs:date remain items of type xs:date in the result.

  5. In all other cases AV will be a single atomic item, and this value is used as is, retaining its type.

Note:

Atomic items in the result of the fn:element-to-map function may thus be of any atomic type. The type information is lost if the result is subsequently serialized as JSON.

14.5.914.5.8 Lost XDM Information

This section is non-normative. Its purpose is to explain what information available in the XDM nodes supplied as input to the fn:element-to-map function is missing from the output.

  • Element and attribute names: If the chosen name-format is default or eqname, then local names and namespace URIs of elements and attributes are retained, but namespace prefixes are lost. If the chosen name-format is lexical, then prefixes are retained but namespace URIs are lost. If the chosen name-format is local then only local names are retained; namespace URIs and prefixes are lost.

    In addition, element names are lost when the parent element is mapped using list layout: see 14.5.1.5 Layout: Simple List.

  • In-scope namespaces: All information about in-scope namespaces (and in particular, bindings for namespaces that are declared but not used in element and attribute names) is lost.

  • Comments and processing instructions: Comments and processing instructions are lost except when they appear as children of elements that are mapped using the sequence, mixed or xml layouts.

  • Text nodes: Whitespace text nodes are discarded when they appear as children of elements that are mapped using the empty, empty-plus, list, list-plus, record, or sequence layouts. Non-whitespace text nodes are never discarded.

  • Additional node properties: The values of the is-id, is-idref, and is-nilled properties of a node are lost.

  • Type annotations: The values of type annotations on elements are lost. Type annotations on atomized values of schema-validated nodes, however, are retained.

  • Element order: The order of child elements is lost when record layout is used and the element has multiple children with the same name.

  • XSI attributes: Attributes in the xsi namespace (for example, xsi:type and xsi:nil) are not represented in the result. .

14.5.1014.5.9 Examples

The following examples show the effect of transforming some simple XML documents with default options, and then serializing the result as JSON with indent is set to true. The actual indentation is implementation dependent.

XDM elementJSON serialization of result
<a x='1' b='2'/>
{ "a":{
    "@x": "1",
    "@b": "2"
  } }
<a><x>1</x><y>2</y></a>
{ "a":{
    "x": "1",
    "y": "2"
  } }
<polygon> 
   <point x='0' y='0'/> 
   <point x='0' y='1'/> 
   <point x='1' y='1'/> 
   <point x='1' y='0'/>
</polygon>
{ "polygon":[
       {"@x": "0", "@y": "0"},
       {"@x": "0", "@y": "1"},
       {"@x": "1", "@y": "1"},
       {"@x": "1", "@y": "0"}
  ] ] }
<cities>
   <city id="LDN">
      <name>London</name>
      <size>18.2</size>
   </city>
   <city id="PRS">
      <name>Paris</name>
      <size>19.1</size>
   </city>
   <city id="BLN">
      <name>Berlin</name>
      <size>14.6</size>
   </city>
</cities>
{ "cities":[
    {
      "@id": "LDN",
      "name": "London",
      "size": "18.2"
    },
    {
      "@id": "PRS",
      "name": "Paris",
      "size": "19.1"
    },
    {
      "@id": "BLN",
      "name": "Berlin",
      "size": "14.6"
    }
  ] }

The following more complex example demonstrates a case where the default conversion is inadequate (for example, it wrongly assumes that for the third production, the order of child elements is immaterial). A better result, shown below, can be achieved by using a schema-aware conversion.

XDM elementJSON serialization of result
<g:grammar language="xquery"
    xmlns:g="http://www.w3.org/XPath/grammar">
  <g:production
    name="FunctionBody">
    <g:ref name="EnclosedExpr"/>
    <g:ref name="Block"/>
  </g:production>
  <g:production
    name="EnclosedExpr">
    <g:ref name="Lbrace"/>
    <g:ref name="Expr"/>
    <g:optional>
      <g:ref name="Expr"/>
    </g:optional>
    <g:ref name="Rbrace"/>
  </g:production>
  <g:production
    name="SimpleReturnClause">
    <g:string>return</g:string>
    <g:ref name="ExprSingle"/>
  </g:production>
</g:grammar>
[{ "Q{http://www.w3.org/XPath/grammar}grammar": [
  { "@language": "xquery" },
  { "production": [
    { "@name": "FunctionBody" },
    { "ref": { "@name": "EnclosedExpr" } },
    { "ref": { "@name": "Block" } }
  ] },
  { "production": [
    { "@name": "EnclosedExpr" },
    { "ref": { "@name": "Lbrace" } },
    { "ref": { "@name": "Expr" } },
    { "optional":[
      { "ref": { "@name": "Expr" } }
    ] },
    { "ref": { "@name": "Rbrace" } }
  ] },
  { "production": [
    { "@name": "SimpleReturnClause" },
    { "string": "return" },
    { "ref": { "@name": "ExprSingle" } }
  ] }
] }]

Note:

In the above example, the schema used to validate the source document was simplified to eliminate options that do not actually arise in this input instance (such as the g:string element having attributes). This is a legitimate technique that may be useful when trying to obtain the simplest possible JSON representation.

Further improvements to the usability of the JSON output could be achieved by doing some simple transformation of the XML prior to conversion. For example, the name attribute of various productions could be converted to a child element, and <ref name="x"/> could be transformed to <ref>x</ref>.

14.5.1114.5.10 fn:element-to-map-plan

Changes in 4.0 (next | previous)

  1. New in 4.0  [Issue 1797 PR 1906 29 April 2025]

Summary

Analyzes sample data to generate a conversion plan suitable for use by the element-to-map function.

Signature
fn:element-to-map-plan(
$inputas (document-node() | element())*
) as map(xs:string, record(*))
Properties

This function is deterministic, context-independent, and focus-independent.

Rules

The function takes as input a collection of document and element nodes and analyzes the trees rooted at these nodes to determine a conversion plan for converting elements in these trees to maps, suitable for serialization in JSON format. The conversion plan can be used as-is by supplying it directly to the element-to-map function; alternatively it can be amended before use. The plan can also be serialized to a file (in JSON format) allowing the same plan to be used repeatedly for transforming documents with a similar structure to those in the sample provided.

The rules followed by the function, and the detailed format of the conversion plan, are described in 14.5.2 Creating a conversion plan.

Formal Equivalent

The effect of the function is equivalent to the result of the following XQuery expression.

let $data-type := fn($nodes as node()*) {
  if (every($nodes ! (. castable as xs:boolean))) then "boolean"
  else if (every($nodes ! (. castable as xs:numeric))) then "numeric"
  else ()
}
let $name := fn($node as node()) {
  if (namespace-uri($node)) 
  then expanded-QName(node-name($node))
  else local-name($node)
}  
return (
  for $ee in $input/descendant-or-self::*
  group by $n := $name($ee)
  return { $n :
           if (empty($ee/(*|text())))
             then { 'layout' : if (empty($ee/@*)) 
                               then 'empty' 
                               else 'empty-plus' } 
           else if (empty($ee/*)) 
             then map:merge((
                    if (empty($ee/@*)) 
                      then {'layout': 'simple'}
                      else {'layout': 'simple-plus'},
                    $data-type($ee) ! { 'type': . }
                 ))
           else if (empty($ee/text()[normalize-space()])) 
             then if (all-equal($ee/*/node-name()) and exists($ee/*[2]))
                    then { 'layout': if (empty($ee/@*)) 
                                     then 'list' 
                                     else 'list-plus',
                           'child': $name(head($ee/*))
                         }
                    else { 'layout' : if (every($ee ! all-different(*/node-name())))
                                      then 'record'
                                      else 'sequence'
                         }             
           else {'layout': 'mixed'}
        },
  for $a in $input//@*
  group by $n := $name($a)
  let $t := $data-type($a)
  return $t ! { `@{$n}`: { 'type': $t } }
) => map:merge()
Notes

The conversion plan is organized by element and attribute name, so its effectiveness depends on the $input collection being homogenous in its structure, and representative of the documents that will subsequently be converted using the element-to-map function.

This function is separate from the element-to-map function for a number of reasons:

  • The collection of documents that need to be analyzed to establish an effective conversion plan might be much smaller than the set of documents actually being converted.

  • Conversely, it might be that only a small number of documents need to be converted at a particular time, but the conversion plan used needs to take into account variations that might exist within a larger corpus.

  • If JSON output is required in a particular format, it might be necessary to fine-tune the automatically generated conversion plan to take account of these requirements.

  • It might be necessary to devise a conversion plan that can be used to convert individual documents as they arrive over a period of time, and to ensure that the same conversion rules are applied to each document even though documents might exhibit variations in structure.

  • The conversion plan is human-readable, which can help in understanding why the output of element-to-map is in a particular form.

Examples
Expression:
element-to-map-plan(<a><b>3</b><b>4</b></a>)
Result:
{ 'a': { 'layout': 'list', 'child': 'b' },
  'b': { 'layout': 'simple', 'type': 'numeric' }
}
Expression:
element-to-map-plan((<a x="2">red</a>, <a x="3">blue</a>))
Result:
{ 'a': { 'layout': 'simple-plus' },
  '@x': { 'type': 'numeric' }
}
Expression:
element-to-map-plan(
   <a xmlns="http://example.ns">H<sub>2</sub>SO<sub>4</sub></a>
)
Result:
{ 'Q{http://example.ns}a': { 'layout': 'mixed' },
  'Q{http://example.ns}sub': { 'layout': 'simple', 'type': 'numeric' }
}
Expression:
element-to-map-plan((<a><b/><b/></a>, <a><b/><c/></a>))
Result:
{ 'a': { 'layout': 'sequence' },
  'b': { 'layout': 'empty' },
  'c': { 'layout': 'empty' }
}

14.5.1214.5.11 fn:element-to-map

Changes in 4.0 (next | previous)

  1. New in 4.0.  [Issue 1797 PR 1906 29 April 2025]

Summary

Converts an element node into a map that is suitable for JSON serialization.

Signature
fn:element-to-map(
$elementas element()?,
$optionsas map(*)?:= {}
) as map(xs:string, item()?)?
Properties

This function is deterministic, context-independent, and focus-independent.

Rules

This function returns a map derived from the element node supplied in $element. The map is in a form that is suitable for JSON serialization, thus providing a mechanism for conversion of arbitrary XML to JSON.

The map that is returned will always be a single-entry map; the key of this entry will be a string representing the element name, and the value of the entry will be a representation of the element's attributes and children.

The entries that may appear in the $options map are as follows. The option parameter conventions apply.

record(
plan?as map(xs:string, record(layout?, child?, type?)),
attribute-marker?as xs:string,
name-format?as xs:string
)
KeyValueMeaning

plan?

A conversion plan, supplied as a map whose keys represent element and attribute names. The plan might be generated using the function element-to-map-plan, or it might be constructed in some other way. The format of the plan is described in 14.5.2 Creating a conversion plan.
  • Type: map(xs:string, record(layout?, child?, type?))

  • Default: {}

attribute-marker?

A string that is prepended to any key value in the output that represents an XDM attribute node in the input. The string may be empty. If, after applying the requested prefix (or no prefix) there is a conflict between the names of attributes and child elements, then the requested prefix (or lack thereof) is ignored and the default prefix "@" is used.
  • Type: xs:string

  • Default: "@"

name-format?

Indicates how the names of element and attribute nodes are handled.
  • Type: xs:string

  • Default: "default"

lexicalNames are output in the form produced by the fn:name function.
localNames are output in the form produced by the fn:local-name function.
eqnameNames in a namespace are output in the form "Q{uri}local". Names in no namespace are output using the local name alone.
defaultAn element name is output as a local name alone if either (a) it is a top-level element and is in no namespace, or (b) it is in the same namespace as its parent element. An attribute name is output as a local name alone if it is in no namespace. All other names are output in the format "Q{uri}local" if in a namespace, or "Q{}local" if in no namespace. "Top-level" here means that the element is one that appears explicitly in the sequence of elements passed in the $elements argument, as distinct from a descendant of such an element.

If $element is the empty sequence, the result is the empty sequence.

The principles for conversion from elements to maps are described in 14.5.1 Element Layouts, and the rules for selecting an element layout for each element are given in 14.5.614.5.5 Selecting an element layout.

In general, every descendant element within the tree rooted at the supplied $element maps to a key-value pair in which the key represents the element name, and the corresponding value represents the attributes and children of the element. This key-value pair will be added to the content representing its parent element, in a way that depends on the parent element's layout.

The representation of a node of any other kind depends on the layout chosen for its parent element.

Error Conditions

A dynamic error [err:FOJS0008] occurs if any element cannot be processed using the selected layout for that element, unless fallback processing is defined; or if error action is explicitly requested for an element.

Any error in the conversion plan is treated as a type error [err:XPTY0004]XP whether or not it is technically a contravention of the defined type for the value. This relieves users and implementers of the burden of distinguishing different kinds of error in the plan.

Examples
Expression:

element-to-map(())

Result:

()

Expression:

element-to-map(<foo>bar</foo>)

Result:

{ "foo": "bar" }

Expression:
element-to-map(
    <list>
      <item value='1'/>
      <item value='2'/>
    </list>, { 'attribute-marker': '' }
  )
Result:
{ "list": [ 
    { "value": "1" },
    { "value": "2" }
  ] }
Expression:
element-to-map(
    <name>
      <first>Jane</first>
      <last>Smith</last>
    </name>
  )
Result:
{ "name": { 
  "first": "Jane",
  "last": "Smith" 
} }
Expression:
element-to-map(
    <name xmlns="http://example.ns/">
      <first>Jane</first>
      <middle>Elizabeth</middle>
      <middle>Mary</middle>
      <last>Smith</last>
    </name>, 
    { 'plan': {'Q{http://example.ns/}name': { 'layout': 'record' }},
      'name-format' : 'local'
    }
  )
Result:
{ "name": { 
    "first": "Jane",
    "middle": ["Elizabeth", "Mary"],
    "last": "Smith" 
  } 
}
Expression:
element-to-map(
    <name xmlns="http://example.ns/">
      <first>Jane</first>
      <middle>Elizabeth</middle>
      <middle>Mary</middle>
      <last>Smith</last>
    </name>, 
    { 'plan': {'Q{http://example.ns/}name': { 'layout': 'record' },
               'Q{http://example.ns/}middle': { 'layout': 'deep-skip' }
              },
      'name-format' : 'local'
    }
  )
Result:
{ "name": { 
    "first": "Jane",
    "last": "Smith" 
  } 
}

22 Constructor functions

Changes in 4.0 (next | previous)

  1. Constructor functions now have a zero-arity form; the first argument defaults to the context item.   [Issue 658 PR 662 29 August 2023]

Constructor functions are used to convert a supplied value to a given type, and the name of the function is the same as the name of the target type. This section describes constructor functions corresponding to the following types:

Constructor functions are defined for all user-defined named simple types, and for most built-in atomic, list, and union types. The only named simple types that have no constructor function are those that have no instances other than instances of their derived types: specifically, xs:anySimpleType, xs:anyAtomicType, and xs:NOTATION.

22.1 Constructor functions for XML Schema built-in atomic types

Every built-in atomic type that is defined in [XML Schema Part 2: Datatypes Second Edition], except xs:anyAtomicType and xs:NOTATION, has an associated constructor function. The type xs:untypedAtomic, defined in 2.7 Schema Information DM31 and the two derived types xs:yearMonthDuration and xs:dayTimeDuration defined in 2.7 Schema Information DM31 also have associated constructor functions. Implementations may additionally provide a constructor functions for the new datatype xs:dateTimeStamp introduced in [XSD 1.1 Part 2].

A constructor function is not defined for xs:anyAtomicType as there are no atomic items with type annotation xs:anyAtomicType at runtime, although this can be a statically inferred type. A constructor function is not defined for xs:NOTATION since it is defined as an abstract type in [XML Schema Part 2: Datatypes Second Edition]. If the static context (See 2.1.1 Static Context XP31) contains a type derived from xs:NOTATION then a constructor function is defined for it. See 22.5 Constructor functions for user-defined atomic and union types.

The form of the constructor function for an atomic type eg:TYPE is:

eg:TYPE(
$valueas xs:anyAtomicType?:= .
) as eg:TYPE?

If $arg is the empty sequence, the empty sequence is returned. For example, the signature of the constructor function corresponding to the xs:unsignedInt type defined in [XML Schema Part 2: Datatypes Second Edition] is:

xs:unsignedInt(
$argas xs:anyAtomicType?:= .
) as xs:unsignedInt?

Calling the constructor function xs:unsignedInt(12) returns the xs:unsignedInt value 12. Another call of that constructor function that returns the same xs:unsignedInt value is xs:unsignedInt("12").

The same result would also be returned if the constructor function were to be called with a node that had a typed value equal to the xs:unsignedInt 12. Because the declared parameter type for the argument is xs:anyAtomicType?, the coercion rules will atomize the supplied argument (see 2.4.2 Atomization XP31) to extract its typed value and then call the constructor with the atomized value.

Calling the constructor function xs:unsignedInt(12) returns the xs:unsignedInt value 12. Another call of that constructor function that returns the same xs:unsignedInt value is xs:unsignedInt("12"). The same result would also be returned if the constructor function were to be called with a node that had a typed value equal to the xs:unsignedInt 12. The standard features described in 2.4.2 Atomization XP31 would atomize the node to extract its typed value and then call the constructor with that value. If the value passed to a constructor function, after atomization, is not in the lexical space of the datatype to be constructed, and cannot be converted to a value in the value space of the datatype under the rules in 23 Castingthis specification, then an dynamic error is raised [err:FORG0001].

The semantics of the constructor function xs:TYPE(arg) are identical to the semantics of arg cast as xs:TYPE? . See 23 Casting.

If the argument to a constructor function is a literal, the result of the function may be evaluated statically; if an error is found during such evaluation, it may be reported as a static error.

Special rules apply to constructor functions for xs:QName and types derived from xs:QName and xs:NOTATION. See 22.2 Constructor functions for xs:QName and xs:NOTATION.

The argument is optional, and defaults to the context value (which will be atomized if necessary).

The following constructor functions for the built-in atomic types are supported:

  • xs:string(
    $valueas xs:anyAtomicType?:= .
    ) as xs:string?
  • xs:boolean(
    $valueas xs:anyAtomicType?:= .
    ) as xs:boolean?
  • xs:decimal(
    $valueas xs:anyAtomicType?:= .
    ) as xs:decimal?
  • xs:float(
    $valueas xs:anyAtomicType?:= .
    ) as xs:float?

    Implementations should return negative zero for xs:float("-0.0E0"). But because [XML Schema Part 2: Datatypes Second Edition] does not distinguish between the values positive zero and negative zero, implementations may return positive zero in this case.

  • xs:double(
    $valueas xs:anyAtomicType?:= .
    ) as xs:double?

    Implementations should return negative zero for xs:double("-0.0E0"). But because [XML Schema Part 2: Datatypes Second Edition] does not distinguish between the values positive zero and negative zero, implementations may return positive zero in this case.

  • xs:duration(
    $valueas xs:anyAtomicType?:= .
    ) as xs:duration?
  • xs:dateTime(
    $valueas xs:anyAtomicType?:= .
    ) as xs:dateTime?
  • xs:time(
    $valueas xs:anyAtomicType?:= .
    ) as xs:time?
  • xs:date(
    $valueas xs:anyAtomicType?:= .
    ) as xs:date?
  • xs:gYearMonth(
    $valueas xs:anyAtomicType?:= .
    ) as xs:gYearMonth?
  • xs:gYear(
    $valueas xs:anyAtomicType?:= .
    ) as xs:gYear?
  • xs:gMonthDay(
    $valueas xs:anyAtomicType?:= .
    ) as xs:gMonthDay?
  • xs:gDay(
    $valueas xs:anyAtomicType?:= .
    ) as xs:gDay?
  • xs:gMonth(
    $valueas xs:anyAtomicType?:= .
    ) as xs:gMonth?
  • xs:hexBinary(
    $valueas xs:anyAtomicType?:= .
    ) as xs:hexBinary?
  • xs:base64Binary(
    $valueas xs:anyAtomicType?:= .
    ) as xs:base64Binary?
  • xs:anyURI(
    $valueas xs:anyAtomicType?:= .
    ) as xs:anyURI?
  • xs:QName(
    $valueas xs:anyAtomicType?:= .
    ) as xs:QName?

    See 22.2 Constructor functions for xs:QName and xs:NOTATION for special rules.

  • xs:normalizedString(
    $valueas xs:anyAtomicType?:= .
    ) as xs:normalizedString?
  • xs:token(
    $valueas xs:anyAtomicType?:= .
    ) as xs:token?
  • xs:language(
    $valueas xs:anyAtomicType?:= .
    ) as xs:language?
  • xs:NMTOKEN(
    $valueas xs:anyAtomicType?:= .
    ) as xs:NMTOKEN?
  • xs:Name(
    $valueas xs:anyAtomicType?:= .
    ) as xs:Name?
  • xs:NCName(
    $valueas xs:anyAtomicType?:= .
    ) as xs:NCName?
  • xs:ID(
    $valueas xs:anyAtomicType?:= .
    ) as xs:ID?
  • xs:IDREF(
    $valueas xs:anyAtomicType?:= .
    ) as xs:IDREF?
  • xs:ENTITY(
    $valueas xs:anyAtomicType?:= .
    ) as xs:ENTITY?

    See 23.1.10 Casting to xs:ENTITY for rules related to constructing values of type xs:ENTITY and types derived from it.

  • xs:integer(
    $valueas xs:anyAtomicType?:= .
    ) as xs:integer?
  • xs:nonPositiveInteger(
    $valueas xs:anyAtomicType?:= .
    ) as xs:nonPositiveInteger?
  • xs:negativeInteger(
    $valueas xs:anyAtomicType?:= .
    ) as xs:negativeInteger?
  • xs:long(
    $valueas xs:anyAtomicType?:= .
    ) as xs:long?
  • xs:int(
    $valueas xs:anyAtomicType?:= .
    ) as xs:int?
  • xs:short(
    $valueas xs:anyAtomicType?:= .
    ) as xs:short?
  • xs:byte(
    $valueas xs:anyAtomicType?:= .
    ) as xs:byte?
  • xs:nonNegativeInteger(
    $valueas xs:anyAtomicType?:= .
    ) as xs:nonNegativeInteger?
  • xs:unsignedLong(
    $valueas xs:anyAtomicType?:= .
    ) as xs:unsignedLong?
  • xs:unsignedInt(
    $valueas xs:anyAtomicType?:= .
    ) as xs:unsignedInt?
  • xs:unsignedShort(
    $valueas xs:anyAtomicType?:= .
    ) as xs:unsignedShort?
  • xs:unsignedByte(
    $valueas xs:anyAtomicType?:= .
    ) as xs:unsignedByte?
  • xs:positiveInteger(
    $valueas xs:anyAtomicType?:= .
    ) as xs:positiveInteger?
  • xs:yearMonthDuration(
    $valueas xs:anyAtomicType?:= .
    ) as xs:yearMonthDuration?
  • xs:dayTimeDuration(
    $valueas xs:anyAtomicType?:= .
    ) as xs:dayTimeDuration?
  • xs:untypedAtomic(
    $valueas xs:anyAtomicType?:= .
    ) as xs:untypedAtomic?
  • xs:dateTimeStamp(
    $valueas xs:anyAtomicType?:= .
    ) as xs:dateTimeStamp?

    Available only if the implementation supports XSD 1.1.

H Changes since 3.1 (Non-Normative)

H.1 Summary of Changes

  1. If a section of this specification has been updated since version 3.1, an overview of the changes is provided, along with links to navigate to the next or previous change.

    See 1 Introduction

  2. Sections with significant changes are marked with a ✭ symbol in the table of contents. New functions are indicated by ✚.

    See 1 Introduction

  3. PR 1504 2329 

    New in 4.0

    See 2.1.7 fn:insert-separator

  4. New in 4.0

    See 2.1.10 fn:replicate

  5. New in 4.0

    See 2.1.12 fn:slice

  6. PR 1120 1150 

    A callback function can be supplied for comparing individual items.

    See 2.2.4 fn:deep-equal

  7. Changed in 4.0 to use transitive equality comparisons for numeric values.

    See 2.2.5 fn:distinct-values

  8. PR 614 987 

    New in 4.0

    See 2.2.6 fn:duplicate-values

  9. New in 4.0. Originally proposed under the name fn:uniform

    See 2.4.2 fn:all-equal

  10. New in 4.0. Originally proposed under the name fn:unique

    See 2.4.3 fn:all-different

  11. New in 4.0

    See 2.5.3 fn:every

  12. New in 4.0

    See 2.5.9 fn:highest

  13. New in 4.0

    See 2.5.10 fn:index-where

  14. New in 4.0

    See 2.5.11 fn:lowest

  15. New in 4.0

    See 2.5.15 fn:scan-right

  16. New in 4.0

    See 2.5.16 fn:some

  17. PR 795 2228 

    New in 4.0

    See 2.5.19 fn:sort-with

  18. PR 521 761 

    New in 4.0

    See 2.5.22 fn:transitive-closure

  19. New in 4.0

    See 4.4.5 fn:is-NaN

  20. PR 1260 1275 

    A third argument has been added, providing control over the rounding mode.

    See 4.4.6 fn:round

  21. PR 1049 1151 

    Decimal format parameters can now be supplied directly as a map in the third argument, rather than referencing a format defined in the static context.

    See 4.7.2 fn:format-number

  22. PR 1205 1230 

    New in 4.0

    See 4.8.2 math:e

    See 4.8.8 math:cosh

    See 4.8.15 math:sinh

    See 4.8.18 math:tanh

  23. The 3.1 specification suggested that every value in the result range should have the same chance of being chosen. This has been corrected to say that the distribution should be arithmetically uniform (because there are as many xs:double values between 0.01 and 0.1 as there are between 0.1 and 1.0).

    See 4.9.2 fn:random-number-generator

  24. PR 261 306 993 

    New in 4.0

    See 5.4.1 fn:char

  25. New in 4.0

    See 5.4.2 fn:characters

  26. PR 937 995 1190 

    New in 4.0

    See 5.4.13 fn:hash

  27. PR 215 415  

    New in 4.0

    See 7.6.2 fn:parse-uri

  28. PR 1423 1413 

    New in 4.0

    See 7.6.3 fn:build-uri

  29. New in 4.0

    See 9.4.2 fn:build-dateTime

  30. New in 4.0

    See 9.6.16 fn:parts-of-dateTime

  31. New in 4.0

    See 12.2.2 fn:in-scope-namespaces

  32. PR 1620 1886 

    Options are added to customize the form of the output.

    See 12.2.9 fn:path

  33. PR 1547 1551 

    New in 4.0

    See 12.2.11 fn:siblings

  34. PR 969 1134 

    New in 4.0

    See 14.4.6 map:filter

  35. PR 478 515 

    New in 4.0

    See 14.4.12 map:keys-where

  36. PR 1575 1906 

    A new function fn:element-to-map is provided for converting XDM trees to maps suitable for serialization as JSON. Unlike the fn:xml-to-json function retained from 3.1, this can handle arbitrary XML as input.

    See 14.5 Converting elements to maps

  37. New in 4.0

    See 15.2.3 array:empty

  38. PR 968 1295 

    New in 4.0

    See 15.2.13 array:index-of

  39. PR 476 1087 

    New in 4.0

    See 15.2.16 array:items

  40. PR 360 476 

    New in 4.0

    See 15.2.18 array:members

    See 15.2.19 array:of-members

  41. Supplying the empty sequence as the value of an optional argument is equivalent to omitting the argument.

    See 15.2.29 array:subarray

  42. PR 1117 1279 

    The $options parameter has been added.

    See 17.1.6 fn:unparsed-text-lines

  43. PR 259 956 

    A new function is available for processing input data in HTML format.

    See 17.3 Functions on HTML Data

    New in 4.0

    See 17.3.2 fn:parse-html

  44. PR 975 1058 1246 

    An option is provided to control how JSON numbers should be formatted.

    See 17.4.4 fn:parse-json

  45. Additional options are available, as defined by fn:parse-json.

    See 17.4.5 fn:json-doc

  46. PR 533 719 834 1066 

    New in 4.0

    See 17.5.4 fn:csv-to-arrays

    See 17.5.7 fn:parse-csv

  47. PR 533 719 834 1066 1605 

    New in 4.0

    See 17.5.10 fn:csv-to-xml

  48. PR 791 1256 1282 1405 

    New in 4.0

    See 17.6.1 fn:invisible-xml

  49. PR 629 803 

    New in 4.0

    See 21.2.2 fn:message

  50. PR 533 719 834 

    New functions are available for processing input data in CSV (comma separated values) format.

    See 17.5 Functions on CSV Data

  51. Comparison of mixed numeric types (for example xs:double and xs:decimal) now generally converts both values to xs:decimal.

    See 4.3 Comparing numeric values

  52. PR 289 1901 

    A third argument is added, allowing user control of how absent keys should be handled.

    See 14.4.9 map:get

    A third argument is added, allowing user control of how index-out-of-bounds conditions should be handled.

    See 15.2.11 array:get

  53. A new collation URI is defined for Unicode case-insensitive comparison and ordering.

    See 5.3.5 The Unicode case-insensitive collation

  54. PR 1727 1740 

    It is no longer guaranteed that the new key replaces the existing key.

    See 14.4.14 map:put

  55. The group may remove this function, it is considered at risk.

    See 15.2.18 array:members

    See 15.2.19 array:of-members

  56. PR 173 

    New in 4.0

    See 18.4 fn:op

  57. PR 203 

    New in 4.0

    See 14.4.1 map:build

  58. PR 207 

    New in 4.0

    See 10.1.2 fn:parse-QName

    See 10.2.4 fn:expanded-QName

  59. PR 222 

    New in 4.0

    See 2.2.3 fn:contains-subsequence

    See 2.2.7 fn:ends-with-subsequence

    See 2.2.9 fn:starts-with-subsequence

  60. PR 250 

    New in 4.0

    See 2.1.3 fn:foot

    See 2.1.15 fn:trunk

    See 15.2.2 array:build

    See 15.2.8 array:foot

    See 15.2.31 array:trunk

  61. PR 258 

    New in 4.0

    See 15.2.14 array:index-where

  62. PR 313 

    The second argument can now be a sequence of integers.

    See 2.1.9 fn:remove

  63. PR 319 

    New in 4.0. The function replaces the internal op:same-key function in 3.1

    See 2.2.1 fn:atomic-equal

  64. PR 326 

    Higher-order functions are no longer an optional feature.

    See 1.2 Conformance

  65. PR 360 

    New in 4.0

    See 14.4.4 map:entries

  66. PR 419 

    New in 4.0

    See 2.1.8 fn:items-at

  67. PR 434 

    New in 4.0

    See 4.5.2 fn:parse-integer

    The function has been extended to allow output in a radix other than 10, for example in hexadecimal.

    See 4.6.1 fn:format-integer

  68. PR 477 

    New in 4.0

    See 15.2.24 array:slice

  69. PR 482 

    Deleted an inaccurate statement concerning the behavior of NaN.

    See 4.3 Comparing numeric values

  70. PR 507 

    New in 4.0

    See 2.5.13 fn:partition

  71. PR 546 

    It is no longer automatically an error if the input contains a codepoint that is not valid in XML. Instead, the codepoint must be a permitted character. The set of permitted characters is implementation-defined, but it is recommended that all Unicode characters should be accepted.

    See 5.2.1 fn:codepoints-to-string

    It is no longer automatically an error if the resource (after decoding) contains a codepoint that is not valid in XML. Instead, the codepoint must be a permitted character. The set of permitted characters is implementation-defined, but it is recommended that all Unicode characters should be accepted.

    See 17.1.5 fn:unparsed-text

    The rules regarding use of non-XML characters in JSON texts have been relaxed.

    See 17.4.3 JSON character repertoire

    See 17.4.4 fn:parse-json

    It is no longer automatically an error if the input contains a codepoint that is not valid in XML. Instead, the codepoint must be a permitted character. The set of permitted characters is implementation-defined, but it is recommended that all Unicode characters should be accepted.

    See 17.4.5 fn:json-doc

  72. PR 609 

    New in 4.0

    See 15.2.28 array:split

  73. PR 631 

    New in 4.0

    See 7.1 fn:decode-from-uri

  74. PR 662 

    Constructor functions now have a zero-arity form; the first argument defaults to the context item.

    See 22 Constructor functions

  75. PR 680 

    The case-insensitive collation is now defined normatively within this specification, rather than by reference to the HTML "living specification", which is subject to change. The collation can now be used for ordering comparisons as well as equality comparisons.

    See 5.3.6 The HTML ASCII Case-Insensitive Collation

  76. PR 702 

    The function can now take any number of arguments (previously it had to be two or more), and the arguments can be sequences of strings rather than single strings.

    See 5.4.4 fn:concat

  77. PR 710 

    New in 4.0

    See 13.5 fn:function-annotations

  78. PR 727 

    It has been clarified that loading a module has no effect on the static or dynamic context of the caller.

    See 18.2 fn:load-xquery-module

  79. PR 828 

    The $predicate callback function accepts an optional position argument.

    See 2.5.4 fn:filter

    The $action callback function accepts an optional position argument.

    See 2.5.7 fn:for-each

    See 2.5.8 fn:for-each-pair

    The $predicate callback function now accepts an optional position argument.

    See 15.2.4 array:filter

    The $action callback function now accepts an optional position argument.

    See 15.2.9 array:for-each

    See 15.2.10 array:for-each-pair

  80. PR 881 

    The way that fn:min and fn:max compare numeric values of different types has changed. The most noticeable effect is that when these functions are applied to a sequence of xs:integer or xs:decimal values, the result is an xs:integer or xs:decimal, rather than the result of converting this to an xs:float or xs:double.

    See 2.4.5 fn:max

    See 2.4.6 fn:min

  81. PR 901 

    The optional third argument can now be supplied as the empty sequence.

    See 2.1.13 fn:subsequence

    The third argument can now be supplied as the empty sequence.

    See 5.4.6 fn:substring

    The second argument can now be the empty sequence.

    See 6.3.3 fn:tokenize

    The optional second argument can now be supplied as the empty sequence.

    See 7.5 fn:resolve-uri

    The 3rd, 4th, and 5th arguments are now optional; previously the function required either 2 or 5 arguments.

    See 9.9.1 fn:format-dateTime

    See 9.9.2 fn:format-date

    See 9.9.3 fn:format-time

    All three arguments are now optional, and each argument can be set to the empty sequence. Previously if $description was supplied, it could not be empty.

    See 21.1.1 fn:error

    The $label argument can now be set to the empty sequence. Previously if $label was supplied, it could not be empty.

    See 21.2.1 fn:trace

  82. PR 905 

    The rule that multiple calls on fn:doc supplying the same absolute URI must return the same document node has been clarified; in particular the rule does not apply if the dynamic context for the two calls requires different processing of the documents (such as schema validation or whitespace stripping).

    See 17.1.1 fn:doc

  83. PR 909 

    The function has been expanded in scope to handle comparison of values other than strings.

    See 2.2.2 fn:compare

  84. PR 924 

    Rules have been added clarifying that users should not be allowed to change the schema for the fn namespace.

    See D Schemas

  85. PR 925 

    The decimal format name can now be supplied as a value of type xs:QName, as an alternative to supplying a lexical QName as an instance of xs:string.

    See 4.7.2 fn:format-number

  86. PR 932 

    The specification now prescribes a minimum precision and range for durations.

    See 8.1.2 Limits and precision

  87. PR 933 

    When comments and processing instructions are ignored, any text nodes either side of the comment or processing instruction are now merged prior to comparison.

    See 2.2.4 fn:deep-equal

  88. PR 940 

    New in 4.0

    See 2.5.20 fn:subsequence-where

  89. PR 953 

    Constructor functions for named record types have been introduced.

    See 22.6 Constructor functions for named record types

  90. PR 962 

    New in 4.0

    See 2.5.2 fn:do-until

    See 2.5.23 fn:while-do

  91. PR 969 

    New in 4.0

    See 14.4.3 map:empty

  92. PR 984 

    New in 4.0

    See 8.4.1 fn:seconds

  93. PR 987 

    The order of results is now prescribed; it was previously implementation-dependent.

    See 2.2.5 fn:distinct-values

  94. PR 1022 

    Regular expressions can include comments (starting and ending with #) if the c flag is set.

    See 6.1 Regular expression syntax

    See 6.2 Flags

  95. PR 1028 

    An option is provided to control how the JSON null value should be handled.

    See 17.4.4 fn:parse-json

  96. PR 1032 

    New in 4.0

    See 2.1.17 fn:void

  97. PR 1046 

    New in 4.0

    See 2.5.21 fn:take-while

  98. PR 1059 

    Use of an option keyword that is not defined in the specification and is not known to the implementation now results in a dynamic error; previously it was ignored.

    See 1.7 Options

  99. PR 1068 

    New in 4.0

    See 5.4.3 fn:graphemes

  100. PR 1072 

    The return type is now specified more precisely.

    See 18.2 fn:load-xquery-module

  101. PR 1090 

    When casting from a string to a duration or time or dateTime, it is now specified that when there are more digits in the fractional seconds than the implementation is able to retain, excess digits are truncated. Rounding upwards (which could affect the number of minutes or hours in the value) is not permitted.

    See 23.2 Casting from xs:string and xs:untypedAtomic

  102. PR 1093 

    New in 4.0

    See 5.3.9 fn:collation

  103. PR 1117 

    The $options parameter has been added.

    See 17.1.5 fn:unparsed-text

    See 17.1.7 fn:unparsed-text-available

  104. PR 1182 

    The $predicate callback function may return the empty sequence (meaning false).

    See 2.5.2 fn:do-until

    See 2.5.3 fn:every

    See 2.5.4 fn:filter

    See 2.5.10 fn:index-where

    See 2.5.16 fn:some

    See 2.5.21 fn:take-while

    See 2.5.23 fn:while-do

    See 14.4.6 map:filter

    See 14.4.12 map:keys-where

    See 15.2.4 array:filter

    See 15.2.14 array:index-where

  105. PR 1191 

    The $options parameter has been added, absorbing the $collation parameter.

    See 2.2.4 fn:deep-equal

    New in 4.0

    See 12.3.1 fn:distinct-ordered-nodes

  106. PR 1250 

    For selected properties including percent and exponent-separator, it is now possible to specify a single-character marker to be used in the picture string, together with a multi-character rendition to be used in the formatted output.

    See 4.7.2 fn:format-number

  107. PR 1257 

    The $options parameter has been added.

    See 17.2.1 fn:parse-xml

    See 17.2.2 fn:parse-xml-fragment

  108. PR 1262 

    New in 4.0

    See 5.3.10 fn:collation-available

  109. PR 1265 

    The constraints on the result of the function have been relaxed.

    See 12.1.2 fn:document-uri

  110. PR 1280 

    As a result of changes to the coercion rules, the number of supplied arguments can be greater than the number required: extra arguments are ignored.

    See 2.5.1 fn:apply

  111. PR 1288 

    Additional error conditions have been defined.

    See 17.2.1 fn:parse-xml

  112. PR 1296 

    New in 4.0

    See 2.5.14 fn:scan-left

  113. PR 1333 

    A new option is provided to allow the content of the loaded module to be supplied as a string.

    See 18.2 fn:load-xquery-module

  114. PR 1353 

    An option has been added to suppress the escaping of the solidus (forwards slash) character.

    See 17.4.7 fn:xml-to-json

  115. PR 1358 

    New in 4.0

    See 9.4.3 fn:unix-dateTime

  116. PR 1361 

    The term atomic value has been replaced by atomic item.

    See 1.9 Terminology

  117. PR 1393 

    Changes the function to return a sequence of key-value pairs rather than a map.

    See 13.5 fn:function-annotations

  118. PR 1409 

    This section now uses the term primitive type strictly to refer to the 20 atomic types that are not derived by restriction from another atomic type: that is, the 19 primitive atomic types defined in XSD, plus xs:untypedAtomic. The three types xs:integer, xs:dayTimeDuration, and xs:yearMonthDuration, which have custom casting rules but are not strictly-speaking primitive, are now handled in other subsections.

    See 23.1 Casting from primitive types to primitive types

    The rules for conversion of dates and times to strings are now defined entirely in terms of XSD 1.1 canonical mappings, since these deliver exactly the same result as the XPath 3.1 rules.

    See 23.1.2.2 Casting date/time values to xs:string

    The rules for conversion of durations to strings are now defined entirely in terms of XSD 1.1 canonical mappings, since the XSD 1.1 rules deliver exactly the same result as the XPath 3.1 rules.

    See 23.1.2.3 Casting xs:duration values to xs:string

  119. PR 1455 

    Numbers now retain their original lexical form, except for any changes needed to satisfy JSON syntax rules (for example, stripping leading zero digits).

    See 17.4.7 fn:xml-to-json

  120. PR 1473 

    New in 4.0

    See 2.1.5 fn:identity

  121. PR 1481 

    The function has been extended to handle other Gregorian types such as xs:gYearMonth.

    See 9.6.1 fn:year-from-dateTime

    See 9.6.2 fn:month-from-dateTime

    The function has been extended to handle other Gregorian types such as xs:gMonthDay.

    See 9.6.3 fn:day-from-dateTime

    The function has been extended to handle other types including xs:time.

    See 9.6.4 fn:hours-from-dateTime

    See 9.6.5 fn:minutes-from-dateTime

    The function has been extended to handle other types such as xs:gYearMonth.

    See 9.6.7 fn:timezone-from-dateTime

  122. PR 1523 

    New functions are provided to obtain information about built-in types and types defined in an imported schema.

    See 19 Processing types

    New in 4.0

    See 19.1.2 fn:schema-type

    See 19.1.4 fn:atomic-type-annotation

    See 19.1.5 fn:node-type-annotation

  123. PR 1545 

    New in 4.0

    See 9.7.4 fn:civil-timezone

  124. PR 1565 

    The default for the escape option has been changed to false. The 3.1 specification gave the default value as true, but this appears to have been an error, since it was inconsistent with examples given in the specification and with tests in the test suite.

    See 17.4.4 fn:parse-json

  125. PR 1570 

    New in 4.0

    See 19.1.3 fn:type-of

  126. PR 1587 

    New in 4.0

    See 17.1.8 fn:unparsed-binary

  127. PR 1611 

    The spec has been corrected to note that the function depends on the implicit timezone.

    See 2.2.2 fn:compare

  128. PR 1671 

    New in 4.0.

    See 4.4.3 fn:divide-decimals

  129. PR 1687 

    New in 4.0

    See 14.4.10 map:items

  130. PR 1703 

    Ordered maps are introduced.

    See 14.1 Ordering of Maps

    Enhanced to allow for ordered maps.

    See 14.4.6 map:filter

    See 14.4.7 map:find

    See 14.4.8 map:for-each

    See 14.4.14 map:put

    See 14.4.15 map:remove

    The order of entries in maps is retained.

    See 17.4.4 fn:parse-json

  131. PR 1711 

    It is explicitly stated that the limits for $precision are implementation-defined.

    See 4.4.6 fn:round

    See 4.4.7 fn:round-half-to-even

  132. PR 1727 

    For consistency with the new function map:build, the handling of duplicates may now be controlled by supplying a user-defined callback function as an alternative to the fixed values for the earlier duplicates option.

    See 14.4.13 map:merge

  133. PR 1734 

    In 3.1, given a mixed input sequence such as (1, 3, 4.2e0), the specification was unclear whether it was permitted to add the first two integer items using integer arithmetic, rather than converting all items to doubles before performing any arithmetic. The 4.0 specification is clear that this is permitted; but since the items can be reordered before being added, this is not required.

    See 2.4.4 fn:avg

    See 2.4.7 fn:sum

  134. PR 1801 

    New in 4.0

    See 13.4 fn:function-identity

  135. PR 1825 

    New in 4.0

    See 2.5.12 fn:partial-apply

  136. PR 1856 

    Word boundaries can be matched. Lookahead and lookbehind assertions are supported. Assertions (including ^ and $) can no longer be followed by a quantifier.

    See 6.1 Regular expression syntax

    The output of the function is extended to allow the represention of captured groups found within lookahead assertions.

    See 6.3.4 fn:analyze-string

  137. PR 1879 

    Additional options to control DTD and XInclude processing have been added.

    See 17.2.1 fn:parse-xml

  138. PR 1897 

    The $replacement argument can now be a function that computes the replacement strings.

    See 6.3.2 fn:replace

  139. PR 1906 

    New in 4.0

    See 14.5.1114.5.10 fn:element-to-map-plan

    New in 4.0.

    See 14.5.1214.5.11 fn:element-to-map

  140. PR 1910 

    An $options parameter is added. Note that the rules for the $options parameter control aspects of processing that were implementation-defined in earlier versions of this specification. An implementation may provide configuration options designed to retain backwards-compatible behavior when no explicit options are supplied.

    See 17.1.1 fn:doc

    See 17.1.2 fn:doc-available

  141. PR 1913 

    It is now permitted for the regular expression to match a zero-length string.

    See 6.3.2 fn:replace

    See 6.3.3 fn:tokenize

    See 6.3.4 fn:analyze-string

  142. PR 1933 

    New in 4.0

    See 17.2.5 fn:xsd-validator

  143. PR 1991 

    Named record types used in the signatures of built-in functions are now available as standard in the static context.

    See C Built-in named record types

  144. PR 2001 

    New in 4.0.

    See 2.5.18 fn:sort-by

    See 15.2.26 array:sort-by

  145. PR 2013 

    Support for binary input has been added.

    See 17.2.1 fn:parse-xml

    See 17.2.2 fn:parse-xml-fragment

    New in 4.0

    See 17.3.3 fn:html-doc

    See 17.5.8 fn:csv-doc

  146. PR 2030 

    This description of the XSD validation process was previously found (with some duplication) in the XQuery and XSLT specifications; those specifications now reference this description. As a side-effects, the descriptions of the process in XQuery and XSLT are better aligned.

    See 17.2.4 XSD validation

  147. PR 2031 

    Introduced the concept of JNodes.

    See 16 Processing JNodes

    New in 4.0

    See 16.1.1 fn:jtree

    See 16.1.2 fn:jnode-content

    See 16.1.3 fn:jnode-selector

    See 16.1.4 fn:jnode-position

  148. PR 2149 

    Generalized to work with JNodes as well as XNodes.

    See 12.2.1 fn:has-children

    The function is extended to handle JNodes.

    See 12.2.9 fn:path

    Generalized to work with JNodes as well as XNodes.

    See 12.3.2 fn:innermost

    See 12.3.3 fn:outermost

  149. PR 2168 

    Atomic items of types xs:hexBinary and xs:base64Binary are now mutually comparable. In rare cases, where an application uses both types and assumes they are distinct, this can represent a backwards incompatibility.

    See 2.2.1 fn:atomic-equal

    See 2.2.4 fn:deep-equal

    See 2.2.5 fn:distinct-values

  150. PR 2223 

    An error may now be raised if the base URI is not a valid LEIRI reference.

    See 12.1.1 fn:base-uri

  151. PR 2224 

    The $action callback function now accepts an optional position argument.

    See 14.4.6 map:filter

    See 14.4.8 map:for-each

  152. PR 2228 

    New in 4.0

    See 15.2.27 array:sort-with

  153. PR 2249 

    The specification now describes in more detail how to determine the effective encoding value.

    See 17.1.5 fn:unparsed-text

  154. PR 2256 

    In the interests of consistency, the index-of function now defines equality to mean contextually equal. This has the implication that NaN is now considered equal to NaN.

    See 2.2.8 fn:index-of

  155. PR 2259 

    A new parameter canonical is available to give control over serialization of XML, XHTML, and JSON.

    See 17.2.3 fn:serialize

  156. PR 2286 

    The type of $value has been generalized to xs:anyAtomicType?.

    See 5.4.7 fn:string-length

    See 5.4.8 fn:normalize-space

  157. PR 2387 

    It is now recommended that out-of-range xs:double values should translate to positive or negative infinity.

    See 17.4.4 fn:parse-json