Please check the errata for any errors or issues reported since publication.
See also translations.
This document is also available in these non-normative formats: Specification in XML format and XML function catalog.
Copyright © 2000 W3C® (MIT, ERCIM, Keio, Beihang). W3C liability, trademark and document use rules apply.
This document defines constructor functions, operators, and functions on the datatypes defined in [XML Schema Part 2: Datatypes Second Edition] and the datatypes defined in [XQuery and XPath Data Model (XDM) 3.1]. It also defines functions and operators on nodes and node sequences as defined in the [XQuery and XPath Data Model (XDM) 3.1]. These functions and operators are defined for use in [XML Path Language (XPath) 4.0] and [XQuery 4.0: An XML Query Language] and [XSL Transformations (XSLT) Version 4.0] and other related XML standards. The signatures and summaries of functions defined in this document are available at: http://www.w3.org/2005/xpath-functions/.
A summary of changes since version 3.1 is provided at G Changes since version 3.1.
This version of the specification is work in progress. It is produced by the QT4 Working Group, officially the W3C XSLT 4.0 Extensions Community Group. Individual functions specified in the document may be at different stages of review, reflected in their History notes. Comments are invited, in the form of GitHub issues at https://github.com/qt4cg/qtspecs.
The publications of this community group are dedicated to our co-chair, Michael Sperberg-McQueen (1954–2024).
Maps were introduced as a new datatype in XDM 3.1. This section describes functions that operate on maps.
A map is a kind of item.
[Definition] A map consists of a sequence of entries, also known as key-value pairs. Each entry comprises a key which is an arbitrary atomic item, and an arbitrary sequence called the associated value.
[Definition] Within a map, no two entries have the same key. Two atomic items K1 and K2 are the same key for this purpose if the function call fn:atomic-equal($K1, $K2) returns true.
It is not necessary that all the keys in a map should be of the same type (for example, they can include a mixture of integers and strings).
Maps are immutable, and have no identity separate from their content. For example, the map:remove function returns a map that differs from the supplied map by the omission (typically) of one entry, but the supplied map is not changed by the operation. Two calls on map:remove with the same arguments return maps that are indistinguishable from each other; there is no way of asking whether these are “the same map”.
A map can also be viewed as a function from keys to associated values. To achieve this, a map is also a function item. The function corresponding to the map has the signature function($key as xs:anyAtomicValue) as item()*. Calling the function has the same effect as calling the map:get function: the expression $map($key) returns the same result as get($map, $key). For example, if $books-by-isbn is a map whose keys are ISBNs and whose assocated values are book elements, then the expression $books-by-isbn("0470192747") returns the book element with the given ISBN. The fact that a map is a function item allows it to be passed as an argument to higher-order functions that expect a function item as one of their arguments.
A new function fn:elements-to-maps is provided for converting XDM trees to maps suitable for serialization as JSON. Unlike the fn:xml-to-json function retained from 3.1, this can handle arbitrary XML as input. [Issue 528 ]
The fn:elements-to-maps function converts XML element nodes to maps, in a form suitable for serialization as JSON. This section describes the mappings used by this function.
This mapping is designed with three objectives:
It should be possible to represent any XML element as a map suitable for JSON serialization.
The resulting JSON should be intuitive and easy to use.
The JSON should be consistent and stable: small changes in the input should not result in large changes in the output.
Achieving all three objectives requires design compromises. It also requires sacrificing some other desiderata. In consequence:
The conversion is not lossless (see 17.5.5 Lost XDM Information for details).
The conversion is not streamable.
The results are not necessarily compatible with those produced by other popular libraries.
The requirement for consistency and stability is particularly challenging. An element such as <name>John</name> maps naturally to the map { "name": "John" }; but adding an attribute (so it becomes <name role="first">John</name>) then requires an incompatible change in the JSON representation. The format could be made extensible by converting <name>John</name> to { "name": {"#content":"John"} } and <name role="first">John</name> to { "name": { "@role":"first", "#content":"John" } }, but this imposes unwanted complexity on the simplest cases. The solution adopted is threefold:
The function makes use of schema information where available, so it considers not just the structure of an individual element instance, but the rules governing the element type.
It is possible to request uniform layout for all elements sharing the same name, so the decision is based on the structure of all elements with a given name, not just an individual element.
It is possible to override the choice made by the system, and explicitly specify a layout to be used for elements having a given name.
The key challenge in mapping XML to JSON is in deciding how element content is to be represented. To illustrate the variety of mappings that are possible, the following table lists some examples of typical XML elements and their JSON equivalents:
| XML element | JSON equivalent |
|---|---|
<hr/> | "hr": "" |
<date-of-birth>2023-05-18</date-of-birth> | "date-of-birth": "2023-05-18" |
<box width="5" height="10"/> | "box": { "@width": "5", "@height": "10" } |
<label id="t41">Warning!</label> | "label": { "@id": "t41", "#content": "Warning!" } |
<box>
<width>5</width>
<height>10</height>
</box> | "box": {
"width": 5,
"height": 10
} |
<polygon>
<point x="0" y="0"/>
<point x="1" y="0"/>
<point x="1" y="1"/>
<point x="0" y="1"/>
</polygon> | "polygon": [
{ "x": 0, "y": 0 },
{ "x": 1, "y": 0 },
{ "x": 1, "y": 1 },
{ "x": 0, "y": 1 }
] |
This specification defines a number of named mappings, called layouts, and allows the layout for a particular element to be selected in four different ways:
The layout to be used for a specific element name can be explicitly selected in the options to the fn:elements-to-maps function.
In the absence of an explicit selection, if the data has been schema-validated, the layout is inferred from the content model for the element type as defined in the schema.
When the data is untyped and no specific layout has been selected, a default layout is chosen based on the properties of the individual element instance.
If the uniform option is set to true, then the same layout will be used for all elements with a given name. This means that all elements need to be examined before any element is converted.
It is possible to disable some of the layouts so they will never be chosen by the automatic rules, but only when explicitly selected.
The advantage of using schema information is that it gives a consistent representation for all elements of a particular type, even if they vary in content: for example if an element type allows optional attributes, the JSON representation will be consistent between those elements that have attributes and those without. In the absence of a schema, consistency can be achieved either by using the uniform option, or by selecting a layout explicitly in the layouts option.
The different layouts available are defined in the following sections. For each layout there is a table showing:
Layout name: the name to be used to select this layout in the $options parameter of the fn:elements-to-maps function.
Usage: the situations for which this layout is designed.
Example input: an example of a typical element for which this layout is appropriate, shown as serialized XML.
Example output: the result of converting this example, shown as serialized JSON. The result is always shown as a singleton map, which is how it will appear when the layout is used for the top-level elements supplied in the $elements argument; when used to convert a descendant element, the corresponding key-value pair may appear as part of a larger map, depending on the layout chosen for its parent element..
Note:
The fn:elements-to-maps function produces maps as its result, but it is convenient to illustrate the form of the map by showing the effect of serializing the map as JSON.
Mapping rules: The rules for mapping the XML element to an XDM map representation.
Mapping for nilled elements: special rules that apply to an element having the attribute xsi:nil="true". These rules only apply if the element has been schema-validated.
Notes: General observations, especially concerning what information is retained by this mapping and what information is lost.
The rules for selecting the layout for a particular element are given later, in 17.5.2 Selecting an Element Layout.
Note that it is possible to use any layout for any element. Use of an inappropriate layout may result in information being discarded; but in some cases, discarding information may be the desired outcome.
Note:
Acknowledgements for this categorization: see [Goessner]. Although Goessner's categories have been used, the actual mappings vary from his proposal.
| Layout name |
|
|---|---|
| Usage | Intended primarily for XML elements that contain multiple child elements, with different names, where the order of the child elements is not significant. Also used for elements whose content is a single element node child. The element may or may not have attributes. |
| Example input (1) | <employee id="x"> <date-of-birth>1984-03-20</date> <location>Germany</location> <position>Janitor</position> </employee> |
| Example output (1) | { "employee": { "@id": "x",
"date-of-birth": "1984-03-20",
"location": "Germany",
"position": "Janitor"
}
} |
| Example input (2) | <employee id="x"> <date-of-birth>1984-03-20</date> <location>Germany</location> <position>Janitor</position> <position>Gardener</position> </employee> |
| Example output (2) | { "employee": { "@id": "x",
"date-of-birth": "1984-03-20",
"location": "Germany",
"position": [ "Janitor", "Gardener" ]
}
} |
| Mapping rules | If the element has non-whitespace text node children, then it is output as if mixed layout were chosen (see 17.5.1.9 Layout: Mixed). This is fallback behavior for use when this layout is chosen inappropriately. In other cases, the content is represented by a map containing one entry for each attribute in the XML element, plus one entry for each child element, whose value is formatted according to the rules for that element. If two or more child elements have the same name, or names that are represented by the same string (taking into account the chosen Because the child elements are converted to a map, their order is not retained. The entry orderDM of the resulting map first contains entries derived from attributes (in unpredictable order), then entries derived from child elements, in order of first appearance. |
| Mapping for nilled elements | Alongside any attributes, the value includes the additional entry |
| Notes | Although this layout is intended primarily for elements whose children are unordered and uniquely named, it is also viable to use it in cases where elements can repeat, so long as order relative to other elements is not significant. Comments and processing instructions in the content are discarded. |