Please check the errata for any errors or issues reported since publication.
See also translations.
This document is also available in these non-normative formats: Specification in XML format and XML function catalog.
Copyright © 2000 W3C® (MIT, ERCIM, Keio, Beihang). W3C liability, trademark and document use rules apply.
This document defines constructor functions, operators, and functions on the datatypes defined in [XML Schema Part 2: Datatypes Second Edition] and the datatypes defined in [XQuery and XPath Data Model (XDM) 3.1]. It also defines functions and operators on nodes and node sequences as defined in the [XQuery and XPath Data Model (XDM) 3.1]. These functions and operators are defined for use in [XML Path Language (XPath) 4.0] and [XQuery 4.0: An XML Query Language] and [XSL Transformations (XSLT) Version 4.0] and other related XML standards. The signatures and summaries of functions defined in this document are available at: http://www.w3.org/2005/xpath-functions/.
A summary of changes since version 3.1 is provided at G Changes since 3.1.
This version of the specification is work in progress. It is produced by the QT4 Working Group, officially the W3C XSLT 4.0 Extensions Community Group. Individual functions specified in the document may be at different stages of review, reflected in their History notes. Comments are invited, in the form of GitHub issues at https://github.com/qt4cg/qtspecs.
The publications of this community group are dedicated to our co-chair, Michael Sperberg-McQueen (1954–2024).
Accessors and their semantics are described in [XQuery and XPath Data Model (XDM) 3.1]. Some of these accessors are exposed to the user through the functions described below.
Each of these functions has an arity-zero signature which is equivalent to the arity-one form, with the context value supplied as the implicit first argument. In addition, each of the arity-one functions accepts an empty sequence as the argument, in which case it generally delivers an empty sequence as the result: the exception is fn:string, which delivers a zero-length string.
| Function | Accessor | Accepts | Returns |
|---|---|---|---|
fn:node-name | node-name | node (optional) | xs:QName (optional) |
fn:nilled | nilled | node (optional) | xs:boolean (optional) |
fn:string | string-value | item (optional) | xs:string |
fn:data | typed-value | zero or more items | a sequence of atomic items |
fn:base-uri | base-uri | node (optional) | xs:anyURI (optional) |
fn:document-uri | document-uri | node (optional) | xs:anyURI (optional) |
| Function | Meaning |
|---|---|
fn:node-name | Returns the name of a node, as an xs:QName. |
fn:nilled | Returns true for an element that is nilled. |
fn:string | Returns the value of $value represented as an xs:string. |
fn:data | Returns the result of atomizing a sequence. This process flattens arrays, and replaces nodes by their typed values. |
fn:base-uri | Returns the base URI of a node. |
fn:document-uri | Returns the URI of a resource where a document can be found, if available. |
Returns the name of a node, as an xs:QName.
fn:node-name( | ||
$node | as | := . |
) as | ||
The zero-argument form of this function is deterministic, context-dependent, and focus-dependent.
The one-argument form of this function is deterministic, context-independent, and focus-independent.
If the argument is omitted, it defaults to the context value (.).
If $node is the empty sequence, the empty sequence is returned.
Otherwise, the function returns the result of the dm:node-name accessor as defined in [XQuery and XPath Data Model (XDM) 3.1] (see Section 6.7.10 node-name AccessorDM).
The following errors may be raised when $node is omitted:
If the context value is absentDM, type error [err:XPDY0002]XP.
If the context value is not a single node, type error [err:XPTY0004]XP.
For element and attribute nodes, the name of the node is returned as an xs:QName, retaining the prefix, namespace URI, and local part.
For processing instructions, the name of the node is returned as an xs:QName in which the prefix and namespace URI are absentDM.
For a namespace node, the function returns an empty sequence if the node represents the default namespace; otherwise it returns an xs:QName in which prefix and namespace URI are absentDM and the local part is the namespace prefix being bound.
For all other kinds of node, the function returns the empty sequence.
| Variables | |
|---|---|
let $e := <doc> <p id="alpha" xml:id="beta">One</p> <p id="gamma" xmlns="http://example.com/ns">Two</p> <ex:p id="delta" xmlns:ex="http://example.com/ns">Three</ex:p> <?pi 3.14159?> </doc> | |
| Expression | Result |
|---|---|
| QName("", "p") |
| QName("http://example.com/ns", "p") |
| QName("http://example.com/ns", "ex:p") |
| QName("", "pi") |
| () |
| QName("", "id") |
| xs:QName("xml:id")#xml:id |
In XPath 4.0, statically-known QNames can be expressed using a QName literal such as #xml:space. Where the QName is not known statically, the xs:QName constructor function can be used.
In addition to the xs:QName constructor function, QName values can be constructed by combining a namespace URI, prefix, and local name, or by resolving a lexical QName against the in-scope namespaces of an element node. This section defines functions that perform these operations. Leading and trailing whitespace, if present, is stripped from string arguments before the result is constructed.
| Function | Meaning |
|---|---|
fn:QName | Returns an xs:QName value formed using a supplied namespace URI and lexical QName. |
fn:parse-QName | Returns an xs:QName value formed by parsing an EQName. |
fn:resolve-QName | Returns an xs:QName value (that is, an expanded-QName) by taking an xs:string that has the lexical form of an xs:QName (a string in the form "prefix:local-name" or "local-name") and resolving it using the in-scope namespaces for a given element. |
Returns an xs:QName value formed by parsing an EQName.
fn:parse-QName( | ||
$value | as | |
) as | ||
This function is deterministic, context-dependent, and focus-independent. It depends on namespaces.
If $value is an empty sequence, the result is an empty sequence.
Otherwise, leading and trailing whitespace in $value is stripped.
If the resulting $value is castable to xs:NCName, the result is fn:QName("", $value): that is, a QName in no namespace.
Otherwise, if the resulting $value is in the lexical space of xs:QName (that is, if it is in the form prefix:local), the result is xs:QName($value). Note that this result depends on the in-scope prefixes in the static context, and may result in various error conditions.
Otherwise, if the resulting $value takes the form of an XPath BracedURILiteralXP (that is, Q{uri}local, where the uri part may be zero-length), then the result is fn:QName(uri, local).
The rules used for parsing a BracedURILiteralXP within a URIQualifiedNameXP are the XPath rules, not the XQuery rules (the XQuery rules require special characters such as < and & to be escaped).
A dynamic error is raised [err:FOCA0002] if the supplied value of $value, after whitespace normalization, does not match the XPath production EQNameXP
A dynamic error is raised [err:FONS0004] if the supplied value of $value, after whitespace normalization, is in the form prefix:local (with a non-absent prefix), and the prefix cannot be resolved to a namespace URI using the in-scope namespace bindings from the static context.
| |
| |
| |
|
These functions convert between the lexical representation and XPath and XQuery data model representation of various file formats.
The functions listed in this section parse or serialize JSON data.
JSON is a popular format for exchange of structured data on the web: it is specified in [RFC 7159]. This section describes facilities allowing JSON data to be converted to and from XDM values.
This specification describes two ways of representing JSON data losslessly using XDM constructs. The first method uses XDM maps to represent JSON objects, and XDM arrays to represent JSON arrays. The second method represents all JSON constructs using XDM element and attribute nodes.
| Function | Meaning |
|---|---|
fn:parse-json | Parses a string supplied in the form of a JSON text, returning the results typically in the form of a map or array. |
fn:json-doc | Reads an external resource containing JSON, and returns the result of parsing the resource as JSON. |
fn:json-to-xml | Parses a string supplied in the form of a JSON text, returning the results in the form of an XML document node. |
fn:xml-to-json | Converts an XML tree, whose format corresponds to the XML representation of JSON defined in this specification, into a string conforming to the JSON grammar. |
fn:pin | Adapts a map or array so that retrieval operations retain additional information. |
fn:label | Returns the label associated with a labeled item, as a map. |
Note also:
The function fn:serialize has an option to generate JSON output from a structure of maps and arrays.
The function fn:element-to-map enables arbitrary XML node trees to be converted to trees of maps and arrays suitable for serializing as JSON.
The rules regarding use of non-XML characters in JSON texts have been relaxed. [Issue 414 PR 546 25 July 2023]
An option is provided to control how the JSON null value should be handled. [Issue 960 PR 1028 20 February 2024]
An option is provided to control how JSON numbers should be formatted. [Issues 973 1037 PRs 975 1058 1246 12 March 2024]
The default for the escape option has been changed to false. The 3.1 specification gave the default value as true, but this appears to have been an error, since it was inconsistent with examples given in the specification and with tests in the test suite. [Issue 1555 PR 1565 11 November 2024]
The order of entries in maps is retained. [Issue 1651 PR 1703 14 January 2025]
Parses a string supplied in the form of a JSON text, returning the results typically in the form of a map or array.
fn:parse-json( | ||
$value | as , | |
$options | as | := {} |
) as | ||
This function is deterministic, context-independent, and focus-independent.
If the second argument is omitted or an empty sequence, the result is the same as calling the two-argument form with an empty map as the value of the $options argument.
The first argument is a JSON text as defined in [RFC 7159], in the form of a string. The function parses this string to return an XDM value.
If $value is the empty sequence, the function returns the empty sequence.
Note:
The result will also be an empty sequence if $value is the string "null".
The $options argument can be used to control the way in which the parsing takes place. The option parameter conventions apply.
The entries that may appear in the $options map are as follows:
record( | |
liberal? | as xs:boolean, |
duplicates? | as xs:string, |
escape? | as xs:boolean, |
fallback? | as (fn(xs:string) as xs:anyAtomicType)?, |
null? | as item()*, |
number-parser? | as (fn(xs:untypedAtomic) as item()?)? |
) | |
| Key | Value | Meaning |
|---|---|---|
| Determines whether deviations from the syntax of RFC7159 are permitted.
| |
false | The input must consist of an optional byte order mark (which is ignored) followed by a string that conforms to the grammar of JSON-text in [RFC 7159]. An error must be raised [err:FOJS0001] if the input does not conform to the grammar. | |
true | The input may contain deviations from the grammar of [RFC 7159], which are handled in an implementation-defined way. (Note: some popular extensions include allowing quotes on keys to be omitted, allowing a comma to appear after the last item in an array, allowing leading zeroes in numbers, and allowing control characters such as tab and newline to be present in unescaped form.) Since the extensions accepted are implementation-defined, an error may be raised [err:FOJS0001] if the input does not conform to the grammar. | |
| Determines the policy for handling duplicate keys in a JSON object. To determine whether keys are duplicates, they are compared using the Unicode codepoint collation, after expanding escape sequences, unless the escape option is set to true, in which case keys are compared in escaped form.
| |
reject | An error is raised [err:FOJS0003] if duplicate keys are encountered. | |
use-first | If duplicate keys are present in a JSON object, all but the first of a set of duplicates are ignored. | |
use-last | If duplicate keys are present in a JSON object, all but the last of a set of duplicates are ignored. | |
| Determines whether special characters are represented in the XDM output in backslash-escaped form.
| |
false | Any permitted character in the input, whether or not it is represented in the input by means of an escape sequence, is represented as an unescaped character in the result. Any other character or codepoint (for example, an unpaired surrogate) is passed to the fallback function as described below; in the absence of a fallback function, it is replaced by U+FFFD (REPLACEMENT CHARACTER, �) . | |
true | JSON escape sequences are used in the result to represent special characters in the JSON input, as defined below, whether or not they were represented using JSON escape sequences in the input. The characters that are considered “special” for this purpose are:
\t), or a six-character escape sequence otherwise (for example \uDEAD). Characters other than these are not escaped in the result, even if they were escaped in the input. | |
| Provides a function which is called when the input contains an escape sequence that represents a character that is not a permitted character. It is an error to supply the fallback option if the escape option is present with the value true.
| |
User-supplied function | The function is called when the JSON input contains character that is not a permitted character It is called once for any surrogate that is not properly paired with another surrogate. The untyped atomic item supplied as the argument will always be a two- or six-character escape sequence, starting with a backslash, that conforms to the rules in the JSON grammar (as extended by the implementation if liberal:true() is specified): for example \b or \uFFFF or \uDEAD. By default, the escape sequence is replaced with the Unicode | |
| Determines how the JSON null value should be represented.
| |
Value | The supplied XDM value is used to represent the JSON null value. The default representation of null is an empty sequence, which works well in cases where setting a property of an object to null has the same meaning as omitting the property. It works less well in cases where null is used with some other meaning, because expressions such as the lookup operators ? and ?? flatten the result to a single sequence of items, which means that any entries whose value is an empty sequence effectively disappear. The property can be set to any XDM value; a suggested value is the xs:QName value fn:QName("http://www.w3.org/2005/xpath-functions", "null"), which is recognized by the JSON serialization method as representing the JSON value null. | |
| Determines how numeric values should be processed.
| |
User-supplied function | The supplied function is called to process the string value of any JSON number in the input. By default, numbers are processed by converting to xs:double using the XPath casting rules. Supplying the value xs:decimal#1 will instead convert to xs:decimal (which potentially retains more precision, but disallows exponential notation), while supplying a function that casts to (xs:decimal | xs:double) will treat the value as xs:decimal if there is no exponent, or as xs:double otherwise. Supplying the value fn:identity#1 causes the value to be retained unchanged as an xs:untypedAtomic. If the liberal option is false (the default), then the supplied number-parser is called if and only if the value conforms to the JSON grammar for numbers (for example, a leading plus sign and redundant leading zeroes are not allowed). If the liberal option is true then it is also called if the value conforms to an implementation-defined extension of this grammar. | |
The various structures that can occur in JSON are transformed recursively to XDM values as follows:
A JSON object is converted to a map. The entries in the map correspond to the key/value pairs in the JSON object. The key is always of type xs:string; the associated value may be of any type, and is the result of converting the JSON value by recursive application of these rules. For example, the JSON text { "x": 2, "y": 5 } is transformed to the value { "x": 2, "y": 5 }.
If duplicate keys are encountered in a JSON object, they are handled as determined by the duplicates option defined above.
The order of entries is retained.
A JSON array is transformed to an array whose members are the result of converting the corresponding member of the array by recursive application of these rules. For example, the JSON text [ "a", "b", null ] is transformed (by default) to the value [ "a", "b", () ].
A JSON string is converted to an xs:string value. The handling of special characters depends on the escape and fallback options, as described in the table above.
A JSON number is processed using the function supplied in the number-parser option; by default it is converted to an xs:double value using the rules for casting from xs:string to xs:double.
The JSON boolean values true and false are converted to the corresponding xs:boolean values.
The JSON value null is converted to the value given by the null option, which defaults to an empty sequence.
A dynamic error [err:FOJS0001] occurs if the value of $value does not conform to the JSON grammar, unless the option "liberal":true() is present and the processor chooses to accept the deviation.
A dynamic error [err:FOJS0003] occurs if the option "duplicates": "reject" is present and the value of $value contains a JSON object with duplicate keys.
A dynamic error [err:FOJS0005] occurs if the $options map contains an entry whose key is defined in this specification and whose value is not valid for that key, or if it contains an entry with the key fallback when the option "escape":true() is also present.
The result of the function will be an instance of one of the following types. An instance of test (or in XQuery, typeswitch) can be used to distinguish them:
map(xs:string, item()?) for a JSON object
array(item()?) for a JSON array
xs:string for a JSON string
xs:double for a JSON number
xs:boolean for a JSON boolean
empty-sequence() for a JSON null (or for empty input)
If the input starts with a byte order mark, this function ignores it. The byte order mark may have been added to the data stream in order to facilitate decoding of an octet stream to a character string, but since this function takes a character string as input, the byte order mark serves no useful purpose.
The possibility of the input containing characters that are not valid in XML (for example, unpaired surrogates) arises only when such characters are expressed using JSON escape sequences. This is because the input to the function is an instance of xs:string, which by definition (see Section 4.1.5 XML and XSD VersionsDM) cannot contain unpaired surrogates.
The serializer provides an option to output data in json-lines format. This is a format for structured data containing one JSON value (usually but not necessarily a JSON object) on each line. There is no corresponding option to parse json-lines input, but this can be achieved using the expression unparsed-text-lines($uri) => parse-json().
| Expression: |
|
|---|---|
| Result: | { "x": 1e0, "y": [ 3e0, 4e0, 5e0 ] } |
| Expression: |
|
| Result: | "abcd" |
| Expression: |
|
| Result: | { "x": "\", "y": "%" } |
| Expression: | parse-json(
'{ "x": "\\", "y": "\u0025" }',
{ 'escape': true() }
) |
| Result: | { "x": "\\", "y": "%" } |
| Expression: | parse-json(
'{ "x": "\\", "y": "\u0000" }'
) |
| Result: | { "x": "\", "y": char(0xFFFD) } |
| Expression: | parse-json(
'{ "x": "\\", "y": "\u0000" }',
{ 'escape': true() }
) |
| Result: | { "x": "\\", "y": "\u0000" } |
| Expression: | parse-json(
'{ "x": "\\", "y": "\u0000" }',
{ 'fallback': fn($s) { '[' || $s || ']' } }
) |
| Result: | { "x": "\", "y": "[\u0000]" } |
| Expression: | parse-json(
"1984.2",
{ 'number-parser': fn { xs:integer(round(.)) } }
) |
| Result: | 1984 |
| Expression: | parse-json(
'[ 1, -1, 2 ]',
{ 'number-parser': fn { boolean(. >= 0) } }
) |
| Result: | [ true(), false(), true() ] |
| Expression: | parse-json('[ "a", null, "b" ]',
{ 'null': xs:QName("fn:null") }
)parse-json('[ "a", null, "b" ]',
{ 'null': #fn:null }
) |
| Result: | [ "a", xs:QName("fn:null"), "b" ][ "a", #fn:null, "b" ] |
The functions included in this section operate on function items, that is, values referring to a function.
[Definition] Functions that accept functions among their arguments, or that return functions in their result, are described in this specification as higher-order functions.
Note:
Some functions such as fn:parse-json allow the option of supplying a callback function for example to define exception behavior. Where this is not essential to the use of the function, the function has not been classified as higher-order for this purpose; in applications where function items cannot be created, these particular options will not be available.
| Function | Meaning |
|---|---|
fn:function-lookup | Returns a function item having a given name and arity, if there is one. |
fn:function-name | Returns the name of the function identified by a function item. |
fn:function-arity | Returns the arity of the function identified by a function item. |
fn:function-identity | Returns a string representing the identity of a function item. |
fn:function-annotations | Returns the annotations of the function item. |
Returns a function item having a given name and arity, if there is one.
fn:function-lookup( | ||
$name | as , | |
$arity | as | |
) as | ||
This function is deterministic, context-dependent, and focus-dependent.
A call to fn:function-lookup starts by looking for a function definitionXP in the named functions component of the dynamic context (specifically, the dynamic context of the call to fn:function-lookup), using the expanded QName supplied as $name and the arity supplied as $arity. There can be at most one such function definition.
If no function definition can be identified (by name and arity), then an empty sequence is returned.
If a function definition is identified, then a function item is obtained from the function definition using the same rules as for evaluation of a named function reference (see Section 4.5.5 Named Function ReferencesXP). The captured context of the returned function item (if it is context dependent) is the static and dynamic context of the call on fn:function-lookup.
If the arguments to fn:function-lookup identify a function that is present in the static context of the function call, the function will always return the same function that a static reference to this function would bind to. If there is no such function in the static context, then the results depend on what is present in the dynamic context, which is implementation-defined.
An error is raised if the identified function depends on components of the static or dynamic context that are not present, or that have unsuitable values. For example [err:XPDY0002]XP is raised for the call function-lookup(xs:QName("fn:name")#fn:name, 0) if the context value is absent, and [err:FODC0001] is raised for the call function-lookup(xs:QName("fn:id")#fn:id, 1) if the context value is not a single node in a tree that is rooted at a document node. The error that is raised is the same as the error that would be raised by the corresponding function if called with the same static and dynamic context.
This function can be useful where there is a need to make a dynamic decision on which of several statically known functions to call. It can thus be used as a substitute for polymorphism, in the case where the application has been designed so several functions implement the same interface.
The function can also be useful in cases where a query or stylesheet module is written to work with alternative versions of a library module. In such cases the author of the main module might wish to test whether an imported library module contains or does not contain a particular function, and to call a function in that module only if it is available in the version that was imported. A static call would cause a static error if the function is not available, whereas getting the function using fn:function-lookup allows the caller to take fallback action in this situation.
If the function that is retrieved by fn:function-lookup is context-dependent, that is, if it has dependencies on the static or dynamic context of its caller, the context that applies is the static and/or dynamic context of the call to the fn:function-lookup function itself. The context thus effectively forms part of the closure of the returned function. This mainly applies when the target of fn:function-lookup is a built-in function, because user-defined functions typically have no dependency on the static or dynamic context of the function call (an exception arises when the expressions used to define default values for parameters are context-dependent). The rule applies recursively, since fn:function-lookup is itself a context-dependent built-in function.
However, the static and dynamic context of the call to fn:function-lookup may play a role even when the selected function definition is not itself context dependent, if the expressions used to establish default parameter values are context dependent.
User-defined XSLT or XQuery functions should be accessible to fn:function-lookup only if they are statically visible at the location where the call to fn:function-lookup appears. This means that private functions, if they are not statically visible in the containing module, should not be accessible using fn:function-lookup.
The function identity is determined in the same way as for a named function reference. Specifically, if there is no context dependency, two calls on fn:function-lookup with the same name and arity must return the same function.
These specifications do not define any circumstances in which the dynamic context will contain functions that are not present in the static context, but neither do they rule this out. For example an API may provide the ability to add functions to the dynamic context, and such functions may potentially be context-dependent.
The mere fact that a function exists and has a name does not of itself mean that the function is present in the dynamic context. For example, functions obtained through use of the fn:load-xquery-module function are not added to the dynamic context.
| Expression: |
|
|---|---|
| Result: | 'bcd' |
The expression | |
The expression let $f := function-lookup(xs:QName('zip:binary-entry'), 2)
return if (exists($f)) then $f($source, $entry) else ()zip:binary-entry($source, $entry) if the function is available, or an empty sequence otherwise. | |
Maps were introduced as a new datatype in XDM 3.1. This section describes functions that operate on maps.
A map is a kind of item.
[Definition] A map consists of a sequence of entries, also known as key-value pairs. Each entry comprises a key which is an arbitrary atomic item, and an arbitrary sequence called the associated value.
[Definition] Within a map, no two entries have the same key. Two atomic items K1 and K2 are the same key for this purpose if the function call fn:atomic-equal($K1, $K2) returns true.
It is not necessary that all the keys in a map should be of the same type (for example, they can include a mixture of integers and strings).
Maps are immutable, and have no identity separate from their content. For example, the map:remove function returns a map that differs from the supplied map by the omission (typically) of one entry, but the supplied map is not changed by the operation. Two calls on map:remove with the same arguments return maps that are indistinguishable from each other; there is no way of asking whether these are “the same map”.
A map can also be viewed as a function from keys to associated values. To achieve this, a map is also a function item. The function corresponding to the map has the signature function($key as xs:anyAtomicValue) as item()*. Calling the function has the same effect as calling the map:get function: the expression $map($key) returns the same result as get($map, $key). For example, if $books-by-isbn is a map whose keys are ISBNs and whose assocated values are book elements, then the expression $books-by-isbn("0470192747") returns the book element with the given ISBN. The fact that a map is a function item allows it to be passed as an argument to higher-order functions that expect a function item as one of their arguments.
The functions defined in this section use a conventional namespace prefix map, which is assumed to be bound to the namespace URI http://www.w3.org/2005/xpath-functions/map.
The function call map:get($map, $key) can be used to retrieve the value associated with a given key.
There is no operation to atomize a map or convert it to a string. The function fn:serialize can in some cases be used to produce a JSON representation of a map.
| Function | Meaning |
|---|---|
map:build | Returns a map that typically contains one entry for each item in a supplied input sequence. |
map:contains | Tests whether a supplied map contains an entry for a given key. |
map:empty | Returns true if the supplied map contains no entries. |
map:entries | Returns a sequence containing all the key-value pairs present in a map, each represented as a single-entry map. |
map:entry | Returns a single-entry map that represents a single key-value pair. |
map:filter | Selects entries from a map, returning a new map. |
map:find | Searches the supplied input sequence and any contained maps and arrays for a map entry with the supplied key, and returns the corresponding values. |
map:for-each | Applies a supplied function to every entry in a map, returning the sequence concatenationXP of the results. |
map:get | Returns the value associated with a supplied key in a given map. |
map:items | Returns a sequence containing all the values present in a map, in order. |
map:keys | Returns a sequence containing all the keys present in a map. |
map:keys-where | Returns a sequence containing selected keys present in a map. |
map:merge | Returns a map that combines the entries from a number of existing maps. |
map:of-pairs | Returns a map that combines data from a sequence of key-value pair maps. |
map:pair | Returns a key-value pair map that represents a single key-value pair. |
map:pairs | Returns a sequence containing all the key-value pairs present in a map, each represented as a key-value pair map. |
map:put | Returns a map containing all the contents of the supplied map, but with an additional entry, which replaces any existing entry for the same key. |
map:remove | Returns a map containing all the entries from a supplied map, except those having a specified key. |
map:size | Returns the number of entries in the supplied map. |
Returns a map that combines data from a sequence of key-value pair maps.
map:of-pairs( | ||
$input | as key-value-pair*, | |
$options | as | := {} |
) as | ||
This function is deterministic, context-independent, and focus-independent.
The function map:of-pairsreturns a map which is formed by combining key-value pair maps supplied in the $input argument.
The $options argument can be used to control the way in which duplicate keys are handled. The option parameter conventions apply.
The entries that may appear in the $options map are as follows:
record( | |
duplicates? | as (enum( "reject", "use-first", "use-last", "use-any", "combine") | fn(item()*, item()*) as item()*)? |
) | |
| Key | Value | Meaning |
|---|---|---|
| Determines the policy for handling duplicate keys: specifically, the action to be taken if two entries in the input sequence have key values K1 and K2 where K1 and K2 are the same key.
| |
"reject" | Equivalent to supplying a function that raises a dynamic error with error code "FOJS0003". The effect is that duplicate keys result in an error. | |
"use-first" | Equivalent to supplying the function fn($a, $b){ $a }. The effect is that the first of the duplicates is chosen. | |
"use-last" | Equivalent to supplying the function fn($a, $b){ $b }. The effect is that the last of the duplicates is chosen. | |
"use-any" | Equivalent to supplying the function fn($a, $b){ one-of($a, $b) } where one-of chooses either $a or $b in an implementation-dependent way. The effect is that it is implementation-dependent which of the duplicates is chosen. | |
"combine" | Equivalent to supplying the function fn($a, $b){ $a, $b } (or equivalently, the function op(",")). The effect is that the result contains the sequence concatenationXP of the values having the same key, retaining order. | |
function(*) | A function with signature fn(item()*, item()*) as item()*. The function is called for any entry in the input sequence that has the same key as a previous entry. The first argument is the existing value associated with the key; the second argument is the value associated with the key in the duplicate input entry, and the result is the new value to be associated with the key. The effect is cumulative: for example if there are three values X, Y, and Z associated with the same key, and the supplied function is F, then the result is an entry whose value is X => F(Y) => F(Z). | |
The effect of the function is equivalent to the result of the following XPath expression, except in error cases.
let $one-of := fn($a, $b) {
(: select either $a or $b at implementation option :)
if (environment-variable("X")) then $a else $b
}
let $duplicates := $options?duplicates
let $combine := if ($duplicates instance of xs:string) then (
{
"reject": fn($a, $b) { error(xs:QName("err:FOJS0003")) },
"use-first": fn($a, $b) { $a },
"use-last": fn($a, $b) { $b },
"use-any": fn($a, $b) { $one-of($a, $b) },
"combine": fn($a, $b) { $a, $b }
}?$duplicates
) else if ($duplicates instance of fn(*)) then (
$duplicates
) else (
fn($a, $b) { $a, $b }
)
return fold-left(
$input,
{},
fn($out, $next) {
let $newVal := if (map:contains($out, $next?key)) then (
$combine($out?($next?key), $next?value)
) else (
$next?value
)
return map:put($result, $next?key, $newVal)
}
)let $one-of := fn($a, $b) {
(: select either $a or $b at implementation option :)
if (environment-variable("X")) then $a else $b
}
let $duplicates := $options?duplicates
let $combine := if ($duplicates instance of xs:string) then (
{
"reject": fn($a, $b) { error(#err:FOJS0003) },
"use-first": fn($a, $b) { $a },
"use-last": fn($a, $b) { $b },
"use-any": fn($a, $b) { $one-of($a, $b) },
"combine": fn($a, $b) { $a, $b }
}?$duplicates
) else if ($duplicates instance of fn(*)) then (
$duplicates
) else (
fn($a, $b) { $a, $b }
)
return fold-left(
$input,
{},
fn($out, $next) {
let $newVal := if (map:contains($out, $next?key)) then (
$combine($out?($next?key), $next?value)
) else (
$next?value
)
return map:put($result, $next?key, $newVal)
}
)An error is raised [err:FOJS0003] if the value of $options indicates that duplicates are to be rejected, and a duplicate key is encountered.
In the formal equivalent shown above:
The call on error() is indicative; the implementation is free to raise the error in its own way.
The function $one-of($a, $b) is intended to illustrate that either $a or $b is returned, at the discretion of the implementation. A function body is provided for completeness, but it is not intended as a realistic implementation.
If the input is an empty sequence, the result is an empty map.
There is no requirement that the supplied key-value pairs should have the same or compatible types. The type of a map (for example map(xs:integer, xs:string)) is descriptive of the entries it currently contains, but is not a constraint on how the map may be combined with other maps.
When duplicate keys are encountered, the effect is that:
In the entry orderDM of the result map, the position of the entry containing the result of combining a set of entries with duplicate keys corresponds to the position of the first of the duplicates in the input sequence.
The key of the combined entry will correspond to the key of one of the duplicates: it is implementation-dependent which one is chosen. (Keys may be duplicates even though they differ: for example they may have different type annotations, or they might be xs:dateTime values in different timezones.)
| Variables | |
|---|---|
let $week := {
0: "Sonntag", 1: "Montag", 2: "Dienstag", 3: "Mittwoch",
4: "Donnerstag", 5: "Freitag", 6: "Samstag"
} | |
| Expression: |
|
|---|---|
| Result: | {}(Returns an empty map). |
| Expression: |
|
| Result: | {
0: "Sonntag",
1: "Montag",
2: "Dienstag",
3: "Mittwoch",
4: "Donnerstag",
5: "Freitag",
6: "Samstag"
}(The function |
| Expression: | map:of-pairs((
{ "key": 0, "value": "no" },
{ "key": 1, "value": "yes" }
)) |
| Result: | { 0: "no", 1: "yes" }(Returns a map with two entries). |
| Expression: | map:of-pairs((
map:pairs($week),
{ "key": 7, "value": "Unbekannt" }
)) |
| Result: | { 0: "Sonntag", 1: "Montag", 2: "Dienstag", 3: "Mittwoch",
4: "Donnerstag", 5: "Freitag", 6: "Samstag", 7: "Unbekannt" }(The value of the existing map is unchanged; the returned map contains all the entries from |
| Expression: | map:of-pairs((
map:pairs($week),
{ "key": 6, "value": "Sonnabend" }
)) |
| Result: | { 0: "Sonntag", 1: "Montag", 2: "Dienstag", 3: "Mittwoch",
4: "Donnerstag", 5: "Freitag", 6: ("Samstag", "Sonnabend") }(The value of the existing map is unchanged; the returned map contains all the entries from |
| Expression: | map:of-pairs(
(map:pairs($week), { "key": 6, "value": "Sonnabend" }),
{ "duplicates": "use-last" }
) |
| Result: | { 0: "Sonntag", 1: "Montag", 2: "Dienstag", 3: "Mittwoch",
4: "Donnerstag", 5: "Freitag", 6: "Sonnabend" }(The value of the existing map is unchanged; the returned map contains all the entries from |
| Expression: | map:of-pairs(
(map:pairs($week), { "key": 6, "value": "Sonnabend" }),
{ "duplicates": concat(?, '|', ?) }
) |
| Result: | { 0: "Sonntag", 1: "Montag", 2: "Dienstag", 3: "Mittwoch",
4: "Donnerstag", 5: "Freitag", 6: "Samstag|Sonnabend" }(In the result map, the value for key |
| Expression: | map:of-pairs(
(
map:pairs({ "England": 2, "Germany": 1 }),
map:pairs({ "France": 2, "Germany": 2 }),
map:pairs({ "England": 0, "France": 1 })
),
{ "duplicates": op("+") }
) |
| Result: | { "England": 2, "Germany": 3, "France": 3 }(The values for each distinct key are summed.) |
| Expression: | map:of-pairs(
(map:pair("red", 0), map:pair("green", 1), map:pair("blue", 2))
)
=> map:keys() |
| Result: | "red", "green", "blue" (The keys are returned in the order supplied.) |
| Expression: | { "red": 0, "green": 1, "blue": 2 }
=> map:pairs()
=> sort(keys := fn { ?key })
=> map:of-pairs()
=> map:keys() |
| Result: | "blue", "green", "red" (Takes any map and produces a map with the same entries, but sorted by key.) |
| Expression: | map:of-pairs(
(map:pair("red", 0), map:pair("green", 1), map:pair("blue", 2))
)
=> map:put("yellow", -1)
=> map:keys() |
| Result: | "red", "green", "blue", "yellow" (New entries are added at the end.) |
The following expression takes an existing map and sorts its entries into key order: | |
map:of-pairs(map:pairs($M) => sort(keys := fn { ?key })) | |
The fn:element-to-map function converts a tree rooted at an XML element node to a corresponding tree of maps, in a form suitable for serialization as JSON. In effect it provides a mechanism for converting XML to JSON.
This section describes the mappings used by this function.
This mapping is designed with three objectives:
It should be possible to represent any XML element as a map suitable for JSON serialization.
The resulting JSON should be intuitive and easy to use.
The JSON should be consistent and stable: small variations in the input should not result in large variations in the output.
Achieving all three objectives requires design compromises. It also requires sacrificing some other desiderata. In consequence:
The conversion is not lossless (see 18.5.8 Lost XDM Information for details).
The conversion is not streamable.
The results are not necessarily compatible with those produced by other popular libraries.
The requirement for consistency and stability is particularly challenging. An element such as <name>John</name> maps naturally to the map { "name": "John" }; but adding an attribute (so it becomes <name role="first">John</name>) then requires an incompatible change in the JSON representation. The format could be made extensible by converting <name>John</name> to { "name": {"#content":"John"} } and <name role="first">John</name> to { "name": { "@role":"first", "#content":"John" } }, but this imposes unwanted complexity on the simplest cases. The solution adopted is threefold:
It is possible to analyze a corpus of XML documents to develop a conversion plan, which can then be applied consistently to individual input documents, whether or not these documents were present in the corpus. The conversion plan can be serialized and subsequently reused, so that it can be applied to input documents that might not have existed at the time the conversion plan was formulated.
Alternatively, the function can make use of schema information where available, so it considers not just the structure of an individual element instance, but the rules governing the element type.
It is possible to override the choices made by the system, and explicitly specify the format to be used for elements or attributes having a given name.
The key challenge in mapping XML to JSON is in deciding how element content is to be represented. To illustrate the variety of mappings that are possible, the following table lists some examples of typical XML elements and their JSON equivalents:
| XML element | JSON equivalent |
|---|---|
<hr/> | "hr": "" |
<date-of-birth>2023-05-18</date-of-birth> | "date-of-birth": "2023-05-18" |
<box width="5" height="10"/> | "box": { "@width": "5", "@height": "10" } |
<label id="t41">Warning!</label> | "label": { "@id": "t41", "#content": "Warning!" } |
<box>
<width>5</width>
<height>10</height>
</box> | "box": {
"width": 5,
"height": 10
} |
<polygon>
<point x="0" y="0"/>
<point x="1" y="0"/>
<point x="1" y="1"/>
<point x="0" y="1"/>
</polygon> | "polygon": [
{ "x": 0, "y": 0 },
{ "x": 1, "y": 0 },
{ "x": 1, "y": 1 },
{ "x": 0, "y": 1 }
] |
This specification defines a number of named mappings, called layouts, and allows the layout for a particular element to be selected in a number of different ways:
The layout to be used for a specific elements can be explicitly selected by supplying a conversion plan as input to the fn:element-to-map function.
It is possible to construct a conversion plan by analyzing a corpus of documents using the fn:element-to-map-plan function.
It is also possible to construct a conversion plan manually, or to modify the conversion plan produced by the fn:element-to-map-plan function before use.
In the absence of an explicit conversion plan, if the data has been schema-validated, the layout is inferred from the content model for the element type as defined in the schema.
When the data is untyped and no specific layout has been selected, a default layout is chosen based on the properties of the individual element instance.
The advantage of using schema information is that it gives a consistent representation for all elements of a particular type, even if they vary in content: for example if an element type allows optional attributes, the JSON representation will be consistent between those elements that have attributes and those without. In the absence of a schema, consistency can be achieved by supplying a conversion plan that applies uniformly to multiple documents.
The different layouts available are defined in the following sections. For each layout there is a table showing:
Layout name: the name to be used to select this layout in a conversion plan supplied to the fn:element-to-map function.
Usage: the situations for which this layout is designed.
Example input: an example of a typical element for which this layout is appropriate, shown as serialized XML.
Example output: the result of converting this example, shown as serialized JSON. The result is always shown as a singleton map, which is how it will appear when the layout is used for the top-level elements supplied in the $elements argument; when used to convert a descendant element, the corresponding key-value pair may appear as part of a larger map, depending on the layout chosen for its parent element..
Note:
The fn:element-to-map function produces a map as its result, but it is convenient to illustrate the form of the map by showing the effect of serializing the map as JSON.
Mapping rules: The rules for mapping the XML element to an XDM map representation.
Mapping for nilled elements: special rules that apply to an element having the attribute xsi:nil="true". These rules only apply if the element has been schema-validated.
Errors: situations where the layout cannot be used, and where attempting to use it will fail. For example, the empty layout cannot be used for an element that is not empty. In such a situation the recovery action is as follows, in order:
Attributes are dropped, and if this is sufficient to enable the layout to be used, then the element is converted without its attributes.
If the type of an element or attribute in the conversion plan is given as boolean or numeric, but the actual value of the element or attribute is not castable to xs:boolean or xs:numeric respectively, then the node is output ignoring the type property, that is, as an instance of xs:untypedAtomic.
If the conversion plan supplies a fallback layout (an entry with key "*"), then the fallback layout is used.
The element-to-map function fails with a dynamic error.
The rules for selecting the layout for a particular element are given later, in 18.5.5 Selecting an element layout.
Note that it is possible to request any layout for any element. If an inappropriate layout is chosen for a particular element (for example, empty layout for an element that is not empty), then the rules for that layout specify what happens. It is possible to specify a fallback layout for use when the selected layout fails: this will typically be a layout such as xml or mixed that can handle any element.
Note:
Acknowledgements for this categorization: see [Goessner]. Although Goessner's categories have been used, the detailed mappings vary from his proposal.
| Layout name |
|
|---|---|
| Usage | Intended for XML elements that have no content and no attributes. |
| Example input | <hr/> |
| Example output | { "hr": "" } |
| Mapping rules | The content is represented by the zero-length |
| Mapping for nilled elements | The content is represented by the QName |
| Errors | Attributes are discarded, along with child comment nodes, processing instructions, and whitespace-only text nodes. If any other child nodes are present, this layout fails. |
| Layout name |
|
|---|---|
| Usage | Intended for XML elements that have no content but may have attributes. |
| Example input | <hr class="ccc" id="zzz"/> |
| Example output | { "hr": { "@class": "ccc", "@id": "zzz" } } |
| Mapping rules | The content is represented by a map containing one entry for each attribute in the XML element; if there are no attributes, the content is represented as an empty map. The rules for attribute names are defined in 18.5.6 Element and Attribute Names, and the rules for attribute content in 18.5.7 Element and Attribute Content. |
| Mapping for nilled elements | An additional key-value pair |
| Errors | Child comment nodes, processing instructions, and whitespace-only text nodes are discarded. If any other child nodes are present, this layout fails. |
| Layout name |
|
|---|---|
| Usage | Intended for XML elements that have simple content and no attributes. |
| Example input | <date>2023-05-30</date> |
| Example output | { "date": "2023-05-30" } |
| Mapping rules | The element is atomized and the resulting atomized value is handled as described in 18.5.7 Element and Attribute Content. If atomization fails, the element is treated as if it were untyped. Note: If the element is untyped, the atomized value will always appear in the result as an instance of |
| Mapping for nilled elements | The content is represented by the value |
| Errors | Attributes are discarded, along with child comment nodes and processing instructions; whitespace is retained. If any child elements are present, this layout fails. |
| Layout name |
|
|---|---|
| Usage | Intended for XML elements that have simple content and (optionally) attributes. |
| Example input | <price currency="USD">23.50</date> |
| Example output | { "price": { "@currency": "USD", "#content": 23.50 } } |
| Mapping rules | The element is represented by a map containing one entry for each of its attributes, plus an entry with key The rules for attribute names are defined in 18.5.6 Element and Attribute Names, and the rules for attribute content in 18.5.7 Element and Attribute Content. Note: If the element is untyped, the value of each attribute, and of If the element has been schema-validated, the types of the items in the atomized value are retained. |
| Mapping for nilled elements | The |
| Errors | Child comment nodes and processing instructions are discarded; whitespace is retained. If any child elements are present, this layout fails. |
| Layout name |
|
|---|---|
| Usage | Intended for XML elements that act as wrappers for a list of child elements, all having the same element name; neither the element itself nor any of its children should have any attributes. The expected child element name may be present in the conversion plan. The names of the child elements are not retained in the output. |
| Example input (1) | <dates> <date>2023-03-20</date> <date>2023-04-12</date> <date>2023-05-30</date> </dates> |
| Example output (1) | { "dates": [ "2023-03-20", "2023-04-12", "2023-05-30" ] } |
| Example input (2) | <dates> <date><year>2023</year><month>03</month><day>20</day></date> <date><year>2023</year><month>04</month><day>12</day></date> <date><year>2023</year><month>05</month><day>30</day></date> </dates> |
| Example output (2) | { "dates": [
{ "year": "2023", "month": "03", "day": "20" },
{ "year": "2023", "month": "04", "day": "12" },
{ "year": "2023", "month": "05", "day": "30" }
] } |
| Mapping rules | The content is represented by an array, whose members correspond one-to-one with the children of the element. Each child element is converted to a map as if it were a top-level element: the resulting map contains a single key-value pair. The key part is discarded, and the value part is used as a member in the resulting array. If there are no children then the content is represented by an empty array. |
| Mapping for nilled elements | The array is replaced by the value |
| Errors | Attributes are discarded for both the element itself, and its children. Comments, processing instructions, and whitespace text nodes in the content are discarded. This layout fails if any child element is present with a name that differs from the expected child element name, or if there are non-whitespace text node children. |
| Layout name |
|
|---|---|
| Usage | Intended for XML elements that act as wrappers for a list of child elements, all having the same element name. The wrapper element may have attributes, but the children should not. and the name of the child elements is retained in the output. |
| Example input (1) | <dates id="x"> <date>2023-03-20</date> <date>2023-04-12</date> <date>2023-05-30</date> </dates> |
| Example output (1) | "dates": { "@id": "x", "date": ["2023-03-20", "2023-04-12", "2023-05-30"]} |
| Example input (2) | <dates id="x"> <date><year>2023</year><month>03</month><day>20</day></date> <date><year>2023</year><month>04</month><day>12</day></date> <date><year>2023</year><month>05</month><day>30</day></date> </dates> |
| Example output (2) | { "dates": {
"@id": "x",
"date": [
{ "year": "2023", "month": "03", "day": "20" },
{ "year": "2023", "month": "04", "day": "12" },
{ "year": "2023", "month": "05", "day": "30" }
] } } |
| Mapping rules | The content is represented by a map containing one entry for each attribute in the XML element, plus a property named after the child elements (the content property), whose value is an array containing the results of formatting the content in the same way as the If there are no children and the element is untyped (which can occur when this layout is chosen explicitly via the options to |
| Mapping for nilled elements | The array-valued entry in the result is replaced by the entry |
| Errors | Any attributes on the element's children are discarded. Comments, processing instructions, and whitespace text nodes in the content are discarded. This layout fails if any child element is present with a name that differs from the expected child element name, or if there are non-whitespace text node children. |
| Layout name |
|
|---|---|
| Usage | Intended primarily for XML elements that contain multiple child elements, with different names, where the order of the child elements is not significant. Also used for elements whose content is a single element node child. The element may or may not have attributes. |
| Example input (1) | <employee id="x"> <date-of-birth>1984-03-20</date> <location>Germany</location> <position>Janitor</position> </employee> |
| Example output (1) | { "employee": { "@id": "x",
"date-of-birth": "1984-03-20",
"location": "Germany",
"position": "Janitor"
}
} |
| Example input (2) | <employee id="x"> <date-of-birth>1984-03-20</date> <location>Germany</location> <position>Janitor</position> <position>Gardener</position> </employee> |
| Example output (2) | { "employee": { "@id": "x",
"date-of-birth": "1984-03-20",
"location": "Germany",
"position": [ "Janitor", "Gardener" ]
}
} |
| Mapping rules | The content is represented by a map containing one entry for each attribute in the XML element, plus one entry for each child element, whose value is formatted according to the rules for that element. If two or more child elements have the same name, or names that are represented by the same string (taking into account the chosen The entry orderDM of the resulting map first contains entries derived from attributes (in unpredictable order), then entries derived from child elements, in order of first appearance. |
| Mapping for nilled elements | Alongside any attributes, the value includes the additional entry |
| Errors | Although this layout is intended primarily for elements whose children are unordered and uniquely named, it is also viable to use it in cases where elements can repeat, so long as order relative to other elements is not significant. Comments, processing instructions, and whitespace text nodes in the content are discarded. This layout fails if there are non-whitespace text node children. |
| layout name |
|
|---|---|
| Usage | Intended for XML elements that contain a sequence of element node children, whose order is significant. The element may or may not have attributes. |
| Example input | <section id="x"> <head>Introduction</head> <p>Lorem ipsum.</p> <p>Dolor sit amet.</p> </section> |
| Example output | { "section": [
{ "@id": "x" },
{ "head": "Introduction" },
{ "p": "Lorem ipsum." },
{ "p": "Dolor sit amet." }
] } |
| Mapping rules | The mapping rules are identical to the rules for the |
| Mapping for nilled elements | A nilled element is indicated by including an additional map |
| Errors | This layout fails if there are non-whitespace text node children. |
| Layout name |
|
|---|---|
| Usage | Intended for XML elements that contain mixed content (that is, elements that contain both child elements and child text nodes, intermingled). The element may or may not have attributes. |
| Example input | <para id="x">This is a <i>fine</i> mess!</para> |
| Example output | { "para": [
{ "@id": "x" },
"This is a ",
{ "i": "fine" },
"mess!"
] } |
| Mapping rules | The content is represented by an XDM array containing one entry for each attribute in the XML element, and one entry for each child node, in order. Each attribute node is represented within this array by a single-entry map: the rules for attribute names are defined in 18.5.6 Element and Attribute Names, and the rules for attribute content in 18.5.7 Element and Attribute Content. Child nodes are represented within the array as follows:
Whitespace text nodes are retained. |
| Mapping for nilled elements | A nilled element is indicated by including an additional map |
| Errors | All children are retained, including comments, processing instructions, and text nodes, whether or not they are whitespace-only. This layout never fails. |