XPath and XQuery Functions and Operators 4.0

15 Parsing and serializing

These functions convert between the lexical representation and XPath and XQuery data model representation of various file formats.

15.1 Functions on XML Data

These functions convert between the lexical representation of XML and the tree representation.

(The fn:serialize function also handles HTML and JSON output, but is included in this section for editorial convenience.)

Function	Meaning
`fn:parse-xml`	This function takes as input an XML document represented as a string, and returns the document node at the root of an XDM tree representing the parsed document.
`fn:parse-xml-fragment`	This function takes as input an XML external entity represented as a string, and returns the document node at the root of an XDM tree representing the parsed document fragment.
`fn:serialize`	This function serializes the supplied input sequence `$input` as described in [XSLT and XQuery Serialization 3.1], returning the serialized representation of the sequence as a string.
`fn:xsd-validator`	Given an XSD schema, delivers a function item that can be invoked to validate a document or element node against this schema.

15.1.4 XSD validation

Changes in 4.0 ⬇ ⬆

This description of the XSD validation process was previously found (with some duplication) in the XQuery and XSLT specifications; those specifications now reference this description. As a side-effects, the descriptions of the process in XQuery and XSLT are better aligned. [Issue 2029 PR 2030 28 May 2025]

This section describes a process called XSD validation, which validates a supplied node against a supplied XSD schema. The validation process refers to the process defined in [XML Schema Part 1: Structures Second Edition] or [XSD 1.1 Part 1].

The validation process takes the following inputs:

A schema to be used for validation, called the effective schema.
A boolean indicating whether any xsi:schemaLocation or xsi:noNamespaceSchemaLocation attributes are to be taken into consideration.
A document, element, or attribute node to be validated; this is called the operand node.
A validation mode, which is one of strictlax, or by-type.
Note:
XSLT also allows the value strip, but this does not invoke validation (instead, it invokes stripping of existing type annotations, and re-annotation of nodes as xs:untyped.)
If the validation mode is by-type, then a schema type to be used for validating the operand node. This may be any simple or complex type present in the effective schema: it must not be xs:untyped or xs:untypedAtomic.
Note:
An XQuery ValidateExpr allows the type to be specified as xs:untyped or xs:untypedAtomic, but this does not invoke validation (instead, it invokes stripping of existing type annotations and re-annotation of nodes as untyped.)

The output of the validation process comprises one or more of the following:

A boolean indicating whether the operand node was found to be valid.
If the operand node was found to be valid, a deep copy of the operand node augmented with type annotations corresponding to the types against which they were validated, the copies may also include expanded values for element and attribute defaults defined in the schema.
This creates a new node with its own identity and with no parent.
The base URI property of every node in the resulting XDM tree is the same as the base URI property of the corresponding node in the input tree.
If the operand node was not found to be valid, then optionally, a set of error diagnostics in implementation-defined format.

The operand node must be one of:

An element node
An attribute node
A well-formed document node, that is, a document node having among its children exactly one element node and zero or more comment and processing instruction nodes.

The term validation root is used to refer to the operand node if it is an element or attribute node, or to the single element child of the operand node when the operand node is a document node.

Note that a schema is defined as a collection of schema components (for example, element and attribute declarations, complex and simple type definitions). In some cases the schema that is used is the set of schema components found in the in-scope schema definitions^XP, but this is not the only possibility.

The result of the validation process is defined by the following rules.

The invoking application determines whether the validity assessment process takes account of any xsi:schemaLocation or xsi:noNamespaceSchemaLocation attributes in the tree being validated. If it does so, then it should adhere to the following rules:
1. Any schema loaded using these attributes must be compatible^DM with the existing effective schema.
2. Any schema loaded using these attributes must not override or redefine any schema components in the effective schema.
3. Any schema components loaded using this mechanism must be used for this validity assessment only, and must not affect the outcome of any subsequent validity assessments of other documents.
  Note:
  A processor may choose to cache such schema components but the existence of such a cache should only affect performance, not the validation outcome.
A consequence of validating a document using schema components that are not in the static context is that nodes may be annotated with types that are not in the static context. But the rules for schema compatibility^DM mean that this is not a problem.
If the instance being validated contains any xml:id attributes, such attributes are validated against the type xs:ID, making the containing element eligible as a target for the id function. Uniqueness checking of elements and attributes typed as xs:ID, however, is carried out only if the operand node is a document node.
If the operand node is a document node:
1. The children of the document node must consist of exactly one element node and zero or more comment and processing instruction nodes, in any order.
2. The element node child is validated, as described below.
3. The validation rule “Validation Root Valid (ID/IDREF)” is applied to the single element node child of the document node. This means that validation will fail if there are non-unique ID values or dangling IDREF values in the document tree.
  Note:
  This rule is not applied when the operand node is an element or attribute node.
4. There is no check that the tree contains unparsed entities whose names match the values of nodes of type xs:ENTITY or xs:ENTITIES. This is because it is not possible (either in XSLT or XQuery) to construct a tree containing unparsed entities. It is possible to add unparsed entity declarations to the result document by referencing a suitable DOCTYPE during serialization.
5. All other children of the document node (comments and processing instructions) are copied unchanged, and the results become the children of a new document node, which is returned as the validation result.
If the operand node is an element node, then:
1. For specification purposes, because the XSD specifications require the input document to be expressed as an XML Information Set ([XML Infoset]), the operand node is first converted to an Infoset according to the “Infoset Mapping” rules defined in [XQuery and XPath Data Model (XDM) 4.0]. Note that this process discards any existing type annotations.
  Validity assessment is carried out on the root element information item of the resulting Infoset, using the supplied schema. The process of validation applies recursively to contained elements and attributes to the extent required by the supplied schema.
  Note:
  A practical implementation is unlikely to perform any physical conversion, but the process is defined this way in order to align with the XSD specification.
2. If the validation mode is by-type, then Schema-validity assessment is carried out according to the rules defined in [XML Schema Part 1: Structures Second Edition] or [XSD 1.1 Part 1] Part 1, section 3.3.4 "Element Declaration Validation Rules", “Validation Rule: Schema-Validity Assessment (Element)”, clauses 1.2 and 2, using this type definition as the “processor-stipulated type definition” for validation.
3. If validation mode is strict, then strict validation is carried out as described in [XML Schema Part 1: Structures Second Edition] Part 1, section 5.2, “Assessing Schema-Validity”, item 2, or its counterpart in XSD 1.1. This means that the root element information item in the Infoset must either:
  1. have a name that matches a top-level element declaration in the effective schema, or
  2. have an xsi:type attribute whose value matches the name of a top-level type definition in the effective schema
  If there is no such element declaration or type definition, the element is assessed as invalid.
4. If validation mode is lax, then schema-validity assessment is carried out in accordance with [XML Schema Part 1: Structures Second Edition] Part 1, section 5.2, “Assessing Schema-Validity”, item 3, or its counterpart in XSD 1.1.
  If validation mode is lax and the root element information item has neither a top-level element declaration nor an xsi:type attribute, XSD 1.0 and XSD 1.1 define the recursive checking of children and attributes as optional. This specification prescribes that this recursive checking is required.
  Note:
  This means, for example, that when an instance document is structured as having an envelope in one namespace wrapping a payload in a different namespaces, and when schema definitions are available for the payload but not for the envelope, lax validation of the envelope may trigger validation of the payload.
5. If the operand node is an element node, the validation rules named “Validation Root Valid (ID/IDREF)” are not applied. This means that document-level constraints relating to uniqueness and referential integrity are not enforced.
6. There is no check that the document contains unparsed entities whose names match the values of nodes of type xs:ENTITY or xs:ENTITIES.
If the operand node is an attribute node, in particular when it is a parentless attribute node, then validation cannot be defined directly in terms of the XSD-defined validation process. Instead, conceptually, a copy of the attribute is first added to an element node that is created for the purpose, and namespace fixup is performed on this element node to ensure that it has an in-scope namespace binding for the prefix and namespace of the attribute name. The name of this element is of no consequence, but it must be the same as the name of a synthesized element declaration of the form:
```
<xs:element name="E">
  <xs:complexType>
    <xs:sequence/>
    <xs:attribute ref="A"/>
  </xs:complexType>
</xs:element>
```
where A is the name of the attribute being validated.
This synthetic element is then validated using the procedure given above for validating elements, and if it is found to be valid, a copy of the validated attribute is made, retaining its type annotation, but detaching it from the containing element (and thus, from any in-scope namespace bindings).
The XDM data model does not permit an attribute node with no parent to have a typed value that includes a namespace-qualified name, that is, a value whose type is derived from xs:QName or xs:NOTATION. This restriction is imposed because these types rely on the in-scope namespaces of a containing element to resolve namespace prefixes. Therefore, a parentless attribute is considered to be invalid against such a type.
The outcome of the validation expression depends on the validity property of the root element information item in the PSVI that results from the XSD validation process.
1. If the validity property of the root element information item is valid, or if validation mode is lax and the validity property of the root element information item is notKnown, the PSVI is converted back into a data model instance as described in [XQuery and XPath Data Model (XDM) 4.0] Section 3.3, “Construction from a PSVI”. The resulting node (a new node of the same kind as the operand node) is returned as the result of the validate expression.
  Otherwise, the operand node is deemed invalid.

Note:

During conversion of the PSVI into an XDM instance after validation, any element information items whose validity property is notKnown are converted into element nodes with type annotationxs:anyType, and any attribute information items whose validity property is notKnown are converted into attribute nodes with type annotationxs:untypedAtomic, as described in Section 6.5.3.1.1 Element and Attribute Node Types^DM.

H Changes since 3.1 (Non-Normative)

H.1 Summary of Changes

Use the arrows to browse significant changes since the 3.1 version of this specification.
See 1 Introduction
Sections with significant changes are marked Δ in the table of contents. New functions introduced in this version are marked ➕ in the table of contents.
See 1 Introduction
PR 1620 1886
Options are added to customize the form of the output.
See 2.2.6 fn:path
PR 1547 1551
New in 4.0
See 2.2.8 fn:siblings
PR 629 803
New in 4.0
See 3.2.2 fn:message
PR 1260 1275
A third argument has been added, providing control over the rounding mode.
See 4.4.4 fn:round
New in 4.0
See 4.4.7 fn:is-NaN
PR 1049 1151
Decimal format parameters can now be supplied directly as a map in the third argument, rather than referencing a format defined in the static context.
See 4.7.2 fn:format-number
PR 1205 1230
New in 4.0
See 4.8.2 math:e
See 4.8.16 math:sinh
See 4.8.17 math:cosh
See 4.8.18 math:tanh
The 3.1 specification suggested that every value in the result range should have the same chance of being chosen. This has been corrected to say that the distribution should be arithmetically uniform (because there are as many xs:double values between 0.01 and 0.1 as there are between 0.1 and 1.0).
See 4.9.2 fn:random-number-generator
PR 261 306 993
New in 4.0
See 5.4.1 fn:char
New in 4.0
See 5.4.2 fn:characters
PR 937 995 1190
New in 4.0
See 5.4.13 fn:hash
New in 4.0
See 7.6.2 fn:parse-uri
PR 1423 1413
New in 4.0
See 7.6.3 fn:build-uri
New in 4.0
See 11.2.6 fn:in-scope-namespaces
Reformulated in 4.0 in terms of the new fn:in-scope-namespaces function; the semantics are unchanged.
See 11.2.7 fn:in-scope-prefixes
Reformulated in 4.0 in terms of the new fn:in-scope-namespaces function; the semantics are unchanged.
See 11.2.8 fn:namespace-uri-for-prefix
New in 4.0
See 14.1.9 fn:replicate
New in 4.0
See 14.1.12 fn:slice
New in 4.0. The function is identical to the internal op:same-key function in 3.1
See 14.2.1 fn:atomic-equal
PR 1120 1150
A callback function can be supplied for comparing individual items.
See 14.2.2 fn:deep-equal
Changed in 4.0 to use transitive equality comparisons for numeric values.
See 14.2.4 fn:distinct-values
PR 614 987
New in 4.0
See 14.2.5 fn:duplicate-values
New in 4.0. Originally proposed under the name fn:uniform
See 14.4.6 fn:all-equal
New in 4.0. Originally proposed under the name fn:unique
See 14.4.7 fn:all-different
PR 1117 1279
The $options parameter has been added.
See 14.6.6 fn:unparsed-text-lines
New in 4.0
See 15.1.5 fn:xsd-validator
PR 259 956
A new function is available for processing input data in HTML format.
See 15.2 Functions on HTML Data
New in 4.0
See 15.2.2 fn:parse-html
PR 975 1058 1246
An option is provided to control how JSON numbers should be formatted.
See 15.3.4 fn:parse-json
Additional options are available, as defined by fn:parse-json.
See 15.3.5 fn:json-doc
PR 533 719 834 1066
New in 4.0
See 15.4.4 fn:csv-to-arrays
See 15.4.7 fn:parse-csv
PR 533 719 834 1066 1605
New in 4.0
See 15.4.9 fn:csv-to-xml
PR 791 1256 1282 1405
New in 4.0
See 15.5.1 fn:invisible-xml
New in 4.0
See 17.2.3 fn:every
New in 4.0
See 17.2.9 fn:highest
New in 4.0
See 17.2.10 fn:index-where
New in 4.0
See 17.2.11 fn:lowest
New in 4.0
See 17.2.15 fn:scan-right
New in 4.0
See 17.2.16 fn:some
PR 521 761
New in 4.0
See 17.2.22 fn:transitive-closure
New in 4.0
See 18.4.6 map:filter
New in 4.0
See 18.4.10 map:items
PR 478 515
New in 4.0
See 18.4.12 map:keys-where
New in 4.0
See 18.4.14 map:of-pairs
New in 4.0
See 18.4.15 map:pair
New in 4.0
See 18.4.16 map:pairs
PR 1575 1906
A new function fn:element-to-map is provided for converting XDM trees to maps suitable for serialization as JSON. Unlike the fn:xml-to-json function retained from 3.1, this can handle arbitrary XML as input.
See 18.5 Converting elements to maps
New in 4.0
See 19.2.3 array:empty
PR 968 1295
New in 4.0
See 19.2.13 array:index-of
PR 476 1087
New in 4.0
See 19.2.16 array:items
PR 360 476
New in 4.0
See 19.2.18 array:members
See 19.2.19 array:of-members
New in 4.0
See 19.2.24 array:slice
New in 4.0
See 19.2.27 array:split
Supplying an empty sequence as the value of an optional argument is equivalent to omitting the argument.
See 19.2.28 array:subarray
PR 533 719 834
New functions are available for processing input data in CSV (comma separated values) format.
See 15.4 Functions on CSV Data
PR 289 1901
A third argument is added, allowing user control of how absent keys should be handled.
See 18.4.9 map:get
A third argument is added, allowing user control of how index-out-of-bounds conditions should be handled.
See 19.2.11 array:get
A new collation URI is defined for Unicode case-insensitive comparison and ordering.
See 5.3.5 The Unicode case-insensitive collation
The specification now describes how an initial BOM should be handled.
See 14.6.5 fn:unparsed-text
PR 1727 1740
It is no longer guaranteed that the new key replaces the existing key.
See 18.4.17 map:put
This description of the XSD validation process was previously found (with some duplication) in the XQuery and XSLT specifications; those specifications now reference this description. As a side-effects, the descriptions of the process in XQuery and XSLT are better aligned.
See 15.1.4 XSD validation
This description of the XSD validation process was previously found (with some duplication) in the XQuery and XSLT specifications; those specifications now reference this description. As a side-effects, the descriptions of the process in XQuery and XSLT are better aligned.
See 15.1.4 XSD validation
PR 173
New in 4.0
See 17.3.4 fn:op
PR 203
New in 4.0
See 18.4.1 map:build
PR 207
New in 4.0
See 11.1.2 fn:parse-QName
See 11.2.5 fn:expanded-QName
PR 222
New in 4.0
See 14.2.7 fn:starts-with-subsequence
See 14.2.8 fn:ends-with-subsequence
See 14.2.9 fn:contains-subsequence
PR 250
New in 4.0
See 14.1.3 fn:foot
See 14.1.15 fn:trunk
See 19.2.2 array:build
See 19.2.8 array:foot
See 19.2.30 array:trunk
PR 258
New in 4.0
See 19.2.14 array:index-where
PR 313
The second argument can now be a sequence of integers.
See 14.1.8 fn:remove
PR 314
New in 4.0
See 18.4.4 map:entries
PR 326
Higher-order functions are no longer an optional feature.
See 1.2 Conformance
PR 419
New in 4.0
See 14.1.7 fn:items-at
PR 434
New in 4.0
See 4.5.2 fn:parse-integer
The function has been extended to allow output in a radix other than 10, for example in hexadecimal.
See 4.6.1 fn:format-integer
PR 482
Deleted an inaccurate statement concerning the behavior of NaN.
See 4.3 Comparison operators on numeric values
PR 507
New in 4.0
See 17.2.13 fn:partition
PR 546
It is no longer automatically an error if the input contains a codepoint that is not valid in XML. Instead, the codepoint must be a permitted character. The set of permitted characters is implementation-defined, but it is recommended that all Unicode characters should be accepted.
See 5.2.1 fn:codepoints-to-string
It is no longer automatically an error if the resource (after decoding) contains a codepoint that is not valid in XML. Instead, the codepoint must be a permitted character. The set of permitted characters is implementation-defined, but it is recommended that all Unicode characters should be accepted.
See 14.6.5 fn:unparsed-text
The rules regarding use of non-XML characters in JSON texts have been relaxed.
See 15.3.3 JSON character repertoire
See 15.3.4 fn:parse-json
It is no longer automatically an error if the input contains a codepoint that is not valid in XML. Instead, the codepoint must be a permitted character. The set of permitted characters is implementation-defined, but it is recommended that all Unicode characters should be accepted.
See 15.3.5 fn:json-doc
PR 631
New in 4.0
See 7.3 fn:decode-from-uri
PR 662
Constructor functions now have a zero-arity form; the first argument defaults to the context item.
See 21 Constructor functions
PR 680
The case-insensitive collation is now defined normatively within this specification, rather than by reference to the HTML "living specification", which is subject to change. The collation can now be used for ordering comparisons as well as equality comparisons.
See 5.3.6 The HTML ASCII Case-Insensitive Collation
PR 702
The function can now take any number of arguments (previously it had to be two or more), and the arguments can be sequences of strings rather than single strings.
See 5.4.4 fn:concat
PR 710
Changes the function to return a sequence of key-value pairs rather than a map.
See 17.1.5 fn:function-annotations
PR 727
It has been clarified that loading a module has no effect on the static or dynamic context of the caller.
See 17.3.2 fn:load-xquery-module
PR 795
New in 4.0
See 17.2.19 fn:sort-with
PR 828
The $predicate callback function accepts an optional position argument.
See 17.2.4 fn:filter
The $action callback function accepts an optional position argument.
See 17.2.6 fn:fold-right
See 17.2.7 fn:for-each
See 17.2.8 fn:for-each-pair
The $predicate callback function now accepts an optional position argument.
See 19.2.4 array:filter
The $action callback function now accepts an optional position argument.
See 19.2.7 array:fold-right
See 19.2.9 array:for-each
See 19.2.10 array:for-each-pair
PR 881
The way that fn:min and fn:max compare numeric values of different types has changed. The most noticeable effect is that when these functions are applied to a sequence of xs:integer or xs:decimal values, the result is an xs:integer or xs:decimal, rather than the result of converting this to an xs:double
See 14.4.3 fn:max
See 14.4.4 fn:min
PR 901
All three arguments are now optional, and each argument can be set to an empty sequence. Previously if $description was supplied, it could not be empty.
See 3.1.1 fn:error
The $label argument can now be set to an empty sequence. Previously if $label was supplied, it could not be empty.
See 3.2.1 fn:trace
The third argument can now be supplied as an empty sequence.
See 5.4.6 fn:substring
The second argument can now be an empty sequence.
See 6.3.3 fn:tokenize
The optional second argument can now be supplied as an empty sequence.
See 7.1 fn:resolve-uri
The 3rd, 4th, and 5th arguments are now optional; previously the function required either 2 or 5 arguments.
See 10.8.1 fn:format-dateTime
See 10.8.2 fn:format-date
See 10.8.3 fn:format-time
The optional third argument can now be supplied as an empty sequence.
See 14.1.13 fn:subsequence
PR 905
The rule that multiple calls on fn:doc supplying the same absolute URI must return the same document node has been clarified; in particular the rule does not apply if the dynamic context for the two calls requires different processing of the documents (such as schema validation or whitespace stripping).
See 14.6.1 fn:doc
PR 909
The function has been expanded in scope to handle comparison of values other than strings.
See 14.2.3 fn:compare
PR 924
Rules have been added clarifying that users should not be allowed to change the schema for the fn namespace.
See D Schemas
PR 925
The decimal format name can now be supplied as a value of type xs:QName, as an alternative to supplying a lexical QName as an instance of xs:string.
See 4.7.2 fn:format-number
PR 932
The specification now prescribes a minimum precision and range for durations.
See 9.1.2 Limits and precision
PR 933
When comments and processing instructions are ignored, any text nodes either side of the comment or processing instruction are now merged prior to comparison.
See 14.2.2 fn:deep-equal
PR 940
New in 4.0
See 17.2.20 fn:subsequence-where
PR 953
Constructor functions for named record types have been introduced.
See 21.6 Constructor functions for named record types
PR 962
New in 4.0
See 17.2.2 fn:do-until
See 17.2.23 fn:while-do
PR 969
New in 4.0
See 18.4.3 map:empty
PR 984
New in 4.0
See 9.4.1 fn:seconds
PR 987
The order of results is now prescribed; it was previously implementation-dependent.
See 14.2.4 fn:distinct-values
PR 988
New in 4.0
See 15.3.8 fn:pin
See 15.3.9 fn:label
PR 1022
Regular expressions can include comments (starting and ending with #) if the c flag is set.
See 6.1 Regular expression syntax
See 6.2 Flags
PR 1028
An option is provided to control how the JSON null value should be handled.
See 15.3.4 fn:parse-json
PR 1032
New in 4.0
See 14.1.17 fn:void
PR 1046
New in 4.0
See 17.2.21 fn:take-while
PR 1059
Use of an option keyword that is not defined in the specification and is not known to the implementation now results in a dynamic error; previously it was ignored.
See 1.7 Options
PR 1068
New in 4.0
See 5.4.3 fn:graphemes
PR 1072
The return type is now specified more precisely.
See 17.3.2 fn:load-xquery-module
PR 1090
When casting from a string to a duration or time or dateTime, it is now specified that when there are more digits in the fractional seconds than the implementation is able to retain, excess digits are truncated. Rounding upwards (which could affect the number of minutes or hours in the value) is not permitted.
See 22.2 Casting from xs:string and xs:untypedAtomic
PR 1093
New in 4.0
See 5.3.9 fn:collation
PR 1117
The $options parameter has been added.
See 14.6.5 fn:unparsed-text
See 14.6.7 fn:unparsed-text-available
PR 1182
The $predicate callback function may return an empty sequence (meaning false).
See 17.2.2 fn:do-until
See 17.2.3 fn:every
See 17.2.4 fn:filter
See 17.2.10 fn:index-where
See 17.2.16 fn:some
See 17.2.21 fn:take-while
See 17.2.23 fn:while-do
See 18.4.6 map:filter
See 18.4.12 map:keys-where
See 19.2.4 array:filter
See 19.2.14 array:index-where
PR 1191
New in 4.0
See 2.3.1 fn:distinct-ordered-nodes
The $options parameter has been added, absorbing the $collation parameter.
See 14.2.2 fn:deep-equal
PR 1250
For selected properties including percent and exponent-separator, it is now possible to specify a single-character marker to be used in the picture string, together with a multi-character rendition to be used in the formatted output.
See 4.7.2 fn:format-number
PR 1257
The $options parameter has been added.
See 15.1.1 fn:parse-xml
See 15.1.2 fn:parse-xml-fragment
PR 1262
New in 4.0
See 5.3.10 fn:collation-available
PR 1265
The constraints on the result of the function have been relaxed.
See 2.1.6 fn:document-uri
PR 1280
As a result of changes to the coercion rules, the number of supplied arguments can be greater than the number required: extra arguments are ignored.
See 17.2.1 fn:apply
PR 1288
Additional error conditions have been defined.
See 15.1.1 fn:parse-xml
PR 1296
New in 4.0
See 17.2.14 fn:scan-left
PR 1333
A new option is provided to allow the content of the loaded module to be supplied as a string.
See 17.3.2 fn:load-xquery-module
PR 1353
An option has been added to suppress the escaping of the solidus (forwards slash) character.
See 15.3.7 fn:xml-to-json
PR 1358
New in 4.0
See 10.3.2 fn:unix-dateTime
PR 1361
The term atomic value has been replaced by atomic item.
See 1.9 Terminology
PR 1393
Changes the function to return a sequence of key-value pairs rather than a map.
See 17.1.5 fn:function-annotations
PR 1409
This section now uses the term primitive type strictly to refer to the 20 atomic types that are not derived by restriction from another atomic type: that is, the 19 primitive atomic types defined in XSD, plus xs:untypedAtomic. The three types xs:integer, xs:dayTimeDuration, and xs:yearMonthDuration, which have custom casting rules but are not strictly-speaking primitive, are now handled in other subsections.
See 22.1 Casting from primitive types to primitive types
The rules for conversion of dates and times to strings are now defined entirely in terms of XSD 1.1 canonical mappings, since these deliver exactly the same result as the XPath 3.1 rules.
See 22.1.2.2 Casting date/time values to xs:string
The rules for conversion of durations to strings are now defined entirely in terms of XSD 1.1 canonical mappings, since the XSD 1.1 rules deliver exactly the same result as the XPath 3.1 rules.
See 22.1.2.3 Casting xs:duration values to xs:string
PR 1455
Numbers now retain their original lexical form, except for any changes needed to satisfy JSON syntax rules (for example, stripping leading zero digits).
See 15.3.7 fn:xml-to-json
PR 1473
New in 4.0
See 14.1.5 fn:identity
PR 1481
The function has been extended to handle other Gregorian types such as xs:gYearMonth.
See 10.5.1 fn:year-from-dateTime
See 10.5.2 fn:month-from-dateTime
The function has been extended to handle other Gregorian types such as xs:gMonthDay.
See 10.5.3 fn:day-from-dateTime
The function has been extended to handle other types including xs:time.
See 10.5.4 fn:hours-from-dateTime
See 10.5.5 fn:minutes-from-dateTime
The function has been extended to handle other types such as xs:gYearMonth.
See 10.5.7 fn:timezone-from-dateTime
PR 1504
New in 4.0
See 14.1.11 fn:sequence-join
Optional $separator added.
See 19.2.17 array:join
PR 1523
New functions are provided to obtain information about built-in types and types defined in an imported schema.
See 20 Processing types
New in 4.0
See 20.1.2 fn:schema-type
See 20.1.4 fn:atomic-type-annotation
See 20.1.5 fn:node-type-annotation
PR 1545
New in 4.0
See 10.6.4 fn:civil-timezone
PR 1565
The default for the escape option has been changed to false. The 3.1 specification gave the default value as true, but this appears to have been an error, since it was inconsistent with examples given in the specification and with tests in the test suite.
See 15.3.4 fn:parse-json
PR 1570
New in 4.0
See 20.1.3 fn:type-of
PR 1587
New in 4.0
See 14.6.8 fn:unparsed-binary
PR 1611
The spec has been corrected to note that the function depends on the implicit timezone.
See 14.2.3 fn:compare
PR 1671
New in 4.0.
See 4.4.6 fn:divide-decimals
PR 1703
The order of entries in maps is retained.
See 15.3.4 fn:parse-json
Ordered maps are introduced.
See 18.1 Ordering of Maps
Enhanced to allow for ordered maps.
See 18.4.6 map:filter
See 18.4.7 map:find
See 18.4.8 map:for-each
See 18.4.17 map:put
See 18.4.18 map:remove
PR 1711
It is explicitly stated that the limits for $precision are implementation-defined.
See 4.4.4 fn:round
See 4.4.5 fn:round-half-to-even
PR 1727
For consistency with the new functions map:build and map:of-pairs, the handling of duplicates may now be controlled by supplying a user-defined callback function as an alternative to the fixed values for the earlier duplicates option.
See 18.4.13 map:merge
PR 1734
In 3.1, given a mixed input sequence such as (1, 3, 4.2e0), the specification was unclear whether it was permitted to add the first two integer items using integer arithmetic, rather than converting all items to doubles before performing any arithmetic. The 4.0 specification is clear that this is permitted; but since the items can be reordered before being added, this is not required.
See 14.4.2 fn:avg
See 14.4.5 fn:sum
PR 1825
New in 4.0
See 17.2.12 fn:partial-apply
PR 1856
Word boundaries can be matched. Lookahead and lookbehind assertions are supported. Assertions (including ^ and $) can no longer be followed by a quantifier.
See 6.1 Regular expression syntax
It is now permitted for the regular expression to match a zero-length string.
See 6.3.2 fn:replace
See 6.3.3 fn:tokenize
The output of the function is extended to allow the represention of captured groups found within lookahead assertions.
See 6.3.4 fn:analyze-string
It is now permitted for the regular expression to match a zero-length string.
See 6.3.4 fn:analyze-string
PR 1879
Additional options to control DTD and XInclude processing have been added.
See 15.1.1 fn:parse-xml
PR 1897
The $replacement argument can now be a function that computes the replacement strings.
See 6.3.2 fn:replace
PR 1906
New in 4.0
See 18.5.10 fn:element-to-map-plan
New in 4.0.
See 18.5.11 fn:element-to-map
PR 1910
An $options parameter is added. Note that the rules for the $options parameter control aspects of processing that were implementation-defined in earlier versions of this specification. An implementation may provide configuration options designed to retain backwards-compatible behavior when no explicit options are supplied.
See 14.6.1 fn:doc
See 14.6.2 fn:doc-available
PR 1991
Named record types used in the signatures of built-in functions are now available as standard in the static context.
See C Built-in named record types
PR 2001
New in 4.0.
See 17.2.18 fn:sort-by
See 19.2.26 array:sort-by
PR 2030
This description of the XSD validation process was previously found (with some duplication) in the XQuery and XSLT specifications; those specifications now reference this description. As a side-effects, the descriptions of the process in XQuery and XSLT are better aligned.
See 15.1.4 XSD validation
PR 2030
This description of the XSD validation process was previously found (with some duplication) in the XQuery and XSLT specifications; those specifications now reference this description. As a side-effects, the descriptions of the process in XQuery and XSLT are better aligned.
See 15.1.4 XSD validation

XPath and XQuery Functions and Operators 4.0

W3C Editor's Draft 23 February 2026

Abstract

Status of this Document

Dedication

15 Parsing and serializing

15.1 Functions on XML Data

15.1.4 XSD validation

H Changes since 3.1 (Non-Normative)

H.1 Summary of Changes