View Old View New View Both View Only Previous Next

This draft contains only sections that have differences from the version that it modified.

W3C

XPath and XQuery Functions and Operators 4.0

W3C Editor's Draft 23 February 2026

This version:
https://qt4cg.org/specifications/xpath-functions-40/
Latest version of XPath and XQuery Functions and Operators 4.0:
https://qt4cg.org/specifications/xpath-functions-40/
Most recent Recommendation of XPath and XQuery Functions and Operators:
https://www.w3.org/TR/2017/REC-xpath-functions-31-20170321/
Editor:
Michael Kay, Saxonica <http://www.saxonica.com/>

This document is also available in these non-normative formats: Specification in XML format and XML function catalog.


Abstract

This document defines constructor functions, operators, and functions on the datatypes defined in [XML Schema Part 2: Datatypes Second Edition] and the datatypes defined in [XQuery and XPath Data Model (XDM) 4.0]. It also defines functions and operators on nodes and node sequences as defined in the [XQuery and XPath Data Model (XDM) 4.0]. These functions and operators are defined for use in [XML Path Language (XPath) 4.0] and [XQuery 4.0: An XML Query Language] and [XSL Transformations (XSLT) Version 4.0] and other related XML standards. The signatures and summaries of functions defined in this document are available at: http://www.w3.org/2005/xpath-functions/.

A summary of changes since version 3.1 is provided at H Changes since 3.1.

Status of this Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document.

This document is a working draft developed and maintained by a W3C Community Group, the XQuery and XSLT Extensions Community Group unofficially known as QT4CG (where "QT" denotes Query and Transformation). This draft is work in progress and should not be considered either stable or complete. Standard W3C copyright and patent conditions apply.

The community group welcomes comments on the specification. Comments are best submitted as issues on the group's GitHub repository.

The community group maintains two extensive test suites, one oriented to XQuery and XPath, the other to XSLT. These can be found at qt4tests and xslt40-test respectively. New tests, or suggestions for correcting existing tests, are welcome. The test suites include extensive metadata describing the conditions for applicability of each test case as well as the expected results. They do not include any test drivers for executing the tests: each implementation is expected to provide its own test driver.

Dedication

The publications of this community group are dedicated to our co-chair, Michael Sperberg-McQueen (1954–2024).


2 Processing sequences

A sequence is an ordered collection of zero or more items. An item is a node, an atomic item, or a function, such as a map or an array. The terms sequence and item are defined formally in [XQuery 4.0: An XML Query Language] and [XML Path Language (XPath) 4.0].

2.6 Incremental Evaluation

Some operations, most notably the generate function, may deliver sequences of unbounded length. Other examples include the do-while and until-do functions. In addition, many functions and operators that take an unbounded sequence as input may also deliver an unbounded sequence as their result: example include the functions filter, for-each, index-of, index-where, insert-before, remove, and operations such as A!B and for $x in X return R.

[Definition] A function or operation is said to be incrementally evaluated with respect to one of its arguments or operands if the implementation is capable of processing an unbounded value for that argument or operand without exhausting available memory.

Note:

Other terms used for incremental evaluation include lazy evaluation and pipelining. However, these terms have different meanings in other contexts. Incremental evaluation of sequences (processing the initial items of a sequence without first materializing the entire sequence) is one example of a lazy evaluation strategy, while the term pipelining can also be used to refer to a sequence of processing steps in which each step runs to completion before the next one starts.

Some functions and operators may accept more than one unbounded sequence among their inputs. Examples are the comma operator (which performs sequence concatenationXP), and the functions for-each-pair and insert-before.

Incremental evaluation applies not only to sequences, but also to arrays. Although there is no equivalent to generate that directly delivers an unbounded array, the functions array:build and array:of-members produce an unbounded array if supplied with an unbounded sequence as their input.

Incremental operations can be divided into a number of categories:

For a small number of functions and operators, this specification mandates that the implementation must be incrementally evaluated. A conformant implementation of these functions and operators must not be constrained by available memory (though it may have other limits, such as a limit on the size of integer used to represent the value of position).

For a larger class of operations, the processor may be able to use incremental evaluation, but this is not mandated. For example, the specification does not mandate that the partition function must be pipelined. Similarly, it does not mandate that a general comparison (such as the = operator) must be incrementally evaluated with respect to either or both of the operands.

Some operations may be amenable to incremental evaluation depending on the conditions. For example, the union and intersect operators can be incrementally evaluated provided that the processor is able to determine that the operand sequences will always be in document order (this will be the case, for example, if the operands are path expressions). This analysis is outside the scope of the specification: in consequence it is implementation-dependent whether, and under what circumstances, such operations are incrementally evaluated.

TODO: define which functions and operators are guaranteed to be incrementally evaluated.

2.6.1 Unbounded sequences and optimization

As declarative expression-based language, XPath, XSLT, and XQuery are designed to have referential transparency. This essentially means that any subexpression can be replaced by its value, or by a different expression having the same value. This property enables powerful optimizations through expression rewriting, including, for example:

  • Constant folding: that is, early compile-time evaluation of expressions whose value will be the same on every execution of a program;

  • Loop lifting: moving an expression out of a loop, and binding its result to a variable, to avoid evaluating the expression repeatedly;

  • Common subexpression elimination: recognising where the same expression appears more than once in the code, so that it need only be evaluated once.

  • Memoization: caching the results of a function call so that subsequent attempts to evaluate the same function with the same arguments do not require the value to be recomputed.

All these optimizations potentially become problematic when expressions evaluate to unbounded sequences. The fact that the value of such an expression cannot be held in finite memory effectively undermines the principle that the expression can always be replaced by its value.

This is not a new problem, because previous versions of XQuery and XSLT also allowed constructs (recursive functions, for example) that evaluated to unbounded sequences, and applications using such constructs might succeed on some implementations and fail on others. However, the explicit use of generators (via the generate function) makes it necessary to address the issues raised.

The specification does not attempt a complete solution to the problem. Rather, it takes a pragmatic approach, offering recommendations of good practise both to implementers and users.

Recommendations for implementers:

  • When implementing constant folding or memoization, abandon the attempted optimization if the value becomes too large.

  • When variables are introduced to avoid repeated evaluation of an expression, ensure that the variable itself is lazily evaluated so that evaluation does not fail when the expression is non-terminating.

Recommendations for users:

  • Avoid binding expressions to variables if the value might be unbounded.

  • Avoid using an expression whose result might be unbounded as the argument to a function call, except where the function explicitly guarantees incremental evaluation. (Note that these specifications offer no such guarantee in the case of user-defined functions, though an implementation might do so.)

2.6.2 Incremental evaluation and streaming

TODO: say something here about the relationship of incremental evaluation to XSLT streaming.

2.6.3 fn:generate

Changes in 4.0 (next | previous)

  1. New in 4.0  [Issue 708 PR 2350 17 December 2025]

Summary

Delivers a potentially unbounded sequence based on a supplied initial value, together with a function for computing the next value.

Signature
fn:generate(
$initas item(),
$stepas function(item(), xs:integer) as item()?
) as item()*
Properties

This function is deterministic, context-independent, and focus-independent.

Rules

The first item in the returned sequence is $init; the second item is the result of $initial-state => $step(1); the third item is the result of $initial-state => $step(1) => step(2), and so on.

The $step function is called with two arguments, the current state, and the sequential position of the current state (starting at 1). The coercion rules allow an arity-1 function to be supplied if the second argument is not required.

If the $step function returns an empty sequence, this marks the end of the generated sequence. However, the generated sequence is potentially unbounded, and the specification defines that certain operations on sequences are incrementally evaluated, ensuring that unbounded sequences can be processed without exhausting available resources.

Formal Equivalent

The function delivers the same result as the following XQuery implementation.

declare function generate(
  $init   as item(),
  $step as function(item(), xs:integer) as item()?
) as item()* {
  generate-helper($init, $step, 1)
};
declare %private function generate-helper (
  $init   as item(),
  $step as function(item(), xs:integer) as item()?,
  $position as xs:integer
) as item()* {
  let $next := $step($init, $position)
  while exists($next)
  return ($next, generate-helper($next, $step, $position+1))
};
Notes

The evaluation technique used to implement pipelined operation is often called lazy evaluation. This specification does not mandate any particular implementation technique, it only requires that the evaluation of pipelined operations is not constrained by available memory. For example, a processor with access to SIMD hardware is free to take advantage of this. This has the practical implication that an application cannot assume there will be zero lookahead in the pipeline, nor that such lookahead will be error-free.

The generate function delivers a sequence of items representing succesive states. There is no constraint on how states are represented; in many cases it will be useful to represent states as records with a method conventionally named next that delivers the next state in the sequence (as in the example using fn:random-number-generator).

Although generate is designed to be capable of delivering an unbounded sequence, it can also be a convenient way of generating a finite sequence. For example,

generate(., fn{..})

returns the ancestors of a node, ending at the root of the containing tree.

An implementation is allowed to place limits on the number of items in a sequence. Even though some operations are defined to be incrementally evaluated, there may be constraints such as a limit on the size of the integer returned by fn:position.

Operations that consume a sequence in its entirety should be avoided if the sequence might be unbounded. Examples of such operations include:

  • Aggregation functions such as count, sum, string-join, or distinct-values.

  • Reordering operations, such as sort-by, reverse, the permute method of random-number-generator, or the implicit sort into document order performed by the /, union, intersect, and except operators.

  • The function last.

  • Searching operations, such as predicates, index-of, filter, and some, unless the user has good reason to believe these will always find a match.

  • Binding an unbounded sequence to a variable.

Using such operations on an unbounded sequence may result in catastrophic failure (for example, running out of memory), or in non-termination.

There are a number of ways of reliably processing a bounded subsequence of an unbounded input sequence:

Examples

An infinite sequence of even numbers

The expression

generate(0, fn{.+2})

Produces an infinite sequence of even numbers: 0, 2, 4, 6, 8, ....

Operations that attempt to process this entire sequence (for example, count or sum) will inevitably fail (either by running out of resources, or by failing to terminate). However, operations that only need access to the start of the sequence, for example generate(0, fn{.+2})[20] will typically succeed, evaluating the infinite sequence only as far as is needed to deliver a result. Such operations are guaranteed to succeed if the operation in question is guaranteed to perform incremental evaluation of its operand.

Some operations, such as:

index-of(generate(0, fn{.+2}), $n)[1]

may or may not succeed, depending on the actual content of the data (in this case, depending on whether $n is odd or even).

An infinite sequence of random doubles

The function call:

generate(random-number-generator($seed), fn{?next}) 
   [1 to 1000] ? number

returns an sequence of one thousand random numbers. The call to generate supplies as initial value a random number generator with a given seed. The $step function calls the next method of fn:random-number-generator to produce a new random number generator. The result of the call to generate is thus an unbounded sequence of random number generators. The filter predicate 1 to 1000 then reduces this sequence to the first 1000 items: this works because a filter expression is guaranteed to perform incremental evaluation of its base expression. The final ? number operation then extracts the number delivered by each of the random number generator objects.

Generating the Fibonnaci sequence

The expression:

generate([1,1], fn{[?2, ?1+?2]})?1

delivers an infinite sequence of integers, specifically the Fibonacci sequence:

1, 1, 2, 3, 5, 8, 13, 21, 34, 55...

This example illustrates that generate produces a sequence of states, which in turn can be used to deliver the actual values that are needed. In this case the state is represented by an array containing two consecutive Fibonacci numbers, which is sufficient information to compute the next state. The final ?1 is used to extract the actual wanted number from the array used to represent the state.

Markov simulation

The generate function can be useful in simulations of Markov processes. A trivial example of such a process is the “drunkard's walk”: a sequence of integers in which each integer is either one greater or one less than the previous integer, with equal probability.

  generate({'val':0, 'rg':random-number-generator(current-dateTime())},
           fn{ {'val': ?val + head(?rg?permute((-1, +1))),
                'rg': ?rg?next() } } )

In one sample run of the simulation, this produced the sequence 0 1 0 -1 -2 -1 0 1 0 -1 -2 -3 -4 -3 -2 -3 ...

A finite state automaton

Validating a text against a grammar often involves production of a finite state automaton. Each state in such an automaton has a set of possible transitions, which is a mapping from the next token in the input to the next state in the automaton. For example, the grammar (ab)* has two states: state S1 has a single transition that maps the token "a" to the state S2, while state S2 has a single transition that maps the token "b" to state S1. State S1 is both the initial state and the required final state, so an input string is valid against the grammar if a sequence of transitions ends with state S1.

This finite state automaton might be represented by the data structure:

{
  "S1": { "final":true(),  "transitions": {"a": "S2" } },
  "S2": { "final":false(), "transitions": {"b": "S1" } }
}

If this data structure is bound to the variable $fsa, and the input (as a sequence of string tokens) is bound to the variable $input. The following function call:

generate("S1", fn($state, $i) {
    if ($i le count($input)) {
       $fsa ? $state ? transitions ? ($input[$i])
         otherwise error(`Unexpected token "{$input[$i]}"`)
    }) [last()] ? final

returns true if and only if the input matches the grammar corresponding to the automaton.

The way this works is that the sequence of states of the generator (which correspond to states of the finite state automaton) advances by matching successive tokens from the input against the valid transitions for that state. If a valid transition is found, the generator moves to the next state; if no transition is found, an error is reported. When the stream of tokens is exhausted, the step function returns an empty sequence to indicate completion. Finally the expression returns true if the last state is a final state, or false otherwise.

E Glossary (Non-Normative)

atomic item

An atomic item is a pair (T, D) where T (the type annotation) is an atomic type, and D (the datum) is a point in the value space of T.

capturing subexpression

A left parenthesis is recognized as a capturing left parenthesis provided it is not immediately followed by ? or * (see below), is not within a character group (square brackets), and is not escaped with a backslash. The sub-expression enclosed by a capturing left parenthesis and its matching right parenthesis is referred to as a capturing subexpression.

character

A character is an instance of the CharXML production of [Extensible Markup Language (XML) 1.0 (Fifth Edition)].

character position

A string of length N has N+1character positions: one immediately before each character in the string, and one after the last character. In interfaces where character positions are exposed, they are numbered from 1 to N+1.

codepoint

A codepoint is an integer assigned to a character by the Unicode consortium, or reserved for future assignment to a character.

collation

A collation is an algorithm that determines, for any two given strings S1 and S2, whether S1 is less than, equal to, or greater than S2. In this specification, a collation is identified by an absolute URI.

collation unit

The term collation unit as used in this specification is equivalent to the term collation element used in [UTS #10].

context-dependent

A function definitionXP may have the property of being context-dependent: the result of such a function depends on the values of properties in the static and dynamic evaluation context of the caller as well as on the actual supplied arguments (if any). A function definition may be context-dependent for some arities in its arity range, and context-independent for others: for example fn:name#0 is context-dependent while fn:name#1 is context-independent.

context-independent

A function definitionXP that is not context-dependent is called context-independent.

contextually equal

Two atomic items A and B are said to be contextually equal if the function call fn:compare(A, B) returns zero when evaluated with a specified or context-determined collation and implicit timezone.

CSV

The term comma separated values or CSV refers to a wide variety of plain-text tabular data formats with fields and records separated by standard character delimiters (often, but not invariably, commas).

date formatting function

The three functions fn:format-dateTime, fn:format-date, and fn:format-time are referred to collectively as the date formatting functions.

datum

The datum of an atomic item is a point in the value space of its type, which is also a point in the value space of the primitive type from which that type is derived.

deterministic

A function that is guaranteed to produce identical results from repeated calls within a single execution scope if the explicit and implicit arguments are identical is referred to as deterministic.

digit family

The decimal digit family of a decimal format is the sequence of ten digits with consecutive Unicode codepoints starting with the character that is the value of the zero-digitXP31 property.

disjoint matching segments

The disjoint matching segments obtained by applying a regular expression R to a string S in the presence of a set of flags F are the segments of S that match R (using flags F), after elimination of overlapping segments.

end position

The end position of a segment is the start position of the segment plus its length.

execution scope

An execution scope is a sequence of calls to the function library during which certain aspects of the state are required to remain invariant. For example, two calls to fn:current-dateTime within the same execution scope will return the same result. The execution scope is defined by the host language that invokes the function library.

expanded-QName

An expanded-QName is a value in the value space of the xs:QName datatype as defined in the XDM data model (see [XQuery and XPath Data Model (XDM) 4.0]): that is, a triple containing namespace prefix (optional), namespace URI (optional), and local name. Two expanded QNames are equal if the namespace URIs are the same (or both absent) and the local names are the same. The prefix plays no part in the comparison, but is used only if the expanded QName needs to be converted back to a string.

focus-dependent

A function is focus-dependent if its result depends on the focusXP31 (that is, the context item, position, or size) of the caller.

focus-dependent

A function that is not focus-dependent is called focus-independent.

Gregorian

The eight primitive types xs:dateTime, xs:date, xs:time, xs:gYearMonth, xs:gYear, xs:gMonthDay, xs:gMonth, xs:gDay are referred to collectively as the Gregorian types.

higher-order

Functions that accept functions among their arguments, or that return functions in their result, are described in this specification as higher-order functions.

identical

Two values $V1 and $V2 are defined to be identical if they contain the same number of items and the items are pairwise identical. Two items are identical if and only if one of the following conditions applies:

implementation-defined

Where behavior is described as implementation-defined, variations between processors are permitted, but a conformant implementation must document the choices it has made.

implementation-dependent

Where behavior is described as implementation-dependent, variations between processors are permitted, and conformant implementations are not required to document the choices they have made.

incremental evaluation

A function or operation is said to be incrementally evaluated with respect to one of its arguments or operands if the implementation is capable of processing an unbounded value for that argument or operand without exhausting available memory.

map

A map consists of a sequence of entries, also known as key-value pairs. Each entry comprises a key which is an arbitrary atomic item, and an arbitrary sequence called the associated value.

match

The term match is used in the sense of definition DS2 from [UTS #10].

minimal match

The term minimal match is used in the sense of definition DS4 from [UTS #10].

nondeterministic

A function that is not deterministic is referred to as nondeterministic.

nondeterministic with respect to ordering

Some functions (such as fn:in-scope-prefixes, fn:load-xquery-module, and fn:unordered) produce result sequences or result maps in an implementation-defined or implementation-dependent order. In such cases two calls with the same arguments are not guaranteed to produce the results in the same order. These functions are said to be nondeterministic with respect to ordering.

optional digit character

The optional digit character is the character that is the value of the digitXP31 property.

option parameter conventions

Functions that take an options parameter adopt common conventions on how the options are used. These are referred to as the option parameter conventions. These rules apply only to functions that explicitly refer to them.

permitted character

A permitted character is one within the repertoire accepted by the implementation.

picture string

The formatting of a number is controlled by a picture string. The picture string is a sequence of characters, in which the characters assigned to the properties decimal-separatorXP31 , exponent-separatorXP31, grouping-separatorXP31, digitXP31, and pattern-separatorXP31 and the members of the decimal digit family, are classified as active characters, and all other characters (including the values of the properties percentXP31 and per-milleXP31) are classified as passive characters.

primitive type

A primitive type is one of the 19 primitive atomic types defined in 3.2 Primitive datatypesXS2 of [XML Schema Part 2: Datatypes Second Edition], or the type xs:untypedAtomic defined in [XQuery and XPath Data Model (XDM) 4.0].

same key

Within a map, no two entries have the same key. Two atomic items K1 and K2 are the same key for this purpose if the function call fn:atomic-equal($K1, $K2) returns true.

segment

A segment of a string S is a sequence of zero or more contiguous characters starting at a given character position within S.

single-entry map

A single-entry map is a map containing a single entry.

string

A string is a sequence of zero or more characters, or equivalently, a value in the value space of the xs:string datatype.

type annotation

The type annotation of an atomic item is the most specific atomic type that it is an instance of (it is also an instance of every type from which that type is derived).

Unicode codepoint collation

The collation URI http://www.w3.org/2005/xpath-functions/collation/codepoint identifies a collation which must be recognized by every implementation: it is referred to as the Unicode codepoint collation (not to be confused with the Unicode collation algorithm).

URI

Within this specification, the term URI refers to Universal Resource Identifiers as defined in [RFC 3986] and extended in [RFC 3987] with a new name IRI. The term URI Reference, unless otherwise stated, refers to a string in the lexical space of the xs:anyURI datatype as defined in [XML Schema Part 2: Datatypes Second Edition].

variadic

The function fn:concat is defined to be variadic: it accepts any number of arguments. No other function has this property.

H Changes since 3.1 (Non-Normative)

H.1 Summary of Changes

  1. If a section of this specification has been updated since version 3.1, an overview of the changes is provided, along with links to navigate to the next or previous change.

    See 1 Introduction

  2. Sections with significant changes are marked with a ✭ symbol in the table of contents. New functions are indicated by ✚.

    See 1 Introduction

  3. PR 1504 2329 

    New in 4.0

    See 2.1.7 fn:insert-separator

  4. New in 4.0

    See 2.1.10 fn:replicate

  5. New in 4.0

    See 2.1.12 fn:slice

  6. PR 1120 1150 

    A callback function can be supplied for comparing individual items.

    See 2.2.4 fn:deep-equal

  7. Changed in 4.0 to use transitive equality comparisons for numeric values.

    See 2.2.5 fn:distinct-values

  8. PR 614 987 

    New in 4.0

    See 2.2.6 fn:duplicate-values

  9. New in 4.0. Originally proposed under the name fn:uniform

    See 2.4.2 fn:all-equal

  10. New in 4.0. Originally proposed under the name fn:unique

    See 2.4.3 fn:all-different

  11. New in 4.0

    See 2.5.3 fn:every

  12. New in 4.0

    See 2.5.9 fn:highest

  13. New in 4.0

    See 2.5.10 fn:index-where

  14. New in 4.0

    See 2.5.11 fn:lowest

  15. New in 4.0

    See 2.5.15 fn:scan-right

  16. New in 4.0

    See 2.5.16 fn:some

  17. PR 795 2228 

    New in 4.0

    See 2.5.19 fn:sort-with

  18. PR 521 761 

    New in 4.0

    See 2.5.22 fn:transitive-closure

  19. New in 4.0

    See 4.4.5 fn:is-NaN

  20. PR 1260 1275 

    A third argument has been added, providing control over the rounding mode.

    See 4.4.6 fn:round

  21. PR 1049 1151 

    Decimal format parameters can now be supplied directly as a map in the third argument, rather than referencing a format defined in the static context.

    See 4.7.2 fn:format-number

  22. PR 1205 1230 

    New in 4.0

    See 4.8.2 math:e

    See 4.8.8 math:cosh

    See 4.8.15 math:sinh

    See 4.8.18 math:tanh

  23. The 3.1 specification suggested that every value in the result range should have the same chance of being chosen. This has been corrected to say that the distribution should be arithmetically uniform (because there are as many xs:double values between 0.01 and 0.1 as there are between 0.1 and 1.0).

    See 4.9.2 fn:random-number-generator

  24. PR 261 306 993 

    New in 4.0

    See 5.4.1 fn:char

  25. New in 4.0

    See 5.4.2 fn:characters

  26. PR 937 995 1190 

    New in 4.0

    See 5.4.13 fn:hash

  27. PR 215 415  

    New in 4.0

    See 7.6.2 fn:parse-uri

  28. PR 1423 1413 

    New in 4.0

    See 7.6.3 fn:build-uri

  29. New in 4.0

    See 12.2.2 fn:in-scope-namespaces

  30. PR 1620 1886 

    Options are added to customize the form of the output.

    See 12.2.9 fn:path

  31. PR 1547 1551 

    New in 4.0

    See 12.2.11 fn:siblings

  32. PR 969 1134 

    New in 4.0

    See 14.4.6 map:filter

  33. PR 478 515 

    New in 4.0

    See 14.4.12 map:keys-where

  34. PR 1575 1906 

    A new function fn:element-to-map is provided for converting XDM trees to maps suitable for serialization as JSON. Unlike the fn:xml-to-json function retained from 3.1, this can handle arbitrary XML as input.

    See 14.5 Converting elements to maps

  35. New in 4.0

    See 15.2.3 array:empty

  36. PR 968 1295 

    New in 4.0

    See 15.2.13 array:index-of

  37. PR 476 1087 

    New in 4.0

    See 15.2.16 array:items

  38. PR 360 476 

    New in 4.0

    See 15.2.18 array:members

    See 15.2.19 array:of-members

  39. Supplying an empty sequence as the value of an optional argument is equivalent to omitting the argument.

    See 15.2.29 array:subarray

  40. PR 1117 1279 

    The $options parameter has been added.

    See 17.1.6 fn:unparsed-text-lines

  41. PR 259 956 

    A new function is available for processing input data in HTML format.

    See 17.3 Functions on HTML Data

    New in 4.0

    See 17.3.2 fn:parse-html

  42. PR 975 1058 1246 

    An option is provided to control how JSON numbers should be formatted.

    See 17.4.4 fn:parse-json

  43. Additional options are available, as defined by fn:parse-json.

    See 17.4.5 fn:json-doc

  44. PR 533 719 834 1066 

    New in 4.0

    See 17.5.4 fn:csv-to-arrays

    See 17.5.7 fn:parse-csv

  45. PR 533 719 834 1066 1605 

    New in 4.0

    See 17.5.10 fn:csv-to-xml

  46. PR 791 1256 1282 1405 

    New in 4.0

    See 17.6.1 fn:invisible-xml

  47. PR 629 803 

    New in 4.0

    See 21.2.2 fn:message

  48. PR 533 719 834 

    New functions are available for processing input data in CSV (comma separated values) format.

    See 17.5 Functions on CSV Data

  49. Comparison of mixed numeric types (for example xs:double and xs:decimal) now generally converts both values to xs:decimal.

    See 4.3 Comparing numeric values

  50. PR 289 1901 

    A third argument is added, allowing user control of how absent keys should be handled.

    See 14.4.9 map:get

    A third argument is added, allowing user control of how index-out-of-bounds conditions should be handled.

    See 15.2.11 array:get

  51. A new collation URI is defined for Unicode case-insensitive comparison and ordering.

    See 5.3.5 The Unicode case-insensitive collation

  52. PR 1727 1740 

    It is no longer guaranteed that the new key replaces the existing key.

    See 14.4.14 map:put

  53. PR 173 

    New in 4.0

    See 18.4 fn:op

  54. PR 203 

    New in 4.0

    See 14.4.1 map:build

  55. PR 207 

    New in 4.0

    See 10.1.2 fn:parse-QName

    See 10.2.5 fn:expanded-QName

  56. PR 222 

    New in 4.0

    See 2.2.3 fn:contains-subsequence

    See 2.2.7 fn:ends-with-subsequence

    See 2.2.9 fn:starts-with-subsequence

  57. PR 250 

    New in 4.0

    See 2.1.3 fn:foot

    See 2.1.15 fn:trunk

    See 15.2.2 array:build

    See 15.2.8 array:foot

    See 15.2.31 array:trunk

  58. PR 258 

    New in 4.0

    See 15.2.14 array:index-where

  59. PR 313 

    The second argument can now be a sequence of integers.

    See 2.1.9 fn:remove

  60. PR 319 

    New in 4.0. The function replaces the internal op:same-key function in 3.1

    See 2.2.1 fn:atomic-equal

  61. PR 326 

    Higher-order functions are no longer an optional feature.

    See 1.2 Conformance

  62. PR 360 

    New in 4.0

    See 14.4.4 map:entries

  63. PR 419 

    New in 4.0

    See 2.1.8 fn:items-at

  64. PR 434 

    New in 4.0

    See 4.5.2 fn:parse-integer

    The function has been extended to allow output in a radix other than 10, for example in hexadecimal.

    See 4.6.1 fn:format-integer

  65. PR 477 

    New in 4.0

    See 15.2.24 array:slice

  66. PR 482 

    Deleted an inaccurate statement concerning the behavior of NaN.

    See 4.3 Comparing numeric values

  67. PR 507 

    New in 4.0

    See 2.5.13 fn:partition

  68. PR 546 

    It is no longer automatically an error if the input contains a codepoint that is not valid in XML. Instead, the codepoint must be a permitted character. The set of permitted characters is implementation-defined, but it is recommended that all Unicode characters should be accepted.

    See 5.2.1 fn:codepoints-to-string

    It is no longer automatically an error if the resource (after decoding) contains a codepoint that is not valid in XML. Instead, the codepoint must be a permitted character. The set of permitted characters is implementation-defined, but it is recommended that all Unicode characters should be accepted.

    See 17.1.5 fn:unparsed-text

    The rules regarding use of non-XML characters in JSON texts have been relaxed.

    See 17.4.3 JSON character repertoire

    See 17.4.4 fn:parse-json

    It is no longer automatically an error if the input contains a codepoint that is not valid in XML. Instead, the codepoint must be a permitted character. The set of permitted characters is implementation-defined, but it is recommended that all Unicode characters should be accepted.

    See 17.4.5 fn:json-doc

  69. PR 609 

    New in 4.0

    See 15.2.28 array:split

  70. PR 631 

    New in 4.0

    See 7.1 fn:decode-from-uri

  71. PR 662 

    Constructor functions now have a zero-arity form; the first argument defaults to the context item.

    See 22 Constructor functions

  72. PR 680 

    The case-insensitive collation is now defined normatively within this specification, rather than by reference to the HTML "living specification", which is subject to change. The collation can now be used for ordering comparisons as well as equality comparisons.

    See 5.3.6 The HTML ASCII Case-Insensitive Collation

  73. PR 702 

    The function can now take any number of arguments (previously it had to be two or more), and the arguments can be sequences of strings rather than single strings.

    See 5.4.4 fn:concat

  74. PR 710 

    Changes the function to return a sequence of key-value pairs rather than a map.

    See 13.5 fn:function-annotations

  75. PR 727 

    It has been clarified that loading a module has no effect on the static or dynamic context of the caller.

    See 18.2 fn:load-xquery-module

  76. PR 828 

    The $predicate callback function accepts an optional position argument.

    See 2.5.4 fn:filter

    The $action callback function accepts an optional position argument.

    See 2.5.7 fn:for-each

    See 2.5.8 fn:for-each-pair

    The $predicate callback function now accepts an optional position argument.

    See 15.2.4 array:filter

    The $action callback function now accepts an optional position argument.

    See 15.2.9 array:for-each

    See 15.2.10 array:for-each-pair

  77. PR 881 

    The way that fn:min and fn:max compare numeric values of different types has changed. The most noticeable effect is that when these functions are applied to a sequence of xs:integer or xs:decimal values, the result is an xs:integer or xs:decimal, rather than the result of converting this to an xs:float or xs:double.

    See 2.4.5 fn:max

    See 2.4.6 fn:min

  78. PR 901 

    The optional third argument can now be supplied as an empty sequence.

    See 2.1.13 fn:subsequence

    The third argument can now be supplied as an empty sequence.

    See 5.4.6 fn:substring

    The second argument can now be an empty sequence.

    See 6.3.3 fn:tokenize

    The optional second argument can now be supplied as an empty sequence.

    See 7.5 fn:resolve-uri

    The 3rd, 4th, and 5th arguments are now optional; previously the function required either 2 or 5 arguments.

    See 9.8.1 fn:format-dateTime

    See 9.8.2 fn:format-date

    See 9.8.3 fn:format-time

    All three arguments are now optional, and each argument can be set to an empty sequence. Previously if $description was supplied, it could not be empty.

    See 21.1.1 fn:error

    The $label argument can now be set to an empty sequence. Previously if $label was supplied, it could not be empty.

    See 21.2.1 fn:trace

  79. PR 905 

    The rule that multiple calls on fn:doc supplying the same absolute URI must return the same document node has been clarified; in particular the rule does not apply if the dynamic context for the two calls requires different processing of the documents (such as schema validation or whitespace stripping).

    See 17.1.1 fn:doc

  80. PR 909 

    The function has been expanded in scope to handle comparison of values other than strings.

    See 2.2.2 fn:compare

  81. PR 924 

    Rules have been added clarifying that users should not be allowed to change the schema for the fn namespace.

    See D Schemas

  82. PR 925 

    The decimal format name can now be supplied as a value of type xs:QName, as an alternative to supplying a lexical QName as an instance of xs:string.

    See 4.7.2 fn:format-number

  83. PR 932 

    The specification now prescribes a minimum precision and range for durations.

    See 8.1.2 Limits and precision

  84. PR 933 

    When comments and processing instructions are ignored, any text nodes either side of the comment or processing instruction are now merged prior to comparison.

    See 2.2.4 fn:deep-equal

  85. PR 940 

    New in 4.0

    See 2.5.20 fn:subsequence-where

  86. PR 953 

    Constructor functions for named record types have been introduced.

    See 22.6 Constructor functions for named record types

  87. PR 962 

    New in 4.0

    See 2.5.2 fn:do-until

    See 2.5.23 fn:while-do

  88. PR 969 

    New in 4.0

    See 14.4.3 map:empty

  89. PR 984 

    New in 4.0

    See 8.4.1 fn:seconds

  90. PR 987 

    The order of results is now prescribed; it was previously implementation-dependent.

    See 2.2.5 fn:distinct-values

  91. PR 1022 

    Regular expressions can include comments (starting and ending with #) if the c flag is set.

    See 6.1 Regular expression syntax

    See 6.2 Flags

  92. PR 1028 

    An option is provided to control how the JSON null value should be handled.

    See 17.4.4 fn:parse-json

  93. PR 1032 

    New in 4.0

    See 2.1.17 fn:void

  94. PR 1046 

    New in 4.0

    See 2.5.21 fn:take-while

  95. PR 1059 

    Use of an option keyword that is not defined in the specification and is not known to the implementation now results in a dynamic error; previously it was ignored.

    See 1.7 Options

  96. PR 1068 

    New in 4.0

    See 5.4.3 fn:graphemes

  97. PR 1072 

    The return type is now specified more precisely.

    See 18.2 fn:load-xquery-module

  98. PR 1090 

    When casting from a string to a duration or time or dateTime, it is now specified that when there are more digits in the fractional seconds than the implementation is able to retain, excess digits are truncated. Rounding upwards (which could affect the number of minutes or hours in the value) is not permitted.

    See 23.2 Casting from xs:string and xs:untypedAtomic

  99. PR 1093 

    New in 4.0

    See 5.3.9 fn:collation

  100. PR 1117 

    The $options parameter has been added.

    See 17.1.5 fn:unparsed-text

    See 17.1.7 fn:unparsed-text-available

  101. PR 1182 

    The $predicate callback function may return an empty sequence (meaning false).

    See 2.5.2 fn:do-until

    See 2.5.3 fn:every

    See 2.5.4 fn:filter

    See 2.5.10 fn:index-where

    See 2.5.16 fn:some

    See 2.5.21 fn:take-while

    See 2.5.23 fn:while-do

    See 14.4.6 map:filter

    See 14.4.12 map:keys-where

    See 15.2.4 array:filter

    See 15.2.14 array:index-where

  102. PR 1191 

    The $options parameter has been added, absorbing the $collation parameter.

    See 2.2.4 fn:deep-equal

    New in 4.0

    See 12.3.1 fn:distinct-ordered-nodes

  103. PR 1250 

    For selected properties including percent and exponent-separator, it is now possible to specify a single-character marker to be used in the picture string, together with a multi-character rendition to be used in the formatted output.

    See 4.7.2 fn:format-number

  104. PR 1257 

    The $options parameter has been added.

    See 17.2.1 fn:parse-xml

    See 17.2.2 fn:parse-xml-fragment

  105. PR 1262 

    New in 4.0

    See 5.3.10 fn:collation-available

  106. PR 1265 

    The constraints on the result of the function have been relaxed.

    See 12.1.2 fn:document-uri

  107. PR 1280 

    As a result of changes to the coercion rules, the number of supplied arguments can be greater than the number required: extra arguments are ignored.

    See 2.5.1 fn:apply

  108. PR 1288 

    Additional error conditions have been defined.

    See 17.2.1 fn:parse-xml

  109. PR 1296 

    New in 4.0

    See 2.5.14 fn:scan-left

  110. PR 1333 

    A new option is provided to allow the content of the loaded module to be supplied as a string.

    See 18.2 fn:load-xquery-module

  111. PR 1353 

    An option has been added to suppress the escaping of the solidus (forwards slash) character.

    See 17.4.7 fn:xml-to-json

  112. PR 1358 

    New in 4.0

    See 9.3.2 fn:unix-dateTime

  113. PR 1361 

    The term atomic value has been replaced by atomic item.

    See 1.9 Terminology

  114. PR 1393 

    Changes the function to return a sequence of key-value pairs rather than a map.

    See 13.5 fn:function-annotations

  115. PR 1409 

    This section now uses the term primitive type strictly to refer to the 20 atomic types that are not derived by restriction from another atomic type: that is, the 19 primitive atomic types defined in XSD, plus xs:untypedAtomic. The three types xs:integer, xs:dayTimeDuration, and xs:yearMonthDuration, which have custom casting rules but are not strictly-speaking primitive, are now handled in other subsections.

    See 23.1 Casting from primitive types to primitive types

    The rules for conversion of dates and times to strings are now defined entirely in terms of XSD 1.1 canonical mappings, since these deliver exactly the same result as the XPath 3.1 rules.

    See 23.1.2.2 Casting date/time values to xs:string

    The rules for conversion of durations to strings are now defined entirely in terms of XSD 1.1 canonical mappings, since the XSD 1.1 rules deliver exactly the same result as the XPath 3.1 rules.

    See 23.1.2.3 Casting xs:duration values to xs:string

  116. PR 1455 

    Numbers now retain their original lexical form, except for any changes needed to satisfy JSON syntax rules (for example, stripping leading zero digits).

    See 17.4.7 fn:xml-to-json

  117. PR 1473 

    New in 4.0

    See 2.1.5 fn:identity

  118. PR 1481 

    The function has been extended to handle other Gregorian types such as xs:gYearMonth.

    See 9.5.1 fn:year-from-dateTime

    See 9.5.2 fn:month-from-dateTime

    The function has been extended to handle other Gregorian types such as xs:gMonthDay.

    See 9.5.3 fn:day-from-dateTime

    The function has been extended to handle other types including xs:time.

    See 9.5.4 fn:hours-from-dateTime

    See 9.5.5 fn:minutes-from-dateTime

    The function has been extended to handle other types such as xs:gYearMonth.

    See 9.5.7 fn:timezone-from-dateTime

  119. PR 1504 

    Optional $separator added.

    See 15.2.17 array:join

  120. PR 1523 

    New functions are provided to obtain information about built-in types and types defined in an imported schema.

    See 19 Processing types

    New in 4.0

    See 19.1.2 fn:schema-type

    See 19.1.4 fn:atomic-type-annotation

    See 19.1.5 fn:node-type-annotation

  121. PR 1545 

    New in 4.0

    See 9.6.4 fn:civil-timezone

  122. PR 1565 

    The default for the escape option has been changed to false. The 3.1 specification gave the default value as true, but this appears to have been an error, since it was inconsistent with examples given in the specification and with tests in the test suite.

    See 17.4.4 fn:parse-json

  123. PR 1570 

    New in 4.0

    See 19.1.3 fn:type-of

  124. PR 1587 

    New in 4.0

    See 17.1.8 fn:unparsed-binary

  125. PR 1611 

    The spec has been corrected to note that the function depends on the implicit timezone.

    See 2.2.2 fn:compare

  126. PR 1671 

    New in 4.0.

    See 4.4.3 fn:divide-decimals

  127. PR 1687 

    New in 4.0

    See 14.4.10 map:items

  128. PR 1703 

    Ordered maps are introduced.

    See 14.1 Ordering of Maps

    Enhanced to allow for ordered maps.

    See 14.4.6 map:filter

    See 14.4.7 map:find

    See 14.4.8 map:for-each

    See 14.4.14 map:put

    See 14.4.15 map:remove

    The order of entries in maps is retained.

    See 17.4.4 fn:parse-json

  129. PR 1711 

    It is explicitly stated that the limits for $precision are implementation-defined.

    See 4.4.6 fn:round

    See 4.4.7 fn:round-half-to-even

  130. PR 1727 

    For consistency with the new function map:build, the handling of duplicates may now be controlled by supplying a user-defined callback function as an alternative to the fixed values for the earlier duplicates option.

    See 14.4.13 map:merge

  131. PR 1734 

    In 3.1, given a mixed input sequence such as (1, 3, 4.2e0), the specification was unclear whether it was permitted to add the first two integer items using integer arithmetic, rather than converting all items to doubles before performing any arithmetic. The 4.0 specification is clear that this is permitted; but since the items can be reordered before being added, this is not required.

    See 2.4.4 fn:avg

    See 2.4.7 fn:sum

  132. PR 1825 

    New in 4.0

    See 2.5.12 fn:partial-apply

  133. PR 1856 

    Word boundaries can be matched. Lookahead and lookbehind assertions are supported. Assertions (including ^ and $) can no longer be followed by a quantifier.

    See 6.1 Regular expression syntax

    The output of the function is extended to allow the represention of captured groups found within lookahead assertions.

    See 6.3.4 fn:analyze-string

  134. PR 1879 

    Additional options to control DTD and XInclude processing have been added.

    See 17.2.1 fn:parse-xml

  135. PR 1897 

    The $replacement argument can now be a function that computes the replacement strings.

    See 6.3.2 fn:replace

  136. PR 1906 

    New in 4.0

    See 14.5.10 fn:element-to-map-plan

    New in 4.0.

    See 14.5.11 fn:element-to-map

  137. PR 1910 

    An $options parameter is added. Note that the rules for the $options parameter control aspects of processing that were implementation-defined in earlier versions of this specification. An implementation may provide configuration options designed to retain backwards-compatible behavior when no explicit options are supplied.

    See 17.1.1 fn:doc

    See 17.1.2 fn:doc-available

  138. PR 1913 

    It is now permitted for the regular expression to match a zero-length string.

    See 6.3.2 fn:replace

    See 6.3.3 fn:tokenize

    See 6.3.4 fn:analyze-string

  139. PR 1933 

    New in 4.0

    See 17.2.5 fn:xsd-validator

  140. PR 1991 

    Named record types used in the signatures of built-in functions are now available as standard in the static context.

    See C Built-in named record types

  141. PR 2001 

    New in 4.0.

    See 2.5.18 fn:sort-by

    See 15.2.26 array:sort-by

  142. PR 2013 

    Support for binary input has been added.

    See 17.2.1 fn:parse-xml

    See 17.2.2 fn:parse-xml-fragment

    New in 4.0

    See 17.3.3 fn:html-doc

    See 17.5.8 fn:csv-doc

  143. PR 2030 

    This description of the XSD validation process was previously found (with some duplication) in the XQuery and XSLT specifications; those specifications now reference this description. As a side-effects, the descriptions of the process in XQuery and XSLT are better aligned.

    See 17.2.4 XSD validation

  144. PR 2031 

    Introduced the concept of JNodes.

    See 16 Processing JNodes

    New in 4.0

    See 16.1.1 fn:jtree

    See 16.1.3 fn:jnode-selector

    See 16.1.4 fn:jnode-position

  145. PR 2149 

    Generalized to work with JNodes as well as XNodes.

    See 12.2.1 fn:has-children

    The function is extended to handle JNodes.

    See 12.2.9 fn:path

    Generalized to work with JNodes as well as XNodes.

    See 12.3.2 fn:innermost

    See 12.3.3 fn:outermost

  146. PR 2168 

    Atomic items of types xs:hexBinary and xs:base64Binary are now mutually comparable. In rare cases, where an application uses both types and assumes they are distinct, this can represent a backwards incompatibility.

    See 2.2.1 fn:atomic-equal

    See 2.2.4 fn:deep-equal

    See 2.2.5 fn:distinct-values

  147. PR 2223 

    An error may now be raised if the base URI is not a valid LEIRI reference.

    See 12.1.1 fn:base-uri

  148. PR 2224 

    The $action callback function now accepts an optional position argument.

    See 14.4.6 map:filter

    See 14.4.8 map:for-each

  149. PR 2228 

    New in 4.0

    See 15.2.27 array:sort-with

  150. PR 2249 

    The specification now describes in more detail how to determine the effective encoding value.

    See 17.1.5 fn:unparsed-text

  151. PR 2256 

    In the interests of consistency, the index-of function now defines equality to mean contextually equal. This has the implication that NaN is now considered equal to NaN.

    See 2.2.8 fn:index-of

  152. PR 2259 

    A new parameter canonical is available to give control over serialization of XML, XHTML, and JSON.

    See 17.2.3 fn:serialize

  153. PR 2286 

    The type of $value has been generalized to xs:anyAtomicType?.

    See 5.4.7 fn:string-length

    See 5.4.8 fn:normalize-space

  154. PR 2350 

    New in 4.0

    See 2.6.3 fn:generate