View Old View New View Both View Only Previous Next

This draft contains only sections that have differences from the version that it modified.

W3C

XPath and XQuery Functions and Operators 4.0

W3C Editor's Draft 23 February 2026

This version:
https://qt4cg.org/specifications/xpath-functions-40/
Latest version of XPath and XQuery Functions and Operators 4.0:
https://qt4cg.org/specifications/xpath-functions-40/
Most recent Recommendation of XPath and XQuery Functions and Operators:
https://www.w3.org/TR/2017/REC-xpath-functions-31-20170321/
Editor:
Michael Kay, Saxonica <http://www.saxonica.com/>

This document is also available in these non-normative formats: Specification in XML format and XML function catalog.


Abstract

This document defines constructor functions, operators, and functions on the datatypes defined in [XML Schema Part 2: Datatypes Second Edition] and the datatypes defined in [XQuery and XPath Data Model (XDM) 4.0]. It also defines functions and operators on nodes and node sequences as defined in the [XQuery and XPath Data Model (XDM) 4.0]. These functions and operators are defined for use in [XML Path Language (XPath) 4.0] and [XQuery 4.0: An XML Query Language] and [XSL Transformations (XSLT) Version 4.0] and other related XML standards. The signatures and summaries of functions defined in this document are available at: http://www.w3.org/2005/xpath-functions/.

A summary of changes since version 3.1 is provided at H Changes since 3.1.

Status of this Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document.

This document is a working draft developed and maintained by a W3C Community Group, the XQuery and XSLT Extensions Community Group unofficially known as QT4CG (where "QT" denotes Query and Transformation). This draft is work in progress and should not be considered either stable or complete. Standard W3C copyright and patent conditions apply.

The community group welcomes comments on the specification. Comments are best submitted as issues on the group's GitHub repository.

The community group maintains two extensive test suites, one oriented to XQuery and XPath, the other to XSLT. These can be found at qt4tests and xslt40-test respectively. New tests, or suggestions for correcting existing tests, are welcome. The test suites include extensive metadata describing the conditions for applicability of each test case as well as the expected results. They do not include any test drivers for executing the tests: each implementation is expected to provide its own test driver.

Dedication

The publications of this community group are dedicated to our co-chair, Michael Sperberg-McQueen (1954–2024).


5 Processing strings

This section specifies functions and operators on the [XML Schema Part 2: Datatypes Second Edition]xs:string datatype and the datatypes derived from it.

5.4 Functions on string values

The following functions are defined on values of type xs:string and types derived from it.

FunctionMeaning
fn:charReturns a string containing a particular character or glyph.
fn:charactersSplits the supplied string into a sequence of single-character strings.
fn:graphemesSplits the supplied string into a sequence of single-grapheme strings.
fn:concatReturns the concatenation of the arguments, treated as sequences of strings.
fn:string-joinReturns a string created by concatenating the items in a sequence, with a defined separator between adjacent items.
fn:substringReturns the part of $value beginning at the position indicated by $start and continuing for the number of characters indicated by $length.
fn:string-lengthReturns the number of characters in a string.
fn:normalize-spaceReturns $value with leading and trailing whitespace removed, and sequences of internal whitespace reduced to a single space character.
fn:normalize-unicodeReturns $value after applying Unicode normalization.
fn:upper-caseConverts a string to upper case.
fn:lower-caseConverts a string to lower case.
fn:translateReturns $value modified by replacing or removing individual characters.
fn:hashReturns the results of a specified hash, checksum, or cyclic redundancy check function applied to the input.

Notes:

When the above operators and functions are applied to datatypes derived from xs:string, they are guaranteed to return values that are instances of xs:string, but the value might or might not be an instance of the particular subtype of xs:string to which they were applied.

The strings returned by fn:concat and fn:string-join are not guaranteed to be normalized. But see note in fn:concat.

5.4.7 fn:string-length

Changes in 4.0  

  1. The type of $value has been generalized to xs:anyAtomicType?.  [Issue 2279 PR 2286 17 November 2025]

Summary

Returns the number of characters in a string.

Signature
fn:string-length(
$valueas xs:anyAtomicType?:= fn:string(.)
) as xs:integer
Properties

The zero-argument form of this function is deterministic, context-dependent, and focus-dependent.

The one-argument form of this function is deterministic, context-independent, and focus-independent.

Rules

If the argument is omitted, it defaults to fn:string(.).

If the value is the empty sequence, the function returns the xs:integer value 0. Otherwise, the value is cast to an xs:string, and an xs:integer is returned that reflects the number of characters in the string.

Error Conditions

If $value is not specified and the context value is absentDM, a type error is raised: [err:XPDY0002]XP.

As a consequence of the rules given above, a type error is raised [err:XPTY0004]XP if the context value cannot be atomized, or if the result of atomizing the context value is a sequence containing more than one atomic item.

Notes

Unlike some programming languages, a codepoint greater than 65535 counts as one character, not two.

There are situations where fn:string-length() has a different effect from fn:string-length(.). For example, if the context value is an attribute node typed as an xs:integer with the string value 000001, then fn:string-length() returns 6 (the length of the string value of the node), while fn:string-length(.) raises a type error (because the result of atomization is not an xs:string).

There are situations where fn:string-length() has a different effect from fn:string-length(.). These situations all involve nodes with non-trivial type annotations. For example:

  • If the context value is an attribute node typed as an xs:integer with the string value 000001, then fn:string-length() returns 6 (the length of the string value of the node), while fn:string-length(.) returns 1 (the length of the string that results from converting the typed value (1) to a string). In earlier versions of this specification this call would have failed with a type error.

  • If the context value is the element node <e> NN </e>, and has the type annotation xs:NCName, then fn:string-length() returns 4 (the length of the string value of the element node), while fn:string-length(.) returns 2 (the length of the typed value of the element node).

  • If the context value is the attribute node ref="A B C", and has the type annotation xs:IDREFS, then fn:string-length() returns 5 (the length of the string value of the attribute node), while fn:string-length(.) raises an error, because the atomized value of the attribute is a sequence of strings, and calling the fn:string function on a sequence of strings fails.

Examples
Expression:
string-length(
  "As long as a piece of string"
)
Result:
28
Expression:
"ᾧ" => string-length()
Result:
1
Expression:
"ᾧ" => normalize-unicode("NFD") => string-length()
Result:
4

(For strings that consist of a base character with combining characters, each combining character is length 1.)

Expression:

string-length(())

Result:

0

5.4.8 fn:normalize-space

Changes in 4.0  

  1. The type of $value has been generalized to xs:anyAtomicType?.  [Issue 2279 PR 2286 17 November 2025]

Summary

Returns $value with leading and trailing whitespace removed, and sequences of internal whitespace reduced to a single space character.

Signature
fn:normalize-space(
$valueas xs:anyAtomicType?:= string(.)
) as xs:string
Properties

The zero-argument form of this function is deterministic, context-dependent, and focus-dependent.

The one-argument form of this function is deterministic, context-independent, and focus-independent.

Rules

If the argument is omitted, it defaults to fn:string(.).

If the value is the empty sequence, the function returns a zero-length string. Otherwise, the value is cast to an xs:string, and a new string is constructed by stripping leading and trailing whitespace from the string, and by replacing sequences of one or more adjacent whitespace characters with a single space, U+0020 (SPACE) .

The whitespace characters are defined in the metasymbol S (Production 3) of [Extensible Markup Language (XML) 1.0 (Fifth Edition)].

Error Conditions

If no argument is supplied and the context value is absentDM, a type error is raised [err:XPDY0002]XP.

As a consequence of the rules given above, a type error is raised [err:XPTY0004]XP if the context value cannot be atomized, or if the result of atomizing the context value is a sequence containing more than one atomic item.

Notes

The definition of whitespace is unchanged in [Extensible Markup Language (XML) 1.1 Recommendation]. It is repeated here for convenience:

S ::= (#x20 | #x9 | #xD | #xA)+

There are situations where fn:normalize-space() has a different effect from fn:normalize-space(.). These situations all involve nodes with non-trivial type annotations. For example:

  • If the context value is an attribute node typed as an xs:integer with the string value 000001, then fn:normalize-space() returns "000001" (the string value of the node, whitespace-normalized), while fn:normalize-space(.) returns "1" (the typed value of the node, converted to a string and then normalized).

  • If the context value is the attribute node ref=" A B C ", and has the type annotation xs:IDREFS, then fn:normalize-space() returns "A B C" (the string value of the attribute node, whitespace-normalized), while fn:normalize-space(.) raises an error, because the typed value of the attribute is a sequence of strings, and calling the fn:string function on a sequence of strings fails.

The effect of fn:normalize-space is exactly the same as the effect of the whitespace=collapse facet in XSD. Since this facet is implicit for most XSD data types (with the notable exception of xs:string itself), nodes that are validated against types other than xs:string will tend to be implicitly in the form that normalize-string generates. Confusingly, the XSD type xs:normalizedString uses the facet whitespace=replace, which does not have the same effect as the normalize-space function: it replaces all whitespace characters by U+0020 (SPACE) , but does not remove leading or trailing spaces, nor does it merge adjacent spaces.

Examples
Expression:
normalize-space(" The    wealthy curled darlings
           of    our    nation. ")
Result:
"The wealthy curled darlings of our nation."
Expression:

normalize-space(())

Result:

""