XPath and XQuery Functions and Operators 4.0

5 Processing strings

This section specifies functions and operators on the [XML Schema Part 2: Datatypes Second Edition]xs:string datatype and the datatypes derived from it.

5.4 Functions on string values

The following functions are defined on values of type xs:string and types derived from it.

Function	Meaning
`fn:char`	Returns a string containing a particular character or glyph.
`fn:characters`	Splits the supplied string into a sequence of single-character strings.
`fn:graphemes`	Splits the supplied string into a sequence of single-grapheme strings.
`fn:concat`	Returns the concatenation of the arguments, treated as sequences of strings.
`fn:string-join`	Returns a string created by concatenating the items in a sequence, with a defined separator between adjacent items.
`fn:substring`	Returns the part of `$value` beginning at the position indicated by `$start` and continuing for the number of characters indicated by `$length`.
`fn:string-length`	Returns the number of characters in a string.
`fn:normalize-space`	Returns `$value` with leading and trailing whitespace removed, and sequences of internal whitespace reduced to a single space character.
`fn:normalize-unicode`	Returns `$value` after applying Unicode normalization.
`fn:upper-case`	Converts a string to upper case.
`fn:lower-case`	Converts a string to lower case.
`fn:translate`	Returns `$value` modified by replacing or removing individual characters.
`fn:hash`	Returns the results of a specified hash, checksum, or cyclic redundancy check function applied to the input.

Notes:

When the above operators and functions are applied to datatypes derived from xs:string, they are guaranteed to return values that are instances of xs:string, but the value might or might not be an instance of the particular subtype of xs:string to which they were applied.

The strings returned by fn:concat and fn:string-join are not guaranteed to be normalized. But see note in fn:concat.

5.4.7 fn:string-length

Changes in 4.0 ⬇ ⬆

The type of $value has been generalized to xs:anyAtomicType?. [Issue 2279 PR 2286 17 November 2025]

Summary

Returns the number of characters in a string.

Signature

`fn:string-length`(
`$value`	`as` `xs:anyAtomicType?`	`:=` `fn:string(.)`
) `as` `xs:integer`

Properties

The zero-argument form of this function is deterministic, context-dependent, and focus-dependent.

The one-argument form of this function is deterministic, context-independent, and focus-independent.

Rules

If the argument is omitted, it defaults to fn:string(.).

If the value is the empty sequence, the function returns the xs:integer value 0. Otherwise, the value is cast to an xs:string, and an xs:integer is returned that reflects the number of characters in the string.

Error Conditions

If $value is not specified and the context value is absent^DM, a type error is raised: [err:XPDY0002]^XP.

As a consequence of the rules given above, a type error is raised [err:XPTY0004]^XP if the context value cannot be atomized, or if the result of atomizing the context value is a sequence containing more than one atomic item.

Notes

Unlike some programming languages, a codepoint greater than 65535 counts as one character, not two.

There are situations where fn:string-length() has a different effect from fn:string-length(.). For example, if the context value is an attribute node typed as an xs:integer with the string value 000001, then fn:string-length() returns 6 (the length of the string value of the node), while fn:string-length(.) raises a type error (because the result of atomization is not an xs:string).

There are situations where fn:string-length() has a different effect from fn:string-length(.). These situations all involve nodes with non-trivial type annotations. For example:

If the context value is an attribute node typed as an xs:integer with the string value 000001, then fn:string-length() returns 6 (the length of the string value of the node), while fn:string-length(.) returns 1 (the length of the string that results from converting the typed value (1) to a string). In earlier versions of this specification this call would have failed with a type error.
If the context value is the element node <e> NN </e>, and has the type annotation xs:NCName, then fn:string-length() returns 4 (the length of the string value of the element node), while fn:string-length(.) returns 2 (the length of the typed value of the element node).
If the context value is the attribute node ref="A B C", and has the type annotation xs:IDREFS, then fn:string-length() returns 5 (the length of the string value of the attribute node), while fn:string-length(.) raises an error, because the atomized value of the attribute is a sequence of strings, and calling the fn:string function on a sequence of strings fails.

Examples

Expression:	string-length( "As long as a piece of string" )
Result:	28
Expression:	"ᾧ" => string-length()
Result:	1
Expression:	"ᾧ" => normalize-unicode("NFD") => string-length()
Result:	4 (For strings that consist of a base character with combining characters, each combining character is length 1.)
Expression:	`string-length(())`
Result:	`0`

5.4.8 fn:normalize-space

Changes in 4.0 ⬇ ⬆

The type of $value has been generalized to xs:anyAtomicType?. [Issue 2279 PR 2286 17 November 2025]

Summary

Returns $value with leading and trailing whitespace removed, and sequences of internal whitespace reduced to a single space character.

Signature

`fn:normalize-space`(
`$value`	`as` `xs:anyAtomicType?`	`:=` `string(.)`
) `as` `xs:string`

Properties

The zero-argument form of this function is deterministic, context-dependent, and focus-dependent.

The one-argument form of this function is deterministic, context-independent, and focus-independent.

Rules

If the argument is omitted, it defaults to fn:string(.).

If the value is the empty sequence, the function returns a zero-length string. Otherwise, the value is cast to an xs:string, and a new string is constructed by stripping leading and trailing whitespace from the string, and by replacing sequences of one or more adjacent whitespace characters with a single space, U+0020 (SPACE) .

The whitespace characters are defined in the metasymbol S (Production 3) of [Extensible Markup Language (XML) 1.0 (Fifth Edition)].

Error Conditions

If no argument is supplied and the context value is absent^DM, a type error is raised [err:XPDY0002]^XP.

Notes

The definition of whitespace is unchanged in [Extensible Markup Language (XML) 1.1 Recommendation]. It is repeated here for convenience:

S ::= (#x20 | #x9 | #xD | #xA)+

There are situations where fn:normalize-space() has a different effect from fn:normalize-space(.). These situations all involve nodes with non-trivial type annotations. For example:

If the context value is an attribute node typed as an xs:integer with the string value 000001, then fn:normalize-space() returns "000001" (the string value of the node, whitespace-normalized), while fn:normalize-space(.) returns "1" (the typed value of the node, converted to a string and then normalized).
If the context value is the attribute node ref=" A B C ", and has the type annotation xs:IDREFS, then fn:normalize-space() returns "A B C" (the string value of the attribute node, whitespace-normalized), while fn:normalize-space(.) raises an error, because the typed value of the attribute is a sequence of strings, and calling the fn:string function on a sequence of strings fails.

The effect of fn:normalize-space is exactly the same as the effect of the whitespace=collapse facet in XSD. Since this facet is implicit for most XSD data types (with the notable exception of xs:string itself), nodes that are validated against types other than xs:string will tend to be implicitly in the form that normalize-string generates. Confusingly, the XSD type xs:normalizedString uses the facet whitespace=replace, which does not have the same effect as the normalize-space function: it replaces all whitespace characters by U+0020 (SPACE) , but does not remove leading or trailing spaces, nor does it merge adjacent spaces.

Examples

Expression:	normalize-space(" The wealthy curled darlings of our nation. ")
Result:	"The wealthy curled darlings of our nation."
Expression:	`normalize-space(())`
Result:	`""`

XPath and XQuery Functions and Operators 4.0

W3C Editor's Draft 23 February 2026

Abstract

Status of this Document

Dedication

5 Processing strings

5.4 Functions on string values

5.4.7 fn:string-length

5.4.8 fn:normalize-space