View Old View New View Both View Only Previous Next

This draft contains only sections that have differences from the version that it modified.

W3C

XPath and XQuery Functions and Operators 4.0

W3C Editor's Draft 23 February 2026

This version:
https://qt4cg.org/specifications/xpath-functions-40/
Latest version of XPath and XQuery Functions and Operators 4.0:
https://qt4cg.org/specifications/xpath-functions-40/
Most recent Recommendation of XPath and XQuery Functions and Operators:
https://www.w3.org/TR/2017/REC-xpath-functions-31-20170321/
Editor:
Michael Kay, Saxonica <http://www.saxonica.com/>

Please check the errata for any errors or issues reported since publication.

See also translations.

This document is also available in these non-normative formats: Specification in XML format and XML function catalog.


Abstract

This document defines constructor functions, operators, and functions on the datatypes defined in [XML Schema Part 2: Datatypes Second Edition] and the datatypes defined in [XQuery and XPath Data Model (XDM) 3.1]. It also defines functions and operators on nodes and node sequences as defined in the [XQuery and XPath Data Model (XDM) 3.1]. These functions and operators are defined for use in [XML Path Language (XPath) 4.0] and [XQuery 4.0: An XML Query Language] and [XSL Transformations (XSLT) Version 4.0] and other related XML standards. The signatures and summaries of functions defined in this document are available at: http://www.w3.org/2005/xpath-functions/.

A summary of changes since version 3.1 is provided at G Changes since 3.1.

Status of this Document

This version of the specification is work in progress. It is produced by the QT4 Working Group, officially the W3C XSLT 4.0 Extensions Community Group. Individual functions specified in the document may be at different stages of review, reflected in their History notes. Comments are invited, in the form of GitHub issues at https://github.com/qt4cg/qtspecs.

Dedication

The publications of this community group are dedicated to our co-chair, Michael Sperberg-McQueen (1954–2024).


14 Processing sequences

A sequence is an ordered collection of zero or more items. An item is a node, an atomic item, or a function, such as a map or an array. The terms sequence and item are defined formally in [XQuery 4.0: An XML Query Language] and [XML Path Language (XPath) 4.0].

14.6 Functions giving access to external information

The functions in this section provide access to resources (such as files) in the external environment.

FunctionMeaning
fn:docRetrieves a document using a URI supplied as an xs:string, and returns the corresponding document node.
fn:doc-availableThe function returns true if and only if the function call fn:doc($source, $options) would return a document node.
fn:collectionReturns a sequence of items identified by a collection URI; or a default collection if no URI is supplied.
fn:uri-collectionReturns a sequence of xs:anyURI values representing the URIs in a URI collection.
fn:unparsed-textThe fn:unparsed-text function reads an external resource (for example, a file) and returns a string representation of the resource.
fn:unparsed-text-linesThe fn:unparsed-text-lines function reads an external resource (for example, a file) and returns its contents as a sequence of strings, one for each line of text in the string representation of the resource.
fn:unparsed-text-availableAllows an application to determine whether a call on fn:unparsed-text with particular arguments would succeed.
fn:environment-variableReturns the value of a system environment variable, if it exists.
fn:available-environment-variablesReturns a list of environment variable names that are suitable for passing to fn:environment-variable, as a (possibly empty) sequence of strings.

14.6.1 fn:doc

Changes in 4.0  

  1. The rule that multiple calls on fn:doc supplying the same absolute URI must return the same document node has been clarified; in particular the rule does not apply if the dynamic context for the two calls requires different processing of the documents (such as schema validation or whitespace stripping).  [Issue 898 PR 905 9 January 2024]

  2. An $options parameter is added. Note that the rules for the $options parameter control aspects of processing that were implementation-defined in earlier versions of this specification. An implementation may provide configuration options designed to retain backwards-compatible behavior when no explicit options are supplied.  [Issue 1021 PR 1910 6 April 2025]

Summary

Retrieves a document using a URI supplied as an xs:string, and returns the corresponding document node.

Signature
fn:doc(
$sourceas xs:string?,
$optionsas map(*)?:= {}
) as document-node()?
Properties

This function is deterministic, context-dependent, and focus-independent. It depends on available documents, and static base URI.

Rules

If $source is the empty sequence, the result is an empty sequence.

If $source is a relative URI reference, it is resolved relative to the value of the static base URI property from the static context. The resulting absolute URI is cast to an xs:string.

If the available documents described in Section 2.1.2 Dynamic Context XP31 provides a mapping from this string to a document node, the function returns that document node.

The URI may include a fragment identifier.

The $options argument, if present and non-empty, defines the detailed behavior of the function. The option parameter conventions apply. The options available are as follows:

record(
dtd-validation?as xs:boolean,
allow-external-entities?as xs:boolean,
entity-expansion-limit?as xs:integer?,
stable?as xs:boolean,
strip-space?as xs:boolean?,
xinclude?as xs:boolean,
xsd-validation?as xs:string,
xsi-schema-location?as xs:boolean
)
KeyValueMeaning

dtd-validation?

Determines whether DTD validation takes place.
  • Type: xs:boolean

  • Default: false()

trueThe input is parsed using a validating XML parser. The input must contain a DOCTYPE declaration to identify the DTD to be used for validation. The DTD may be internal or external.
falseDTD validation does not take place. However, if a DOCTYPE declaration is present, then it is read, for example to perform entity expansion.

allow-external-entities?

Determines whether references to external entities (including a DTD entity) are permitted.
  • Type: xs:boolean

  • Default: true()

trueReferences to external entities are permitted, and are resolved relative to the base URI.
falseReferences to external entities (including an external DTD) are not permitted, and result in the call on parse-xml failing with a dynamic error if present.

entity-expansion-limit?

Places a limit on the maximum number of entity references that may be expanded, or on the size of the expanded entities. The limit applies both to internal and external entities, but not to built-in entity references, nor to character references.
  • Type: xs:integer?

  • Default: ()

()The limit (if any) is implementation-dependent.
integerThe processor should impose a limit on the number of entity references that are expanded, or on the size of the expanded entities, depending on the options available in the underlying XML parser; the limit should be commensurate with the value requested, but the precise effect may be . implementation-dependent. If the XML parser does not offer the ability to impose a limit, or if the value is zero, then entity expansion should if possible be disabled entirely, leading to a dynamic error if the input contains any entity references. A negative value should be interpreted as placing no limits on entity expansion.

stable?

Determines whether two calls on the doc function, with the same URI, the same options, and the same context, are guaranteed to return the same document node. The default value is true, but this may be overridden by implementation-defined configuration options.
  • Type: xs:boolean

  • Default: true()

trueGiven the same explicit and implicit arguments, multiple calls return the same document node: that is, the function is deterministic.
falseMultiple calls with the same explicit and implicit arguments may return the same document node or different document nodes at the discretion of the implementation.

strip-space?

Determines whether whitespace-only text nodes are removed from the resulting document. The default is defined by the host language or by the implementation. (Note: in XSLT, the xsl:strip-space and xsl:preserve-space declarations provide detailed control based on the parent element name.)
  • Type: xs:boolean?

  • Default: ()

trueAll whitespace-only text nodes are stripped, unless either (a) they are within the scope of the attribute xml:space="preserve", or (b) XSD validation identifies that the parent element has a simple type or a complex type with simple content.
falseAll whitespace-only text nodes are preserved, unless either (a) DTD validation marks them as ignorable, or (b) XSD validation recognizes the containing element as having element-only or empty content.

xinclude?

Determines whether any xi:include elements in the input are to be processed using an XInclude processor.
  • Type: xs:boolean

  • Default: false()

trueAny xi:include elements are expanded. If there are xi:include elements and no XInclude processor is available then a dynamic error is raised.
falseAny xi:include elements are handled as ordinary elements without expansion.

xsd-validation?

Determines whether XSD validation takes place, using the schema definitions present in the static context. The effect of requesting validation is the same as invoking the doc function without validation, and then applying an XQuery validate expression to the result, with corresponding options.
  • Type: xs:string

  • Default: "skip"

strictStrict XSD validation takes place
laxLax XSD validation takes place
skipNo XSD validation takes place
type Q{uri}localXSD validation takes place against the schema-defined type, present in the static context, that has the given URI and local name.

xsi-schema-location?

When XSD validation takes place, determines whether schema components referenced using xsi:schemaLocation or xsi:noNamespaceSchemaLocation attributes within the source document are to be used. The option is ignored if XSD validation does not take place.
  • Type: xs:boolean

  • Default: false()

trueXSD validation uses the schema components referenced using xsi:schemaLocation or xsi:noNamespaceSchemaLocation attributes in addition to the schema components present in the static context; these components must be compatible as described in Section 4.1.2 Schema ConsistencyDM.
falseAny xsi:schemaLocation and xsi:noNamespaceSchemaLocation attributes in the document are ignored.

By default, this function is deterministic. Two calls on this function return the same document node if the same URI Reference (after resolution to an absolute URI Reference) is supplied to both calls. Thus, the following expression (if it does not raise an error) will always return true:

doc("foo.xml") is doc("foo.xml")

Note:

This equivalence applies only because the two calls on the fn:doc function have the same options and the same static and dynamic context, to the extent this is relevant. If two calls on fn:doc have different dynamic contexts, then the mapping from URIs to document nodes in the two contexts may differ, which means that different document nodes may be returned for the same URI. This can happen, for example, if the two calls appear in different XSLT packages with different validation options or whitespace-stripping options; one call might produce a schema-validated document, the other an untyped document.

The requirement to deliver a deterministic result has performance implications, and for this reason implementations may provide a user option to evaluate the function without a guarantee of determinism. The manner in which any such option is provided is implementation-defined. If the user has not selected such an option, a call of the function must either return a deterministic result or must raise a dynamic error [err:FODC0003].

Note:

If the $source URI is obtained from a source document, it is generally appropriate to resolve it relative to the base URI property of the relevant node in the source document. This can be achieved by calling the fn:resolve-uri function, and passing the resulting absolute URI as an argument to the fn:doc function.

If two calls to this function supply different absolute URI References as arguments, the same document node may be returned if the implementation can determine that the two arguments refer to the same resource.

By defining the semantics of this function in terms of a string-to-document-node mapping in the dynamic context, the specification is acknowledging that the results of this function are outside the purview of the language specification itself, and depend entirely on the run-time environment in which the expression is evaluated. This run-time environment includes not only an unpredictable collection of resources (“the web”), but configurable machinery for locating resources and turning their contents into document nodes within the XPath data model. Both the set of resources that are reachable, and the mechanisms by which those resources are parsed and validated, are implementation-dependent.

One possible processing model for this function is as follows. The resource identified by the URI Reference is retrieved. If the resource cannot be retrieved, a dynamic error is raised [err:FODC0002]. The data resulting from the retrieval action is then parsed as an XML document and a tree is constructed in accordance with the [XQuery and XPath Data Model (XDM) 3.0]. If the top-level media type is known and is "text", the content is parsed in the same way as if the media type were text/xml; otherwise, it is parsed in the same way as if the media type were application/xml. If the contents cannot be parsed successfully, a dynamic error is raised [err:FODC0002]. Otherwise, the result of the function is the document node at the root of the resulting tree. This tree is then optionally validated against a schema.

Various aspects of this processing are implementation-defined. Implementations may provide external configuration options that allow any aspect of the processing to be controlled by the user. In particular:

  • The set of URI schemes that the implementation recognizes is implementation-defined. Implementations may allow the mapping of URIs to resources to be configured by the user, using mechanisms such as catalogs or user-written URI handlers.

  • The handling of non-XML media types is implementation-defined. Implementations may allow instances of the data model to be constructed from non-XML resources, under user control.

  • It is implementation-defined whether DTD validation and/or schema validation is applied to the source document.

  • Implementations may provide user-defined error handling options that allow processing to continue following an error in retrieving a resource, or in parsing and validating its content. When errors have been handled in this way, the function may return either an empty sequence, or a fallback document provided by the error handler.

  • Implementations may provide user options that relax the requirement for the function to return deterministic results.

  • The effect of a fragment identifier in the supplied URI is implementation-defined. One possible interpretation is to treat the fragment identifier as an ID attribute value, and to return a document node having the element with the selected ID value as its only child.

Error Conditions

A dynamic error may be raised [err:FODC0005] if $source is not a valid URI reference.

A dynamic error is raised [err:FODC0002] if a relative URI reference is supplied, and the base-URI property in the static context is absent.

A dynamic error is raised [err:FODC0002] if the available documents provides no mapping for the absolutized URI.

A dynamic error is raised [err:FODC0002] if the resource cannot be retrieved or cannot be parsed successfully as XML using the selected options.

A dynamic error is raised [err:FODC0003] if the implementation is not able to guarantee that the result of the function will be deterministic, and the user has not indicated that an unstable result is acceptable.

15 Parsing and serializing

These functions convert between the lexical representation and XPath and XQuery data model representation of various file formats.

15.1 Functions on XML Data

These functions convert between the lexical representation of XML and the tree representation.

(The fn:serialize function also handles HTML and JSON output, but is included in this section for editorial convenience.)

FunctionMeaning
fn:parse-xmlThis function takes as input an XML document represented as a string, and returns the document node at the root of an XDM tree representing the parsed document.
fn:parse-xml-fragmentThis function takes as input an XML external entity represented as a string, and returns the document node at the root of an XDM tree representing the parsed document fragment.
fn:serializeThis function serializes the supplied input sequence $input as described in [XSLT and XQuery Serialization 3.1], returning the serialized representation of the sequence as a string.

15.1.1 fn:parse-xml

Changes in 4.0  

  1. The $options parameter has been added.  [Issue 305 PR 1257 11 June 2024]

  2. Additional error conditions have been defined.  [Issue 1287 PR 1288 25 June 2024]

  3. Additional options to control DTD and XInclude processing have been added.  [Issues 1857 1860 ]

Summary

This function takes as input an XML document represented as a string, and returns the document node at the root of an XDM tree representing the parsed document.

Signature
fn:parse-xml(
$valueas xs:string?,
$optionsas map(*)?:= {}
) as document-node(*)?
Properties

This function is nondeterministic, context-dependent, and focus-independent. It depends on static base URI.

Rules

If $value is the empty sequence, the function returns the empty sequence.

Because the input is supplied as a string, not as an octet stream, the encoding specified in the XML declaration (if present) should be ignored. For similar reasons, any initial byte order mark (codepoint U+FEFF) should be ignored.

The $options argument, if present and non-empty, defines the detailed behavior of the function. The option parameter conventions apply. The options available are as follows:

record(
base-uri?as xs:anyURI,
dtd-validation?as xs:boolean,
allow-external-entities?as xs:boolean,
entity-expansion-limit?as xs:integer?,
strip-space?as xs:boolean,
xinclude?as xs:boolean,
xsd-validation?as xs:string,
xsi-schema-location?as xs:boolean
)
KeyValueMeaning

base-uri?

Determines the base URI. This is used both as the base URI used by the XML parser to resolve relative entity references within the document, and as the base URI of the document node that is returned. It defaults to the static base URI of the function call.
  • Type: xs:anyURI

  • Default: static-base-uri()

dtd-validation?

Determines whether DTD validation takes place.
  • Type: xs:boolean

  • Default: false()

trueThe input is parsed using a validating XML parser. The input must contain a DOCTYPE declaration to identify the DTD to be used for validation. The DTD may be internal or external.
falseDTD validation does not take place. However, if a DOCTYPE declaration is present, then it is read, for example to perform entity expansion.

allow-external-entities?

Determines whether references to external entities (including a DTD entity) are permitted.
  • Type: xs:boolean

  • Default: true()

trueReferences to external entities are permitted, and are resolved relative to the base URI.
falseReferences to external entities (including an external DTD) are not permitted, and result in the call on parse-xml failing with a dynamic error if present.

entity-expansion-limit?

Places a limit on the maximum number of entity references that may be expanded, or on the size of the expanded entities. The limit applies both to internal and external entities, but not to built-in entity references, nor to character references.
  • Type: xs:integer?

  • Default: ()

()The limit (if any) is implementation-dependent.
integerThe processor should impose a limit on the number of entity references that are expanded, or on the size of the expanded entities, depending on the options available in the underlying XML parser; the limit should be commensurate with the value requested, but the precise effect may be . implementation-dependent. If the XML parser does not offer the ability to impose a limit, or if the value is zero, then entity expansion should if possible be disabled entirely, leading to a dynamic error if the input contains any entity references. A negative value should be interpreted as placing no limits on entity expansion.

strip-space?

Determines whether whitespace-only text nodes are removed from the resulting document. (Note: in XSLT, the xsl:strip-space and xsl:preserve-space declarations are ignored.)
  • Type: xs:boolean

  • Default: false()

trueAll whitespace-only text nodes are stripped, unless either (a) they are within the scope of the attribute xml:space="preserve", or (b) XSD validation identifies that the parent element has a simple type or a complex type with simple content.
falseAll whitespace-only text nodes are preserved, unless either (a) DTD validation marks them as ignorable, or (b) XSD validation recognizes the containing element as having element-only or empty content.

xinclude?

Determines whether any xi:include elements in the input are to be processed using an XInclude processor.
  • Type: xs:boolean

  • Default: false()

trueAny xi:include elements are expanded. If there are xi:include elements and no XInclude processor is available then a dynamic error is raised.
falseAny xi:include elements are handled as ordinary elements without expansion.

xsd-validation?

Determines whether XSD validation takes place., using the schema definitions present in the static context. The effect of requesting validation is the same as invoking the parse-xml function without validation, and then applying an XQuery validate expression to the result, with corresponding options.
  • Type: xs:string

  • Default: "skip"

strictStrict XSD validation takes place
laxLax XSD validation takes place
skipNo XSD validation takes place
type Q{uri}localXSD validation takes place against the schema-defined type, present in the static context, that has the given URI and local name.

xsi-schema-location?

When XSD validation takes place, determines whether schema components referenced using xsi:schemaLocation or xsi:noNamespaceSchemaLocation attributes within the source document are to be used. The option is ignored if XSD validation does not take place.
  • Type: xs:boolean

  • Default: false

trueXSD validation uses the schema components referenced using xsi:schemaLocation or xsi:noNamespaceSchemaLocation attributes in addition to the schema components present in the static context; these components must be compatible as described in Section 4.1.2 Schema ConsistencyDM.
falseAny xsi:schemaLocation and xsi:noNamespaceSchemaLocation attributes in the document are ignored.

Except to the extent defined by these options, the precise process used to construct the XDM instance is implementation-defined. In particular, it is implementation-defined whether an XML 1.0 or XML 1.1 parser is used.

The document URI of the returned node is absentDM.

The function is notdeterministic: that is, if the function is called twice with the same arguments, it is implementation-dependent whether the same node is returned on both occasions.

Options set in $options may be supplemented or modified based on configuration options defined externally using implementation-defined mechanisms.

Error Conditions

A dynamic error is raised [err:FODC0006] if the content of $value is not a well-formed and namespace-well-formed XML document.

A dynamic error is raised [err:FODC0007] if DTD validation is carried out and the content of $value is not valid against the relevant DTD.

A dynamic error is raised [err:FODC0008] if the value of the xsd-validation option is not one of the permitted values (for example, if the string that follows "type" is not a valid EQName, or if it does not identify a type that is present in the static context).

A dynamic error is raised [err:FODC0009] if the value of the xsd-validation option is set to anything other than skip when the processor is not schema-aware. (XSLT 4.0 and XQuery 4.0 define schema-awareness as an optional feature; other host languages may set their own rules.)

A dynamic error is raised [err:FODC0013] if processor does not have access to an XML parser supporting the requested options, for example the ability to perform DTD validation or XInclude processing or to prevent access to external entities.

A dynamic error is raised [err:FODC0014] if XSD validation is carried out and the content of $value is not valid against the relevant XSD schema.

Notes

Since the XML document is presented to the parser as a string, rather than as a sequence of octets, the encoding specified within the XML declaration has no meaning. If the XML parser accepts input only in the form of a sequence of octets, then the processor must ensure that the string is encoded as octets in a way that is consistent with rules used by the XML parser to detect the encoding.

A common use case for this function is to handle input documents that contain nested XML documents embedded within CDATA sections. Since the content of the CDATA section are exposed as text, the receiving query or stylesheet may pass this text to the fn:parse-xml function to create a tree representation of the nested document.

Similarly, nested XML within comments is sometimes encountered, and lexical XML is sometimes returned by extension functions, for example, functions that access web services or read from databases.

A use case arises in XSLT where there is a need to preprocess an input document before parsing. For example, an application might wish to edit the document to remove its DOCTYPE declaration. This can be done by reading the raw text using the fn:unparsed-text function, editing the resulting string, and then passing it to the fn:parse-xml function.

Examples

The expression fn:parse-xml("<alpha>abcd</alpha>") returns a newly created document node, having an alpha element as its only child; the alpha element in turn is the parent of a text node whose string value is "abcd".

The expression fn:parse-xml("<alpha><beta> </beta></alpha>", { "strip-space": true() }) returns a newly created document node, having an alpha element as its only child; the alpha element in turn is the parent of a beta element whose content is empty, as a result of whitespace stripping.