Please check the errata for any errors or issues reported since publication.
See also translations.
This document is also available in these non-normative formats: Specification in XML format and XML function catalog.
Copyright © 2000 W3C® (MIT, ERCIM, Keio, Beihang). W3C liability, trademark and document use rules apply.
This document defines constructor functions, operators, and functions on the datatypes defined in [XML Schema Part 2: Datatypes Second Edition] and the datatypes defined in [XQuery and XPath Data Model (XDM) 3.1]. It also defines functions and operators on nodes and node sequences as defined in the [XQuery and XPath Data Model (XDM) 3.1]. These functions and operators are defined for use in [XML Path Language (XPath) 4.0] and [XQuery 4.0: An XML Query Language] and [XSL Transformations (XSLT) Version 4.0] and other related XML standards. The signatures and summaries of functions defined in this document are available at: http://www.w3.org/2005/xpath-functions/.
A summary of changes since version 3.1 is provided at G Changes since 3.1.
This version of the specification is work in progress. It is produced by the QT4 Working Group, officially the W3C XSLT 4.0 Extensions Community Group. Individual functions specified in the document may be at different stages of review, reflected in their History notes. Comments are invited, in the form of GitHub issues at https://github.com/qt4cg/qtspecs.
The publications of this community group are dedicated to our co-chair, Michael Sperberg-McQueen (1954–2024).
These functions convert between the lexical representation and XPath and XQuery data model representation of various file formats.
These functions convert between the lexical representation of XML and the tree representation.
(The fn:serialize function also handles HTML and JSON output, but is included in this section for editorial convenience.)
| Function | Meaning |
|---|---|
fn:parse-xml | This function takes as input an XML document represented as a string, and returns the document node at the root of an XDM tree representing the parsed document. |
fn:parse-xml-fragment | This function takes as input an XML external entity represented as a string, and returns the document node at the root of an XDM tree representing the parsed document fragment. |
fn:serialize | This function serializes the supplied input sequence $input as described in [XSLT and XQuery Serialization 3.1], returning the serialized representation of the sequence as a string. |
This function takes as input an XML document represented as a string, and returns the document node at the root of an XDM tree representing the parsed document.
fn:parse-xml( | ||
$value | as , | |
$options | as | := {} |
) as | ||
This function is nondeterministic, context-dependent, and focus-independent. It depends on static base URI.
If $value is the empty sequence, the function returns the empty sequence.
Because the input is supplied as a string, not as an octet stream, the encoding specified in the XML declaration (if present) should be ignored. For similar reasons, any initial byte order mark (codepoint U+FEFF) should be ignored.
The $options argument, if present and non-empty, defines the detailed behavior of the function. The option parameter conventions apply. The options available are as follows:
record( | |
base-uri? | as xs:anyURI, |
dtd-validation? | as xs:boolean, |
allow-external-entities? | as xs:boolean, |
entity-expansion-limit? | as xs:integer?, |
strip-space? | as xs:boolean, |
xinclude? | as xs:boolean, |
xsd-validation? | as xs:string |
) | |
| Key | Value | Meaning |
|---|---|---|
| Determines the base URI. This is used both as the base URI used by the XML parser to resolve relative entity references within the document, and as the base URI of the document node that is returned. It defaults to the static base URI of the function call.
| |
| Determines whether DTD validation takes place.
| |
true | The input is parsed using a validating XML parser. The input must contain a DOCTYPE declaration to identify the DTD to be used for validation. The DTD may be internal or external. | |
false | DTD validation does not take place. However, if a DOCTYPE declaration is present, then it is read, for example to perform entity expansion. | |
| Determines whether references to external entities (including a DTD entity) are permitted.
| |
true | References to external entities are permitted, and are resolved relative to the base URI. | |
false | References to external entities (including an external DTD) are not permitted, and result in the call on parse-xml failing with a dynamic error if present. | |
| Places a limit on the maximum number of entity references that may be expanded, or on the size of the expanded entities. The limit applies both to internal and external entities, but not to built-in entity references, nor to character references.
| |
() | The limit (if any) is implementation-dependent. | |
integer | The processor should impose a limit on the number of entity references that are expanded, or on the size of the expanded entities, depending on the options available in the underlying XML parser; the limit should be commensurate with the value requested, but the precise effect may be . implementation-dependent. If the XML parser does not offer the ability to impose a limit, or if the value is zero, then entity expansion should if possible be disabled entirely, leading to a dynamic error if the input contains any entity references. A negative value should be interpreted as placing no limits on entity expansion. | |
| Determines whether whitespace-only text nodes are removed from the resulting document. (Note: in XSLT, the xsl:strip-space and xsl:preserve-space declarations are ignored.)
| |
true | All whitespace-only text nodes are stripped, unless either (a) they are within the scope of the attribute xml:space="preserve", or (b) XSD validation identifies that the parent element has a simple type or a complex type with simple content. | |
false | All whitespace-only text nodes are preserved, unless either (a) DTD validation marks them as ignorable, or (b) XSD validation recognizes the containing element as having element-only or empty content. | |
| Determines whether any xi:include elements in the input are to be processed using an XInclude processor.
| |
true | Any xi:include elements are expanded. If there are xi:include elements and no XInclude processor is available then a dynamic error is raised. | |
false | Any xi:include elements are handled as ordinary elements without expansion. | |
| Determines whether XSD validation takes place.
| |
strict | Strict XSD validation takes place | |
lax | Lax XSD validation takes place | |
skip | No XSD validation takes place | |
type Q{uri}local | XSD validation takes place against the schema-defined type, present in the static context, that has the given URI and local name. | |
Except to the extent defined by these options, the precise process used to construct the XDM instance is implementation-defined. In particular, it is implementation-defined whether an XML 1.0 or XML 1.1 parser is used.
The document URI of the returned node is absentDM.
The function is notdeterministic: that is, if the function is called twice with the same arguments, it is implementation-dependent whether the same node is returned on both occasions.
Options set in $options may be supplemented or modified based on configuration options defined externally using implementation-defined mechanisms.
A dynamic error is raised [err:FODC0006] if the content of $value is not a well-formed and namespace-well-formed XML document.
A dynamic error is raised [err:FODC0007] if DTD validation is carried out and the content of $value is not valid against the relevant DTD.
A dynamic error is raised [err:FODC0008] if the value of the xsd-validation option is not one of the permitted values (for example, if the string that follows "type" is not a valid EQName, or if it does not identify a type that is present in the static context).
A dynamic error is raised [err:FODC0009] if the value of the xsd-validation option is set to anything other than skip when the processor is not schema-aware. (XSLT 4.0 and XQuery 4.0 define schema-awareness as an optional feature; other host languages may set their own rules.)
A dynamic error is raised [err:FODC0013] if processor does not have access to an XML parser supporting the requested options, for example the ability to perform DTD validation or XInclude processing or to prevent access to external entities.
A dynamic error is raised [err:FODC0014] if XSD validation is carried out and the content of $value is not valid against the relevant XSD schema.
Since the XML document is presented to the parser as a string, rather than as a sequence of octets, the encoding specified within the XML declaration has no meaning. If the XML parser accepts input only in the form of a sequence of octets, then the processor must ensure that the string is encoded as octets in a way that is consistent with rules used by the XML parser to detect the encoding.
A common use case for this function is to handle input documents that contain nested XML documents embedded within CDATA sections. Since the content of the CDATA section are exposed as text, the receiving query or stylesheet may pass this text to the fn:parse-xml function to create a tree representation of the nested document.
Similarly, nested XML within comments is sometimes encountered, and lexical XML is sometimes returned by extension functions, for example, functions that access web services or read from databases.
A use case arises in XSLT where there is a need to preprocess an input document before parsing. For example, an application might wish to edit the document to remove its DOCTYPE declaration. This can be done by reading the raw text using the fn:unparsed-text function, editing the resulting string, and then passing it to the fn:parse-xml function.
The expression | |
The expression |