View Old View New View Both View Only Previous Next

This draft contains only sections that have differences from the version that it modified.

W3C

XSLT and XQuery Serialization 4.0

W3C Editor's Draft 23 February 2026

This version:
https://www.w3.org/TR/2000/REC-xslt-xquery-serialization-40-20000101/
Latest version of XSLT and XQuery Serialization 4.0:
https://www.w3.org/TR/xslt-xquery-serialization-40/
Most recent version of XSLT and XQuery Serialization 4:
https://www.w3.org/TR/xslt-xquery-serialization-4/
Most recent version of XSLT and XQuery Serialization:
https://www.w3.org/TR/xslt-xquery-serialization/
Most recent Recommendation of XSLT and XQuery Serialization:
https://www.w3.org/TR/xslt-xquery-serialization-31/
Editors:
Andrew Coleman, IBM Hursley Laboratories <andrew_coleman@uk.ibm.com>
C. M. Sperberg-McQueen, Black Mesa Technologies <http://blackmesatech.com/>

Please check the errata for any errors or issues reported since publication.

See also translations.

This document is also available in these non-normative formats: XML.


Abstract

This document defines serialization of an instance of the data model as defined in [XQuery and XPath Data Model (XDM) 4.0][XDM 4.0] into a sequence of octets. Serialization is designed to be a component that can be used by other specifications such as [XSL Transformations (XSLT) Version 4.0][XSLT 4.0] or [XQuery 4.0: An XML Query Language].

Status of this Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at https://www.w3.org/TR/.

This document is governed by the 1 March 2017 W3C Process Document.

This is a Recommendation of the W3C. It was jointly developed by the W3C XSLT Working Group and the W3C XML Query Working Group, each of which is part of the XML Activity.

This Editor's Draft specifies XSLT and XQuery Serialization version 4.0, a fully compatible extension of Serialization version 3.1.

This specification is designed to be referenced normatively from other specifications defining a host language for it; it is not intended to be implemented outside a host language. The implementability of this specification has been tested in the context of its normative inclusion in host languages defined by the XQuery 3.1 and XSLT 3.0 specifications; see the XQuery 3.1 implementation report (and, in the future, the WGs expect that there will also be an XSLT 3.0 implementation report) for details.

No substantive changes have been made to this specification since its publication as a Proposed Recommendation.

Please report errors in this document using W3C's public Bugzilla system (instructions can be found at https://www.w3.org/XML/2005/04/qt-bugzilla). If access to that system is not feasible, you may send your comments to the W3C XSLT/XPath/XQuery public comments mailing list, public-qt-comments@w3.org. It will be very helpful if you include the string “[SER40]” in the subject line of your report, whether made in Bugzilla or in email. Please use multiple Bugzilla entries (or, if necessary, multiple email messages) if you have more than one comment to make. Archives of the comments and responses are available at https://lists.w3.org/Archives/Public/public-qt-comments/.

This document has been reviewed by W3C Members, by software developers, and by other W3C groups and interested parties, and is endorsed by the Director as a W3C Recommendation. It is a stable document and may be used as reference material or cited from another document. W3C's role in making the Recommendation is to draw attention to the specification and to promote its widespread deployment. This enhances the functionality and interoperability of the Web.

This document was produced by groups operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures (W3C XML Query Working Group) and a public list of any patent disclosures (W3C XSLT Working Group) made in connection with the deliverables of each group; these pages also include instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.

Dedication

The publications of this community group are dedicated to our co-chair, Michael Sperberg-McQueen (1954–2024).


1 Introduction

Changes in 4.0 

  1. Use the arrows to browse significant changes since the 3.1 version of this specification.

  2. Sections with significant changes are marked Δ in the table of contents.

This document defines serialization ofmethods of serializing of the W3C XQuery and XPath Data Model 4.0 (XDM[XDM 4.0]), which isthat is, methods of representing instances of the data model of at leastas strings or octet sequences. This is the data model used by [XML Path Language (XPath) 4.0][XPath 4.0], [XSL Transformations (XSLT) Version 4.0][XSLT 4.0], and [XQuery 4.0: An XML Query Language], and any other specifications that reference it.

In this document, examples and material labeled as “Note” are provided for explanatory purposes and are not normative.

Serialization is the process of converting an instance of the [XQuery and XPath Data Model (XDM) 4.0][XDM 4.0] into a sequence of octets.

[Definition: The XDM value supplied as input to the serializer is referred to as the input value.] Some serialization methods apply only to certain types of input value.

Note:

Where serialization is used to process the result of an XQuery evaluation or an XSLT transformation, the input value of the serializer corresponds to the output from XQuery or XSLT.

[Definition: In general the output of the serializer will represent the items actually present in the input value, together with other items that are reachable from these, for example (in the case of nodes) their descendants. The complete set of items that are represented in the output of the serializer is referred to (without loss of generality) as the input tree.]

1.1 Terminology

Changes in 4.0  

  1. The term atomic value has been replaced by atomic item.   [Issue 1337  2 August 2024]

In this specification, where they are rendered in small capitals, the words must, must not, should, should not, may, required, and recommended are to be interpreted as described in [RFC2119].

[Definition: As is indicated in 12 Conformance, conformance criteria for serialization are determined by other specifications that refer to this specification. A serializer is software that implements some or all of the requirements of this specification in accordance with such conformance criteria.] A serializer is not required to directly provide a programming interface that permits a user to set serialization parameters or to provide an input sequence for serialization. In this document, material labeled as "Note" and examples are provided for explanatory purposes and are not normative.

Certain aspects of serialization are described in this specification as implementation-defined or implementation-dependent.

[Definition: Implementation-defined indicates an aspect that may differ between serializers, but whose actual behavior must be specified either by another specification that sets conformance criteria for serialization (see 12 Conformance) or in documentation that accompanies the serializer.]

[Definition: Implementation-dependent indicates an aspect that may differ between serializers, and whose actual behavior is not required to be specified either by another specification that sets conformance criteria for serialization (see 12 Conformance) or in documentation that accompanies the serializer.]

[Definition: In some instances, the input tree cannot be successfully converted into a sequence of octets given the set of serialization parameter (3 Serialization Parameters) values specified. A serialization error is said to occur in such an instance.] In some cases, a serializer is required to raise such an error. What it means to raise a serialization error is determined by the relevant conformance criteria (12 Conformance) to which the serializer conforms. In other cases, there is an implementation-defined choice between raising a serialization error and performing a recovery action. Such a recovery action will allow a serializer to produce a sequence of octets that might not fully reflect the usual requirements of the parameter settings that are in effect.

[Definition: Where this specification indicates that two strings are to be compared without regard to case, the serializermust translate any characters in the range U+0041 (LATIN CAPITAL LETTER A, A) through U+005A (LATIN CAPITAL LETTER Z, Z) inclusive, to the corresponding lower-case letters in the range U+0061 (LATIN SMALL LETTER A, a) through U+007A (LATIN SMALL LETTER Z, z) only for the purposes of making the comparison. The comparison succeeds if the two strings are the same length and the code point of each character in the first string is equal to the code point of the character in the corresponding position in the second string.]

Many terms used in this document are defined in the XPath specification [XML Path Language (XPath) 4.0][XPath 4.0] or the Data Model specification [XQuery and XPath Data Model (XDM) 4.0][XDM 4.0]. Particular attention is drawn to the following:

Where this specification indicates that an XSLT instruction is evaluated, the behavior is as specified by [XSL Transformations (XSLT) Version 4.0][XSLT 4.0]. Where it indicates that an XQuery expression is evaluated, the behavior is as specified by [XQuery 4.0: An XML Query Language].

2 Sequence Normalization

The input value is a sequence. Prior to serializing a sequence using any of the output methods whose behavior is specified by this document (3 Serialization Parameters), with the exception of the JSON and Adaptive output methods, the serializermust first compute a normalized sequence for serialization; it is the normalized sequence that is actually serialized. [Definition: The purpose of sequence normalization is to create a sequence that can be serialized as a well-formed XML document or external general parsed entity, that also reflects the content of the input sequence to the extent possible.] [Definition: The result of the sequence normalization process is a result tree.]

The normalized sequence for serialization is constructed by applying all of the following rules in order, with the input value being input to the first step, and the sequence that results from any step being used as input to the subsequent step. For any implementation-defined output method, it is implementation-defined whether this sequence normalization process takes place. For the JSON and Adaptive output methods, sequence normalization must not take place.

Where the process of converting the input sequence to a normalized sequence indicates that a value must be cast to xs:string, that operation is defined in [Functions and Operators 4.0] section Section 23.1.2 Casting to xs:stringFO of [XQuery and XPath Functions and Operators 4.0]. Where a step in the sequence normalization process indicates that a node should be copied, the copy is performed in the same way as an XSLT xsl:copy-of instruction that has a validation attribute whose value is preserve and has a select attribute whose effective value is the node, as described in [XSLT 4.0] section Section 11.9.2 Deep CopyXT of [XSL Transformations (XSLT) Version 4.0][XSLT 4.0], or equivalently in the same way as an XQuery content expression as described in Step 1e of [XQuery 4.0] section Section 4.12.1.3 ContentXQ of [XQuery 4.0: An XML Query Language], where the construction mode is preserve. Let S0 be the sequence that is input to serialization. The steps in computing the normalized sequence are:

  1. Create a new sequence S1 from S0 as follows. For each item in S0, if the item is a JNode, copy the ·content· property of the item; otherwise, copy the item itself.

  2. Create a new sequence S2 from S1 as follows. For each item in S1, if the item is an array, copy the results of passing the item into the function array:flatten(); otherwise, copy the item itself. If S1 is empty, let S2 consist of a zero-length string.

  3. Create a new sequence S3 from S2 as follows. For each item in S3, if the item is atomic, copy to S3 only the lexical representation resulting from casting the item to an xs:string, otherwise, copy the item to S3.

  4. Create a new sequence S4 from S3 as follows. If the item-separator serialization parameter is present, then copy each item in S3 to S4, inserting between each pair of items a string whose value is equal to the value of the item-separator parameter. If the item-separator serialization parameter is not present, then first maximally group the items in S3 into subsequences of xs:string items and non-xs:string items. For each group of items, if the group is a subsequence of non-xs:string items, copy the subsequence to S4; if the group is a subsequence of xs:string items, copy to S4 the results of passing to fn:string-join() the subsequence and the value of item-separator as the function’s two parameters.

  5. Create a new sequence S5 from S4 as follows. For each item in S4, if the item is a string, copy to S5 a text node whose string value is equal to the string; otherwise, copy the item to S5.

  6. Create a new sequence S6 from S5 as follows. For each item in S5, if the item is a document node, copy its children to S6; otherwise, copy the item to S6.

  7. Create a new sequence S7 from S6 as follows. First, remove any text nodes with values of zero length from S6, then maximally group the results into groups of text nodes and non-text nodes. For each group of items, if the group is a subsequence of text nodes, copy to S7 a single text node whose value is equal to the concatenated values of the subsequence; if the group is a subsequence of non-text nodes, copy the subsequence of items to S7. It is a serialization error [err:SENR0001] if any item in S7 is an attribute node, a namespace node, or a function.

  8. Create a new sequence S8 from S7 as follows. Let S8 be a single document node. Copy sequence S7 to the document node as its children.

S8 is the normalized sequence.

The result tree rooted at the document node that is created by the final step of this sequence normalization process is the value to which the rules of the appropriate output method are applied. If the sequence normalization process results in a serialization error, the serializermust raise the error.

Note:

If the item-separator serialization parameter is absent, the sequence normalization process for a sequence $seq is equivalent to constructing a document node using the XSLT instruction:

<xsl:document>
  <xsl:copy-of select="$seq" validation="preserve"/>
</xsl:document>

or the XQuery expression:

declare construction preserve;

document { $seq }

If the item-separator serialization parameter is present, the sequence normalization process for a sequence $seq is equivalent to constructing a document node using the XSLT instruction:

<xsl:document>
  <xsl:for-each select="$seq">
    <xsl:sequence select="if (position() gt 1) 
                          then $sep 
                          else ()"/>

    <xsl:choose>
      <xsl:when test=". instance of node()">
        <xsl:sequence select="."/>
      </xsl:when>
      <xsl:otherwise>
        <xsl:value-of select="."/>
      </xsl:otherwise>
    </xsl:choose>
  </xsl:for-each>
</xsl:document>

or the XQuery expression:

declare construction preserve; 

document {
  for $item at $pos in $seq
  let $node := 
    if ($item instance of node()) then 
      $item 
    else 
      text { $item }
  return
    if ($pos eq 1) then
      $node
    else
      ($sep, $node)  
}

where the value of the sep variable is a string whose value is equal to the value of the item-separator serialization parameter.

This process results in a serialization error [err:SENR0001] if $seq contains functions, attribute nodes or namespace nodes.

3 Serialization Parameters

Changes in 4.0  

  1. Added the escape-solidus parameter for JSON serialization.   [Issue 530 PR 534 6 June 2023]

  2. Added the json-lines parameter for JSON serialization.   [Issue 1471  15 October 2024]

There are a number of parameters that influence how serialization is performed. Host languagesmay allow users to specify any or all of these parameters, but they are not required to be able to do so. However, the host language specification must specify how the values of all applicable parameters are to be determined.

Host languages may also define alternative representations of the values of serialization parameters. For example, both XSLT and XQuery allow the boolean values true and false to be written as 1/0 or yes/no. The $options map passed to the fn:serialize function, by contrast, requires an xs:boolean value.

It is a serialization error [err:SEPM0016] if a parameter value is invalid for the given parameter. It is the responsibility of the host language to specify how invalid values should be handled at the level of that language.

The following serialization parameters are defined:

Serialization parameters
Serialization parameter namePermitted values for parameter
allow-duplicate-namesA boolean value, true or false. This parameter indicates whether a map item serialized as a JSON object using the JSON output method is allowed to contain duplicate member names. If the value false is specified, a serialization error [err:SERE0022] may be raised under certain conditions.
byte-order-markA boolean value, true or false. This parameter indicates whether the serialized sequence of octets is to be preceded by a Byte Order Mark (See Section 5.1 of [Unicode Encoding]). The actual octet order used is implementation-dependent. If the encoding defines no Byte Order Mark, or if the Byte Order Mark is prohibited for the specific Unicode encoding or implementation environment, then this parameter is ignored.
cdata-section-elementsA list of expanded QNames, possibly empty.
doctype-publicA string of PubidCharXML characters. This parameter may be absent.
doctype-systemA string of Unicode characters that does not include both the characters U+0027 (APOSTROPHE, ') and U+0022 (QUOTATION MARK, ") . This parameter may be absent.
encodingA string of Unicode characters in the range U+0021 (EXCLAMATION MARK, !) through U+007E (TILDE, ~) (that is, printable ASCII characters); the value should be a charset registered with the Internet Assigned Numbers Authority [IANA], [RFC2978] or begin with the characters x- or X-.
escape-solidusA boolean value, true or false.
escape-uri-attributesA boolean value, true or false.
html-versionA decimal value. This parameter may be absent.
include-content-typeA boolean value, true or false.
indentA boolean value, true or false.
item-separatorA string of Unicode characters. This parameter may be absent.
json-linesA boolean value, true or false.
json-node-output-methodAn expanded QName with a non-null namespace URI, or with a null namespace URI and a local name equal to one of xml, xhtml, html or text. If the namespace URI is non-null, the parameter specifies an implementation-defined output method.
media-typeA string of Unicode characters specifying the media type (MIME content type) [RFC2046]; the charset parameter of the media type must not be specified explicitly in the value of the media-type parameter. If the destination of the serialized output is annotated with a media type, this parameter may be used to provide such an annotation. For example, it may be used to set the media type in an HTTP header.
methodAn expanded QName with a non-null namespace URI, or with a null namespace URI and a local name that must be equal to one of xml, xhtml, html, text, json, or adaptive, in which case, the output method specified must be used for serializing. If the namespace URI is non-null, the parameter specifies an implementation-defined output method; its behavior is not specified by this document.
normalization-formOne of the enumerated values NFC, NFD, NFKC, NFKD, fully-normalized or none, or an implementation-defined value of type NMTOKEN.
omit-xml-declarationA boolean value, true or false.
standaloneEither a boolean value, true or false, or the value or omit.
suppress-indentationA list of expanded QNames, possibly empty.
undeclare-prefixesA boolean value, true or false.
use-character-mapsA list of pairs, possibly empty, with each pair consisting of a single Unicode character and a string of Unicode characters.
versionA string of Unicode characters. This parameter may be absent.

In those cases where they have no important effect on the content of the serialized result, details of the output methods defined by this specification are left unspecified and are regarded as implementation-dependent. Whether a serializer uses apostrophes or quotation marks to delimit attribute values in the XML output method is an example of such a detail.

The detailed semantics of each parameter will be described separately for each output method for which it is applicable. If the semantics of a parameter are not described for an output method, then it is not applicable to that output method.

Implementations may define additional serialization parameters, and may allow users to do so. For this purpose, the name of a serialization parameter is considered to be a QName; the parameters listed above are QNames whose expanded-QName has a null namespace URI, while any additional serialization parameters that are either implementation-defined or defined by the host languagemust have names that are namespace-qualified. Any such additional serialization parameters must not be in the namespace https://www.w3.org/2010/xslt-xquery-serialization. A host languagemay specify the means by which an implementation can define such an additional serialization parameter, and implementations may provide mechanisms by which users can define such an additional serialization parameter. If the serialization method is one of the six methods xml, html, xhtml, text, json, or adaptive then the additional serialization parameters may affect the output of the serializer to the extent (but only to the extent) that this specification leaves the output implementation-defined or implementation-dependent. For example, such parameters might control whether namespace declarations on an element are written before or after the attributes of the element, or they might define the number of space or tab characters to be inserted when the indent parameter is set to true; but they could not instruct the serializer to suppress the error that occurs when the HTML output method encounters characters that are not permitted (see error [err:SERE0014]).

3.1 Setting Serialization Parameters by Means of a Parameter Document

A host languagemay provide, by reference to this section, a mechanism by which the settings of serialization parameters are supplied in the form of an output:serialization-parameters element node.

[Definition: An output:serialization-parameters element node used to hold the settings of serialization parameters is referred to as a parameter document].

Note:

The use of the word document does not imply that the output:serialization-parameters element must be the outermost element of an XDM document, although this will often be the case.

The parameter documentmust be processed as if by the procedure described below.

With the exception of the use-character-maps parameter, the setting of each serialization parameter defined in this specification is equal to the result of evaluating the XQuery expression

document { . } 
   /output:serialization-parameters
   /(validate lax { 
      output:*[local-name() eq $param-name] 
   })
   /data(@value)

or equivalently the XSLT instructions

<xsl:sequence>
  <xsl:variable name="validated-instance">
    <xsl:document validation="lax">
      <xsl:sequence select="
        self::output:serialization-parameters
        /output:*
       [local-name() eq $param-name]"/>
    </xsl:document>
  </xsl:variable>
  <xsl:sequence select="$validated-instance
                        /data(@value)"/>
</xsl:sequence>

with the parameter document as the context value, the param-name variable bound to a value of type xs:string equal to the local part of the name of the particular serialization parameter, and the other components of the dynamic context and static context as specified in the subsequent tables. If in any case evaluating this expression would yield an error, serialization error [err:SEPM0017] results.

If the result of evaluating this expression for a particular serialization parameter is the empty sequence, then

  1. If the parameter is either cdata-section-elements or suppress-indentation, and the result of evaluating the XQuery expression

    document { . }
    /output:serialization-parameters
    /(validate lax {  
       output:*[local-name() eq $param-name]
    })

    or equivalently the XSLT instructions

    <xsl:sequence>
      <xsl:variable name="validated-instance">
        <xsl:document select="." validation="lax">
          <xsl:sequence select="
            self::output:serialization-parameters
            /output:*
           [local-name() eq $param-name]"/>
        </xsl:document>
      </xsl:variable>
      <xsl:sequence select="$validated-instance"/>
    </xsl:sequence>

    with the same settings of the static context and dynamic context is not an empty sequence, the setting of the parameter is the empty list;

  2. otherwise, the setting of the parameter is absent.

The components of the static context used in evaluating the XQuery expressions or XSLT instructions are as defined in the following table.

Settings of static context components used in extracting serialization parameter settings from a parameter document
Static Context ComponentXQuery or XSLTSetting
XPath 1.0 compatibility modeBothfalse
Statically known namespacesXQueryThe pair (output,http://www.w3.org/2010/xslt-xquery-serialization)
XSLTThe pairs (output,http://www.w3.org/2010/xslt-xquery-serialization), (xsl,http://www.w3.org/1999/XSL/Transform)
Default element/type namespaceBoth"none"
Default function namespaceBothhttp://www.w3.org/2005/xpath-functions
In-scope schema types, In-scope element declarations, Substitution groups, In-scope attribute declarationsBothAs defined by the schema for serialization parameters (B Schema for Serialization Parameters) and any additional implementation-defined in-scope schema components
In-scope variablesBoth{param-name}
Context value static typeBothnode()
Statically known function signaturesBoth{fn:data($arg as item()*) as xs:anyAtomicType*}, {fn:local-name($arg as node()?) as xs:string}
Statically known collationsBoth { (http://www.w3.org/2005/xpath-functions/collation/codepoint, The Unicode codepoint collation ) }
Default collationBothThe Unicode codepoint collation
Construction modeXQuerystrip
Ordering modeXQueryordered
Default order for empty sequencesXQueryleast
Boundary space policyXQuerystrip
Copy-namespaces modeXQuery(preserve,inherit)
Base URIBothAbsent
Statically known documentsBothNone
Statically known collectionsBothNone
Statically known default collection typeBothnode()*
Statically known decimal formatsBothNone
Set of named keysXSLT{}
Values of system propertiesXSLTNone
Set of available instructionsXSLTThe set of all instructions defined by [XSL Transformations (XSLT) Version 4.0][XSLT 4.0]

The remaining components of the dynamic context used in evaluating the XQuery expressions or XSLT instructions in the preceding table are as defined in the following table.

Settings of dynamic context components used in extracting serialization parameter settings from a parameter document
Dynamic Context ComponentXQuery or XSLTSetting
Context positionBoth1
Context sizeBoth1
Variable valuesBothThe param-name variable has a value of type xs:string equal to the local part of the name of the serialization parameter under consideration
Function implementationsBothThe implementation of fn:data
Current dateTimeBothAbsent
Implicit timezoneBothAbsent
Available documentsBothNone
Available collectionsBothNone
Default collectionBothNone
Current template ruleXSLTAbsent
Current modeXSLTThe default mode
Current groupXSLTAbsent
Current grouping keyXSLTAbsent
Current captured substringsXSLTThe empty sequence
Output stateXSLTTemporary output state

In the case of the use-character-maps parameter, the XQuery expression

document { . }
/output:serialization-parameters
/ ( validate lax { output:use-character-maps } )
/output:character-map[@character eq $char]
/string(@map-string)

or equivalently the XSLT instructions

<xsl:sequence>
  <xsl:variable name="validated-instance">
    <xsl:document validation="lax">
      <xsl:sequence select="
        self::output:serialization-parameters
         /output:use-character-maps"/>
    </xsl:document>
  </xsl:variable>
  <xsl:sequence select="$validated-instance                          
                        /output:character-map
                        [@character eq $char]
                        /string(@map-string)"/>
</xsl:sequence>

is evaluated for each Unicode character that is permitted in an XML document. The dynamic context and static context used to evaluate the expression are as defined above, except that the component In-scope variables is the set {char} and the value of the variable "char" is a value of type xs:string of length one whose value is the Unicode character under consideration. If the result of evaluating the expression is not an empty sequence, the pair consisting of the Unicode character and the result of evaluating the expression is part of the list of pairs in the value of the use-character-maps parameter. It is a serialization error [err:SEPM0018] if the result of evaluating this expression for any character is a sequence of length greater than one.

Using the same settings of the components of the dynamic context and static context, serialization error [err:SEPM0019] results if the result of evaluating the following XQuery expression is not true

(document { . })/output:serialization-parameters
   /(count(distinct-values(*/node-name(.))) eq (count(*)))

or equivalently if the result of evaluating the following XSLT instructions is not true.

<xsl:sequence>
  <xsl:variable name="doc">
    <xsl:document>
      <xsl:sequence select="."/>
    </xsl:document>
  </xsl:variable>
  <xsl:sequence
    select="$doc/output:serialization-parameters
                /(count(distinct-values(
                    */node-name(.))) 
                eq (count(*)))"/>
</xsl:sequence>

The result of evaluating either will be false if the parameter document supplies a value for any particular serialization parameter more than once, or will be the empty sequence if the parameter document is not an element node whose local name is serialization-parameters and whose namespace URI is http://www.w3.org/2010/xslt-xquery-serialization.

Note:

A serializer or implementation of a host language does not need to be accompanied by an XQuery processor nor by a general-purpose schema validator in order to meet the requirements of this section. It merely needs to be capable of extracting values from an XDM instance that conforms to the schema for serialization parameters, while checking that the constraints implied by the schema and additional constraints implied by the XQuery validate expression or explicitly stated in this section are satisfied.

The host languagemay provide additional mechanisms for overriding the values of any serialization parameters specified through the mechanism defined in this section, as well as additional mechanisms for specifying the values of any serialization parameters whose values are absent after applying the mechanism defined in this section.

If the parameter document contains elements or attributes that are in a namespace other than http://www.w3.org/2010/xslt-xquery-serialization, the implementation may interpret them to specify the values of implementation-defined serialization parameters in an implementation-defined manner.

The following XML document, if parsed as a parameter document and processed using the mechanism described in this section, would specify the settings of the method, version and indent serialization parameters with the values xml, 1.0 and true, respectively.

<output:serialization-parameters 
    xmlns:output 
    = "http://www.w3.org/2010/xslt-xquery-serialization">
  <output:method value="xml"/>
  <output:version value="1.0"/>
  <output:indent value="yes"/>
</output:serialization-parameters>

The following document would specify the value of the cdata-section-elements serialization parameter with value equal to the pair of expanded QNames (http://example.org/book/chapter,heading) and (http://example.org/book,footnote)

<output:serialization-parameters
    xmlns:output
    = "http://www.w3.org/2010/xslt-xquery-serialization"
    xmlns:book="http://example.org/book"
    xmlns="http://example.org/book/chapter">
  <output:cdata-section-elements value="heading book:footnote"/>
</output:serialization-parameters>

The following document would specify the value of the method serialization parameter with the value html.

Notice that in this example, the default namespace declaration in scope has no effect on the interpretation of the setting of the method parameter.

<output:serialization-parameters
    xmlns:output
    = "http://www.w3.org/2010/xslt-xquery-serialization"
    xmlns="http://example.org/ext">
  <output:method value="html"/>
</output:serialization-parameters>

The following document would specify the value of the method serialization parameter with value equal to the expanded QName (http://example.org/ext, jsp), and the use-character-maps parameter with value equal to the list of pairs, («, <%), (», %>).

<output:serialization-parameters
    xmlns:output
    = "http://www.w3.org/2010/xslt-xquery-serialization"
    xmlns:ext="http://example.org/ext">
  <output:method value="ext:jsp"/>
  <output:use-character-maps>
    <output:character-map character="&#xAB;" map-string="&lt;%"/>
    <output:character-map character="&#xBB;" map-string="%&gt;"/>
  </output:use-character-maps>
</output:serialization-parameters>

4 Phases of Serialization

For the XML, HTML, XHTML and Text output methods, serialization comprises five phases of processing (preceded by the sequence normalization process described in 2 Sequence Normalization). For the JSON and Adaptive output methods, serialization is described in 9 JSON Output Method and 10 Adaptive Output Method respectively.

For an implementation-defined output method, any of these phases may be skipped or may be performed in a different order than is specified here. For the output methods defined in this specification, these phases are carried out sequentially as follows:

  1. A meta element is added to the sequence, possibly replacing existing meta elements, as controlled by the include-content-type parameter for the XHTML and HTML output methods. This step is skipped for the other output methods defined by this specification.

  2. Markup generation produces the character representation of those parts of the serialized result that describe the structure of the sequence. In the cases of the XML, HTML and XHTML output methods, this phase produces the character representations of the following:

    • the document type declaration;

    • start tags and end tags (except for attribute values, whose representation is produced by the character expansion phase);

    • processing instructions; and

    • comments.

    In the cases of the XML and XHTML output methods, this phase also produces the following:

    • the XML or text declaration; and

    • empty element tags (except for the attribute values);

    In the case of the text output method, this phase replaces the single document node produced by sequence normalization with a new document node that has exactly one child, which is a text node. The string value of the new text node is the string value of the document node that was produced by sequence normalization.

  3. Character expansion is concerned with the representation of characters appearing in text and attribute nodes in the sequence. For each text and attribute node, the following rules are applied in sequence.

    1. If the node is an attribute that is a URI attribute value and the escape-uri-attributes parameter is set to require escaping of URI attributes, apply URI escaping as defined below, and skip rules b-e. Otherwise, continue with rule b.

      [Definition: URI escaping consists of the following three steps applied in sequence to the content of URI attribute values:]

      1. normalize to NFC using the method defined in [Functions and Operators 4.0] section Section 5.4.9 fn:normalize-unicodeFO

      2. percent-encode any special characters in the URI using the method defined in [Functions and Operators 4.0] section Section 7.5 fn:escape-html-uriFO

      3. escape according to the rules of the XML or HTML output method, whichever is applicable, any characters that require escaping, and any characters that cannot be represented in the selected encoding. For example, replace < with &lt; (See also section 7.3 Writing Character Data).

      [Definition: The values of attributes listed in D List of URI Attributes are URI attribute values. Attributes are not considered to be URI attributes simply because they are namespace declaration attributes or have the type annotation xs:anyURI.]

    2. If the node is a text node whose parent element is selected by the rules of the cdata-section-elements parameter for the applicable output method, create CDATA sections as described below, and skip rules c-e. Otherwise, continue with rule c.

      Apply the following two processes in sequence to create CDATA sections

      1. Unicode Normalization if requested by the normalization-form parameter.

      2. The application of changes as detailed in the description of the cdata-section-elements parameter for the applicable output method.

    3. Apply character mapping as determined by the use-character-maps parameter for the applicable output method. For characters that were substituted by this process, skip rules d and e. For the remaining characters that were not modified by character mapping, continue with rule d.

    4. Apply Unicode Normalization if requested by the normalization-form parameter.

      [Definition: Unicode Normalization is the process of removing alternative representations of equivalent sequences from textual data, to convert the data into a form that can be binary-compared for equivalence, as specified in [UAX #15: Unicode Normalization Forms]. For specific recommendations for character normalization on the World Wide Web, see [Character Model for the World Wide Web 1.0: Normalization].]

      The meanings associated with the possible values of the normalization-form parameter are defined in section 5.1.8 XML Output Method: the normalization-form Parameter.

      Continue with step e.

    5. Escape according to the rules of the XML or HTML output method, whichever is applicable, any characters (such as < and &) where XML or HTML requires escaping, and any characters that cannot be represented in the selected encoding. For example, replace < with &lt;. (See also section 7.3 Writing Character Data). For characters such as > where XML defines a built-in entity but does not require its use in all circumstances, it is implementation-dependent whether the character is escaped.

  4. Indentation, as controlled by the indent parameter and the suppress-indentation parameter, may add or remove whitespace according to the rules defined by the applicable output method.

  5. Encoding, as controlled by the encoding parameter, converts the character sequence produced by the previous phases into an octet stream.

    Note:

    Serialization is defined only in terms of encoding the result as a stream of octets. However, a serializermay provide an option that allows the encoding phase to be skipped, so that the result of serialization can be encoded in a way required by a particular destination (e.g., a Java StringBuffer). The effect of any such option is implementation-defined, and a serializer is not required to support such an option.

5 XML Output Method

The XML output method serializes the normalized sequence as an XML entity that must satisfy the rules for either a well-formed XML document entity, a well-formed XML external general parsed entity, or both. A serialization error [err:SERE0003] results if the serializer is unable to satisfy those rules, except for content modified by the character expansion phase of serialization, as described in 4 Phases of Serialization. The effects of the character expansion phase could result in the serialized output being not well-formed, but will not result in a serialization error. If a serialization error results, the serializermust raise the error.

If the document node of the normalized sequence has a single element node child and no text node children, then the serialized output is a well-formed XML document entity, and the serialized output must conform to the appropriate version of the XML Namespaces Recommendation [XML Names] or [XML Names 1.1]. If the normalized sequence does not take this form, then the serialized output is a well-formed XML external general parsed entity, which, when referenced within a trivial XML document wrapper like this:

<?xml version="version"?>
<!DOCTYPE doc [
<!ENTITY e SYSTEM "entity-URI">
]>
<doc>&e;</doc>

where entity-URI is a URI for the entity, and the value of the version pseudo-attribute is the value of the version parameter, produces a document which must itself be a well-formed XML document conforming to the corresponding version of the XML Namespaces Recommendation [XML Names] or [XML Names 1.1].

[Definition: A reconstructed tree may be constructed by parsing the XML document and converting it into an document node as specified in [XQuery and XPath Data Model (XDM) 4.0][XDM 4.0].] The result of serialization must be such that the reconstructed tree is the same as the result tree except for the following permitted differences:

A consequence of this rule is that certain characters must be output as character references, to ensure that they survive the round trip through serialization and parsing. Specifically:

For example, an attribute with the value "x" followed by "y" separated by a newline will result in the output "x&#xA;y" (or with any equivalent character reference). The XML output cannot be "x" followed by a literal newline followed by a "y" because after parsing, the attribute value would be "x y" as a consequence of the XML attribute normalization rules.

Note:

XML 1.0 did not permit an XML processor to normalize U+0085 (NEXT LINE, NEL) or U+2028 (LINE SEPARATOR) characters to a U+000A (NEWLINE) character. However, if a document entity that specifies version 1.1 invokes an external general parsed entity with no text declaration or a text declaration that specifies version 1.0, the external parsed entity is processed according to the rules of XML 1.1. For this reason, U+0085 (NEXT LINE, NEL) and U+2028 (LINE SEPARATOR) characters in text and attribute nodes must always be escaped using character references, regardless of the value of the version parameter.

XML 1.0 permitted control characters in the range U+007F (DELETE) through U+009F (APC) to appear as literal characters in an XML document, but XML 1.1 requires such characters, other than U+0085 (NEXT LINE, NEL) , to be escaped as character references. An external general parsed entity with no text declaration or a text declaration that specifies a version pseudo-attribute with value 1.0 that is invoked by an XML 1.1 document entity must follow the rules of XML 1.1. Therefore, the non-whitespace control characters in the ranges U+0001 (SOH) through U+001F (IS1) and U+007F (DELETE) through U+009F (APC) must always be escaped, regardless of the value of the version parameter.

It is a serialization error [err:SEPM0004] to specify the doctype-system parameter, or to specify the standalone parameter with a value other than omit, if the input tree contains text nodes or multiple element nodes as children of the root node. The serializermust either raise the error, or recover by ignoring the request to output a document type declaration or standalone parameter.

5.1 The Influence of Serialization Parameters upon the XML Output Method

The serialization parameters that affect the XML output method are listed in the following subsections.

Serialization parameters other than those listed are not applicable to this output method. It is the responsibility of the host language to specify whether an error occurs if such a parameter is specified in combination with the XML output method, or whether the parameter is ignored, or whether it is validated and then ignored.

5.1.3 XML Output Method: the indent and suppress-indentation Parameters

The indent and suppress-indentation parameters control whether the serializermay adjust the whitespace in the serialized result so that a person will find it easier to read. If the indent parameter has the value true, the serializermay output whitespace characters in addition to the whitespace characters in the input tree. It may also elide from the output whitespace characters that occurred in the input tree or replace such whitespace characters with other whitespace characters.

[Definition: The term content has the same meaning as the term ContentXML defined in Section 3.1 Start-Tags, End-Tags, and Empty-Element TagsXML of [XML10].] [Definition: The immediate content of an element is the part of the content of the element that is not also in the content of a child element of that element.]

If the indent parameter has the value false, the serializermust not add, elide or replace whitespace characters in the output. If the indent parameter has the value true, the serializermust use an algorithm for dealing with whitespace characters that satisfies all of the following constraints. If more than one constraint applies, the serializermust apply the most restrictive constraint. That is, if any applicable constraint indicates that whitespace must not be added, elided or replaced, that constraint prevails; if an applicable constraint indicates that whitespace should not be added, elided or replaced, while all other applicable constraints indicate that whitespace may be added, elided or replaced, whitespace should not be added, elided or replaced.

  • Whitespace characters may be added adjacent to a text node only if the text node contains only whitespace characters. Whitespace characters in such a text node may also be elided or replaced. For example, a tab may be inserted as a replacement for existing spaces.

  • Whitespace characters may be added, elided or replaced in the immediate content of an element whose type annotation is xs:untyped or xs:anyType and that has element node children, in the immediate content of an element whose content model is element only, or outside the content of any element.

  • Whitespace characters must not be added, elided or replaced in the immediate content of an element whose content model is known to be simple or empty.

  • Whitespace characters should not be added, elided or replaced in places where the characters would constitute significant whitespace, for example, in the immediate content of an element that is annotated with a type other than xs:untyped or xs:anyType, and whose content model is known to be mixed.

  • Whitespace characters must not be added, elided or replaced in the content of an element whose expanded QName is a member of the list of expanded QNames in the value of the suppress-indentation parameter.

  • Whitespace characters must not be added, elided or replaced in a part of the result document that is controlled by an xml:space attribute with value preserve. (See [XML10] for more information about the xml:space attribute.)

Note:

The effect of these rules is to ensure that whitespace is added in only those places where (a) XSLT’s <xsl:strip-space> declaration could cause it to be removed, and (b) it does not affect the string value of any element node with simple content. It is usually not safe to indent document types that include elements with mixed content.

Note:

The whitespace added may possibly be based on whitespace stripped from either the source document or the stylesheet (in the case of XSLT), or guided by other means that might depend on the host language, in the case of an input tree created using some other process.

6 XHTML Output Method

Changes in 4.0  

  1. In the HTML and XHTML output methods, the rules for adding and replacing meta elements have been revised to take account of the new HTML5 syntax, for example <meta charset="utf-8">.   [Issue 318 PR 342 14 February 2023]

  2. The default HTML version is now 5. This may result in changes to the serialized output in cases where no explicit HTML version is requested.  [Issue 1889 PR 1977 2 May 2025]

The XHTML output method serializes the input tree as XML, using the HTML compatibility guidelines defined in the XHTML specification ([XHTML 1.0] or the XHTML syntax of HTML5 (see [HTML5]).

The default value of the html-version serialization parameter for this method is 5.0, and all references to the value of this parameter assume this default when the parameter is absent. The value of the parameter is a decimal, so the values 5 and 5.0 are equivalent.

[Definition: The term with HTML5 is used in this specification to qualify rules that apply only when the effective version of the html-version serialization parameter is 5.0.]

[Definition: The term prior to HTML5 is used in this specification to qualify rules that apply only when the effective version of the html-version serialization parameter is less than 5.0.]

[Definition: An element node is recognized as an HTML element by the XHTML output method if either of the following conditions is true:

]

It is entirely the responsibility of the supplier of the input tree to ensure that it conforms to the relevant specification, this being:

It is not an error if the input tree is invalid XHTML. Equally, it is entirely under the control of the supplier of the input tree whether the output conforms to XHTML 1.0 Strict, XHTML 1.0 Transitional, the XHTML syntax of HTML5 (see [HTML5]), [POLYGLOT] or any other specific definition of XHTML.

The serialization of the input tree follows the same rules as for the XML output method, with the general exceptions noted below and parameter-specific exceptions in 6.1 The Influence of Serialization Parameters upon the XHTML Output Method. These differences are based on the HTML compatibility guidelines published in Appendix C of [XHTML 1.0] and on [POLYGLOT], both of which are designed to ensure that as far as possible, XHTML is rendered correctly on user agents designed originally to handle HTML.

With HTML5 the input tree is first subjected to prefix normalization.

[Definition: During prefix normalization, any element node in the input tree that is in one of the XHTML namespace, the SVG namespace or the MathML namespace has its name replaced by the local part of its name. Such an element node is given a default namespace node whose value is the element’s namespace URI. Any namespace node for any of those three namespaces that was previously present on any element node in the input tree is also removed, unless the prefix that that namespace node declared is used as the prefix on the name of an attribute on that element or an ancestor of that element.]

The process of prefix normalization is equivalent to replacing the input tree with the result of the transformation described by this XSLT stylesheet, with the root of the input tree as the initial context value.

<xsl:stylesheet
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
    version="4.0"
    xmlns:xhtml="http://www.w3.org/1999/xhtml"
    xmlns:svg="http://www.w3.org/2000/svg"
    xmlns:mathml="http://www.w3.org/1998/Math/MathML">
  <xsl:template match="xhtml:*|svg:*|mathml:*">
    <xsl:element name="{local-name()}" 
                 namespace="{namespace-uri()}">
      <xsl:apply-templates select="@*|namespace::*|node()"/>
    </xsl:element>
  </xsl:template>

  <xsl:template match="node()|@*|namespace::*">
    <xsl:copy copy-namespaces="no">
      <xsl:apply-templates select="@*|namespace::*|node()"/>
    </xsl:copy>
  </xsl:template>

  <xsl:template
    match="namespace::*[. eq 'http://www.w3.org/1999/xhtml']|
           namespace::*[. eq 'http://www.w3.org/2000/svg']|
           namespace::*[. eq 'http://www.w3.org/1998/Math/MathML']"/>
</xsl:stylesheet>

Note:

[POLYGLOT] and Appendix C of [XHTML 1.0] describe a number of compatibility guidelines for users of XHTML who wish to render their XHTML documents with HTML user agents. In some cases, such as the guideline on the form empty elements take, only the serialization process itself has the ability to follow the guideline. In such cases, those guidelines are reflected in the requirements on the serializer described above.

In all other cases, the guidelines can be adhered to by the input tree. The guideline on the use of whitespace characters in attribute values is one such example. Another example is that xml:lang="..." does not serialize to both xml:lang="..." and lang="..." as required by some legacy user agents. It is the responsibility of the person or process that creates the instance of the data model that is input to the serialization process to ensure it is created in a way that is consistent with the guidelines. No serialization error results if the input tree does not adhere to the guidelines.

6.1 The Influence of Serialization Parameters upon the XHTML Output Method

The serialization parameters that affect the XHTML output method are listed in the following subsections.

Serialization parameters other than those listed are not applicable to this output method. It is the responsibility of the host language to specify whether an error occurs if such a parameter is specified in combination with the XHTML output method, or whether the parameter is ignored, or whether it is validated and then ignored.

6.1.13 XHTML Output Method: the escape-uri-attributes Parameter

If the escape-uri-attributes parameter has the value true, the XHTML output method must apply URI escaping to URI attribute values, except that relative URIs must not be absolutized.

Note:

This escaping is deliberately confined to non-ASCII characters, because escaping of ASCII characters is not always appropriate, for example when URIs or URI fragments are interpreted locally by the HTML user agent. Even in the case of non-ASCII characters, escaping can sometimes cause problems. More precise control of URI escaping is therefore available by setting escape-uri-attributes to false, and controlling the escaping of URIs by using methods defined in [Functions and Operators 4.0] section Section 7.2 fn:encode-for-uriFO and [Functions and Operators 4.0] section Section 7.4 fn:iri-to-uriFO.

7 HTML Output Method

Changes in 4.0  

  1. In the HTML and XHTML output methods, the rules for adding and replacing meta elements have been revised to take account of the new HTML5 syntax, for example <meta charset="utf-8">.  [Issue 318 PR 342 14 February 2023]

  2. The default HTML version is now 5. This may result in changes to the serialized output in cases where no explicit HTML version is requested.  [Issue 1889 PR 1977 2 May 2025]

The HTML output method serializes the input tree as HTML.

For example, the following XSL stylesheet generates html output,

<xsl:stylesheet version="2.0" 
     xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="html" version="4.0"/>
<xsl:template match="/">
  <html>
    <xsl:apply-templates/>
  </html>
</xsl:template>
...
</xsl:stylesheet>

In the example, the version attribute of the xsl:output element indicates the version of the HTML Recommendation [HTML] to which the serialized result is to conform.

[Definition: The requested HTML version is the value of the html-version serialization parameter if present; otherwise the value of the version serialization parameter if present; otherwise 5.0.]

This document provides the normative definition of serialization for the HTML output method if the requested HTML version has the lexical form of a value of type decimal whose value is 1.0 or greater, but no greater than 5.0. For any other requested HTML version, the behavior is implementation-defined. In that case the implementation-defined behavior may supersede all other requirements of this recommendation.

An implementation is required to behave as specified in this document when the requested version is 5.0. If the requested version is greater than or equal to 1 but less than 5.0, then the processor may behave as if the requested version were 5.0.

It is entirely the responsibility of the supplier of the input tree to ensure that it conforms to the relevant HTML specification. It is not an error if the input tree is invalid HTML. Equally, it is entirely under the control of the supplier of the input tree whether the output conforms to HTML. If the result tree is valid HTML, the serializermust serialize the result in a way that conforms with the requested HTML version.

7.4 The Influence of Serialization Parameters upon the HTML Output Method

The serialization parameters that affect the HTML output method are listed in the following subsections.

Serialization parameters other than those listed are not applicable to this output method. It is the responsibility of the host language to specify whether an error occurs if such a parameter is specified in combination with the HTML output method, or whether the parameter is ignored, or whether it is validated and then ignored.

7.4.10 HTML Output Method: the escape-uri-attributes Parameter

If the escape-uri-attributes parameter has the value true, the HTML output method must apply URI escaping to URI attribute values, except that relative URIs must not be absolutized.

Note:

This escaping is deliberately confined to non-ASCII characters, because escaping of ASCII characters is not always appropriate, for example when URIs or URI fragments are interpreted locally by the HTML user agent. Even in the case of non-ASCII characters, escaping can sometimes cause problems. More precise control of URI escaping is therefore available by setting escape-uri-attributes to false, and controlling the escaping of URIs by using methods defined in [Functions and Operators 4.0] section Section 7.2 fn:encode-for-uriFO and [Functions and Operators 4.0] section Section 7.4 fn:iri-to-uriFO.

10 Adaptive Output Method

Changes in 4.0  

  1. The serialization of maps retains the order of entries.   [Issue 1651 PR 1703 14 January 2025]

  2. The output of QNames reflects the new syntax for QName literals.   [Issue 2059 PR TODO 23 June 2025]

  3. A JNode is represented as jnode(X) where X is its ·content· property.   [Issue 2087 PR 2114 22 June 2025]

The Adaptive output method serializes the input tree into a human readable form for the purposes of debugging query results. The intention of this is to allow any input value to be serialized without raising a serialization error. Sequence normalization is not performed for this output method.

Each item in the supplied sequence is serialized individually as follows, with an occurrence of the chosen item-separator between successive items.

Character maps are applied (a) when nodes are serialized using the XML output method, and (b) to any value represented as a string enclosed in quotation marks.

Optionally, in all the above constructs, characters whose visual representation is ambiguous (for example tab or non-breaking-space) may be represented in the form of an XML numeric character reference (for example &#x9; or &#xa0;)

Note:

In many cases the serialization of an item conforms to the syntax of an XQuery expression whose result is that item. There are exceptions, however. For example, the syntax will not be valid XQuery in the case of free-standing attribute or namespace nodes, or QName values, or anonymous functions; and where it is valid XQuery, the result of evaluating the expression will not necessarily be identical to the original: for example, the distinction between strings and untypedAtomic items is lost.

If any value cannot be output because doing so would cause a serialization error, the behavior is implementation-defined.

If the output is sent to a destination that allows hyperlinks to be included in the generated text, then the serializer may include implementation-dependent hyperlinks to provide additional information for example:

11 Character Maps

The use-character-maps parameter is a list of characters and corresponding string substitutions.

Character maps allow a specific character appearing in a text or attribute node or a string in the input tree to be replaced with a specified string of characters during serialization. The string that is substituted is output "as is," and the serializer performs no checks that the resulting document is well-formed. This mechanism can therefore be used to introduce arbitrary markup in the serialized output. See [XSLT 4.0] section Section 26.3 Character MapsXT of [XSL Transformations (XSLT) Version 4.0][XSLT 4.0] for examples of using character mapping in XSLT.

Character mapping is applied to the characters that actually appear in a text or attribute node or a string in the input tree, before any other serialization operations such as escaping or Unicode Normalization are applied. If a character is mapped, then it is not subjected to XML or HTML escaping, nor to Unicode Normalization. The string that is substituted for a character is not validated or processed in any way by the serializer, except for translation into the target encoding. In particular, it is not subjected to XML or HTML escaping, it is not subjected to Unicode Normalization, and it is not subjected to further character mapping.

Character mapping is not applied to characters in text nodes whose parent elements are listed in the cdata-section-elements parameter, nor to characters for which output escaping has been disabled (disabling output escaping is an [XSL Transformations (XSLT) Version 4.0][XSLT 4.0] feature), nor to characters in attribute values that are subject to URI escaping defined for the HTML and XHTML output methods, unless URI escaping has been disabled using the escape-uri-attributes parameter in the output definition.

On serialization, occurrences of a character specified in the use-character-maps in text nodes, attribute values and strings are replaced by the corresponding string from the use-character-maps parameter.

Note:

Using a character map can result in non-well-formed documents if the string contains XML-significant characters. For example, it is possible to create documents containing unmatched start and end tags, references to entities that are not declared, or attributes that contain tags or unescaped quotation marks.

If a character is mapped, then it is not subjected to XML or HTML escaping.

A serialization error [err:SERE0008] occurs if character mapping causes the output of a string containing a character that cannot be represented in the encoding that the serializer is using for output. The serializermust raise the error.

12 Conformance

Serialization is intended primarily as a component of a host language. [Definition: A host language is another specification that includes, by reference, this specification and all of its requirements. A host language might be a programming language such as [XSL Transformations (XSLT) Version 4.0][XSLT 4.0] or [XQuery 4.0: An XML Query Language], or it might be an application programming interface (API) intended to be used by programs written in some other high-level programming language. The use of the term language is not intended to preclude the possibility that this specification might be referenced outside the context of a programming language specification.] This document relies on specifications that use it to specify conformance criteria for Serialization in their respective environments. Specifications that set conformance criteria for their use of Serialization must not change the semantic definitions of Serialization as given in this specification, except by subsetting and/or compatible extensions. It is the responsibility of the host language to specify how serialization errors are to be handled.

Certain facilities in this specification are described as producing implementation-defined results. A claim that asserts conformance with this specification must be accompanied by documentation stating the effect of each implementation-defined feature. For convenience, a non-normative checklist of implementation-defined features is provided at F.1 Checklist of Implementation-Defined Features.

A References

A.1 Normative References

Character Model for the World Wide Web 1.0: Normalization
Character Model for the World Wide Web 1.0: Normalization, François Yergeau, Martin Dürst, Richard Ishida, et. al., Editors. World Wide Web Consortium, 01 May 2012. This version is http://www.w3.org/TR/2012/WD-charmod-norm-20120501/. The latest version is available at http://www.w3.org/TR/charmod-norm/.
XQuery and XPath Data Model (XDM) 4.0
XDM 4.0
XQuery and XPath Data Model (XDM) 4.0, XSLT Extensions Community Group, World Wide Web Consortium.
XQuery and XPath Functions and Operators 4.0
XQuery and XPath Functions and Operators 4.0, XSLT Extensions Community Group, World Wide Web Consortium.
HTML5
HTML5, Robin Berjon, Steve Faulkner, Travis Leithead, et. al., Editors. World Wide Web Consortium, 04 Feb 2014. This version is http://www.w3.org/TR/2014/CR-html5-20140204/. The latest version is available at http://www.w3.org/TR/html5/.
HTML
HTML 4.01 Specification, Dave Raggett, Arnaud Le Hors, and Ian Jacobs, Editors. World Wide Web Consortium, 24 Dec 1999. This version is http://www.w3.org/TR/1999/REC-html401-19991224/. The latest version is available at http://www.w3.org/TR/html401/
POLYGLOT
Polyglot Markup: A robust profile of the HTML5 vocabulary, Eliot Graff and Leif Halvard Silli, Editors. World Wide Web Consortium, 04 Feb 2014. This version is http://www.w3.org/TR/2014/WD-html-polyglot-20140204/. The latest version is available at http://www.w3.org/TR/html-polyglot/.
IANA
Character Sets. Internet Assigned Numbers Authority. Oct 2012.
RFC2046
Multipurpose Internet Mail Extensions (MIME) Part Two: Media Types, N. Freed, N. Borenstein. Network Working Group, IETF, Nov 1996.
RFC2119
Key words for use in RFCs to Indicate Requirement Levels, S. Bradner. Network Working Group, IETF, Mar 1997.
RFC2978
IANA Charset Registration Procedures, N. Freed and J. Postel Network Working Group, IETF, Oct 2000.
RFC2854
The 'text/html' Media Type, D. Connolly, L. Masinter. Network Working Group, IETF, Jun 2000.
RFC3236
The 'application/xhtml+xml' Media Type, M. Baker and P. Stark. Network Working Group, IETF, Jan 2002.
Unicode Encoding
Unicode Character Encoding Model, Unicode Consortium. Unicode Standard Annex #17.
UAX #15: Unicode Normalization Forms
Unicode Normalization Forms, Unicode Consortium. Unicode Standard Annex #15.
JSON Lines
JSON Lines. Maintained by Ian Ward.
XHTML 1.0
XHTML™ 1.0 The Extensible HyperText Markup Language (Second Edition), Steven Pemberton, Editor. World Wide Web Consortium, 01 Aug 2002. This version is http://www.w3.org/TR/2002/REC-xhtml1-20020801. The latest version is available at http://www.w3.org/TR/xhtml1/.
XHTML 1.1
XHTML™ 1.1 - Module-based XHTML - Second Edition, Shane McCarron and Masayasu Ishikawa, Editors. World Wide Web Consortium, 23 Nov 2010. This version is http://www.w3.org/TR/2010/REC-xhtml11-20101123. The latest version is available at http://www.w3.org/TR/xhtml11/.
XML10
Extensible Markup Language (XML) 1.0 (Fifth Edition), Tim Bray, Jean Paoli, Michael Sperberg-McQueen, et. al., Editors. World Wide Web Consortium, 26 Nov 2008. This version is http://www.w3.org/TR/2008/REC-xml-20081126/. The latest version is available at http://www.w3.org/TR/xml.
XML11
Extensible Markup Language (XML) 1.1 (Second Edition), Tim Bray, Jean Paoli, Michael Sperberg-McQueen, et. al., Editors. World Wide Web Consortium, 16 Aug 2006. This version is http://www.w3.org/TR/2006/REC-xml11-20060816. The latest version is available at http://www.w3.org/TR/xml11/.
XML Names
Namespaces in XML 1.0 (Third Edition), Tim Bray, Dave Hollander, Andrew Layman, et. al., Editors. World Wide Web Consortium, 08 Dec 2009. This version is http://www.w3.org/TR/2009/REC-xml-names-20091208/. The latest version is available at http://www.w3.org/TR/xml-names/.
XML Names 1.1
Namespaces in XML 1.1 (Second Edition), Tim Bray, Dave Hollander, Andrew Layman, and Richard Tobin, Editors. World Wide Web Consortium, 16 Aug 2006. This version is http://www.w3.org/TR/2006/REC-xml-names11-20060816. The latest version is available at http://www.w3.org/TR/xml-names11/.
XML Schema
XML Schema Part 1: Structures Second Edition, Henry Thompson, David Beech, Murray Maloney, and Noah Mendelsohn, Editors. World Wide Web Consortium, 28 Oct 2004. This version is http://www.w3.org/TR/2004/REC-xmlschema-1-20041028/. The latest version is available at http://www.w3.org/TR/xmlschema-1/.
XML Path Language (XPath) 4.0
XPath 4.0
XML Path Language (XPath) 4.0, XSLT Extensions Community Group, World Wide Web Consortium.
XQuery 4.0: An XML Query Language
XQuery 4.0: An XML Query Language, XSLT Extensions Community Group, World Wide Web Consortium.
XSL Transformations (XSLT) Version 4.0
XSLT 4.0
XSL Transformations (XSLT) Version 4.0, XSLT Extensions Community Group, World Wide Web Consortium.
RFC 7159
IETF. RFC 7159: The Javascript Object Notation (JSON) Data Interchange Format, T. Bray, Editor. Internet Engineering Task Force, March 2014. Available at: http://www.rfc-editor.org/rfc/rfc7159.txt

A.2 Informative References

The JSON Data Interchange Format
The JSON Data Interchange Format, ECMA International.
XHTML Modularization
XHTML™ Modularization 1.1 - Second Edition, Shane McCarron, Editor. World Wide Web Consortium, 29 Jul 2010. This version is http://www.w3.org/TR/2010/REC-xhtml-modularization-20100729. The latest version is available at http://www.w3.org/TR/xhtml-modularization/.
XQuery 1.0 and XPath 2.0 Data Model
XDM 4.0
XQuery 1.0 and XPath 2.0 Data Model (XDM) (Second Edition), Norman Walsh, Mary Fernández, Ashok Malhotra, et. al., Editors. World Wide Web Consortium, 14 December 2010. This version is https://www.w3.org/TR/2010/REC-xpath-datamodel-20101214/. The latest version is available at https://www.w3.org/TR/xpath-datamodel/.
XSLT 2.0 and XQuery 1.0 Serialization (Second Edition)
XSLT 2.0 and XQuery 1.0 Serialization (Second Edition), W3C Recommendation, Henry Zongaro, Norman Walsh, Joanne Tong, et. al., Editors. World Wide Web Consortium, 14  December  2010. This version is http://www.w3.org/TR/2010/REC-xslt-xquery-serialization-20101214/
XSLT and XQuery Serialization 4.0 (First Public Working Draft)
XSLT and XQuery Serialization, W3C First Public Working Draft, Andrew Coleman, C. M. Sperberg-McQueen, et. al., Editors. World Wide Web Consortium, 24 April 2014.

C Summary of Error Conditions

This document uses the err prefix which represents the same namespace URI (http://www.w3.org/2005/xqt-errors) as defined in [XML Path Language (XPath) 4.0][XPath 4.0]. Use of this namespace prefix binding in this document is not normative.

err:SENR0001

It is an error if an item in S6 in sequence normalization is an attribute node, a namespace node, or a function.

err:SERE0003

It is an error if the serializer is unable to satisfy the rules for either a well-formed XML document entity or a well-formed XML external general parsed entity, or both, except for content modified by the character expansion phase of serialization.

err:SEPM0004

It is an error to specify the doctype-system parameter, or to specify the standalone parameter with a value other than omit, if the instance of the data model contains text nodes or multiple element nodes as children of the root node.

err:SERE0005

It is an error if the serialized result would contain an NCNameNames that contains a character that is not permitted by the version of Namespaces in XML specified by the version parameter.

err:SERE0006

It is an error if the serialized result would contain a character that is not permitted by the version of XML specified by the version parameter.

err:SESU0007

It is an error if an output encoding other than UTF-8 or UTF-16 is requested and the serializer does not support that encoding.

err:SERE0008

It is an error if a character that cannot be represented in the encoding that the serializer is using for output appears in a context where character references are not allowed (for example if the character occurs in the name of an element).

err:SEPM0009

It is an error if the omit-xml-declaration parameter has the value yes, true or 1, and the standalone attribute has a value other than omit; or the version parameter has a value other than 1.0 and the doctype-system parameter is specified.

err:SEPM0010

It is an error if the output method is xml or xhtml, the value of the undeclare-prefixes parameter is one ofyes, true or 1, and the value of the version parameter is 1.0.

err:SESU0011

It is an error if the value of the normalization-form parameter specifies a normalization form that is not supported by the serializer.

err:SERE0012

It is an error if the value of the normalization-form parameter is fully-normalized and any relevant construct of the result begins with a combining character.

err:SESU0013

It is an error if the serializer does not support the version of XML or HTML specified by the version parameter.

err:SERE0014

It is an error to use the HTML output method if characters which are permitted in XML but not in HTML appear in the instance of the data model.

err:SERE0015

It is an error to use the HTML output method when > appears within a processing instruction in the data model instance being serialized.

err:SEPM0016

It is an error if a parameter value is invalid for the defined domain.

err:SEPM0017

It is an error if evaluating an expression in order to extract the setting of a serialization parameter from a data model instance would yield an error.

err:SEPM0018

It is an error if evaluating an expression in order to extract the setting of the use-character-maps serialization parameter from a data model instance would yield a sequence of length greater than one.

err:SEPM0019

It is an error if an instance of the data model used to specify the settings of serialization parameters specifies the value of the same parameter more than once.

err:SERE0020

It is an error if a numeric value being serialized using the JSON output method cannot be represented in the JSON grammar (e.g. +INF, -INF, NaN).

err:SERE0021

It is an error if a sequence being serialized using the JSON output method includes items for which no rules are provided in the appropriate section of the serialization rules.

err:SERE0022

It is an error if a map being serialized using the JSON output method has two keys with the same string value, unless the allow-duplicate-names has the value yes, true or 1.

err:SERE0023

It is an error if a sequence being serialized using the JSON output method is of length greater than one.

E Glossary (Non-Normative)

array item

The term array item is defined in [XDM 4.0] section Section 8.3 Array ItemsDM.

atomize

The term atomization is defined in [XPath 4.0] section Section 2.5.3 AtomizationXP.

character

The term character is defined in [XDM 4.0] section Section 4.1.5 XML and XSD VersionsDM.

codepoint

The term codepoint is defined in [XDM 4.0] section Section 4.1.5 XML and XSD VersionsDM.

content

The term content has the same meaning as the term ContentXML defined in Section 3.1 Start-Tags, End-Tags, and Empty-Element TagsXML of [XML10].

EMPTY

The following XHTML elements have an EMPTY content model: area, base, br, col, embed, hr, img, input, link, meta, basefont, frame, isindex, and param.

expanded QName

The term expanded QName is defined in [XPath 4.0] section Section 2 BasicsXP. An expanded QName consists of an optional namespace URI and a local name. An expanded QName also retains its original namespace prefix (if any), to facilitate casting the expanded QName into a string.

expected-empty

An element node is expected to be empty if it is recognized as an HTML element and:

function item

The term function item is defined in [XDM 4.0] section Section 8.1 Function ItemsDM.

host language

A host language is another specification that includes, by reference, this specification and all of its requirements. A host language might be a programming language such as [XSL Transformations (XSLT) Version 4.0][XSLT 4.0] or [XQuery 4.0: An XML Query Language], or it might be an application programming interface (API) intended to be used by programs written in some other high-level programming language. The use of the term language is not intended to preclude the possibility that this specification might be referenced outside the context of a programming language specification.

immediate content

The immediate content of an element is the part of the content of the element that is not also in the content of a child element of that element.

implementation-defined

Implementation-defined indicates an aspect that may differ between serializers, but whose actual behavior must be specified either by another specification that sets conformance criteria for serialization (see 12 Conformance) or in documentation that accompanies the serializer.

implementation-dependent

Implementation-dependent indicates an aspect that may differ between serializers, and whose actual behavior is not required to be specified either by another specification that sets conformance criteria for serialization (see 12 Conformance) or in documentation that accompanies the serializer.

input tree

In general the output of the serializer will represent the items actually present in the input value, together with other items that are reachable from these, for example (in the case of nodes) their descendants. The complete set of items that are represented in the output of the serializer is referred to (without loss of generality) as the input tree.

input value

The XDM value supplied as input to the serializer is referred to as the input value.

map item

The term map item is defined in [XDM 4.0] section Section 8.2 Map ItemsDM.

MathML namespace

the MathML namespace namespace, https://www.w3.org/1998/Math/MathML.

node

The term node is defined as partin [XPath 4.0] ofsection [TITLE OF DM40 SPEC, TITLE OF Node SECTION]DM402 Basics. There are seven kinds of nodes in the data model: document, element, attribute, text, namespace, processing instruction, and comment.

non-null namespace URI

An element or attribute that does not have a null namespace URI, is referred to as having a non-null namespace URI

null namespace URI

An expanded-QName whose namespace part is an empty sequence, or an element or attribute whose name expands to such an expanded-QName, is referred to as having a null namespace URI

Output declaration namespace

the Output declaration namespace, https://www.w3.org/2010/xslt-xquery-serialization

parameter document

An output:serialization-parameters element node used to hold the settings of serialization parameters is referred to as a parameter document

prefix normalization

During prefix normalization, any element node in the input tree that is in one of the XHTML namespace, the SVG namespace or the MathML namespace has its name replaced by the local part of its name. Such an element node is given a default namespace node whose value is the element’s namespace URI. Any namespace node for any of those three namespaces that was previously present on any element node in the input tree is also removed, unless the prefix that that namespace node declared is used as the prefix on the name of an attribute on that element or an ancestor of that element.

prior to HTML5

The term prior to HTML5 is used in this specification to qualify rules that apply only when the effective version of the html-version serialization parameter is less than 5.0.

recognized as an HTML element

An element node is recognized as an HTML element by the XHTML output method if either of the following conditions is true:

reconstructed tree

A reconstructed tree may be constructed by parsing the XML document and converting it into an document node as specified in [XQuery and XPath Data Model (XDM) 4.0][XDM 4.0].

requested HTML version

The requested HTML version is the value of the html-version serialization parameter if present; otherwise the value of the version serialization parameter if present; otherwise 5.0.

result tree

The result of the sequence normalization process is a result tree.

sequence

The term sequence is defined in [XPath 4.0] section Section 2 BasicsXP. A sequence is an ordered collection of zero or more items.

sequence normalization

The purpose of sequence normalization is to create a sequence that can be serialized as a well-formed XML document or external general parsed entity, that also reflects the content of the input sequence to the extent possible.

serialization error

In some instances, the input tree cannot be successfully converted into a sequence of octets given the set of serialization parameter (3 Serialization Parameters) values specified. A serialization error is said to occur in such an instance.

serialized as an HTML element

An element node is serialized as an HTML element if

serializer

As is indicated in 12 Conformance, conformance criteria for serialization are determined by other specifications that refer to this specification. A serializer is software that implements some or all of the requirements of this specification in accordance with such conformance criteria.

string

The term string is defined in [XDM 4.0] section Section 4.1.5 XML and XSD VersionsDM.

string value

The term string value is defined in [XDM 4.0] section Section 7.5.12 string-value AccessorDM. Every node has a string value. For example, the string value of an element is the concatenation of the string values of all its descendant text nodes.

SVG namespace

the SVG namespace, https://www.w3.org/2000/svg

to a JSON string

Whenever a value is serialized to a JSON string, the following procedure is applied to the supplied string:

  1. Any character in the string for which character mapping is defined (see 11 Character Maps) is substituted by the replacement string defined in the character map.

  2. Any other character in the input string (but not a character produced by character mapping) is a candidate for Unicode Normalization if requested by the normalization-form parameter, and JSON escaping. JSON escaping replaces the characters quotation mark, backspace, form-feed, newline, carriage return, tab, or reverse solidus by the corresponding JSON escape sequences \", \b, \f, \n, \r, \t, or \\ respectively, and any other codepoint in the range 1-31 or 127-159 by an escape in the form \uHHHH where HHHH is the hexadecimal representation of the codepoint value. Escaping further replaces the solidus character (/) by the escape sequence \/ if the escape-solidus parameter is set to true, but not if it is set to false. Escaping is also applied to any characters that cannot be represented in the selected encoding.

  3. The resulting string is enclosed in double quotation marks.

Unicode Normalization

Unicode Normalization is the process of removing alternative representations of equivalent sequences from textual data, to convert the data into a form that can be binary-compared for equivalence, as specified in [UAX #15: Unicode Normalization Forms]. For specific recommendations for character normalization on the World Wide Web, see [Character Model for the World Wide Web 1.0: Normalization].

URI attribute values

The values of attributes listed in D List of URI Attributes are URI attribute values. Attributes are not considered to be URI attributes simply because they are namespace declaration attributes or have the type annotation xs:anyURI.

URI Escaping

URI escaping consists of the following three steps applied in sequence to the content of URI attribute values:

void

The void elements of HTML5 are area, base, br, col, embed, hr, img, input, keygen, link, meta, param, source, track and wbr.

whitespace character

A space character, TAB character, CR character or NL character is referred to as a whitespace character.

with HTML5

The term with HTML5 is used in this specification to qualify rules that apply only when the effective version of the html-version serialization parameter is 5.0.

without regard to case

Where this specification indicates that two strings are to be compared without regard to case, the serializermust translate any characters in the range U+0041 (LATIN CAPITAL LETTER A, A) through U+005A (LATIN CAPITAL LETTER Z, Z) inclusive, to the corresponding lower-case letters in the range U+0061 (LATIN SMALL LETTER A, a) through U+007A (LATIN SMALL LETTER Z, z) only for the purposes of making the comparison. The comparison succeeds if the two strings are the same length and the code point of each character in the first string is equal to the code point of the character in the corresponding position in the second string.

XHTML namespace

the XHTML namespace namespace, https://www.w3.org/1999/xhtml

XML Island

The portion of the serialized document representing the result of serializing an element that is not to be serialized as an HTML element is known as an XML Island.

XML namespace

the XML namespace, https://www.w3.org/XML/1998/namespace