XSLT and XQuery Serialization 4.0

1 Introduction

Changes in 4.0 ⬇

Use the arrows to browse significant changes since the 3.1 version of this specification.
Sections with significant changes are marked Δ in the table of contents.

This document defines serialization of the W3C XQuery and XPath Data Model 4.0 (XDM), which is the data model of at least [XML Path Language (XPath) 4.0], [XSL Transformations (XSLT) Version 4.0], and [XQuery 4.0: An XML Query Language], and any other specifications that reference it.

In this document, examples and material labeled as “Note” are provided for explanatory purposes and are not normative.

Serialization is the process of converting an instance of the [XQuery and XPath Data Model (XDM) 4.0] into a sequence of octets.

[Definition: The XDM value supplied as input to the serializer is referred to as the input value.] Some serialization methods apply only to certain types of input value.

Note:

Where serialization is used to process the result of an XQuery evaluation or an XSLT transformation, the input value of the serializer corresponds to the output from XQuery or XSLT.

[Definition: In general the output of the serializer will represent the items actually present in the input value, together with other items that are reachable from these, for example (in the case of nodes) their descendants. The complete set of items that are represented in the output of the serializer is referred to (without loss of generality) as the input tree.]

1.1 Terminology

Changes in 4.0 ⬇ ⬆

The term atomic value has been replaced by atomic item. [Issue 1337 2 August 2024]

In this specification, where they are rendered in small capitals, the words must, must not, should, should not, may, required, and recommended are to be interpreted as described in [RFC2119].

[Definition: As is indicated in 12 Conformance, conformance criteria for serialization are determined by other specifications that refer to this specification. A serializer is software that implements some or all of the requirements of this specification in accordance with such conformance criteria.] A serializer is not required to directly provide a programming interface that permits a user to set serialization parameters or to provide an input sequence for serialization. In this document, material labeled as "Note" and examples are provided for explanatory purposes and are not normative.

Certain aspects of serialization are described in this specification as implementation-defined or implementation-dependent.

[Definition: Implementation-defined indicates an aspect that may differ between serializers, but whose actual behavior must be specified either by another specification that sets conformance criteria for serialization (see 12 Conformance) or in documentation that accompanies the serializer.]

[Definition: Implementation-dependent indicates an aspect that may differ between serializers, and whose actual behavior is not required to be specified either by another specification that sets conformance criteria for serialization (see 12 Conformance) or in documentation that accompanies the serializer.]

[Definition: In some instances, the input tree cannot be successfully converted into a sequence of octets given the set of serialization parameter (3 Serialization Parameters) values specified. A serialization error is said to occur in such an instance.] In some cases, a serializer is required to raise such an error. What it means to raise a serialization error is determined by the relevant conformance criteria (12 Conformance) to which the serializer conforms. In other cases, there is an implementation-defined choice between raising a serialization error and performing a recovery action. Such a recovery action will allow a serializer to produce a sequence of octets that might not fully reflect the usual requirements of the parameter settings that are in effect.

[Definition: Where this specification indicates that two strings are to be compared without regard to case, the serializer must translate any characters in the range U+0041 (LATIN CAPITAL LETTER A, A) through U+005A (LATIN CAPITAL LETTER Z, Z) inclusive, to the corresponding lower-case letters in the range U+0061 (LATIN SMALL LETTER A, a) through U+007A (LATIN SMALL LETTER Z, z) only for the purposes of making the comparison. The comparison succeeds if the two strings are the same length and the code point of each character in the first string is equal to the code point of the character in the corresponding position in the second string.]

Many terms used in this document are defined in the XPath specification [XML Path Language (XPath) 4.0] or the Data Model specification [XQuery and XPath Data Model (XDM) 4.0]. Particular attention is drawn to the following:

[Definition: The term atomization is defined in Section 2.5.3 Atomization^XP40.]
[Definition: The term node is defined as part of Section 5 Nodes^DM40. There are seven kinds of nodes in the data model: document, element, attribute, text, namespace, processing instruction, and comment.]
[Definition: The term sequence is defined in Section 2 Basics^XP40. A sequence is an ordered collection of zero or more items.]
[Definition: The term function item is defined in Section 2.9.4 Function Items^DM40.]
[Definition: The term map item is defined in Section 2.9.5 Map Items^DM40.]
[Definition: The term array item is defined in Section 2.9.6 Array Items^DM40.]
[Definition: The term string is defined in Section 2.8.4 XML and XSD Versions^DM40.]
[Definition: The term character is defined in Section 2.8.4 XML and XSD Versions^DM40.]
[Definition: The term codepoint is defined in Section 2.8.4 XML and XSD Versions^DM40.]
[Definition: The term string value is defined in Section 4.12 string-value Accessor^DM40. Every node has a string value. For example, the string value of an element is the concatenation of the string values of all its descendant text nodes.]
[Definition: The term expanded QName is defined in Section 2 Basics^XP40. An expanded QName consists of an optional namespace URI and a local name. An expanded QName also retains its original namespace prefix (if any), to facilitate casting the expanded QName into a string.]
[Definition: An expanded-QName whose namespace part is an empty sequence, or an element or attribute whose name expands to such an expanded-QName, is referred to as having a null namespace URI].
[Definition: An element or attribute that does not have a null namespace URI, is referred to as having a non-null namespace URI].
[Definition: A space character, TAB character, CR character or NL character is referred to as a whitespace character.]

Where this specification indicates that an XSLT instruction is evaluated, the behavior is as specified by [XSL Transformations (XSLT) Version 4.0]. Where it indicates that an XQuery expression is evaluated, the behavior is as specified by [XQuery 4.0: An XML Query Language].

1.2 Namespaces

This specification refers to several namespaces that affect the process of serialization. These are:

[Definition: the Output declaration namespace, https://www.w3.org/2010/xslt-xquery-serialization];
[Definition: the XML namespace, https://www.w3.org/XML/1998/namespace];
[Definition: the XHTML namespace namespace, https://www.w3.org/1999/xhtml];
[Definition: the SVG namespace, https://www.w3.org/2000/svg]; and
[Definition: the MathML namespace namespace, https://www.w3.org/1998/Math/MathML.]

Wherever an element node or attribute node is said to be in a particular namespace, it is understood that the namespace URI of the node is equal to the namespace URI corresponding to that namespace. Wherever a namespace node is said to be a namespace node for a particular namespace, it is understood that the string value of the node is equal to the namespace URI corresponding to that namespace.

2 Sequence Normalization

The input value is a sequence. Prior to serializing a sequence using any of the output methods whose behavior is specified by this document (3 Serialization Parameters), with the exception of the JSON and Adaptive output methods, the serializer must first compute a normalized sequence for serialization; it is the normalized sequence that is actually serialized. [Definition: The purpose of sequence normalization is to create a sequence that can be serialized as a well-formed XML document or external general parsed entity, that also reflects the content of the input sequence to the extent possible.] [Definition: The result of the sequence normalization process is a result tree.]

The normalized sequence for serialization is constructed by applying all of the following rules in order, with the input value being input to the first step, and the sequence that results from any step being used as input to the subsequent step. For any implementation-defined output method, it is implementation-defined whether this sequence normalization process takes place. For the JSON and Adaptive output methods, sequence normalization must not take place.

Where the process of converting the input sequence to a normalized sequence indicates that a value must be cast to xs:string, that operation is defined in Section 20.1.2 Casting to xs:string^FO40 of [XQuery and XPath Functions and Operators 4.0]. Where a step in the sequence normalization process indicates that a node should be copied, the copy is performed in the same way as an XSLT xsl:copy-of instruction that has a validation attribute whose value is preserve and has a select attribute whose effective value is the node, as described in Section 11.9.2 Deep Copy^XT40 of [XSL Transformations (XSLT) Version 4.0], or equivalently in the same way as an XQuery content expression as described in Step 1e of Section 4.12.1.3 Content^XQ40 of [XQuery 4.0: An XML Query Language], where the construction mode is preserve. Let S₀ be the sequence that is input to serialization. The steps in computing the normalized sequence are:

Create a new sequence S₁ from S₀ as follows. For each item in S₀, if the item is an array, copy the results of passing the item into the function array:flatten(); otherwise, copy the item itself. If S₀ is empty, let S₁ consist of a zero-length string.
Create a new sequence S₂ from S₁ as follows. For each item in S₁, if the item is atomic, copy to S₂ only the lexical representation resulting from casting the item to an xs:string, otherwise, copy the item to S₂.
Create a new sequence S₃ from S₂ as follows. If the item-separator serialization parameter is present, then copy each item in S₂ to S₃, inserting between each pair of items a string whose value is equal to the value of the item-separator parameter. If the item-separator serialization parameter is not present, then first maximally group the items in S₂ into subsequences of xs:string items and non-xs:string items. For each group of items, if the group is a subsequence of non-xs:string items, copy the subsequence to S₃; if the group is a subsequence of xs:string items, copy to S₃ the results of passing to fn:string-join() the subsequence and the value of item-separator as the function’s two parameters.
Create a new sequence S₄ from S₃ as follows. For each item in S₃, if the item is a string, copy to S₄ a text node whose string value is equal to the string; otherwise, copy the item to S₄.
Create a new sequence S₅ from S₄ as follows. For each item in S₄, if the item is a document node, copy its children to S₅; otherwise, copy the item to S₅.
Create a new sequence S₆ from S₅ as follows. First, remove any text nodes with values of zero length from S₅, then maximally group the results into groups of text nodes and non-text nodes. For each group of items, if the group is a subsequence of text nodes, copy to S₆ a single text node whose value is equal to the concatenated values of the subsequence; if the group is a subsequence of non-text nodes, copy the subsequence of items to S₆. It is a serialization error [err:SENR0001] if any item in S₆ is an attribute node, a namespace node, or a function.
Create a new sequence S₇ from S₆ as follows. Let S₇ be a single document node. Copy sequence S₆ to the document node as its children.

S₇ is the normalized sequence.

The result tree rooted at the document node that is created by the final step of this sequence normalization process is the value to which the rules of the appropriate output method are applied. If the sequence normalization process results in a serialization error, the serializer must raise the error.

Note:

If the item-separator serialization parameter is absent, the sequence normalization process for a sequence $seq is equivalent to constructing a document node using the XSLT instruction:

<xsl:document>
  <xsl:copy-of select="$seq" validation="preserve"/>
</xsl:document>

or the XQuery expression:

declare construction preserve;

document { $seq }

If the item-separator serialization parameter is present, the sequence normalization process for a sequence $seq is equivalent to constructing a document node using the XSLT instruction:

<xsl:document>
  <xsl:for-each select="$seq">
    <xsl:sequence select="if (position() gt 1) 
                          then $sep 
                          else ()"/>

    <xsl:choose>
      <xsl:when test=". instance of node()">
        <xsl:sequence select="."/>
      </xsl:when>
      <xsl:otherwise>
        <xsl:value-of select="."/>
      </xsl:otherwise>
    </xsl:choose>
  </xsl:for-each>
</xsl:document>

or the XQuery expression:

declare construction preserve; 

document {
  for $item at $pos in $seq
  let $node := 
    if ($item instance of node()) then 
      $item 
    else 
      text { $item }
  return
    if ($pos eq 1) then
      $node
    else
      ($sep, $node)  
}

where the value of the sep variable is a string whose value is equal to the value of the item-separator serialization parameter.

This process results in a serialization error [err:SENR0001] if $seq contains functions, attribute nodes or namespace nodes.

3 Serialization Parameters

Changes in 4.0 ⬇ ⬆

Added the escape-solidus parameter for JSON serialization. [Issue 530 PR 534 6 June 2023]

There are a number of parameters that influence how serialization is performed. Host languages may allow users to specify any or all of these parameters, but they are not required to be able to do so. However, the host language specification must specify how the values of all applicable parameters are to be determined.

It is a serialization error [err:SEPM0016] if a parameter value is invalid for the given parameter. It is the responsibility of the host language to specify how invalid values should be handled at the level of that language.

The following serialization parameters are defined:

Serialization parameters
Serialization parameter name	Permitted values for parameter
`allow-duplicate-names`	One of the enumerated values `yes`, `no`, `true`, `false`, `1` or `0`. This parameter indicates whether a map item serialized as a JSON object using the JSON output method is allowed to contain duplicate member names. If the value `no`, `false` or `0` is specified, a serialization error [err:SERE0022] may be raised under certain conditions.
`byte-order-mark`	One of the enumerated values `yes`, `no`, `true`, `false`, `1` or `0`. This parameter indicates whether the serialized sequence of octets is to be preceded by a Byte Order Mark (See Section 5.1 of [Unicode Encoding]). The actual octet order used is implementation-dependent. If the encoding defines no Byte Order Mark, or if the Byte Order Mark is prohibited for the specific Unicode encoding or implementation environment, then this parameter is ignored.
`cdata-section-elements`	A list of expanded QNames, possibly empty.
`doctype-public`	A string of PubidChar^XML characters. This parameter may be absent.
`doctype-system`	A string of Unicode characters that does not include both the characters U+0027 (APOSTROPHE, `'`) and U+0022 (QUOTATION MARK, `"`) . This parameter may be absent.
`encoding`	A string of Unicode characters in the range U+0021 (EXCLAMATION MARK, `!`) through U+007E (TILDE, `~`) (that is, printable ASCII characters); the value should be a charset registered with the Internet Assigned Numbers Authority [IANA], [RFC2978] or begin with the characters `x-` or `X-`.
`escape-solidus`	One of the enumerated values `yes`, `no`, `true`, `false`, `1` or `0`.
`escape-uri-attributes`	One of the enumerated values `yes`, `no`, `true`, `false`, `1` or `0`.
`html-version`	A decimal value. This parameter may be absent.
`include-content-type`	One of the enumerated values `yes`, `no`, `true`, `false`, `1` or `0`.
`indent`	One of the enumerated values `yes`, `no`, `true`, `false`, `1` or `0`.
`item-separator`	A string of Unicode characters. This parameter may be absent.
`json-node-output-method`	An expanded QName with a non-null namespace URI, or with a null namespace URI and a local name equal to one of `xml`, `xhtml`, `html` or `text`. If the namespace URI is non-null, the parameter specifies an implementation-defined output method.
`media-type`	A string of Unicode characters specifying the media type (MIME content type) [RFC2046]; the charset parameter of the media type must not be specified explicitly in the value of the `media-type` parameter. If the destination of the serialized output is annotated with a media type, this parameter may be used to provide such an annotation. For example, it may be used to set the media type in an HTTP header.
`method`	An expanded QName with a non-null namespace URI, or with a null namespace URI and a local name that must be equal to one of `xml`, `xhtml`, `html`, `text`, `json`, or `adaptive`, in which case, the output method specified must be used for serializing. If the namespace URI is non-null, the parameter specifies an implementation-defined output method; its behavior is not specified by this document.
`normalization-form`	One of the enumerated values `NFC`, `NFD`, `NFKC`, `NFKD`, `fully-normalized` or `none`, or an implementation-defined value of type `NMTOKEN`.
`omit-xml-declaration`	One of the enumerated values `yes`, `no`, `true`, `false`, `1` or `0`.
`standalone`	One of the enumerated values `yes`, `no`, `true`, `false`, `1`, `0` or `omit`.
`suppress-indentation`	A list of expanded QNames, possibly empty.
`undeclare-prefixes`	One of the enumerated values `yes`, `no`, `true`, `false`, `1` or `0`.
`use-character-maps`	A list of pairs, possibly empty, with each pair consisting of a single Unicode character and a string of Unicode characters.
`version`	A string of Unicode characters.

In those cases where they have no important effect on the content of the serialized result, details of the output methods defined by this specification are left unspecified and are regarded as implementation-dependent. Whether a serializer uses apostrophes or quotation marks to delimit attribute values in the XML output method is an example of such a detail.

The detailed semantics of each parameter will be described separately for each output method for which it is applicable. If the semantics of a parameter are not described for an output method, then it is not applicable to that output method.

Implementations may define additional serialization parameters, and may allow users to do so. For this purpose, the name of a serialization parameter is considered to be a QName; the parameters listed above are QNames whose expanded-QName has a null namespace URI, while any additional serialization parameters that are either implementation-defined or defined by the host language must have names that are namespace-qualified. Any such additional serialization parameters must not be in the namespace https://www.w3.org/2010/xslt-xquery-serialization. A host language may specify the means by which an implementation can define such an additional serialization parameter, and implementations may provide mechanisms by which users can define such an additional serialization parameter. If the serialization method is one of the six methods xml, html, xhtml, text, json, or adaptive then the additional serialization parameters may affect the output of the serializer to the extent (but only to the extent) that this specification leaves the output implementation-defined or implementation-dependent. For example, such parameters might control whether namespace declarations on an element are written before or after the attributes of the element, or they might define the number of space or tab characters to be inserted when the indent parameter is set to yes, true or 1; but they could not instruct the serializer to suppress the error that occurs when the HTML output method encounters characters that are not permitted (see error [err:SERE0014]).

3.1 Setting Serialization Parameters by Means of a Parameter Document

A host language may provide, by reference to this section, a mechanism by which the settings of serialization parameters are supplied in the form of an output:serialization-parameters element node.

[Definition: An output:serialization-parameters element node used to hold the settings of serialization parameters is referred to as a parameter document].

Note:

The use of the word document does not imply that the output:serialization-parameters element must be the outermost element of an XDM document, although this will often be the case.

The parameter document must be processed as if by the procedure described below.

With the exception of the use-character-maps parameter, the setting of each serialization parameter defined in this specification is equal to the result of evaluating the XQuery expression

document { . } 
   /output:serialization-parameters
   /(validate lax { 
      output:*[local-name() eq $param-name] 
   })
   /data(@value)

or equivalently the XSLT instructions

<xsl:sequence>
  <xsl:variable name="validated-instance">
    <xsl:document validation="lax">
      <xsl:sequence select="
        self::output:serialization-parameters
        /output:*
       [local-name() eq $param-name]"/>
    </xsl:document>
  </xsl:variable>
  <xsl:sequence select="$validated-instance
                        /data(@value)"/>
</xsl:sequence>

with the parameter document as the context value, the param-name variable bound to a value of type xs:string equal to the local part of the name of the particular serialization parameter, and the other components of the dynamic context and static context as specified in the subsequent tables. If in any case evaluating this expression would yield an error, serialization error [err:SEPM0017] results.

If the result of evaluating this expression for a particular serialization parameter is the empty sequence, then

If the parameter is either cdata-section-elements or suppress-indentation, and the result of evaluating the XQuery expression

document { . }
/output:serialization-parameters
/(validate lax {  
   output:*[local-name() eq $param-name]
})

or equivalently the XSLT instructions

<xsl:sequence>
  <xsl:variable name="validated-instance">
    <xsl:document select="." validation="lax">
      <xsl:sequence select="
        self::output:serialization-parameters
        /output:*
       [local-name() eq $param-name]"/>
    </xsl:document>
  </xsl:variable>
  <xsl:sequence select="$validated-instance"/>
</xsl:sequence>

with the same settings of the static context and dynamic context is not an empty sequence, the setting of the parameter is the empty list;

otherwise, the setting of the parameter is absent.

The components of the static context used in evaluating the XQuery expressions or XSLT instructions are as defined in the following table.

Settings of static context components used in extracting serialization parameter settings from a parameter document
Static Context Component	XQuery or XSLT	Setting
XPath 1.0 compatibility mode	Both	false
Statically known namespaces	XQuery	The pair (output,http://www.w3.org/2010/xslt-xquery-serialization)
Statically known namespaces	XSLT	The pairs (output,http://www.w3.org/2010/xslt-xquery-serialization), (xsl,http://www.w3.org/1999/XSL/Transform)
Default element/type namespace	Both	"none"
Default function namespace	Both	http://www.w3.org/2005/xpath-functions
In-scope schema types, In-scope element declarations, Substitution groups, In-scope attribute declarations	Both	As defined by the schema for serialization parameters (B Schema for Serialization Parameters) and any additional implementation-defined in-scope schema components
In-scope variables	Both	{param-name}
Context value static type	Both	`node()`
Statically known function signatures	Both	{`fn:data($arg as item()) as xs:anyAtomicType`}, {`fn:local-name($arg as node()?) as xs:string`}
Statically known collations	Both	{ (http://www.w3.org/2005/xpath-functions/collation/codepoint, The Unicode codepoint collation ) }
Default collation	Both	The Unicode codepoint collation
Construction mode	XQuery	strip
Ordering mode	XQuery	ordered
Default order for empty sequences	XQuery	least
Boundary space policy	XQuery	strip
Copy-namespaces mode	XQuery	(preserve,inherit)
Base URI	Both	Absent
Statically known documents	Both	None
Statically known collections	Both	None
Statically known default collection type	Both	`node()*`
Statically known decimal formats	Both	None
Set of named keys	XSLT	{}
Values of system properties	XSLT	None
Set of available instructions	XSLT	The set of all instructions defined by [XSL Transformations (XSLT) Version 4.0]

The remaining components of the dynamic context used in evaluating the XQuery expressions or XSLT instructions in the preceding table are as defined in the following table.

Settings of dynamic context components used in extracting serialization parameter settings from a parameter document
Dynamic Context Component	XQuery or XSLT	Setting
Context position	Both	1
Context size	Both	1
Variable values	Both	The `param-name` variable has a value of type `xs:string` equal to the local part of the name of the serialization parameter under consideration
Function implementations	Both	The implementation of `fn:data`
Current dateTime	Both	Absent
Implicit timezone	Both	Absent
Available documents	Both	None
Available collections	Both	None
Default collection	Both	None
Current template rule	XSLT	Absent
Current mode	XSLT	The default mode
Current group	XSLT	Absent
Current grouping key	XSLT	Absent
Current captured substrings	XSLT	The empty sequence
Output state	XSLT	Temporary output state

In the case of the use-character-maps parameter, the XQuery expression

document { . }
/output:serialization-parameters
/ ( validate lax { output:use-character-maps } )
/output:character-map[@character eq $char]
/string(@map-string)

or equivalently the XSLT instructions

<xsl:sequence>
  <xsl:variable name="validated-instance">
    <xsl:document validation="lax">
      <xsl:sequence select="
        self::output:serialization-parameters
         /output:use-character-maps"/>
    </xsl:document>
  </xsl:variable>
  <xsl:sequence select="$validated-instance                          
                        /output:character-map
                        [@character eq $char]
                        /string(@map-string)"/>
</xsl:sequence>

is evaluated for each Unicode character that is permitted in an XML document. The dynamic context and static context used to evaluate the expression are as defined above, except that the component In-scope variables is the set {char} and the value of the variable "char" is a value of type xs:string of length one whose value is the Unicode character under consideration. If the result of evaluating the expression is not an empty sequence, the pair consisting of the Unicode character and the result of evaluating the expression is part of the list of pairs in the value of the use-character-maps parameter. It is a serialization error [err:SEPM0018] if the result of evaluating this expression for any character is a sequence of length greater than one.

Using the same settings of the components of the dynamic context and static context, serialization error [err:SEPM0019] results if the result of evaluating the following XQuery expression is not true

(document { . })/output:serialization-parameters
   /(count(distinct-values(*/node-name(.))) eq (count(*)))

or equivalently if the result of evaluating the following XSLT instructions is not true.

<xsl:sequence>
  <xsl:variable name="doc">
    <xsl:document>
      <xsl:sequence select="."/>
    </xsl:document>
  </xsl:variable>
  <xsl:sequence
    select="$doc/output:serialization-parameters
                /(count(distinct-values(
                    */node-name(.))) 
                eq (count(*)))"/>
</xsl:sequence>

The result of evaluating either will be false if the parameter document supplies a value for any particular serialization parameter more than once, or will be the empty sequence if the parameter document is not an element node whose local name is serialization-parameters and whose namespace URI is http://www.w3.org/2010/xslt-xquery-serialization.

Note:

A serializer or implementation of a host language does not need to be accompanied by an XQuery processor nor by a general-purpose schema validator in order to meet the requirements of this section. It merely needs to be capable of extracting values from an XDM instance that conforms to the schema for serialization parameters, while checking that the constraints implied by the schema and additional constraints implied by the XQuery validate expression or explicitly stated in this section are satisfied.

The host language may provide additional mechanisms for overriding the values of any serialization parameters specified through the mechanism defined in this section, as well as additional mechanisms for specifying the values of any serialization parameters whose values are absent after applying the mechanism defined in this section.

If the parameter document contains elements or attributes that are in a namespace other than http://www.w3.org/2010/xslt-xquery-serialization, the implementation may interpret them to specify the values of implementation-defined serialization parameters in an implementation-defined manner.

The following XML document, if parsed as a parameter document and processed using the mechanism described in this section, would specify the settings of the method, version and indent serialization parameters with the values xml, 1.0 and yes, respectively.

<output:serialization-parameters 
    xmlns:output 
    = "http://www.w3.org/2010/xslt-xquery-serialization">
  <output:method value="xml"/>
  <output:version value="1.0"/>
  <output:indent value="yes"/>
</output:serialization-parameters>

The following document would specify the value of the cdata-section-elements serialization parameter with value equal to the pair of expanded QNames (http://example.org/book/chapter,heading) and (http://example.org/book,footnote)

<output:serialization-parameters
    xmlns:output
    = "http://www.w3.org/2010/xslt-xquery-serialization"
    xmlns:book="http://example.org/book"
    xmlns="http://example.org/book/chapter">
  <output:cdata-section-elements value="heading book:footnote"/>
</output:serialization-parameters>

The following document would specify the value of the method serialization parameter with the value html.

Notice that in this example, the default namespace declaration in scope has no effect on the interpretation of the setting of the method parameter.

<output:serialization-parameters
    xmlns:output
    = "http://www.w3.org/2010/xslt-xquery-serialization"
    xmlns="http://example.org/ext">
  <output:method value="html"/>
</output:serialization-parameters>

The following document would specify the value of the method serialization parameter with value equal to the expanded QName (http://example.org/ext, jsp), and the use-character-maps parameter with value equal to the list of pairs, («, <%), (», %>).

<output:serialization-parameters
    xmlns:output
    = "http://www.w3.org/2010/xslt-xquery-serialization"
    xmlns:ext="http://example.org/ext">
  <output:method value="ext:jsp"/>
  <output:use-character-maps>
    <output:character-map character="&#xAB;" map-string="&lt;%"/>
    <output:character-map character="&#xBB;" map-string="%&gt;"/>
  </output:use-character-maps>
</output:serialization-parameters>

4 Phases of Serialization

For the XML, HTML, XHTML and Text output methods, serialization comprises five phases of processing (preceded by the sequence normalization process described in 2 Sequence Normalization). For the JSON and Adaptive output methods, serialization is described in 9 JSON Output Method and 10 Adaptive Output Method respectively.

For an implementation-defined output method, any of these phases may be skipped or may be performed in a different order than is specified here. For the output methods defined in this specification, these phases are carried out sequentially as follows:

A meta element is added to the sequence, possibly replacing existing meta elements, as controlled by the include-content-type parameter for the XHTML and HTML output methods. This step is skipped for the other output methods defined by this specification.
Markup generation produces the character representation of those parts of the serialized result that describe the structure of the sequence. In the cases of the XML, HTML and XHTML output methods, this phase produces the character representations of the following:
- the document type declaration;
- start tags and end tags (except for attribute values, whose representation is produced by the character expansion phase);
- processing instructions; and
- comments.
In the cases of the XML and XHTML output methods, this phase also produces the following:
- the XML or text declaration; and
- empty element tags (except for the attribute values);
In the case of the text output method, this phase replaces the single document node produced by sequence normalization with a new document node that has exactly one child, which is a text node. The string value of the new text node is the string value of the document node that was produced by sequence normalization.
Character expansion is concerned with the representation of characters appearing in text and attribute nodes in the sequence. For each text and attribute node, the following rules are applied in sequence.
1. If the node is an attribute that is a URI attribute value and the escape-uri-attributes parameter is set to require escaping of URI attributes, apply URI escaping as defined below, and skip rules b-e. Otherwise, continue with rule b.
 
 [Definition: URI escaping consists of the following three steps applied in sequence to the content of URI attribute values:]
 1. normalize to NFC using the method defined in Section 5.4.9 fn:normalize-unicode^FO40
 2. percent-encode any special characters in the URI using the method defined in Section 6.5 fn:escape-html-uri^FO40
 3. escape according to the rules of the XML or HTML output method, whichever is applicable, any characters that require escaping, and any characters that cannot be represented in the selected encoding. For example, replace < with < (See also section 7.3 Writing Character Data).
 [Definition: The values of attributes listed in D List of URI Attributes are URI attribute values. Attributes are not considered to be URI attributes simply because they are namespace declaration attributes or have the type annotation xs:anyURI.]
2. If the node is a text node whose parent element is selected by the rules of the cdata-section-elements parameter for the applicable output method, create CDATA sections as described below, and skip rules c-e. Otherwise, continue with rule c.
 
 Apply the following two processes in sequence to create CDATA sections
 1. Unicode Normalization if requested by the normalization-form parameter.
 2. The application of changes as detailed in the description of the cdata-section-elements parameter for the applicable output method.
3. Apply character mapping as determined by the use-character-maps parameter for the applicable output method. For characters that were substituted by this process, skip rules d and e. For the remaining characters that were not modified by character mapping, continue with rule d.
4. Apply Unicode Normalization if requested by the normalization-form parameter.
 
 [Definition: Unicode Normalization is the process of removing alternative representations of equivalent sequences from textual data, to convert the data into a form that can be binary-compared for equivalence, as specified in [UAX #15: Unicode Normalization Forms]. For specific recommendations for character normalization on the World Wide Web, see [Character Model for the World Wide Web 1.0: Normalization].]
 
 The meanings associated with the possible values of the normalization-form parameter are defined in section 5.1.9 XML Output Method: the normalization-form Parameter.
 
 Continue with step e.
5. Escape according to the rules of the XML or HTML output method, whichever is applicable, any characters (such as < and &) where XML or HTML requires escaping, and any characters that cannot be represented in the selected encoding. For example, replace < with <. (See also section 7.3 Writing Character Data). For characters such as > where XML defines a built-in entity but does not require its use in all circumstances, it is implementation-dependent whether the character is escaped.
Indentation, as controlled by the indent parameter and the suppress-indentation parameter, may add or remove whitespace according to the rules defined by the applicable output method.
Encoding, as controlled by the encoding parameter, converts the character sequence produced by the previous phases into an octet stream.

Note:

Serialization is defined only in terms of encoding the result as a stream of octets. However, a serializer may provide an option that allows the encoding phase to be skipped, so that the result of serialization can be encoded in a way required by a particular destination (e.g., a Java StringBuffer). The effect of any such option is implementation-defined, and a serializer is not required to support such an option.

5 XML Output Method

The XML output method serializes the normalized sequence as an XML entity that must satisfy the rules for either a well-formed XML document entity, a well-formed XML external general parsed entity, or both. A serialization error [err:SERE0003] results if the serializer is unable to satisfy those rules, except for content modified by the character expansion phase of serialization, as described in 4 Phases of Serialization. The effects of the character expansion phase could result in the serialized output being not well-formed, but will not result in a serialization error. If a serialization error results, the serializer must raise the error.

If the document node of the normalized sequence has a single element node child and no text node children, then the serialized output is a well-formed XML document entity, and the serialized output must conform to the appropriate version of the XML Namespaces Recommendation [XML Names] or [XML Names 1.1]. If the normalized sequence does not take this form, then the serialized output is a well-formed XML external general parsed entity, which, when referenced within a trivial XML document wrapper like this:

<?xml version="version"?>
<!DOCTYPE doc [
<!ENTITY e SYSTEM "entity-URI">
]>
<doc>&e;</doc>

where entity-URI is a URI for the entity, and the value of the version pseudo-attribute is the value of the version parameter, produces a document which must itself be a well-formed XML document conforming to the corresponding version of the XML Namespaces Recommendation [XML Names] or [XML Names 1.1].

[Definition: A reconstructed tree may be constructed by parsing the XML document and converting it into an document node as specified in [XQuery and XPath Data Model (XDM) 4.0].] The result of serialization must be such that the reconstructed tree is the same as the result tree except for the following permitted differences:

If the document was produced by adding a document wrapper, as described above, then it will contain an extra doc element as the document element.
The order of attribute and namespace nodes in the two trees may be different.
The following properties of corresponding nodes in the two trees may be different:
- the base-uri property of document nodes and element nodes;
- the document-uri and unparsed-entities properties of document nodes;
- the type-name and typed-value properties of element and attribute nodes;
- the nilled property of element nodes;
- the content property of text nodes, due to the effect of the indent and use-character-maps parameters.
The reconstructed tree may contain additional attributes and text nodes resulting from the expansion of default and fixed values in its DTD or schema; also, in the presence of a DTD, non-CDATA attributes may lose whitespace characters as a result of attribute value normalization.
The type annotations of the nodes in the two trees may be different. Type annotations in a result tree are discarded when the tree is serialized. Any new type annotations obtained by parsing the document will depend on whether the serialized XML document is assessed against a schema, and this may result in type annotations that are different from those in the original result tree.

Note:

In order to influence the type annotations in the tree that would result from processing a serialized XML document, the author of the XSLT stylesheet, XQuery expression or other process might wish to create the input tree so that it makes use of mechanisms provided by [XML Schema], such as xsi:type and xsi:schemaLocation attributes. The serialization process will not automatically create such attributes in the serialized document if those attributes were not part of the result tree that is to be serialized.

Similarly, it is possible that an element node in the input tree has the nilled property with the value true, but no xsi:nil attribute. The serialization process will not create such an attribute in the serialized document simply to reflect the value of the property. The value of the nilled property has no direct effect on the serialized result.
Additional namespace nodes may be present in the reconstructed tree if the serialization process did not undeclare one or more namespaces, as described in 5.1.8 XML Output Method: the undeclare-prefixes Parameter, and the input tree contained an element node with a namespace node that declared some prefix, but a child element of that node did not have any namespace node that declared the same prefix.

The result tree may contain namespace nodes that are not present in the reconstructed tree, as the process of creating an instance of the data model may ignore namespace declarations in some circumstances. See Section 5.2.3 Construction from an Infoset^DM40 and Section 5.2.4 Construction from a PSVI^DM40 of [XQuery and XPath Data Model (XDM) 4.0] for additional information.
If the indent parameter has one of the values yes, true or 1,
- additional text nodes consisting of whitespace characters may be present in the reconstructed tree; and
- text nodes in the result tree that contained only whitespace characters may correspond to text nodes in the reconstructed tree that contain additional whitespace characters that were not present in the result tree
See 5.1.4 XML Output Method: the indent and suppress-indentation Parameters for more information on the indent parameter.
Additional nodes may be present in the reconstructed tree due to the effect of character mapping in the character expansion phase, and the values of attribute nodes and text nodes in the reconstructed tree may be different from those in the result tree, due to the effects of URI expansion, character mapping and Unicode Normalization in the character expansion phase of serialization.

Note:

The use-character-maps parameter can cause arbitrary characters to be inserted into the serialized XML document in an unescaped form, including characters that would be considered to be part of XML markup. Such characters could result in arbitrary new element nodes, attribute nodes, and so on, in the reconstructed tree that results from processing the serialized XML document.

A consequence of this rule is that certain characters must be output as character references, to ensure that they survive the round trip through serialization and parsing. Specifically:

In text nodes, the characters U+000D (CARRIAGE RETURN) , U+0085 (NEXT LINE, NEL) , and U+2028 (LINE SEPARATOR) must be output respectively as "", "", and " ", or their equivalents
In attribute nodes, the characters U+000D (CARRIAGE RETURN) , U+000A (NEWLINE) , U+0009 (TAB) , U+0085 (NEXT LINE, NEL) , and U+2028 (LINE SEPARATOR) must be output respectively as "", "
", "	", "", and " ", or their equivalents.
In both text nodes and attribute nodes, control characters U+0001 (SOH) through U+001F (IS1) and U+007F (DELETE) through U+009F (APC) (except U+0009 (TAB) , U+000A (NEWLINE) , and U+000D (CARRIAGE RETURN) , and U+0085 (NEXT LINE, NEL) ) must be output as character references.

For example, an attribute with the value "x" followed by "y" separated by a newline will result in the output "x
y" (or with any equivalent character reference). The XML output cannot be "x" followed by a literal newline followed by a "y" because after parsing, the attribute value would be "x y" as a consequence of the XML attribute normalization rules.

Note:

XML 1.0 did not permit an XML processor to normalize U+0085 (NEXT LINE, NEL) or U+2028 (LINE SEPARATOR) characters to a U+000A (NEWLINE) character. However, if a document entity that specifies version 1.1 invokes an external general parsed entity with no text declaration or a text declaration that specifies version 1.0, the external parsed entity is processed according to the rules of XML 1.1. For this reason, U+0085 (NEXT LINE, NEL) and U+2028 (LINE SEPARATOR) characters in text and attribute nodes must always be escaped using character references, regardless of the value of the version parameter.

XML 1.0 permitted control characters in the range U+007F (DELETE) through U+009F (APC) to appear as literal characters in an XML document, but XML 1.1 requires such characters, other than U+0085 (NEXT LINE, NEL) , to be escaped as character references. An external general parsed entity with no text declaration or a text declaration that specifies a version pseudo-attribute with value 1.0 that is invoked by an XML 1.1 document entity must follow the rules of XML 1.1. Therefore, the non-whitespace control characters in the ranges U+0001 (SOH) through U+001F (IS1) and U+007F (DELETE) through U+009F (APC) must always be escaped, regardless of the value of the version parameter.

It is a serialization error [err:SEPM0004] to specify the doctype-system parameter, or to specify the standalone parameter with a value other than omit, if the input tree contains text nodes or multiple element nodes as children of the root node. The serializer must either raise the error, or recover by ignoring the request to output a document type declaration or standalone parameter.

5.1 The Influence of Serialization Parameters upon the XML Output Method

5.1.1 XML Output Method: the `version` Parameter

The version parameter specifies the version of XML and the version of Namespaces in XML to be used in the serialized output. The version output in the XML declaration (if an XML declaration is not omitted) must correspond to the version of XML used by the serializer. The value of the version parameter must match the VersionNum^XML production of the XML Recommendation [XML10] or [XML11]. A serialization error [err:SESU0013] results if the value of the version parameter specifies a version of XML that is not supported by the serializer; the serializer must raise the error.

This document provides the normative definition of serialization for the XML output method if the version parameter has either the value 1.0 or 1.1. For any other value of version parameter, the behavior is implementation-defined. In that case the implementation-defined behavior may supersede all other requirements of this recommendation.

If the serialized result would contain an NCName^Names that contains a character that is not permitted by the version of Namespaces in XML specified by the version parameter, a serialization error [err:SERE0005] results. The serializer must raise the error.

If the serialized result would contain a character that is not permitted by the version of XML specified by the version parameter, a serialization error [err:SERE0006] results. The serializer must raise the error.

For example, if the version parameter has the value 1.0, and the input tree contains a non-whitespace control character in the range U+0001 (SOH) through U+001F (IS1) , a serialization error [err:SERE0006] results. If the version parameter has the value 1.1 and a comment node in the input tree contains a non-whitespace control character in the range U+0001 (SOH) through U+001F (IS1) or a control character other than U+0085 (NEXT LINE, NEL) in the range U+007F (DELETE) through U+009F (APC) , a serialization error [err:SERE0006] results.

5.1.2 XML Output Method: the `html-version` Parameter

The html-version parameter is not applicable to the XML output method. It is the responsibility of the host language to specify whether an error occurs if this parameter is specified in combination with the XML output method, or if the parameter is simply dropped.

5.1.3 XML Output Method: the `encoding` Parameter

The encoding parameter specifies the encoding to be used in the serialized output. Serializers are required to support values of UTF-8 and UTF-16. A serialization error [err:SESU0007] occurs if an output encoding other than UTF-8 or UTF-16 is requested and the serializer does not support that encoding. The serializer must raise the error, or recover by using UTF-8 or UTF-16 instead. The serializer must not use an encoding whose name does not match the EncName^XML production of the XML Recommendation [XML10].

When outputting a newline character in the input tree, the serializer is free to represent it using any character sequence that will be normalized to a newline character by an XML parser, unless a specific mapping for the newline character is provided in a character map (see 11 Character Maps).

When outputting any other character that is defined in the selected encoding, the character must be output using the correct representation of that character in the selected encoding.

It is possible that the input tree will contain a character that cannot be represented in the encoding that the serializer is using for output. In this case, if the character occurs in a context where XML recognizes character references (that is, in the value of an attribute node or text node), then the character must be output as a character reference. A serialization error [err:SERE0008] occurs if such a character appears in a context where character references are not allowed (for example, if the character occurs in the name of an element). The serializer must raise the error.

For example, if a text node contains the character U+00E9 (LATIN SMALL LETTER E WITH ACUTE, é) , and the value of the encoding parameter is US-ASCII, the character must be serialized as a character reference. If a comment node contains the same character, a serialization error [err:SERE0008] results.

5.1.4 XML Output Method: the `indent` and `suppress-indentation` Parameters

The indent and suppress-indentation parameters control whether the serializer may adjust the whitespace in the serialized result so that a person will find it easier to read. If the indent parameter has one of the values yes, true or 1, the serializer may output whitespace characters in addition to the whitespace characters in the input tree. It may also elide from the output whitespace characters that occurred in the input tree or replace such whitespace characters with other whitespace characters.

[Definition: The term content has the same meaning as the term Content^XML defined in Section 3.1 Start-Tags, End-Tags, and Empty-Element Tags^XML of [XML10].] [Definition: The immediate content of an element is the part of the content of the element that is not also in the content of a child element of that element.]

If the indent parameter has the value no, false or 0, the serializer must not add, elide or replace whitespace characters in the output. If the indent parameter has one of the values yes, true or 1, the serializer must use an algorithm for dealing with whitespace characters that satisfies all of the following constraints. If more than one constraint applies, the serializer must apply the most restrictive constraint. That is, if any applicable constraint indicates that whitespace must not be added, elided or replaced, that constraint prevails; if an applicable constraint indicates that whitespace should not be added, elided or replaced, while all other applicable constraints indicate that whitespace may be added, elided or replaced, whitespace should not be added, elided or replaced.

Whitespace characters may be added adjacent to a text node only if the text node contains only whitespace characters. Whitespace characters in such a text node may also be elided or replaced. For example, a tab may be inserted as a replacement for existing spaces.
Whitespace characters may be added, elided or replaced in the immediate content of an element whose type annotation is xs:untyped or xs:anyType and that has element node children, in the immediate content of an element whose content model is element only, or outside the content of any element.
Whitespace characters must not be added, elided or replaced in the immediate content of an element whose content model is known to be simple or empty.
Whitespace characters should not be added, elided or replaced in places where the characters would constitute significant whitespace, for example, in the immediate content of an element that is annotated with a type other than xs:untyped or xs:anyType, and whose content model is known to be mixed.
Whitespace characters must not be added, elided or replaced in the content of an element whose expanded QName is a member of the list of expanded QNames in the value of the suppress-indentation parameter.
Whitespace characters must not be added, elided or replaced in a part of the result document that is controlled by an xml:space attribute with value preserve. (See [XML10] for more information about the xml:space attribute.)

Note:

The effect of these rules is to ensure that whitespace is added in only those places where (a) XSLT’s <xsl:strip-space> declaration could cause it to be removed, and (b) it does not affect the string value of any element node with simple content. It is usually not safe to indent document types that include elements with mixed content.

Note:

The whitespace added may possibly be based on whitespace stripped from either the source document or the stylesheet (in the case of XSLT), or guided by other means that might depend on the host language, in the case of an input tree created using some other process.

5.1.5 XML Output Method: the `cdata-section-elements` Parameter

The cdata-section-elements parameter contains a list of expanded QNames. If the expanded QName of the parent of a text node is a member of the list, then the text node must be output as a CDATA section, except in those circumstances described below.

If the text node contains the sequence of characters ]]>, then the currently open CDATA section must be closed following the ]] and a new CDATA section opened before the >.

If the text node contains characters that are not representable in the character encoding being used in the serialized output, then the currently open CDATA section must be closed before such characters, the characters must be output using character references or entity references, and a new CDATA section must be opened for any further characters in the text node.

CDATA sections must not be used except where they have been explicitly requested by the user, either by using the cdata-section-elements parameter, or by using some other implementation-defined mechanism.

Note:

This is phrased to permit an implementor to provide an option that attempts to preserve CDATA sections present in the source document.

5.1.6 XML Output Method: the `omit-xml-declaration` and `standalone` Parameters

The XML output method must output an XML declaration if the omit-xml-declaration parameter has the value no, false or 0. The XML declaration must include both version information and an encoding declaration. If the standalone parameter has one of the values yes, true, 1, no, false or 0, the XML declaration must include a standalone document declaration with the same value as the value of the standalone parameter. If the standalone parameter has the value omit, the XML declaration must not include a standalone document declaration; this ensures that it is both an XML declaration (allowed at the beginning of a document entity) and a text declaration (allowed at the beginning of an external general parsed entity).

A serialization error [err:SEPM0009] results if the omit-xml-declaration parameter has one of the values yes, true or 1, and

the standalone parameter has a value other than omit; or
the version parameter has a value other than 1.0 and the doctype-system parameter is specified.

The serializer must raise the error.

Otherwise, if the omit-xml-declaration parameter has one of the values yes, true or 1, the XML output method must not output an XML declaration.

5.1.7 XML Output Method: the `doctype-system` and `doctype-public` Parameters

If the doctype-system parameter is specified, the XML output method must output a document type declaration immediately before the first element. The name following <!DOCTYPE must be the name of the first element, if any. If the doctype-public parameter is also specified, then the XML output method must output PUBLIC followed by the public identifier and then the system identifier; otherwise, it must output SYSTEM followed by the system identifier. The internal subset must be empty. The doctype-public parameter must be ignored unless the doctype-system parameter is specified.

5.1.8 XML Output Method: the `undeclare-prefixes` Parameter

The Data Model allows an element node that binds a non-empty prefix to have a child element node that does not bind that same prefix. In Namespaces in XML 1.1 ([XML Names 1.1]), this can be represented accurately by undeclaring prefixes. For the undeclaring prefix of the child element node, if the undeclare-prefixes parameter has one of the values yes, true or 1, the output method is XML or XHTML, and the version parameter value is greater than 1.0, the serializer must undeclare its namespace. If the undeclare-prefixes parameter has the value no, false or 0 and the output method is XML or XHTML, then the undeclaration of prefixes must not occur.

Consider an element x:foo with four in-scope namespaces that associate prefixes with URIs as follows:

x is associated with http://example.org/x
y is associated with http://example.org/y
z is associated with http://example.org/z
xml is associated with http://www.w3.org/XML/1998/namespace

Suppose that it has a child element x:bar with three in-scope namespaces:

x is associated with http://example.org/x
y is associated with http://example.org/y
xml is associated with http://www.w3.org/XML/1998/namespace

If namespace undeclaration is in effect, it will be serialized this way:

<x:foo xmlns:x="http://example.org/x"
       xmlns:y="http://example.org/y"
       xmlns:z="http://example.org/z">
       
       <x:bar xmlns:z="">...</x:bar>
       
</x:foo>

In Namespaces in XML 1.0 ([XML Names]), prefix undeclaration is not possible. If the output method is XML or XHTML, the value of the undeclare-prefixes parameter is one of yes, true or 1, and the value of the version parameter is 1.0, a serialization error [err:SEPM0010] results; the serializer must raise the error.

5.1.9 XML Output Method: the `normalization-form` Parameter

The normalization-form parameter is applicable to the XML output method. The values NFC and none must be supported by the serializer. A serialization error [err:SESU0011] results if the value of the normalization-form parameter specifies a normalization form that is not supported by the serializer; the serializer must raise the error.

The meanings associated with the possible values of the normalization-form parameter are as follows:

NFC specifies the serialized result will be in Normalization Form C, using the rules specified in [Character Model for the World Wide Web 1.0: Normalization].
NFD specifies the serialized result will be in Normalization Form D, as specified in [UAX #15: Unicode Normalization Forms].
NFKC specifies the serialized result will be in Normalization Form KC, as specified in [UAX #15: Unicode Normalization Forms].
NFKD specifies the serialized result will be in Normalization Form KD, as specified in [UAX #15: Unicode Normalization Forms].
fully-normalized specifies the serialized result will be in fully normalized text, as specified in [Character Model for the World Wide Web 1.0: Normalization].
none specifies that no Unicode Normalization will be applied.
An implementation-defined value has an implementation-defined effect.

If the value of the parameter is fully-normalized, then no relevant construct of the parsed entity created by the serializer may start with a composing character. The term relevant construct has the meaning defined in section 2.13 of [XML11]. If this condition is not satisfied, a serialization error [err:SERE0012] must be raised.

Note:

Specifying fully-normalized as the value of this parameter does not guarantee that the XML document output by the serializer will in fact be fully normalized as defined in [XML11]. This is because the serializer does not check that the text is include normalized, which would involve checking all external entities that it refers to (such as an external DTD). Furthermore, the serializer does not check whether any character escape generated using character maps represents a composing character.

5.1.10 XML Output Method: the `media-type` Parameter

The media-type parameter is applicable to the XML output method. See 3 Serialization Parameters for more information.

5.1.11 XML Output Method: the `use-character-maps` Parameter

The use-character-maps parameter is applicable to the XML output method. The result of serialization using the XML output method is not guaranteed to be well-formed XML if character maps have been specified. See 11 Character Maps for more information.

5.1.12 XML Output Method: the `byte-order-mark` Parameter

The byte-order-mark parameter is applicable to the XML output method. See 3 Serialization Parameters for more information.

Note:

The byte order mark may be undesirable under certain circumstances, for example, to concatenate resulting XML fragments without additional processing to remove the byte order mark. Therefore this specification does not mandate the byte-order-mark parameter to have one of the values yes, true or 1 when the encoding is UTF-16, even though the XML 1.0 and XML 1.1 specifications state that entities encoded in UTF-16 must begin with a byte order mark. Consequently, this specification does not guarantee that the resulting XML fragment, without a byte order mark, will not cause an error when processed by a conforming XML processor.

5.1.13 XML Output Method: the `escape-solidus` Parameter

The escape-solidus parameter is not applicable to the XML output method. It is the responsibility of the host language to specify whether an error occurs if this parameter is specified in combination with the XML output method, or if the parameter is simply dropped.

5.1.14 XML Output Method: the `escape-uri-attributes` Parameter

The escape-uri-attributes parameter is not applicable to the XML output method. It is the responsibility of the host language to specify whether an error occurs if this parameter is specified in combination with the XML output method, or if the parameter is simply dropped.

5.1.15 XML Output Method: the `include-content-type` Parameter

The include-content-type parameter is not applicable to the XML output method. It is the responsibility of the host language to specify whether an error occurs if this parameter is specified in combination with the XML output method, or if the parameter is simply dropped.

5.1.16 XML Output Method: the `item-separator` Parameter

The effect of the item-separator serialization parameter is described in 2 Sequence Normalization.

5.1.17 XML Output Method: the `allow-duplicate-names` Parameter

The allow-duplicate-names serialization parameter is not applicable to the XML output method.

5.1.18 XML Output Method: the `json-node-output-method` Parameter

The json-node-output-method serialization parameter is not applicable to the XML output method.

6 XHTML Output Method

Changes in 4.0 ⬇ ⬆

In the HTML and XHTML output methods, the rules for adding and replacing meta elements have been revised to take account of the new HTML5 syntax, for example <meta charset="utf-8">. [Issue 318 PR 342 14 February 2023]

The XHTML output method serializes the input tree as XML, using the HTML compatibility guidelines defined in the XHTML specification ([XHTML 1.0] or the XHTML syntax of HTML5 (see [HTML5]).

[Definition: An element node is recognized as an HTML element by the XHTML output method if

the element node is in the XHTML namespace, regardless of the value of the html-version serialization parameter or if the html-version serialization parameter is absent; or
the value of the html-version serialization parameter is 5.0, the element has a null namespace URI, and the local part of the name is equal to the name of an element defined by HTML5 [HTML5], making the comparison without regard to case.

]

It is entirely the responsibility of the person or process that creates the input tree to ensure that it conforms to the [XHTML 1.0] or [XHTML 1.1] specification if the html-version serialization parameter is absent or has a value less than 5.0 or the XHTML syntax of HTML5 if the value of the html-version serialization parameter is 5.0. It is not an error if the instance of the data model is invalid XHTML. Equally, it is entirely under the control of the person or process that creates the input tree whether the output conforms to XHTML 1.0 Strict, XHTML 1.0 Transitional, the XHTML syntax of HTML5 (see [HTML5]), [POLYGLOT] or any other specific definition of XHTML.

The serialization of the input tree follows the same rules as for the XML output method, with the general exceptions noted below and parameter-specific exceptions in 6.1 The Influence of Serialization Parameters upon the XHTML Output Method. These differences are based on the HTML compatibility guidelines published in Appendix C of [XHTML 1.0] and on [POLYGLOT], both of which are designed to ensure that as far as possible, XHTML is rendered correctly on user agents designed originally to handle HTML.

If the value of the html-version serialization parameter is 5.0, the input tree is first subjected to prefix normalization.

[Definition: During prefix normalization, any element node in the input tree that is in one of the XHTML namespace, the SVG namespace or the MathML namespace has its name replaced by the local part of its name. Such an element node is given a default namespace node whose value is the element’s namespace URI. Any namespace node for any of those three namespaces that was previously present on any element node in the input tree is also removed, unless the prefix that that namespace node declared is used as the prefix on the name of an attribute on that element or an ancestor of that element.]

The process of prefix normalization is equivalent to replacing the input tree with the result of the transformation described by this XSLT stylesheet, with the root of the input tree as the initial context value.

<xsl:stylesheet
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
    version="4.0"
    xmlns:xhtml="http://www.w3.org/1999/xhtml"
    xmlns:svg="http://www.w3.org/2000/svg"
    xmlns:mathml="http://www.w3.org/1998/Math/MathML">
  <xsl:template match="xhtml:*|svg:*|mathml:*">
    <xsl:element name="{local-name()}" 
                 namespace="{namespace-uri()}">
      <xsl:apply-templates select="@*|namespace::*|node()"/>
    </xsl:element>
  </xsl:template>

  <xsl:template match="node()|@*|namespace::*">
    <xsl:copy copy-namespaces="no">
      <xsl:apply-templates select="@*|namespace::*|node()"/>
    </xsl:copy>
  </xsl:template>

  <xsl:template
    match="namespace::*[. eq 'http://www.w3.org/1999/xhtml']|
           namespace::*[. eq 'http://www.w3.org/2000/svg']|
           namespace::*[. eq 'http://www.w3.org/1998/Math/MathML']"/>
</xsl:stylesheet>

[Definition: The following XHTML elements have an EMPTY content model: area, base, br, col, embed, hr, img, input, link, meta, basefont, frame, isindex, and param.] [Definition: The void elements of HTML5 are area, base, br, col, embed, hr, img, input, keygen, link, meta, param, source, track and wbr.] [Definition: An element node is expected to be empty if it is recognized as an HTML element and if either
- the html-version serialization parameter is absent or has a value less than 5.0 and the content model is EMPTY, or
- the html-version serialization parameter has the value 5.0 and the element is a void element.
]

If an element node that has no child nodes is not expected to be empty, and:
- the html-version serialization parameter is absent or has a value less than 5.0, and the content model of the HTML element is not EMPTY (for example, an empty title or paragraph); or
- the value of the html-version serialization parameter is 5.0, and the HTML element is not a void element,
the serializer must not use the minimized form. That is, it must output  and not .
If an element that has no children is expected to be empty, the serializer must use the minimized tag syntax, for example  , as the alternative syntax   allowed by XML gives uncertain results in many legacy user agents. If the html-version serialization parameter is absent or has a value less than 5.0, the serializer must include a space before the trailing />, e.g.  , <hr /> and <img src="karen.jpg" alt="Karen" />.
If the html-version serialization parameter is absent or has a value less than 5.0, the serializer must not use the entity reference ' which, although defined in XML and therefore in XHTML, is not defined in versions of HTML prior to HTML5, and is not recognized by all HTML user agents.
If the html-version serialization parameter is absent or has a value less than 5.0, the serializer should output namespace declarations in a way that is consistent with the requirements of the XHTML DTD if this is possible. If the value of the html-version serialization parameter is 5.0, the serializer should output namespace declarations in a way that is consistent with the requirements of [POLYGLOT]. The XHTML 1.0 DTDs require the declaration xmlns="http://www.w3.org/1999/xhtml" to appear on the html element, and only on the html element. The [POLYGLOT] specification permits namespace declarations to appear in a conforming document, but restricts the elements on which they can appear. The serializer must output namespace declarations that are consistent with the namespace nodes present in the result tree, but it must avoid outputting redundant namespace declarations on elements where the DTD would make them invalid, for versions prior to HTML5, or where they are not permitted by [POLYGLOT], for serialization according to the syntax of HTML5.

Note:

If the html element is generated by an XSLT literal result element of the form <html xmlns="http://www.w3.org/1999/xhtml"> ... </html>, or by an XQuery direct element constructor of the same form, then the html element in the result document will have a node name whose prefix is "", which will satisfy the requirements of the DTD. In other cases the prefix assigned to the element is implementation-dependent.

Note:

[POLYGLOT] and Appendix C of [XHTML 1.0] describe a number of compatibility guidelines for users of XHTML who wish to render their XHTML documents with HTML user agents. In some cases, such as the guideline on the form empty elements take, only the serialization process itself has the ability to follow the guideline. In such cases, those guidelines are reflected in the requirements on the serializer described above.

In all other cases, the guidelines can be adhered to by the input tree. The guideline on the use of whitespace characters in attribute values is one such example. Another example is that xml:lang="..." does not serialize to both xml:lang="..." and lang="..." as required by some legacy user agents. It is the responsibility of the person or process that creates the instance of the data model that is input to the serialization process to ensure it is created in a way that is consistent with the guidelines. No serialization error results if the input tree does not adhere to the guidelines.

6.1 The Influence of Serialization Parameters upon the XHTML Output Method

6.1.1 XHTML Output Method: the `version` Parameter

The behavior for the version parameter for the XHTML output method is described in 5.1.1 XML Output Method: the version Parameter.

6.1.2 XHTML Output Method: the `html-version` Parameter

The html-version parameter specifies whether the XHTML output method will produce a serialized document following rules that are tailored to the requirements of the XHTML syntax of [HTML5] or the requirements of [XHTML 1.0] and [XHTML 1.1].

The differences are described in detail throughout 6 XHTML Output Method.

6.1.3 XHTML Output Method: the `encoding` Parameter

The behavior for encoding parameter for the XHTML output method is described in 5.1.3 XML Output Method: the encoding Parameter.

6.1.4 XHTML Output Method: the `indent` and `suppress-indentation` Parameters

If the indent parameter has one of the values yes, true or 1, the serializer may add or remove whitespace as it serializes the result tree, if it observes the following constraints.

Whitespace must not be added other than before or after an element, or adjacent to an existing whitespace character.
Whitespace must not be added or removed adjacent to an inline element. The inline elements are those elements recognized as HTML elements that are in the %inline category of any of the XHTML 1.0 DTDs, in the %inline.class category of the XHTML 1.1 DTD, those elements defined to be phrasing elements in HTML5 and elements recognized as HTML elements with local names ins and del if they are used as inline elements (i.e., if they do not contain element children).
Whitespace must not be added or removed inside a formatted element, the formatted elements being those recognized as HTML elements with local names pre, script, style, title, and textarea.
Whitespace characters must not be added in the content of an element whose expanded QName matches a member of the list of expanded QNames in the value of the suppress-indentation parameter. The expanded QName of an element node is considered to match a member of the list of expanded QNames if:
- the two expanded QNames are equal;
- the expanded QNames both have null namespace URIs, and the local parts of the two QNames are equal without regard to case; or
- the value of the html-version serialization parameter is 5.0, the local parts of the two QNames are equal without regard to case and one QName has a null namespace URI and the namespace URI of the other is equal to the XHTML namespace URI.

Note:

The effect of the above constraints is to ensure any insertion or deletion of whitespace would not affect how an HTML user agent that conforms to the specified version of HTML would render the output, assuming the serialized document does not refer to any HTML style sheets.

The HTML definition of whitespace is different from the XML definition: see section 9.1 of [HTML] 4.01 specification.

6.1.5 XHTML Output Method: the `cdata-section-elements` Parameter

The behavior for cdata-section-elements parameter for the XHTML output method is described in 5.1.5 XML Output Method: the cdata-section-elements Parameter.

6.1.6 XHTML Output Method: the `omit-xml-declaration` and `standalone` Parameters

The behavior for omit-xml-declaration and standalone parameters for the XHTML output method is described in 5.1.6 XML Output Method: the omit-xml-declaration and standalone Parameters.

Note:

As with the XML output method, the XHTML output method specifies that an XML declaration will be output unless it is suppressed using the omit-xml-declaration parameter. Appendix C.1 of [XHTML 1.0] provides advice on the consequences of including, or omitting, the XML declaration.

6.1.7 XHTML Output Method: the `doctype-system` and `doctype-public` Parameters

If the value of the html-version serialization parameter is 5.0, the doctype-system serialization parameter is absent, the first element node child of the document node that is to be serialized is recognized as an HTML element, the local part of the QName of which is equal to the string HTML, without regard to case, and any text node preceding that element in document order contains only whitespace characters, then the XHTML output method must output a document type declaration immediately before the first element, with no public or system identifier. The name following <!DOCTYPE must be the same as the local part of the name of the element.

Otherwise, the behavior for doctype-system and doctype-public parameters for the XHTML output method is described in 5.1.7 XML Output Method: the doctype-system and doctype-public Parameters.

6.1.8 XHTML Output Method: the `undeclare-prefixes` Parameter

The behavior for undeclare-prefixes parameter for the XHTML output method is described in 5.1.8 XML Output Method: the undeclare-prefixes Parameter.

6.1.9 XHTML Output Method: the `normalization-form` Parameter

The behavior for normalization-form parameter for the XHTML output method is described in 5.1.9 XML Output Method: the normalization-form Parameter.

6.1.10 XHTML Output Method: the `media-type` Parameter

The behavior for media-type parameter for the XHTML output method is described in 5.1.10 XML Output Method: the media-type Parameter.

6.1.11 XHTML Output Method: the `use-character-maps` Parameter

The behavior for use-character-maps parameter for the XHTML output method is described in 5.1.11 XML Output Method: the use-character-maps Parameter.

6.1.12 XHTML Output Method: the `byte-order-mark` Parameter

The behavior for byte-order-mark parameter for the XHTML output method is described in 5.1.12 XML Output Method: the byte-order-mark Parameter.

6.1.13 XHTML Output Method: the `escape-solidus` Parameter

The escape-solidus parameter is not applicable to the XHTML output method. It is the responsibility of the host language to specify whether an error occurs if this parameter is specified in combination with the XML output method, or if the parameter is simply dropped.

6.1.14 XHTML Output Method: the `escape-uri-attributes` Parameter

If the escape-uri-attributes parameter has one of the values yes, true or 1, the XHTML output method must apply URI escaping to URI attribute values, except that relative URIs must not be absolutized.

Note:

This escaping is deliberately confined to non-ASCII characters, because escaping of ASCII characters is not always appropriate, for example when URIs or URI fragments are interpreted locally by the HTML user agent. Even in the case of non-ASCII characters, escaping can sometimes cause problems. More precise control of URI escaping is therefore available by setting escape-uri-attributes to no, and controlling the escaping of URIs by using methods defined in Section 6.2 fn:encode-for-uri^FO40 and Section 6.4 fn:iri-to-uri^FO40.

6.1.15 XHTML Output Method: the `include-content-type` Parameter

If the input tree includes a head element recognized as an HTML element, and the include-content-type parameter has one of the values yes, true or 1, the XHTML output method must add a meta element as the first child element of the head element, specifying the character encoding actually used. The meta element should be in no namespace if the head element is in no namespace, and in the XHTML namespace if the head element is in the XHTML namespace.

For example,

<head>
<meta http-equiv="Content-Type" 
      content="text/html; charset=EUC-JP" />
...

For HTML5, the alternative form <meta charset="EUC-JP"/> may be used.

The content type, if included, should be set to the value given for the media-type parameter.

Note:

It is recommended that the host language use as default value for this parameter one of the MIME types ([RFC2046]) registered for XHTML. Currently, these are text/html (registered by [RFC2854]) and application/xhtml+xml (registered by [RFC3236]). Note that some user agents fail to recognize the charset parameter if the content type is not text/html.

If a meta element has been added to the head element as described above, then any existing meta element child of the head element having either a charset attribute, or an http-equiv attribute with the value "Content-Type", making the comparison without regard to case after first stripping leading and trailing spaces from the value of the attribute solely for the purposes of comparison, must be discarded.

Note:

This process removes possible parameters in the attribute value. For example,

<meta http-equiv="Content-Type" 
      content="text/html;version='4.0'" />

in the input tree might be replaced by

<meta http-equiv="Content-Type" 
      content="text/html;charset=utf-8" />

or by

<meta charset="utf-8"/>

6.1.16 XHTML Output Method: the `item-separator` Parameter

The effect of the item-separator serialization parameter is described in 2 Sequence Normalization.

6.1.17 XHTML Output Method: the `allow-duplicate-names` Parameter

The allow-duplicate-names serialization parameter is not applicable to the XHTML output method.

6.1.18 XHTML Output Method: the `json-node-output-method` Parameter

The json-node-output-method serialization parameter is not applicable to the XHTML output method.

7 HTML Output Method

Changes in 4.0 ⬇ ⬆

In the HTML and XHTML output methods, the rules for adding and replacing meta elements have been revised to take account of the new HTML5 syntax, for example <meta charset="utf-8">. [Issue 318 PR 342 14 February 2023]

The HTML output method serializes the input tree as HTML.

For example, the following XSL stylesheet generates html output,

<xsl:stylesheet version="2.0" 
     xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="html" version="4.0"/>
<xsl:template match="/">
  <html>
    <xsl:apply-templates/>
  </html>
</xsl:template>
...
</xsl:stylesheet>

In the example, the version attribute of the xsl:output element indicates the version of the HTML Recommendation [HTML] to which the serialized result is to conform.

It is entirely the responsibility of the person or process that creates the input tree to ensure that it conforms to the HTML Recommendation [HTML]. It is not an error if the input tree is invalid HTML. Equally, it is entirely under the control of the person or process that creates the input tree whether the output conforms to HTML. If the result tree is valid HTML, the serializer must serialize the result in a way that conforms with the version of HTML specified by the requested HTML version.

7.1 Markup for Elements

As is described in detail below, the HTML output method will not output an element differently from the XML output method unless the element is to be serialized as an HTML element. [Definition: The portion of the serialized document representing the result of serializing an element that is not to be serialized as an HTML element is known as an XML Island.] [Definition: An element node is serialized as an HTML element if

the expanded QName of the element has a null namespace URI, regardless of the value of the requested HTML version, or
the value of the requested HTML version is 5.0 or greater, and the element node is in the XHTML namespace.

]

If the element is to be serialized as an HTML element, but the local part of the expanded QName is not recognized as the name of an HTML element, the element must be output in the same way as a non-empty, inline element such as span. In particular:

Any namespace node in the result tree for the XML namespace is ignored by the HTML output method. In addition, if the requested HTML version is 5.0, any element node that has a prefix and is in the XHTML namespace, MathML namespace, or SVG namespace must be serialized with an unprefixed element name. The serializer must serialize an attribute with the name xmlns whose value is equal to the namespace URI of the element node, unless an ancestor element in the serialized result already has an attribute named xmlns with the same value, and no intervening element has an attribute named xmlns with a different value. If the element node has a namespace node for the default namespace whose value is not equal to the namespace URI of the element node, the namespace node is ignored. The serializer must not serialize a namespace declaration for the namespace node declaring the element node’s prefix, unless an attribute of the element node has the same prefix. For namespace nodes in the result tree that are not ignored, the HTML output method must represent these namespaces using attributes named xmlns or xmlns:prefix in the same way as the XML output method would represent them when the version parameter is set to 1.0.
If the result tree contains elements or attributes whose names have a non-null namespace URI, the HTML output method must generate namespace-prefixed QNames for these nodes in the same way as the XML output method would do when the version parameter is set to 1.0.
Where special rules are defined later in this section for serializing specific HTML elements and attributes, these rules must not be applied to an element that is not to be serialized as an HTML element or an attribute whose name has a non-null namespace URI. However, the generic rules for the HTML output method that apply to all elements and attributes, for example the rules for escaping special characters in the text and the rules for indentation, must be used also for namespaced elements and attributes.
When serializing an element whose name is not defined in the HTML specification, but that is to be serialized as an HTML element, the HTML output method must apply the same rules (for example, indentation rules) as when serializing a span element. The descendants of such an element must be serialized as if they were descendants of a span element.
When serializing an element whose name is in a non-null namespace, the HTML output method must apply the same rules (for example, indentation rules) as when serializing a div element. The descendants of such an element must be serialized as if they were descendants of a div element, except for the influence of the cdata-section-elements serialization parameter on any text node children of the element.

The HTML output method must not output an end-tag for an empty element if the element type has an empty content model, and the value of the requested HTML version is less than 5.0, or the element is a void element and the value of the requested HTML version is 5.0. For example, an element written as   or   in an XSLT stylesheet must be output as  .

For HTML 4.0, the element types that have an empty content model are area, base, basefont, br, col, embed, frame, hr, img, input, isindex, link, meta and param. For HTML5, the void elements are area, base, br, col, embed, hr, img, input, keygen, link, meta, param, source, track and wbr. It is implementation-defined whether the basefont, frame and isindex elements, which are not part of HTML5, are considered to be void elements when the requested HTML version has the value 5.0.

Note:

The markup generation step of the phases of serialization only creates start tags and end tags for the HTML output method, never XML-style empty element tags. As such, a serializer must serialize an HTML element that has no children, but whose content model is not empty, using a pair of adjacent start and end element tags, or as a solitary start tag if permitted by the context.

For any element node that is to be serialized as an HTML element, the HTML output method must compare the local part of the name of the element node with the names of HTML elements, making the comparison without regard to case. If the local part of the name of the element node compares equal to that of any HTML element, the element node must be recognized as being that kind of HTML element. For example, elements named br, BR or Br must all be recognized as the HTML br element and output without an end tag.

The HTML output method must not perform escaping for any text node descendant, nor for any attribute of an element node descendant, of a script or style element.

For example, a script element created by an XQuery direct element constructor or an XSLT literal result element, such as:

<script>if (a &lt; b) foo()</script>

<script><![CDATA[if (a < b) foo()]]></script>

must be output as

<script>if (a < b) foo()</script>

A common requirement is to output a script element as shown in the example below:

<script type="application/ecmascript">
      document.write ("<em>This won't work</em>")
</script>

This is invalid HTML, for the reasons explained in section B.3.2 of the [HTML] 4.01 specification. Nevertheless, it is possible to output this fragment using either of the following constructs:

Firstly, by use of a script element created by an XQuery direct element constructor or an XSLT literal result element:

<script type="application/ecmascript">
      document.write ("<em>This won't work</em>")
</script>

Secondly, by constructing the markup from ordinary text characters:

<script type="application/ecmascript">
      document.write ("&lt;em&gt;This won't work&lt;/em&gt;")
</script>

As the [HTML] specification points out, the correct way to write this is to use the escape conventions for the specific scripting language. For JavaScript, it can be written as:

<script type="application/ecmascript">
      document.write ("&lt;em&gt;This will work&lt;\/em&gt;")
</script>

The [HTML] 4.01 specification also shows examples of how to write this in various other scripting languages. The escaping must be done explicitly; it will not be done by the serializer.

7.2 Writing Attributes

The HTML output method must not escape "<" characters occurring in attribute values.

A boolean attribute is an attribute with only a single allowed value in any of the HTML DTDs or that is specified to be a boolean attribute by HTML5 (see [HTML5]), where the allowed value is equal without regard to case to the name of the attribute. The HTML output method must output any boolean attribute in minimized form if and only if the value of the attribute node actually is equal to the name of the attribute making the comparison without regard to case.

For example, a start-tag created using the following XQuery direct element constructor or XSLT literal result element

<OPTION selected="selected">

must be output as

<OPTION selected>

The HTML output method must not escape a & character occurring in an attribute value immediately followed by a { character (see Section B.7.1 of the HTML Recommendation [HTML]).

For example, a start-tag created using the following XQuery direct element constructor or XSLT literal result element

<BODY bgcolor='&amp;{{randomrbg}};'>

must be output as

<BODY bgcolor='&{randomrbg};'>

See 7.4 The Influence of Serialization Parameters upon the HTML Output Method for additional directives on how attributes may be written.

7.3 Writing Character Data

The HTML output method may output a character using a character entity reference in preference to using a numeric character reference, if an entity is defined for the character in the version of HTML that the output method is using. Entity references and character references should be used only where the character is not present in the selected encoding, or where the visual representation of the character is unclear (as with  , for example).

When outputting a sequence of whitespace characters in the input tree, within an element where whitespace characters are treated normally (but not in elements such as pre and textarea), the HTML output method may represent it using any sequence of whitespace characters that will be treated in the same way by an HTML user agent. See section 3.5 of [XHTML Modularization] for some additional information on handling of whitespace by an HTML user agent for versions of HTML prior to HTML5, and see [HTML5] for information on the handling of whitespace characters by an HTML5 user agent.

Note:

The terms space character and whitespace character defined in HTML5 do not match the definition of whitespace character in this specification.

Certain characters are permitted in XML, but not in HTML prior to HTML5 — for example, the control characters U+007F (DELETE) through U+009F (APC) are permitted in both XML 1.0 and XML 1.1, and the control characters U+0001 (SOH) through U+0008 (BACKSPACE) , U+000B (VERTICAL TAB) , U+000C (FORM FEED) and U+000E (SHIFT OUT) through U+001F (IS1) are permitted in XML 1.1, but none of these is permitted in HTML prior to HTML5. It is a serialization error [err:SERE0014] to use the HTML output method if such characters appear in the input tree and the value of the requested HTML version is less than 5.0. The serializer must raise the error.

The HTML output method must terminate processing instructions with > rather than ?>. It is a serialization error [err:SERE0015] to use the HTML output method when > appears within a processing instruction in input tree.

7.4 The Influence of Serialization Parameters upon the HTML Output Method

7.4.1 HTML Output Method: the `version` and `html-version` Parameters

The html-version or the version serialization parameter indicates the version of the HTML Recommendation [HTML] or [HTML5] to which the serialized result is to conform. [Definition: If the html-version serialization parameter is not absent, the requested HTML version is the value of the html-version serialization parameter; otherwise, it is the value of the version serialization parameter.] If the serializer does not support the version of HTML specified by the requested HTML version, it must raise a serialization error [err:SESU0013].

This document provides the normative definition of serialization for the HTML output method if the requested HTML version has the lexical form of a value of type decimal whose value is 1.0 or greater, but no greater than 5.0. For any other value of version parameter, the behavior is implementation-defined. In that case the implementation-defined behavior may supersede all other requirements of this recommendation.

7.4.2 HTML Output Method: the `encoding` Parameter

The encoding parameter specifies the encoding to be used. Serializers are required to support values of UTF-8 and UTF-16. A serialization error [err:SESU0007] occurs if an output encoding other than UTF-8 or UTF-16 is requested and the serializer does not support that encoding. The serializer must raise the error.

It is possible that the input tree will contain a character that cannot be represented in the encoding that the serializer is using for output. In this case, if the character occurs in a context where HTML recognizes character references, then the character must be output as a character entity reference or decimal numeric character reference; otherwise (for example, in a script or style element or in a comment), the serializer must raise a serialization error [err:SERE0008].

See 7.4.14 HTML Output Method: the include-content-type Parameter regarding how this parameter is used with the include-content-type parameter.

7.4.3 HTML Output Method: the `indent` and `suppress-indentation` Parameters

If the indent parameter has one of the values yes, true or 1, then the HTML output method may add or remove whitespace as it serializes the result tree, if it observes the following constraints.

Whitespace must not be added other than before or after an element, or adjacent to an existing whitespace character.
Whitespace must not be added or removed adjacent to an inline element. The inline elements are those included in the %inline category of any of the HTML 4.01 DTDs or those elements defined to be phrasing elements in HTML5, as well as the ins and del elements if they are used as inline elements (i.e., if they do not contain element children).
Whitespace must not be added or removed inside a formatted element, the formatted elements being pre, script, style, title, and textarea.
Whitespace characters must not be added in the content of an element whose expanded QName matches a member of the list of expanded QNames in the value of the suppress-indentation parameter. The expanded QName of an element node is considered to match a member of the list of expanded QNames if:
- the two expanded QNames are equal;
- the expanded QNames both have null namespace URIs, and the local parts of the two QNames are equal without regard to case; or
- the value of the requested HTML version is 5.0, the local parts of the two QNames are equal without regard to case and one QName has a null namespace URI and the namespace URI of the other is equal to the XHTML namespace URI.

Note:

The effect of the above constraints is to ensure that any insertion or deletion of whitespace would not affect how a conforming HTML user agent would render the output, assuming the serialized document does not refer to any HTML style sheets.

Note that the HTML definition of whitespace is different from the XML definition (see section 9.1 of the [HTML] specification).

7.4.4 HTML Output Method: the `cdata-section-elements` Parameter

The cdata-section-elements parameter is not applicable to the HTML output method, except in the case of XML Islands.

7.4.5 HTML Output Method: the `omit-xml-declaration` and `standalone` Parameters

The omit-xml-declaration and standalone parameters are not applicable to the HTML output method.

7.4.6 HTML Output Method: the `doctype-system` and `doctype-public` Parameters

If the doctype-public or doctype-system parameters are specified, then the HTML output method must output a document type declaration. If the doctype-public parameter is specified, then the output method must output PUBLIC followed by the specified public identifier; if the doctype-system parameter is also specified, it must also output the specified system identifier following the public identifier. If the doctype-system parameter is specified but the doctype-public parameter is not specified, then the output method must output SYSTEM followed by the specified system identifier.

If the value of the requested HTML version is 5.0, the doctype-public and doctype-system serialization parameters are both absent, the first element node child of the document node that is to be serialized is to be serialized as an HTML element, the local part of the QName of which is equal to the string HTML, without regard to case, and any text node that precedes that element node in the document contains only whitespace characters, then the HTML output method must output a document type declaration, with no public or system identifier.

If the HTML output method must output a document type declaration, it must be serialized immediately before the first element, if any, and the name following <!DOCTYPE must be HTML or html.

7.4.7 HTML Output Method: the `undeclare-prefixes` Parameter

The undeclare-prefixes parameter is not applicable to the HTML output method.

7.4.8 HTML Output Method: the `normalization-form` Parameter

The normalization-form parameter is applicable to the HTML output method. The values NFC and none must be supported by the serializer. A serialization error [err:SESU0011] results if the value of the normalization-form parameter specifies a normalization form that is not supported by the serializer; the serializer must raise the error.

7.4.9 HTML Output Method: the `media-type` Parameter

The media-type parameter is applicable to the HTML output method. See 3 Serialization Parameters for more information. See 7.4.14 HTML Output Method: the include-content-type Parameter regarding how this parameter is used with the include-content-type parameter.

7.4.10 HTML Output Method: the `use-character-maps` Parameter

The use-character-maps parameter is applicable to the HTML output method. See 11 Character Maps for more information.

7.4.11 HTML Output Method: the `byte-order-mark` Parameter

The byte-order-mark parameter is applicable to the HTML output method. See 3 Serialization Parameters for more information.

7.4.12 HTML Output Method: the `escape-solidus` Parameter

The escape-solidus parameter is not applicable to the HTML output method. It is the responsibility of the host language to specify whether an error occurs if this parameter is specified in combination with the XML output method, or if the parameter is simply dropped.

7.4.13 HTML Output Method: the `escape-uri-attributes` Parameter

If the escape-uri-attributes parameter has one of the values yes, true or 1, the HTML output method must apply URI escaping to URI attribute values, except that relative URIs must not be absolutized.

Note:

7.4.14 HTML Output Method: the `include-content-type` Parameter

If there is a head element, and the include-content-type parameter has one of the values yes, true or 1, the HTML output method must add a meta element as the first child element of the head element specifying the character encoding actually used. The meta element may take either the traditional form using an http-equiv attribute or the newer HTML5 form using a charset attribute.

For example,

<head>
<meta http-equiv="Content-Type" content="text/html; charset=EUC-JP">
...

<head>
<meta charset="EUC-JP">
...

The content type, if included, must be set to the value given for the media-type parameter.

If a meta element has been added to the head element as described above, then any existing meta element child of the head element having a charset attribute or an http-equiv attribute with the value "Content-Type", making the comparison without regard to case after first stripping leading and trailing spaces from the value of the attribute solely for the purposes of comparison, must be discarded.

Note:

This process removes possible parameters in the attribute value. For example,

<meta http-equiv="Content-Type" 
      content="text/html;version='4.0'">

in the input tree might be replaced by

<meta http-equiv="Content-Type" 
      content="text/html;charset=utf-8">

or by

<meta charset="utf-8">

7.4.15 HTML Output Method: the `item-separator` Parameter

The effect of the item-separator serialization parameter is described in 2 Sequence Normalization.

7.4.16 HTML Output Method: the `allow-duplicate-names` Parameter

The allow-duplicate-names serialization parameter is not applicable to the HTML output method.

7.4.17 HTML Output Method: the `json-node-output-method` Parameter

The json-node-output-method serialization parameter is not applicable to the HTML output method.

8 Text Output Method

The Text output method serializes the input tree by outputting the string value of the document node created by the markup generation step of the phases of serialization without any escaping.

A newline character in the input tree may be output using any character sequence that is conventionally used to represent a line ending in the chosen system environment.

Note:

The rule just stated applies to the character U+000A (NEWLINE) ; it does not apply to occurrences in the input tree of U+000D (CARRIAGE RETURN) , U+0085, or U+2028: these should be output literally, regardless of the conventions for line endings in the system environment.

To illustrate, the following table shows the expected output for various character sequences in environments which conventionally use U+000A (NEWLINE) (LF, as in Linux systems), U+000D (CARRIAGE RETURN) followed by U+000A (NEWLINE) (CR+LF, Windows), U+000D (CARRIAGE RETURN) (CR only, older versions of Mac OS), U+0085 (some IBM operating systems), or U+2028 to separate lines:

Expected output for various character sequences
Input	#xA systems	#xD#xA systems	#xD systems	#x85 systems	#x2028 systems
character #xD	character #xD	character #xD	character #xD	character #xD	character #xD
character #xA	character #xA	string #xD + #xA	character #xD	character #x85	character #x2028
string #xD + #xA	string #xD + #xA	string #xD + #xD + #xA	string #xD + #xD	string #xD + #x85	string #xD + #x2028
string #xD + #xD + #xA	string #xD + #xD + #xA	string #xD + #xD + #xD + #xA	string #xD + #xD + #xD	string #xD + #xD + #x85	string #xD + #xD + #x2028

8.1 The Influence of Serialization Parameters upon the Text Output Method

8.1.1 Text Output Method: the `version` Parameter

The version parameter is not applicable to the Text output method.

8.1.2 Text Output Method: the `html-version` Parameter

The html-version parameter is not applicable to the Text output method.

8.1.3 Text Output Method: the `encoding` Parameter

The encoding parameter identifies the encoding that the Text output method must use to convert sequences of characters to sequences of bytes. Serializers are required to support values of UTF-8 and UTF-16. A serialization error [err:SESU0007] occurs if the serializer does not support the encoding specified by the encoding parameter. The serializer must raise the error. If the input tree contains a character that cannot be represented in the encoding that the serializer is using for output, the serializer must raise a serialization error [err:SERE0008].

8.1.4 Text Output Method: the `indent` and `suppress-indentation` Parameters

The indent and suppress-indentation parameters are not applicable to the Text output method.

8.1.5 Text Output Method: the `cdata-section-elements` Parameter

The cdata-section-elements parameter is not applicable to the Text output method.

8.1.6 Text Output Method: the `omit-xml-declaration` and `standalone` Parameters

The omit-xml-declaration and standalone parameters are not applicable to the Text output method.

8.1.7 Text Output Method: the `doctype-system` and `doctype-public` Parameters

The doctype-system and doctype-public parameters are not applicable to the Text output method.

8.1.8 Text Output Method: the `undeclare-prefixes` Parameter

The undeclare-prefixes parameter is not applicable to the Text output method.

8.1.9 Text Output Method: the `normalization-form` Parameter

The normalization-form parameter is applicable to the Text output method. The values NFC and none must be supported by the serializer. A serialization error [err:SESU0011] results if the value of the normalization-form parameter specifies a normalization form that is not supported by the serializer; the serializer must raise the error.

8.1.10 Text Output Method: the `media-type` Parameter

The media-type parameter is applicable to the Text output method. See 3 Serialization Parameters for more information.

8.1.11 Text Output Method: the `use-character-maps` Parameter

The use-character-maps parameter is applicable to the Text output method. See 11 Character Maps for more information.

8.1.12 Text Output Method: the `byte-order-mark` Parameter

The byte-order-mark parameter is applicable to the Text output method. See 3 Serialization Parameters for more information.

8.1.13 Text Output Method: the `escape-solidus` Parameter

The escape-solidus parameter is not applicable to the Text output method. It is the responsibility of the host language to specify whether an error occurs if this parameter is specified in combination with the XML output method, or if the parameter is simply dropped.

8.1.14 Text Output Method: the `escape-uri-attributes` Parameter

The escape-uri-attributes parameter is not applicable to the Text output method.

8.1.15 Text Output Method: the `include-content-type` Parameter

The include-content-type parameter is not applicable to the Text output method.

8.1.16 Text Output Method: the `item-separator` Parameter

The effect of the item-separator serialization parameter is described in 2 Sequence Normalization.

8.1.17 Text Output Method: the `allow-duplicate-names` Parameter

The allow-duplicate-names serialization parameter is not applicable to the Text output method.

8.1.18 Text Output Method: the `json-node-output-method` Parameter

The json-node-output-method serialization parameter is not applicable to the Text output method.

9 JSON Output Method

Changes in 4.0 ⬇ ⬆

Added the escape-solidus parameter for JSON serialization. [Issue 530 PR 534 6 June 2023]

The JSON output method serializes the input tree as a JSON value using the JSON syntax defined in [RFC 7159]. Sequence normalization is not performed for this output method.

An empty sequence in the input tree is serialized to the JSON token null.
Sequences of length greater than one in the input tree are processed as follows:
- If the sequence is on root level, it is serialized item by item by applying the rules in this section and separating the items by the item-separator value (see 9.1.16 JSON Output Method: the item-separator Parameter).
- Otherwise, [err:SERE0023] is raised.
An array item in the input tree is serialized to a JSON array by outputting the serialized JSON value of each member within the array separated by delimiters according to the JSON array syntax, i.e. [member, member, ...]. Each member in the array is to be serialized by recursively applying the rules in this section.
A map item in the input tree is serialized to a JSON object by outputting, for each key/value pair, the string value of the key to a JSON string, followed by the serialized JSON value of the entry, separated by delimiters according to the JSON object syntax, i.e. {key:value, key:value, ...}. The order in which each key/value pair appears in the serialized output is implementation-dependent.

If any two keys of the map item have the same string value, serialization error [err:SERE0022] is raised, unless the allow-duplicate-names parameter has one of the values yes, true or 1.
A node in the input tree is serialized to a JSON string by outputting the result of serializing the node using the method specified by the json-node-output-method parameter. The node is serialized with the serialization parameter omit-xml-declaration set to yes and with no other serialization parameters set.
An atomic value^XP40 in the input tree with a numeric type, or derived from a numeric type xs:float, xs:double or xs:decimal is serialized to a JSON number. Implementations may serialize the numeric value using any lexical representation of a JSON number defined in [RFC 7159]. If the numeric value cannot be represented in the JSON grammar (such as Infinity or NaN), then the serializer must raise a serialization error [err:SERE0020].
An atomic item^XP40 in the input tree of type xs:boolean and value true is serialized to the JSON token true.
An atomic item^XP40 in the input tree of type xs:boolean and value false is serialized to the JSON token false.
An atomic item^XP40 of type xs:QName in the input tree whose namespace part is "http://www.w3.org/2005/xpath-functions" and whose local part is "null" is serialized to the JSON token null.

Note:

This rule is introduced in 4.0, along with an option in the fn:parse-json function to allow a user-defined representation of the JSON value null. While the default representation of null as an empty sequence is usable in many circumstances, an explicit representation of null as a recognizable item can make some operations on JSON-derived values easier.
Any other atomic item^XP40 in the input tree is serialized to a JSON string by outputting the result of applying the fn:string function to the item.
Any item in the input tree of a type not specified in the above list will result in a serialization error [err:SERE0021].

[Definition: Whenever a value is serialized to a JSON string, the following procedure is applied to the supplied string:

Any character in the string for which character mapping is defined (see 11 Character Maps) is substituted by the replacement string defined in the character map.
Any other character in the input string (but not a character produced by character mapping) is a candidate for Unicode Normalization if requested by the normalization-form parameter, and JSON escaping. JSON escaping replaces the characters quotation mark, backspace, form-feed, newline, carriage return, tab, or reverse solidus by the corresponding JSON escape sequences \", \b, \f, \n, \r, \t, or \\ respectively, and any other codepoint in the range 1-31 or 127-159 by an escape in the form \uHHHH where HHHH is the hexadecimal representation of the codepoint value. Escaping further replaces the solidus character (/) by the escape sequence \/ if the escape-solidus parameter is set to true, yes, or 1, but not if it is set to false, no, or 0. Escaping is also applied to any characters that cannot be represented in the selected encoding.
The resulting string is enclosed in double quotation marks.

]

Finally, encoding, as controlled by the encoding parameter, converts the character stream produced by the preceding rules into an octet stream.

9.1 The Influence of Serialization Parameters upon the JSON Output Method

When nodes are serialized using the JSON output method, serialization is delegated to the output method specified by the json-node-output-method serialization parameter. The omit-xml-declaration parameter is set to yes, and no other serialization parameters are passed down to the serialization method responsible for serializing the node.

9.1.1 JSON Output Method: the `version` Parameter

The version parameter is not applicable to the JSON output method.

9.1.2 JSON Output Method: the `html-version` Parameter

The html-version parameter is not applicable to the JSON output method.

9.1.3 JSON Output Method: the `encoding` Parameter

The encoding parameter identifies the encoding that the JSON output method must use to convert sequences of characters to sequences of bytes. Serializers are required to support values of UTF-8 and UTF-16. A serialization error [err:SESU0007] occurs if the serializer does not support the encoding specified by the encoding parameter. The serializer must raise the error. If the input tree contains a character that cannot be represented in the encoding that the serializer is using for output, the serializer must raise a serialization error [err:SERE0008].

Note:

If an encoding other than UTF-8, UTF-16, UTF-32, US-ASCII, or an equivalent is specified for the encoding parameter, the output will (except in unusual circumstances) fail to conform to the definition of JSON in [RFC 7159].

9.1.4 JSON Output Method: the `indent` and `suppress-indentation` Parameters

The indent parameter controls whether the serializer adjusts the whitespace in the serialized result so that a person will find it easier to read. If the indent parameter has one of the values yes, true or 1, the serializer may output additional whitespace characters adjacent to the JSON structural tokens. If the indent parameter has the value no, false or 0, the serializer must output no whitespace characters adjacent to the JSON structural tokens.

The suppress-indentation parameter is not applicable to the JSON output method.

9.1.5 JSON Output Method: the `cdata-section-elements` Parameter

The cdata-section-elements parameter is not applicable to the JSON output method.

9.1.6 JSON Output Method: the `omit-xml-declaration` and `standalone` Parameters

The omit-xml-declaration and standalone parameters are not applicable to the JSON output method.

9.1.7 JSON Output Method: the `doctype-system` and `doctype-public` Parameters

The doctype-system and doctype-public parameters are not applicable to the JSON output method.

9.1.8 JSON Output Method: the `undeclare-prefixes` Parameter

The undeclare-prefixes parameter is not applicable to the JSON output method.

9.1.9 JSON Output Method: the `normalization-form` Parameter

The normalization-form parameter is applicable to the JSON output method. The values NFC and none must be supported by the serializer. A serialization error [err:SESU0011] results if the value of the normalization-form parameter specifies a normalization form that is not supported by the serializer; the serializer must raise the error.

9.1.10 JSON Output Method: the `media-type` Parameter

The media-type parameter is applicable to the JSON output method. See 3 Serialization Parameters for more information.

9.1.11 JSON Output Method: the `use-character-maps` Parameter

The use-character-maps parameter is applicable to the JSON output method. See 11 Character Maps for more information.

9.1.12 JSON Output Method: the `byte-order-mark` Parameter

The byte-order-mark parameter is applicable to the JSON output method. See 3 Serialization Parameters for more information.

Note:

Serialized output containing a byte-order mark does not conform to the definition of JSON in [RFC 7159] (although conforming JSON parsers are allowed to tolerate the byte-order mark).

9.1.13 JSON Output Method: the `escape-solidus` Parameter

The escape-solidus parameter is applicable to the JSON output method. If the value is yes, true, or 1, then the solidus character ("/") appearing in a string is escaped with a backslash (as "\/"). If the value is no, false, or 0, then it is not escaped.

Note:

In previous versions of this specification, the solidus was always escaped. Although JSON does not require this character to be escaped, doing so provides extra safety when the generated JSON is embedded in an HTML script element, since an unintended </script> end tag might otherwise cause the script to be prematurely terminated. In other situations, however, the escaping creates visual clutter and makes the output less readable.

9.1.14 JSON Output Method: the `escape-uri-attributes` Parameter

The escape-uri-attributes parameter is not applicable to the JSON output method.

9.1.15 JSON Output Method: the `include-content-type` Parameter

The include-content-type parameter is not applicable to the JSON output method.

9.1.16 JSON Output Method: the `item-separator` Parameter

The item-separator specifies the string to be inserted between adjacent serialized items. If the item-separator parameter is absent, the character U+000A (NEWLINE) is used as the item-separator value.

9.1.17 JSON Output Method: the `allow-duplicate-names` Parameter

The allow-duplicate-names serialization parameter determines whether the presence of multiple keys in a map item with the same string value (e.g. the date 2014-10-01 and the string "2014-10-01") will or will not raise serialization error [err:SERE0022]. If the value is one of, yes, true or 1, such duplicate keys will result in duplicate object-member names in the JSON output and no error will be raised because of the duplicate names. If the value is no, false or 0, such duplicate keys are an error ([err:SERE0022]).

9.1.18 JSON Output Method: the `json-node-output-method` Parameter

The json-node-output-method serialization parameter determines how a node in the input tree gets converted to a JSON value. If the value is one of xml, xhtml, html or text, then the node is converted to a JSON string by serializing the node using the output method specified by this parameter. If the value is xml or xhtml then the node is serialised with the additional serialization parameter omit-xml-declaration set to yes.

10 Adaptive Output Method

The Adaptive output method serializes the input tree into a human readable form for the purposes of debugging query results. The intention of this is to allow any input value to be serialized without raising a serialization error. Sequence normalization is not performed for this output method.

Each item in the supplied sequence is serialized individually as follows, with an occurrence of the chosen item-separator between successive items.

A document, element, text, comment, or processing instruction node is serialized using the XML output method described in 5 XML Output Method.
An attribute or namespace node is serialized as if it had a containing element node. For example an attribute node might be serialized as the string xsi:type="xs:integer"; a namespace node might be serialized as xmlns:sns="http://example.com/sample-namespace".

Note:

This may result in output of QNames containing prefixes whose binding is not displayed.

An atomic value^XP40 is serialized as follows:

An instance of xs:boolean is serialized as true() or false().
An instance of xs:string, xs:untypedAtomic or xs:anyURI is serialized by enclosing the value in double quotation marks and doubling any quotes within the value; or optionally by enclosing the value in apostrophes and doubling any apostrophes within the value. The resulting value is then serialized using the Text output method described in 8 Text Output Method.

Note:

The Text output method will apply character expansion and encoding rules to this string as specified by the serialization parameters.
An instance of xs:integer or xs:decimal is serialized by converting the value to a string using the fn:string function.

An instance of xs:double is serialized by applying the function format-number(?, '0.0##########################e0') using the following default decimal format properties:

Decimal format
Property name	Property value
`decimal-separator`	U+002E (FULL STOP, PERIOD, `.`)
`exponent-separator`	U+0065 (LATIN SMALL LETTER E, `e`)
`grouping-separator`	U+002C (COMMA, `,`)
`zero-digit`	U+0030 (DIGIT ZERO, `0`)
`digit`	U+0023 (NUMBER SIGN, `#`)
`infinity`	The string "INF"
`NaN`	The string "NaN"
`minus-sign`	U+002D (HYPHEN-MINUS, `-`)

An instance of xs:QName or xs:NOTATION is serialized as a URI-qualified name (that is, in the form Q{uri}local).
An atomic item of any other type is serialized using the syntax of a constructor function: xs:TYPE("VAL") where TYPE is the name of the primitive type, and VAL is the result of applying the fn:string() function. For example, xs:date("2015-07-17"). The resulting string is then serialized using the Text output method described in 8 Text Output Method.

An array item is serialized using the syntax of a SquareArrayConstructor^XP40, that is as [member,member, ... ]. The members, which in general are sequences, are serialized in the form (item,item, ...) where the items are serialized by applying these rules recursively. The items are separated by commas (not by the item-separator character). The enclosing parentheses are optional if the sequence has length one.

Note:

The serializer should avoid outputting the parentheses if it is able to determine the length of the sequence before serializing the first item; but it is allowed to output parentheses around a singleton if this avoids buffering data in memory.
A map item is serialized using the syntax of a MapConstructor^XP40 without the optional map keyword, that is in the format {key:value, key:value, ...}. The order of entries is implementation-dependent. The key is serialized by applying the rules for serializing an atomic item. The values are serialized in the same way as the members of an array (see above).
A function item is serialized to the representation name#A where fn:name is a representation of the function name and A is the arity. If the function name is in one of the namespaces http://www.w3.org/2005/xpath-functions, http://www.w3.org/2005/xpath-functions/math, http://www.w3.org/2005/xpath-functions/map, http://www.w3.org/2005/xpath-functions/array or http://www.w3.org/2001/XMLSchema, then the name is output as a lexical QName using the conventional prefix fn, math, map, array, or xs as appropriate; if it is in any other namespace or in no namespace, then the name is output as a URI-qualified name (that is, Q{uri}local). If the function is anonymous, name is replaced by the string (anonymous-function).
Note:

The following examples illustrate this rule:
- fn:exists#1 is serialized as function fn:exists#1
- Q{http://www.w3.org/2005/xpath-functions}exists#1 is serialized as fn:exists#1
- function($a) { $a } is serialized as (anonymous-function)#1
- math:pi#0 is serialized as math:pi#0

Character maps are applied (a) when nodes are serialized using the XML output method, and (b) to any value represented as a string enclosed in quotation marks.

Optionally, in all the above constructs, characters whose visual representation is ambiguous (for example tab or non-breaking-space) may be represented in the form of an XML numeric character reference (for example 	 or  )

Note:

In many cases the serialization of an item conforms to the syntax of an XQuery expression whose result is that item. There are exceptions, however. For example, the syntax will not be valid XQuery in the case of free-standing attribute or namespace nodes, or QName values, or anonymous functions; and where it is valid XQuery, the result of evaluating the expression will not necessarily be identical to the original: for example, the distinction between strings and untypedAtomic items is lost.

If any value cannot be output because doing so would cause a serialization error, the behavior is implementation-defined.

If the output is sent to a destination that allows hyperlinks to be included in the generated text, then the serializer may include implementation-dependent hyperlinks to provide additional information for example:

to allow the type of atomic items^XP40 to be ascertained.
to allow the namespace binding of prefixes to be ascertained.
to provide further information about the cause of error indicators.

10.1 The Influence of Serialization Parameters upon the Adaptive Output Method

Changes in 4.0 ⬆

Added the escape-solidus parameter for JSON serialization. [Issue 530 PR 534 6 June 2023]

For some item types the Adaptive output method will delegate serialization to other output methods. With the exception of the byte-order-mark serialization parameter, all serialization parameters, if set, will be passed down to the serialization method that is applied to each item in the supplied sequence. Only the item-separator and byte-order-mark parameters are directly applicable to the Adaptive output method.

10.1.1 Adaptive Output Method: the `version` Parameter

The version parameter is not directly applicable to the Adaptive output method.

10.1.2 Adaptive Output Method: the `html-version` Parameter

The html-version parameter is not directly applicable to the Adaptive output method.

10.1.3 Adaptive Output Method: the `encoding` Parameter

The encoding parameter is not directly applicable to the Adaptive output method.

10.1.4 Adaptive Output Method: the `indent` and `suppress-indentation` Parameters

The indent and suppress-indentation parameters are not directly applicable to the Adaptive output method.

10.1.5 Adaptive Output Method: the `cdata-section-elements` Parameter

The cdata-section-elements parameter is not directly applicable to the Adaptive output method.

10.1.6 Adaptive Output Method: the `omit-xml-declaration` and `standalone` Parameters

The omit-xml-declaration and standalone parameters are not directly applicable to the Adaptive output method.

Note:

If these parameters call for an XML declaration to be serialized, then an XML declaration is to be output each time the Adaptive output method delegates the serialization of a node to the XML output method. If several node items appear in the sequence to be serialized or as values in maps or arrays to be serialized, then the output will contain several XML declarations.

10.1.7 Adaptive Output Method: the `doctype-system` and `doctype-public` Parameters

The doctype-system and doctype-public parameters are not directly applicable to the Adaptive output method.

10.1.8 Adaptive Output Method: the `undeclare-prefixes` Parameter

The undeclare-prefixes parameter is not directly applicable to the Adaptive output method.

10.1.9 Adaptive Output Method: the `normalization-form` Parameter

The normalization-form parameter is not directly applicable to the Adaptive output method.

10.1.10 Adaptive Output Method: the `media-type` Parameter

The media-type parameter is not directly applicable to the Adaptive output method.

10.1.11 Adaptive Output Method: the `use-character-maps` Parameter

The use-character-maps parameter is applicable to the Adaptive output method only as elsewhere specified.

10.1.12 Adaptive Output Method: the `byte-order-mark` Parameter

The byte-order-mark parameter is applicable to the Adaptive output method. See 3 Serialization Parameters for more information.

Note:

A byte order mark can appear only once in the serialized output. Therefore, this parameter does not get passed down to any delegated output method.

10.1.13 Adaptive Output Method: the `escape-solidus` Parameter

The escape-solidus parameter is applicable to the Adaptive output method only as elsewhere specified.

10.1.14 Adaptive Output Method: the `escape-uri-attributes` Parameter

The escape-uri-attributes parameter is not applicable to the Adaptive output method.

10.1.15 Adaptive Output Method: the `include-content-type` Parameter

The include-content-type parameter is not directly applicable to the Adaptive output method.

10.1.16 Adaptive Output Method: the `item-separator` Parameter

The item-separator serialization parameter is directly applicable to the Adaptive output method. It specifies the string to be inserted between adjacent serialized items. If the item-separator parameter is absent, the character U+000A (NEWLINE) is used by the Adaptive output method as the item-separator value.

10.1.17 Adaptive Output Method: the `allow-duplicate-names` Parameter

The allow-duplicate-names parameter is not directly applicable to the Adaptive output method.

10.1.18 Adaptive Output Method: the `json-node-output-method` Parameter

The json-node-output-method parameter is not directly applicable to the Adaptive output method.

11 Character Maps

The use-character-maps parameter is a list of characters and corresponding string substitutions.

Character maps allow a specific character appearing in a text or attribute node or a string in the input tree to be replaced with a specified string of characters during serialization. The string that is substituted is output "as is," and the serializer performs no checks that the resulting document is well-formed. This mechanism can therefore be used to introduce arbitrary markup in the serialized output. See Section 27.1 Character Maps^XT40 of [XSL Transformations (XSLT) Version 4.0] for examples of using character mapping in XSLT.

Character mapping is applied to the characters that actually appear in a text or attribute node or a string in the input tree, before any other serialization operations such as escaping or Unicode Normalization are applied. If a character is mapped, then it is not subjected to XML or HTML escaping, nor to Unicode Normalization. The string that is substituted for a character is not validated or processed in any way by the serializer, except for translation into the target encoding. In particular, it is not subjected to XML or HTML escaping, it is not subjected to Unicode Normalization, and it is not subjected to further character mapping.

Character mapping is not applied to characters in text nodes whose parent elements are listed in the cdata-section-elements parameter, nor to characters for which output escaping has been disabled (disabling output escaping is an [XSL Transformations (XSLT) Version 4.0] feature), nor to characters in attribute values that are subject to URI escaping defined for the HTML and XHTML output methods, unless URI escaping has been disabled using the escape-uri-attributes parameter in the output definition.

On serialization, occurrences of a character specified in the use-character-maps in text nodes, attribute values and strings are replaced by the corresponding string from the use-character-maps parameter.

Note:

Using a character map can result in non-well-formed documents if the string contains XML-significant characters. For example, it is possible to create documents containing unmatched start and end tags, references to entities that are not declared, or attributes that contain tags or unescaped quotation marks.

If a character is mapped, then it is not subjected to XML or HTML escaping.

A serialization error [err:SERE0008] occurs if character mapping causes the output of a string containing a character that cannot be represented in the encoding that the serializer is using for output. The serializer must raise the error.

12 Conformance

Serialization is intended primarily as a component of a host language. [Definition: A host language is another specification that includes, by reference, this specification and all of its requirements. A host language might be a programming language such as [XSL Transformations (XSLT) Version 4.0] or [XQuery 4.0: An XML Query Language], or it might be an application programming interface (API) intended to be used by programs written in some other high-level programming language. The use of the term language is not intended to preclude the possibility that this specification might be referenced outside the context of a programming language specification.] This document relies on specifications that use it to specify conformance criteria for Serialization in their respective environments. Specifications that set conformance criteria for their use of Serialization must not change the semantic definitions of Serialization as given in this specification, except by subsetting and/or compatible extensions. It is the responsibility of the host language to specify how serialization errors are to be handled.

Certain facilities in this specification are described as producing implementation-defined results. A claim that asserts conformance with this specification must be accompanied by documentation stating the effect of each implementation-defined feature. For convenience, a non-normative checklist of implementation-defined features is provided at F.1 Checklist of Implementation-Defined Features.

A References

A.1 Normative References

Character Model for the World Wide Web 1.0: Normalization: Character Model for the World Wide Web 1.0: Normalization, François Yergeau, Martin Dürst, Richard Ishida, et. al., Editors. World Wide Web Consortium, 01 May 2012. This version is http://www.w3.org/TR/2012/WD-charmod-norm-20120501/. The latest version is available at http://www.w3.org/TR/charmod-norm/.
XQuery and XPath Data Model (XDM) 4.0: XQuery and XPath Data Model (XDM) 4.0, XSLT Extensions Community Group, World Wide Web Consortium.
XQuery and XPath Functions and Operators 4.0: XQuery and XPath Functions and Operators 4.0, XSLT Extensions Community Group, World Wide Web Consortium.
HTML5: HTML5, Robin Berjon, Steve Faulkner, Travis Leithead, et. al., Editors. World Wide Web Consortium, 04 Feb 2014. This version is http://www.w3.org/TR/2014/CR-html5-20140204/. The latest version is available at http://www.w3.org/TR/html5/.
HTML: HTML 4.01 Specification, Dave Raggett, Arnaud Le Hors, and Ian Jacobs, Editors. World Wide Web Consortium, 24 Dec 1999. This version is http://www.w3.org/TR/1999/REC-html401-19991224/. The latest version is available at http://www.w3.org/TR/html401/
POLYGLOT: Polyglot Markup: A robust profile of the HTML5 vocabulary, Eliot Graff and Leif Halvard Silli, Editors. World Wide Web Consortium, 04 Feb 2014. This version is http://www.w3.org/TR/2014/WD-html-polyglot-20140204/. The latest version is available at http://www.w3.org/TR/html-polyglot/.
IANA: Character Sets. Internet Assigned Numbers Authority. Oct 2012.
RFC2046: Multipurpose Internet Mail Extensions (MIME) Part Two: Media Types, N. Freed, N. Borenstein. Network Working Group, IETF, Nov 1996.
RFC2119: Key words for use in RFCs to Indicate Requirement Levels, S. Bradner. Network Working Group, IETF, Mar 1997.
RFC2978: IANA Charset Registration Procedures, N. Freed and J. Postel Network Working Group, IETF, Oct 2000.
RFC2854: The 'text/html' Media Type, D. Connolly, L. Masinter. Network Working Group, IETF, Jun 2000.
RFC3236: The 'application/xhtml+xml' Media Type, M. Baker and P. Stark. Network Working Group, IETF, Jan 2002.
Unicode Encoding: Unicode Character Encoding Model, Unicode Consortium. Unicode Standard Annex #17.
UAX #15: Unicode Normalization Forms: Unicode Normalization Forms, Unicode Consortium. Unicode Standard Annex #15.
XHTML 1.0: XHTML™ 1.0 The Extensible HyperText Markup Language (Second Edition), Steven Pemberton, Editor. World Wide Web Consortium, 01 Aug 2002. This version is http://www.w3.org/TR/2002/REC-xhtml1-20020801. The latest version is available at http://www.w3.org/TR/xhtml1/.
XHTML 1.1: XHTML™ 1.1 - Module-based XHTML - Second Edition, Shane McCarron and Masayasu Ishikawa, Editors. World Wide Web Consortium, 23 Nov 2010. This version is http://www.w3.org/TR/2010/REC-xhtml11-20101123. The latest version is available at http://www.w3.org/TR/xhtml11/.
XML10: Extensible Markup Language (XML) 1.0 (Fifth Edition), Tim Bray, Jean Paoli, Michael Sperberg-McQueen, et. al., Editors. World Wide Web Consortium, 26 Nov 2008. This version is http://www.w3.org/TR/2008/REC-xml-20081126/. The latest version is available at http://www.w3.org/TR/xml.
XML11: Extensible Markup Language (XML) 1.1 (Second Edition), Tim Bray, Jean Paoli, Michael Sperberg-McQueen, et. al., Editors. World Wide Web Consortium, 16 Aug 2006. This version is http://www.w3.org/TR/2006/REC-xml11-20060816. The latest version is available at http://www.w3.org/TR/xml11/.
XML Names: Namespaces in XML 1.0 (Third Edition), Tim Bray, Dave Hollander, Andrew Layman, et. al., Editors. World Wide Web Consortium, 08 Dec 2009. This version is http://www.w3.org/TR/2009/REC-xml-names-20091208/. The latest version is available at http://www.w3.org/TR/xml-names/.
XML Names 1.1: Namespaces in XML 1.1 (Second Edition), Tim Bray, Dave Hollander, Andrew Layman, and Richard Tobin, Editors. World Wide Web Consortium, 16 Aug 2006. This version is http://www.w3.org/TR/2006/REC-xml-names11-20060816. The latest version is available at http://www.w3.org/TR/xml-names11/.
XML Schema: XML Schema Part 1: Structures Second Edition, Henry Thompson, David Beech, Murray Maloney, and Noah Mendelsohn, Editors. World Wide Web Consortium, 28 Oct 2004. This version is http://www.w3.org/TR/2004/REC-xmlschema-1-20041028/. The latest version is available at http://www.w3.org/TR/xmlschema-1/.
XML Path Language (XPath) 4.0: XML Path Language (XPath) 4.0, XSLT Extensions Community Group, World Wide Web Consortium.
XQuery 4.0: An XML Query Language: XQuery 4.0: An XML Query Language, XSLT Extensions Community Group, World Wide Web Consortium.
XSL Transformations (XSLT) Version 4.0: XSL Transformations (XSLT) Version 4.0, XSLT Extensions Community Group, World Wide Web Consortium.
RFC 7159: IETF. RFC 7159: The Javascript Object Notation (JSON) Data Interchange Format, T. Bray, Editor. Internet Engineering Task Force, March 2014. Available at: http://www.rfc-editor.org/rfc/rfc7159.txt

A.2 Informative References

The JSON Data Interchange Format: The JSON Data Interchange Format, ECMA International.
XHTML Modularization: XHTML™ Modularization 1.1 - Second Edition, Shane McCarron, Editor. World Wide Web Consortium, 29 Jul 2010. This version is http://www.w3.org/TR/2010/REC-xhtml-modularization-20100729. The latest version is available at http://www.w3.org/TR/xhtml-modularization/.
XQuery 1.0 and XPath 2.0 Data Model: XQuery 1.0 and XPath 2.0 Data Model (XDM) (Second Edition), Norman Walsh, Mary Fernández, Ashok Malhotra, et. al., Editors. World Wide Web Consortium, 14 December 2010. This version is https://www.w3.org/TR/2010/REC-xpath-datamodel-20101214/. The latest version is available at https://www.w3.org/TR/xpath-datamodel/.
XSLT 2.0 and XQuery 1.0 Serialization (Second Edition): XSLT 2.0 and XQuery 1.0 Serialization (Second Edition), W3C Recommendation, Henry Zongaro, Norman Walsh, Joanne Tong, et. al., Editors. World Wide Web Consortium, 14 December 2010. This version is http://www.w3.org/TR/2010/REC-xslt-xquery-serialization-20101214/
XSLT and XQuery Serialization 4.0 (First Public Working Draft): XSLT and XQuery Serialization, W3C First Public Working Draft, Andrew Coleman, C. M. Sperberg-McQueen, et. al., Editors. World Wide Web Consortium, 24 April 2014.

B Schema for Serialization Parameters

The following schema describes the structure of a parameter document that can be used to specify the settings of serialization parameters using the mechanism described in 3.1 Setting Serialization Parameters by Means of a Parameter Document.

A copy of this schema is available at http://www.w3.org/2017/01/xslt-xquery-serialization/schema-for-serialization-parameters.xsd.

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
    targetNamespace="http://www.w3.org/2010/xslt-xquery-serialization"
    xmlns:output="http://www.w3.org/2010/xslt-xquery-serialization"
    elementFormDefault="qualified">

  <xs:annotation>
    <xs:documentation>
      This is a schema for serialization parameters for XSLT and
      XQuery Serialization 3.1.

      This schema is available for use under the conditions of the
      W3C Software License published at
      http://www.w3.org/Consortium/Legal/copyright-software-19980720
      
      It defines a schema for XML Infoset instances with which a
      user of a host language MAY specify serialization parameters
      for use in serializing an instance of the XQuery and XPath
      Data Model.  It also provides hooks that allow the inclusion
      of implementation- defined serialization parameters and
      implementation-defined modifiers to serialization parameters.
    </xs:documentation>
  </xs:annotation>

  <xs:simpleType name="EQName">
    <!--* In principle, this could be declared as a single-step
        * restriction of xs:token with the pattern
        * "Q\{(.*)\}[\i-[:]][\c-[:]]*".  We derive it in two
        * steps because a bug causes some widely used processors 
        * not to support character-class subtraction.
        *-->
    <xs:restriction>
      <xs:simpleType>
        <xs:restriction base="xs:token">
          <xs:pattern value="Q\{[^{}]*\}[\i][\c]*"/>
          <xs:whiteSpace value="collapse"/>
        </xs:restriction>
      </xs:simpleType>
      <xs:pattern value="Q\{.*\}[^:]*"/>
    </xs:restriction>
  </xs:simpleType>
  
  <xs:simpleType name="Prefixed-QName">
    <xs:annotation>
      <xs:documentation>
        Prefixed-QName matches only QNames with a non-null prefix:  
        that is, only QNames with a colon.
      </xs:documentation>
    </xs:annotation>
    <xs:restriction base="xs:QName">
      <xs:pattern value=".*:.*"/>
    </xs:restriction>
  </xs:simpleType>
  
  <xs:simpleType name="Qualified-EQName">
    <xs:annotation>
      <xs:documentation>
        Qualified-EQName matches only EQNames with a non-null namespace name.
      </xs:documentation>
    </xs:annotation>
    <xs:restriction base="output:EQName">
      <xs:pattern value=".*\{(.*\S.*)\}.*"/>
    </xs:restriction>    
  </xs:simpleType>
  
  <xs:simpleType name="QName-or-EQName">
    <xs:union memberTypes="xs:QName output:EQName"/> 
  </xs:simpleType>   
  
  <xs:simpleType name="QNames-type">
    <xs:list itemType="output:QName-or-EQName"/>
  </xs:simpleType>

  <xs:simpleType name="yes-no-type">
    <xs:restriction base="xs:token">
      <xs:enumeration value="no"/>
      <xs:enumeration value="yes"/>
      <xs:enumeration value="false"/>
      <xs:enumeration value="true"/>
      <xs:enumeration value="0"/>
      <xs:enumeration value="1"/>
    </xs:restriction>
  </xs:simpleType>

  <xs:simpleType name="yes-no-omit-type">
    <xs:restriction base="xs:token">
      <xs:enumeration value="no"/>
      <xs:enumeration value="omit"/>
      <xs:enumeration value="yes"/>
      <xs:enumeration value="false"/>
      <xs:enumeration value="true"/>
      <xs:enumeration value="0"/>
      <xs:enumeration value="1"/>
    </xs:restriction>
  </xs:simpleType>

  <xs:simpleType name="char-type">
    <xs:restriction base="xs:string">
      <xs:maxLength value="1"/>
      <xs:minLength value="1"/>
    </xs:restriction>
  </xs:simpleType>

  <xs:simpleType name="encoding-string-type">
    <xs:restriction base="xs:token">
      <xs:pattern value="[A-Za-z][A-Za-z0-9._\-]*"/>
    </xs:restriction>
  </xs:simpleType>

  <xs:simpleType name="json-node-output-method-type">
    <xs:union>
      <xs:simpleType>
        <xs:restriction base="xs:token">
          <xs:enumeration value="html"/>
          <xs:enumeration value="text"/>
          <xs:enumeration value="xml"/>
          <xs:enumeration value="xhtml"/>
        </xs:restriction>
      </xs:simpleType>
      <xs:simpleType>
        <xs:restriction base="output:EQName">
          <xs:pattern value="Q\{\s*\}(html|text|xml|xhtml)"/>
        </xs:restriction>
      </xs:simpleType>
      <!--* other values must have non-null namespace URI *-->
      <xs:simpleType>
        <xs:restriction base="output:Prefixed-QName"/>        
      </xs:simpleType>
      <xs:simpleType>
        <xs:restriction base="output:Qualified-EQName"/>        
      </xs:simpleType>
    </xs:union>
  </xs:simpleType>

  <xs:simpleType name="method-type">
    <xs:union>
      <xs:simpleType>
        <xs:restriction base="xs:token">
          <xs:enumeration value="html"/>
          <xs:enumeration value="text"/>
          <xs:enumeration value="xml"/>
          <xs:enumeration value="xhtml"/>
          <xs:enumeration value="json"/>
          <xs:enumeration value="adaptive"/>
        </xs:restriction>
      </xs:simpleType>
      <xs:simpleType>
        <xs:restriction base="output:EQName">
          <xs:pattern value="Q\{\s*\}(html|text|xml|xhtml|json|adaptive)"/>
        </xs:restriction>
      </xs:simpleType>
      <!--* other values must have non-null namespace URI *-->
      <xs:simpleType>
        <xs:restriction base="output:Prefixed-QName"/>        
      </xs:simpleType>
      <xs:simpleType>
        <xs:restriction base="output:Qualified-EQName"/>        
      </xs:simpleType>
    </xs:union>
  </xs:simpleType>

  <xs:simpleType name="pubid-char-string-type">
    <xs:restriction base="xs:token">
      <xs:pattern value="([- \r\n\ta-zA-Z0-9'()+,./:=?;!*#@$_%])*"/>
    </xs:restriction>
  </xs:simpleType>

  <xs:simpleType name="system-id-string-type">
    <xs:restriction base="xs:string">
      <xs:pattern value="[^']*|[^&quot;]*"/>
    </xs:restriction>
  </xs:simpleType>

  <!--
     - Base type of all serialization parameter types
    -->
  <xs:complexType name="base-param-type">
    <xs:complexContent>
      <xs:restriction base="xs:anyType">
        <xs:anyAttribute namespace="##other" 
                         processContents="lax"/>
      </xs:restriction>
    </xs:complexContent>
  </xs:complexType>

  <!--
     - Generic string serialization parameters
    -->
  <xs:complexType name="string-param-type">
    <xs:complexContent>
      <xs:extension base="output:base-param-type">
        <xs:attribute name="value" 
                      type="xs:string" 
                      use="required"/>
      </xs:extension>
    </xs:complexContent>
  </xs:complexType>
  
  <!--
     - Generic tokenized string serialization parameters
    -->
  <xs:complexType name="tokenized-string-param-type">
    <xs:complexContent>
      <xs:extension base="output:base-param-type">
        <xs:attribute name="value" 
          type="xs:token" 
          use="required"/>
      </xs:extension>
    </xs:complexContent>
  </xs:complexType>

  <!--
     - Generic decimal serialization parameters
    -->
  <xs:complexType name="decimal-param-type">
    <xs:complexContent>
      <xs:extension base="output:base-param-type">
        <xs:attribute name="value" 
                      type="xs:decimal" 
                      use="required"/>
      </xs:extension>
    </xs:complexContent>
  </xs:complexType>

  <!--
     - Serialization parameter type for "yes", "no", "true", "false", "0" or "1"
     - serialization parameters
    -->
  <xs:complexType name="yes-no-param-type">
    <xs:complexContent>
      <xs:extension base="output:base-param-type">
        <xs:attribute name="value" 
                      type="output:yes-no-type" 
                      use="required"/>
      </xs:extension>
    </xs:complexContent>
  </xs:complexType>

  <!--
     - Serialization parameter type for list of xs:QName
     - serialization parameters
    -->
  <xs:complexType name="QNames-param-type">
    <xs:complexContent>
      <xs:extension base="output:base-param-type">
        <xs:attribute name="value" 
                      type="output:QNames-type" 
                      use="required"/>
      </xs:extension>
    </xs:complexContent>
  </xs:complexType>

  <!--
     - Serialization parameter type for "yes", "no", "true", "false", "0", "1" or "omit"
     - serialization parameters
    -->
  <xs:complexType name="yes-no-omit-param-type">
    <xs:complexContent>
      <xs:extension base="output:base-param-type">
        <xs:attribute name="value" 
                      type="output:yes-no-omit-type"
                      use="required"/>
      </xs:extension>
    </xs:complexContent>
  </xs:complexType>

  <!--
     - Serialization parameter type for NMTOKEN serialization
       parameters
    -->
  <xs:complexType name="NMTOKEN-param-type">
    <xs:complexContent>
      <xs:extension base="output:base-param-type">
        <xs:attribute name="value" 
                      type="xs:NMTOKEN" 
                      use="required"/>
      </xs:extension>
    </xs:complexContent>
  </xs:complexType>

  <!--
     - Base element declaration for all serialization parameter
       elements
    -->
  <xs:element name="serialization-parameter-element"
              abstract="true"
              type="output:base-param-type"/>

  <!--
     - Serialization parameter element for allow-duplicate-names
       parameter
    -->
  <xs:element id="allow-duplicate-names" 
              name="allow-duplicate-names" 
              type="output:yes-no-param-type"
              substitutionGroup 
              = "output:serialization-parameter-element"/>

  <!--
     - Serialization parameter element for byte-order-mark
       parameter
    -->
  <xs:element id="byte-order-mark" 
              name="byte-order-mark" 
              type="output:yes-no-param-type"
              substitutionGroup 
              = "output:serialization-parameter-element"/>

  <!--
     - Serialization parameter element for cdata-section-elements
       parameter
    -->
  <xs:element id="cdata-section-elements" 
              name="cdata-section-elements" 
              type="output:QNames-param-type"
              substitutionGroup
              = "output:serialization-parameter-element"/>

  <!--
     - Serialization parameter type for doctype-public parameter
    -->
  <xs:complexType name="doctype-public-param-type">
    <xs:complexContent>
      <xs:extension base="output:base-param-type">
        <xs:attribute name="value" 
                      type="output:pubid-char-string-type"
                      use="required"/>
      </xs:extension>
    </xs:complexContent>
  </xs:complexType>

  <!--
     - Serialization parameter element for doctype-public parameter
    -->
  <xs:element id="doctype-public" 
              name="doctype-public" 
              type="output:doctype-public-param-type"
              substitutionGroup
              = "output:serialization-parameter-element"/>

  <!--
     - Serialization parameter type for doctype-system parameter
    -->
  <xs:complexType name="doctype-system-param-type">
    <xs:complexContent>
      <xs:extension base="output:base-param-type">
        <xs:attribute name="value" 
                      type="output:system-id-string-type"
                      use="required"/>
      </xs:extension>
    </xs:complexContent>
  </xs:complexType>

  <!--
     - Serialization parameter element for doctype-system parameter
    -->
  <xs:element id="doctype-system" 
              name="doctype-system" 
              type="output:doctype-system-param-type"
              substitutionGroup
              = "output:serialization-parameter-element"/>

  <!--
     - Serialization parameter type for encoding parameter
    -->
  <xs:complexType name="encoding-param-type">
    <xs:complexContent>
      <xs:extension base="output:base-param-type">
        <xs:attribute name="value" 
                      type="output:encoding-string-type"
                      use="required"/>
      </xs:extension>
    </xs:complexContent>
  </xs:complexType>

  <!--
     - Serialization parameter element for encoding parameter
    -->
  <xs:element id="encoding" 
              name="encoding" 
              type="output:encoding-param-type"
              substitutionGroup
              = "output:serialization-parameter-element"/>
  
  <!--
     - Serialization parameter element for escape-solidus
       parameter
    -->
  <xs:element id="escape-solidus"
    name="escape-solidus"
    type="output:yes-no-param-type"
    substitutionGroup
    = "output:serialization-parameter-element"/>
  

  <!--
     - Serialization parameter element for escape-uri-attributes
       parameter
    -->
  <xs:element id="escape-uri-attributes"
              name="escape-uri-attributes"
              type="output:yes-no-param-type"
              substitutionGroup
              = "output:serialization-parameter-element"/>

  <!--
     - Serialization parameter element for html-version parameter
    -->
  <xs:element id="html-version" 
              name="html-version"
              type="output:decimal-param-type"
              substitutionGroup
              = "output:serialization-parameter-element"/>

  <!--
     - Serialization parameter element for include-content-type
       parameter
    -->
  <xs:element id="include-content-type" 
              name="include-content-type"
              type="output:yes-no-param-type"
              substitutionGroup
              = "output:serialization-parameter-element"/>

  <!--
     - Serialization parameter element for indent parameter
    -->
  <xs:element id="indent" 
              name="indent"
              type="output:yes-no-param-type"
              substitutionGroup
              = "output:serialization-parameter-element"/>

  <!--
     - Serialization parameter element for item-separator 
       parameter
    -->
  <xs:element id="item-separator" 
              name="item-separator"
              type="output:string-param-type"
              substitutionGroup
              = "output:serialization-parameter-element"/>

  <!--
     - Serialization parameter type for json-node-output-method parameter
    -->
  <xs:complexType name="json-node-output-method-param-type">
    <xs:complexContent>
      <xs:extension base="output:base-param-type">
        <xs:attribute name="value" 
                      type="output:json-node-output-method-type"
                      use="required"/>
      </xs:extension>
    </xs:complexContent>
  </xs:complexType>

  <!--
     - Serialization parameter element for json-node-output-method parameter
    -->
  <xs:element id="json-node-output-method" 
              name="json-node-output-method" 
              type="output:json-node-output-method-param-type"
              substitutionGroup
              = "output:serialization-parameter-element"/>

  <!--
     - Serialization parameter element for media-type parameter
    -->
  <xs:element id="media-type" 
              name="media-type" 
              type="output:tokenized-string-param-type"
              substitutionGroup
              = "output:serialization-parameter-element"/>

  <!--
     - Serialization parameter type for method parameter
    -->
  <xs:complexType name="method-param-type">
    <xs:complexContent>
      <xs:extension base="output:base-param-type">
        <xs:attribute name="value" 
                      type="output:method-type"
                      use="required"/>
      </xs:extension>
    </xs:complexContent>
  </xs:complexType>

  <!--
     - Serialization parameter element for method parameter
    -->
  <xs:element id="method" 
              name="method" 
              type="output:method-param-type"
              substitutionGroup
              = "output:serialization-parameter-element"/>
  <!--
     - Serialization parameter element for normalization-form
       parameter
    -->
  <xs:element id="normalization-form" 
              name="normalization-form" 
              type="output:NMTOKEN-param-type"
              substitutionGroup
              = "output:serialization-parameter-element"/>

  <!--
     - Serialization parameter element for omit-xml-declaration
       parameter
    -->
  <xs:element id="omit-xml-declaration" 
              name="omit-xml-declaration" 
              type="output:yes-no-param-type"
              substitutionGroup
              = "output:serialization-parameter-element"/>

  <!--
     - Serialization parameter element for standalone parameter
    -->
  <xs:element id="standalone" 
              name="standalone" 
              type="output:yes-no-omit-param-type"
              substitutionGroup
              = "output:serialization-parameter-element"/>

  <!--
     - Serialization parameter element for suppress-indentation
       parameter
    -->
  <xs:element id="suppress-indentation" 
              name="suppress-indentation" 
              type="output:QNames-param-type"
              substitutionGroup
              = "output:serialization-parameter-element"/>

  <!--
     - Serialization parameter element for undeclare-prefixes
       parameter
    -->
  <xs:element id="undeclare-prefixes" 
              name="undeclare-prefixes" 
              type="output:yes-no-param-type"
              substitutionGroup
              = "output:serialization-parameter-element"/>

  <!--
     - Serialization parameter type for use-character-maps
       parameter
    -->
  <xs:complexType name="use-character-maps-param-type">
    <xs:complexContent>
      <xs:extension base="output:base-param-type">
        <xs:sequence>
          <xs:element name="character-map" 
                      minOccurs="0"
                      maxOccurs="unbounded">
            <xs:complexType>
              <xs:attribute name="character" 
                            type="output:char-type"/>
              <xs:attribute name="map-string" 
                            type="xs:string"/>
              <xs:anyAttribute namespace="##other"
                               processContents="lax"/>
            </xs:complexType>
          </xs:element>
          <xs:any minOccurs="0" 
                  namespace="##other"
                  processContents="lax"/>
        </xs:sequence>
      </xs:extension>
    </xs:complexContent>
  </xs:complexType>

  <!--
     - Serialization parameter element for use-character-maps
       parameter
    -->
  <xs:element id="use-character-maps" 
              name="use-character-maps"
              type="output:use-character-maps-param-type"
              substitutionGroup
              = "output:serialization-parameter-element"/>

  <!--
     - Serialization parameter element for version parameter
    -->
  <xs:element id="version" 
              name="version"
              type="output:tokenized-string-param-type"
              substitutionGroup
              = "output:serialization-parameter-element"/>

  <xs:element name="serialization-parameters">
    <xs:complexType>
      <xs:choice minOccurs="0" maxOccurs="unbounded">
        <xs:element ref="output:serialization-parameter-element"/>
        <xs:any namespace="##other"
                processContents="lax"/>
      </xs:choice>
    </xs:complexType>
  </xs:element>
</xs:schema>

C Summary of Error Conditions

This document uses the err prefix which represents the same namespace URI (http://www.w3.org/2005/xqt-errors) as defined in [XML Path Language (XPath) 4.0]. Use of this namespace prefix binding in this document is not normative.

err:SENR0001: It is an error if an item in S₆ in sequence normalization is an attribute node, a namespace node, or a function.
err:SERE0003: It is an error if the serializer is unable to satisfy the rules for either a well-formed XML document entity or a well-formed XML external general parsed entity, or both, except for content modified by the character expansion phase of serialization.
err:SEPM0004: It is an error to specify the doctype-system parameter, or to specify the standalone parameter with a value other than omit, if the instance of the data model contains text nodes or multiple element nodes as children of the root node.
err:SERE0005: It is an error if the serialized result would contain an NCName^Names that contains a character that is not permitted by the version of Namespaces in XML specified by the version parameter.
err:SERE0006: It is an error if the serialized result would contain a character that is not permitted by the version of XML specified by the version parameter.
err:SESU0007: It is an error if an output encoding other than UTF-8 or UTF-16 is requested and the serializer does not support that encoding.
err:SERE0008: It is an error if a character that cannot be represented in the encoding that the serializer is using for output appears in a context where character references are not allowed (for example if the character occurs in the name of an element).
err:SEPM0009: It is an error if the omit-xml-declaration parameter has the value yes, true or 1, and the standalone attribute has a value other than omit; or the version parameter has a value other than 1.0 and the doctype-system parameter is specified.
err:SEPM0010: It is an error if the output method is xml or xhtml, the value of the undeclare-prefixes parameter is one of yes, true or 1, and the value of the version parameter is 1.0.
err:SESU0011: It is an error if the value of the normalization-form parameter specifies a normalization form that is not supported by the serializer.
err:SERE0012: It is an error if the value of the normalization-form parameter is fully-normalized and any relevant construct of the result begins with a combining character.
err:SESU0013: It is an error if the serializer does not support the version of XML or HTML specified by the version parameter.
err:SERE0014: It is an error to use the HTML output method if characters which are permitted in XML but not in HTML appear in the instance of the data model.
err:SERE0015: It is an error to use the HTML output method when > appears within a processing instruction in the data model instance being serialized.
err:SEPM0016: It is an error if a parameter value is invalid for the defined domain.
err:SEPM0017: It is an error if evaluating an expression in order to extract the setting of a serialization parameter from a data model instance would yield an error.
err:SEPM0018: It is an error if evaluating an expression in order to extract the setting of the use-character-maps serialization parameter from a data model instance would yield a sequence of length greater than one.
err:SEPM0019: It is an error if an instance of the data model used to specify the settings of serialization parameters specifies the value of the same parameter more than once.
err:SERE0020: It is an error if a numeric value being serialized using the JSON output method cannot be represented in the JSON grammar (e.g. +INF, -INF, NaN).
err:SERE0021: It is an error if a sequence being serialized using the JSON output method includes items for which no rules are provided in the appropriate section of the serialization rules.
err:SERE0022: It is an error if a map being serialized using the JSON output method has two keys with the same string value, unless the allow-duplicate-names has the value yes, true or 1.
err:SERE0023: It is an error if a non-root sequence being serialized using the JSON output method is of length greater than one.

D List of URI Attributes

The following list of attributes are declared as type %URI or %UriList for a given HTML or XHTML element, with the exception of the name attribute for element A which is not a URI type. The name attribute for element A should be escaped as is recommended by the HTML Recommendation [HTML] in Appendix B.2.1.

Attributes of type URI
Attributes	Elements
action	FORM
archive	OBJECT
background	BODY
cite	BLOCKQUOTE, DEL, INS, Q
classid	OBJECT
codebase	APPLET, OBJECT
data	OBJECT
datasrc	BUTTON, DIV, INPUT, OBJECT, SELECT, SPAN, TABLE, TEXTAREA
for	SCRIPT
formaction	BUTTON, INPUT
href	A, AREA, BASE, LINK
icon	COMMAND
longdesc	FRAME, IFRAME, IMG
manifest	HTML
name	A
poster	VIDEO
profile	HEAD
src	AUDIO, EMBED, FRAME, IFRAME, IMG, INPUT, SCRIPT, SOURCE, TRACK, VIDEO
usemap	IMG, INPUT, OBJECT
value	INPUT

E Glossary (Non-Normative)

array item

The term array item is defined in Section 2.9.6 Array Items^DM40.

atomize

The term atomization is defined in Section 2.5.3 Atomization^XP40.

character

The term character is defined in Section 2.8.4 XML and XSD Versions^DM40.

codepoint

The term codepoint is defined in Section 2.8.4 XML and XSD Versions^DM40.

content

The term content has the same meaning as the term Content^XML defined in Section 3.1 Start-Tags, End-Tags, and Empty-Element Tags^XML of [XML10].

EMPTY

The following XHTML elements have an EMPTY content model: area, base, br, col, embed, hr, img, input, link, meta, basefont, frame, isindex, and param.

expanded QName

The term expanded QName is defined in Section 2 Basics^XP40. An expanded QName consists of an optional namespace URI and a local name. An expanded QName also retains its original namespace prefix (if any), to facilitate casting the expanded QName into a string.

expected-empty

An element node is expected to be empty if it is recognized as an HTML element and if either

the html-version serialization parameter is absent or has a value less than 5.0 and the content model is EMPTY, or
the html-version serialization parameter has the value 5.0 and the element is a void element.

function item

The term function item is defined in Section 2.9.4 Function Items^DM40.

host language

A host language is another specification that includes, by reference, this specification and all of its requirements. A host language might be a programming language such as [XSL Transformations (XSLT) Version 4.0] or [XQuery 4.0: An XML Query Language], or it might be an application programming interface (API) intended to be used by programs written in some other high-level programming language. The use of the term language is not intended to preclude the possibility that this specification might be referenced outside the context of a programming language specification.

immediate content

The immediate content of an element is the part of the content of the element that is not also in the content of a child element of that element.

implementation-defined

Implementation-defined indicates an aspect that may differ between serializers, but whose actual behavior must be specified either by another specification that sets conformance criteria for serialization (see 12 Conformance) or in documentation that accompanies the serializer.

implementation-dependent

Implementation-dependent indicates an aspect that may differ between serializers, and whose actual behavior is not required to be specified either by another specification that sets conformance criteria for serialization (see 12 Conformance) or in documentation that accompanies the serializer.

input tree

In general the output of the serializer will represent the items actually present in the input value, together with other items that are reachable from these, for example (in the case of nodes) their descendants. The complete set of items that are represented in the output of the serializer is referred to (without loss of generality) as the input tree.

input value

The XDM value supplied as input to the serializer is referred to as the input value.

map item

The term map item is defined in Section 2.9.5 Map Items^DM40.

MathML namespace

the MathML namespace namespace, https://www.w3.org/1998/Math/MathML.

node

The term node is defined as part of Section 5 Nodes^DM40. There are seven kinds of nodes in the data model: document, element, attribute, text, namespace, processing instruction, and comment.

non-null namespace URI

An element or attribute that does not have a null namespace URI, is referred to as having a non-null namespace URI

null namespace URI

An expanded-QName whose namespace part is an empty sequence, or an element or attribute whose name expands to such an expanded-QName, is referred to as having a null namespace URI

Output declaration namespace

the Output declaration namespace, https://www.w3.org/2010/xslt-xquery-serialization

parameter document

An output:serialization-parameters element node used to hold the settings of serialization parameters is referred to as a parameter document

prefix normalization

During prefix normalization, any element node in the input tree that is in one of the XHTML namespace, the SVG namespace or the MathML namespace has its name replaced by the local part of its name. Such an element node is given a default namespace node whose value is the element’s namespace URI. Any namespace node for any of those three namespaces that was previously present on any element node in the input tree is also removed, unless the prefix that that namespace node declared is used as the prefix on the name of an attribute on that element or an ancestor of that element.

recognized as an HTML element

An element node is recognized as an HTML element by the XHTML output method if

the element node is in the XHTML namespace, regardless of the value of the html-version serialization parameter or if the html-version serialization parameter is absent; or
the value of the html-version serialization parameter is 5.0, the element has a null namespace URI, and the local part of the name is equal to the name of an element defined by HTML5 [HTML5], making the comparison without regard to case.

reconstructed tree

A reconstructed tree may be constructed by parsing the XML document and converting it into an document node as specified in [XQuery and XPath Data Model (XDM) 4.0].

requested HTML version

If the html-version serialization parameter is not absent, the requested HTML version is the value of the html-version serialization parameter; otherwise, it is the value of the version serialization parameter.

result tree

The result of the sequence normalization process is a result tree.

sequence

The term sequence is defined in Section 2 Basics^XP40. A sequence is an ordered collection of zero or more items.

sequence normalization

The purpose of sequence normalization is to create a sequence that can be serialized as a well-formed XML document or external general parsed entity, that also reflects the content of the input sequence to the extent possible.

serialization error

In some instances, the input tree cannot be successfully converted into a sequence of octets given the set of serialization parameter (3 Serialization Parameters) values specified. A serialization error is said to occur in such an instance.

serialized as an HTML element

An element node is serialized as an HTML element if

the expanded QName of the element has a null namespace URI, regardless of the value of the requested HTML version, or
the value of the requested HTML version is 5.0 or greater, and the element node is in the XHTML namespace.

serializer

As is indicated in 12 Conformance, conformance criteria for serialization are determined by other specifications that refer to this specification. A serializer is software that implements some or all of the requirements of this specification in accordance with such conformance criteria.

string

The term string is defined in Section 2.8.4 XML and XSD Versions^DM40.

string value

The term string value is defined in Section 4.12 string-value Accessor^DM40. Every node has a string value. For example, the string value of an element is the concatenation of the string values of all its descendant text nodes.

SVG namespace

the SVG namespace, https://www.w3.org/2000/svg

to a JSON string

Whenever a value is serialized to a JSON string, the following procedure is applied to the supplied string:

Any character in the string for which character mapping is defined (see 11 Character Maps) is substituted by the replacement string defined in the character map.
Any other character in the input string (but not a character produced by character mapping) is a candidate for Unicode Normalization if requested by the normalization-form parameter, and JSON escaping. JSON escaping replaces the characters quotation mark, backspace, form-feed, newline, carriage return, tab, or reverse solidus by the corresponding JSON escape sequences \", \b, \f, \n, \r, \t, or \\ respectively, and any other codepoint in the range 1-31 or 127-159 by an escape in the form \uHHHH where HHHH is the hexadecimal representation of the codepoint value. Escaping further replaces the solidus character (/) by the escape sequence \/ if the escape-solidus parameter is set to true, yes, or 1, but not if it is set to false, no, or 0. Escaping is also applied to any characters that cannot be represented in the selected encoding.
The resulting string is enclosed in double quotation marks.

Unicode Normalization

Unicode Normalization is the process of removing alternative representations of equivalent sequences from textual data, to convert the data into a form that can be binary-compared for equivalence, as specified in [UAX #15: Unicode Normalization Forms]. For specific recommendations for character normalization on the World Wide Web, see [Character Model for the World Wide Web 1.0: Normalization].

URI attribute values

The values of attributes listed in D List of URI Attributes are URI attribute values. Attributes are not considered to be URI attributes simply because they are namespace declaration attributes or have the type annotation xs:anyURI.

URI Escaping

URI escaping consists of the following three steps applied in sequence to the content of URI attribute values:

void

The void elements of HTML5 are area, base, br, col, embed, hr, img, input, keygen, link, meta, param, source, track and wbr.

whitespace character

A space character, TAB character, CR character or NL character is referred to as a whitespace character.

without regard to case

Where this specification indicates that two strings are to be compared without regard to case, the serializer must translate any characters in the range U+0041 (LATIN CAPITAL LETTER A, A) through U+005A (LATIN CAPITAL LETTER Z, Z) inclusive, to the corresponding lower-case letters in the range U+0061 (LATIN SMALL LETTER A, a) through U+007A (LATIN SMALL LETTER Z, z) only for the purposes of making the comparison. The comparison succeeds if the two strings are the same length and the code point of each character in the first string is equal to the code point of the character in the corresponding position in the second string.

XHTML namespace

the XHTML namespace namespace, https://www.w3.org/1999/xhtml

XML Island

The portion of the serialized document representing the result of serializing an element that is not to be serialized as an HTML element is known as an XML Island.

XML namespace

the XML namespace, https://www.w3.org/XML/1998/namespace

F Checklist of Implementation-Defined and Implementation-Dependent Features (Non-Normative)

This appendix provides a summary of Serialization features whose effect is explicitly implementation-defined or implementation-dependent.

F.1 Checklist of Implementation-Defined Features

The following list describes Serialization features whose effect is explicitly implementation-defined. The conformance rules (see 12 Conformance) require vendors to provide documentation that explains how these choices have been exercised.

For any implementation-defined output method, it is implementation-defined whether sequence normalization process takes place. (See 2 Sequence Normalization)
If the namespace URI is non-null for the method serialization parameter, then the parameter specifies an implementation-defined output method. (See 3 Serialization Parameters)
The effect of additional serialization parameters on the output of the serializer, where the name of such a parameter must be namespace-qualified, is implementation-defined or implementation-dependent. The extent of this effect on the output must not override the provisions of this specification. (See 3 Serialization Parameters)
Implementation-defined schema components may be included in the set of schema components that are used in evaluating an XQuery expression or XSLT instruction in the process of using a parameter document to determine the serialization parameters. (See 3.1 Setting Serialization Parameters by Means of a Parameter Document)
If a parameter document contains elements or attributes that are in a namespace other than http://www.w3.org/2010/xslt-xquery-serialization, the implementation may interpret them to specify the values of implementation-defined serialization parameters in an implementation-defined manner. (See 3.1 Setting Serialization Parameters by Means of a Parameter Document)
The effect of providing an option that allows the encoding phase to be skipped, so that the result of serialization can be encoded in a way required by a particular destination (e.g., a Java StringBuffer), is implementation-defined. The serializer is not required to support such an option. (See 4 Phases of Serialization)
If an implementation supports a value of the version parameter for the XML or XHTML output method for which this document does not provide a normative definition, the behavior is implementation-defined. (See 5.1.1 XML Output Method: the version Parameter)
A serializer may provide an implementation-defined mechanism to place CDATA sections in the result tree. (See 5.1.5 XML Output Method: the cdata-section-elements Parameter)
If the value of the normalization-form form parameter is not NFC, NFD, NFKC, NFKD, fully-normalized, or none then the meaning of the value and its effect is implementation-defined. (See 5.1.9 XML Output Method: the normalization-form Parameter)
In the XHTML output method, for HTML5, when meta elements are added to the output according to the rules of the include-content-type parameter, it is implementation-defined whether the traditional form using an http-equiv attribute or the newer HTML5 form using a charset attribute is produced. (See 6.1.15 XHTML Output Method: the include-content-type Parameter)
For the HTML output method, it is implementation-defined whether the basefont, frame and isindex elements, which are not part of HTML5, are considered to be void elements when the requested HTML version has the value 5.0. (See 7.1 Markup for Elements)
If an implementation supports a value of the version parameter for the HTML output method for which this document does not provide a normative definition, the behavior is implementation-defined. (See 7.4.1 HTML Output Method: the version and html-version Parameters)
In the HTML output method, for HTML5, when meta elements are added to the output according to the rules of the include-content-type parameter, it is implementation-defined whether the traditional form using an http-equiv attribute or the newer HTML5 form using a charset attribute is produced. (See 7.4.14 HTML Output Method: the include-content-type Parameter)
It is implementation-defined whether the serialization process recovers from serialization errors when the Adaptive output method is used. If it does, it is implementation-defined what error indicator is used. (See 10 Adaptive Output Method)
It is implementation-defined whether, when the Adaptive output method is used, a serializer includes hyperlinks in its output to record the types of atomic items, the bindings of namespace prefixes, the causes of error indicators, and other information. (See 10 Adaptive Output Method)

F.2 Checklist of Implementation-Dependent Features

The following list describes Serialization features whose effect is explicitly implementation-dependent. The conformance rules (see 12 Conformance) do not require vendors or specifications which define conformance criteria for serialization to provide documentation that explains how these choices have been exercised.

The octet order of the serialized result sequence of octets is implementation-dependent. (See 3 Serialization Parameters)
In those cases where they have no important effect on the content of the serialized result, details of the output methods defined by this specification are left unspecified and are regarded as implementation-dependent. (See 3 Serialization Parameters)
When map items are serialized using the JSON output method, the order in which key/value pairs appear in the serialized output is implementation-dependent. (See 9 JSON Output Method)
If, when the Adaptive output method is used, a serializer includes hyperlinks in its output to record the types of atomic items, the bindings of namespace prefixes, the causes of error indicators, and other information, then it is implementation-dependent what hyperlinks are used and how they convey the information. (See 10 Adaptive Output Method)

G Change Log (Non-Normative)

This appendix lists changes made in version 4.0 of this specification.

Use the arrows to browse significant changes since the 3.1 version of this specification.

See 1 Introduction
Sections with significant changes are marked Δ in the table of contents.

See 1 Introduction
The term atomic value has been replaced by atomic item.

See 1.1 Terminology
PR 342

In the HTML and XHTML output methods, the rules for adding and replacing meta elements have been revised to take account of the new HTML5 syntax, for example <meta charset="utf-8">.

See 6 XHTML Output Method

See 7 HTML Output Method
PR 534

Added the escape-solidus parameter for JSON serialization.

See 3 Serialization Parameters

See 9 JSON Output Method

See 10.1 The Influence of Serialization Parameters upon the Adaptive Output Method

XSLT and XQuery Serialization 4.0

W3C Editor's Draft 1 October 2024

Abstract

Status of this Document

1 Introduction

1.1 Terminology

1.2 Namespaces

2 Sequence Normalization

3 Serialization Parameters

3.1 Setting Serialization Parameters by Means of a Parameter Document

4 Phases of Serialization

5 XML Output Method

5.1 The Influence of Serialization Parameters upon the XML Output Method

5.1.1 XML Output Method: the version Parameter

5.1.2 XML Output Method: the html-version Parameter

5.1.3 XML Output Method: the encoding Parameter

5.1.4 XML Output Method: the indent and suppress-indentation Parameters

5.1.5 XML Output Method: the cdata-section-elements Parameter

5.1.6 XML Output Method: the omit-xml-declaration and standalone Parameters

5.1.7 XML Output Method: the doctype-system and doctype-public Parameters

5.1.8 XML Output Method: the undeclare-prefixes Parameter

5.1.9 XML Output Method: the normalization-form Parameter

5.1.10 XML Output Method: the media-type Parameter

5.1.11 XML Output Method: the use-character-maps Parameter

5.1.12 XML Output Method: the byte-order-mark Parameter

5.1.13 XML Output Method: the escape-solidus Parameter

5.1.14 XML Output Method: the escape-uri-attributes Parameter

5.1.15 XML Output Method: the include-content-type Parameter

5.1.16 XML Output Method: the item-separator Parameter

5.1.17 XML Output Method: the allow-duplicate-names Parameter

5.1.18 XML Output Method: the json-node-output-method Parameter

6 XHTML Output Method

6.1 The Influence of Serialization Parameters upon the XHTML Output Method

6.1.1 XHTML Output Method: the version Parameter

6.1.2 XHTML Output Method: the html-version Parameter

6.1.3 XHTML Output Method: the encoding Parameter

6.1.4 XHTML Output Method: the indent and suppress-indentation Parameters

6.1.5 XHTML Output Method: the cdata-section-elements Parameter

6.1.6 XHTML Output Method: the omit-xml-declaration and standalone Parameters

6.1.7 XHTML Output Method: the doctype-system and doctype-public Parameters

6.1.8 XHTML Output Method: the undeclare-prefixes Parameter

6.1.9 XHTML Output Method: the normalization-form Parameter

6.1.10 XHTML Output Method: the media-type Parameter

6.1.11 XHTML Output Method: the use-character-maps Parameter

6.1.12 XHTML Output Method: the byte-order-mark Parameter

6.1.13 XHTML Output Method: the escape-solidus Parameter

6.1.14 XHTML Output Method: the escape-uri-attributes Parameter

6.1.15 XHTML Output Method: the include-content-type Parameter

6.1.16 XHTML Output Method: the item-separator Parameter

6.1.17 XHTML Output Method: the allow-duplicate-names Parameter

6.1.18 XHTML Output Method: the json-node-output-method Parameter

7 HTML Output Method

7.1 Markup for Elements

7.2 Writing Attributes

7.3 Writing Character Data

7.4 The Influence of Serialization Parameters upon the HTML Output Method

7.4.1 HTML Output Method: the version and html-version Parameters

7.4.2 HTML Output Method: the encoding Parameter

7.4.3 HTML Output Method: the indent and suppress-indentation Parameters

7.4.4 HTML Output Method: the cdata-section-elements Parameter

7.4.5 HTML Output Method: the omit-xml-declaration and standalone Parameters

7.4.6 HTML Output Method: the doctype-system and doctype-public Parameters

7.4.7 HTML Output Method: the undeclare-prefixes Parameter

7.4.8 HTML Output Method: the normalization-form Parameter

7.4.9 HTML Output Method: the media-type Parameter

7.4.10 HTML Output Method: the use-character-maps Parameter

7.4.11 HTML Output Method: the byte-order-mark Parameter

7.4.12 HTML Output Method: the escape-solidus Parameter

7.4.13 HTML Output Method: the escape-uri-attributes Parameter

7.4.14 HTML Output Method: the include-content-type Parameter

7.4.15 HTML Output Method: the item-separator Parameter

7.4.16 HTML Output Method: the allow-duplicate-names Parameter

7.4.17 HTML Output Method: the json-node-output-method Parameter

8 Text Output Method

8.1 The Influence of Serialization Parameters upon the Text Output Method

8.1.1 Text Output Method: the version Parameter

8.1.2 Text Output Method: the html-version Parameter

8.1.3 Text Output Method: the encoding Parameter

8.1.4 Text Output Method: the indent and suppress-indentation Parameters

8.1.5 Text Output Method: the cdata-section-elements Parameter

5.1.1 XML Output Method: the `version` Parameter

5.1.2 XML Output Method: the `html-version` Parameter

5.1.3 XML Output Method: the `encoding` Parameter

5.1.4 XML Output Method: the `indent` and `suppress-indentation` Parameters

5.1.5 XML Output Method: the `cdata-section-elements` Parameter

5.1.6 XML Output Method: the `omit-xml-declaration` and `standalone` Parameters

5.1.7 XML Output Method: the `doctype-system` and `doctype-public` Parameters

5.1.8 XML Output Method: the `undeclare-prefixes` Parameter

5.1.9 XML Output Method: the `normalization-form` Parameter

5.1.10 XML Output Method: the `media-type` Parameter

5.1.11 XML Output Method: the `use-character-maps` Parameter

5.1.12 XML Output Method: the `byte-order-mark` Parameter

5.1.13 XML Output Method: the `escape-solidus` Parameter

5.1.14 XML Output Method: the `escape-uri-attributes` Parameter

5.1.15 XML Output Method: the `include-content-type` Parameter

5.1.16 XML Output Method: the `item-separator` Parameter

5.1.17 XML Output Method: the `allow-duplicate-names` Parameter

5.1.18 XML Output Method: the `json-node-output-method` Parameter

6.1.1 XHTML Output Method: the `version` Parameter

6.1.2 XHTML Output Method: the `html-version` Parameter

6.1.3 XHTML Output Method: the `encoding` Parameter

6.1.4 XHTML Output Method: the `indent` and `suppress-indentation` Parameters

6.1.5 XHTML Output Method: the `cdata-section-elements` Parameter

6.1.6 XHTML Output Method: the `omit-xml-declaration` and `standalone` Parameters

6.1.7 XHTML Output Method: the `doctype-system` and `doctype-public` Parameters

6.1.8 XHTML Output Method: the `undeclare-prefixes` Parameter

6.1.9 XHTML Output Method: the `normalization-form` Parameter

6.1.10 XHTML Output Method: the `media-type` Parameter

6.1.11 XHTML Output Method: the `use-character-maps` Parameter

6.1.12 XHTML Output Method: the `byte-order-mark` Parameter

6.1.13 XHTML Output Method: the `escape-solidus` Parameter

6.1.14 XHTML Output Method: the `escape-uri-attributes` Parameter

6.1.15 XHTML Output Method: the `include-content-type` Parameter

6.1.16 XHTML Output Method: the `item-separator` Parameter

6.1.17 XHTML Output Method: the `allow-duplicate-names` Parameter

6.1.18 XHTML Output Method: the `json-node-output-method` Parameter

7.4.1 HTML Output Method: the `version` and `html-version` Parameters

7.4.2 HTML Output Method: the `encoding` Parameter

7.4.3 HTML Output Method: the `indent` and `suppress-indentation` Parameters

7.4.4 HTML Output Method: the `cdata-section-elements` Parameter

7.4.5 HTML Output Method: the `omit-xml-declaration` and `standalone` Parameters

7.4.6 HTML Output Method: the `doctype-system` and `doctype-public` Parameters

7.4.7 HTML Output Method: the `undeclare-prefixes` Parameter

7.4.8 HTML Output Method: the `normalization-form` Parameter

7.4.9 HTML Output Method: the `media-type` Parameter

7.4.10 HTML Output Method: the `use-character-maps` Parameter

7.4.11 HTML Output Method: the `byte-order-mark` Parameter

7.4.12 HTML Output Method: the `escape-solidus` Parameter

7.4.13 HTML Output Method: the `escape-uri-attributes` Parameter

7.4.14 HTML Output Method: the `include-content-type` Parameter

7.4.15 HTML Output Method: the `item-separator` Parameter

7.4.16 HTML Output Method: the `allow-duplicate-names` Parameter

7.4.17 HTML Output Method: the `json-node-output-method` Parameter

8.1.1 Text Output Method: the `version` Parameter

8.1.2 Text Output Method: the `html-version` Parameter

8.1.3 Text Output Method: the `encoding` Parameter

8.1.4 Text Output Method: the `indent` and `suppress-indentation` Parameters

8.1.5 Text Output Method: the `cdata-section-elements` Parameter

8.1.6 Text Output Method: the `omit-xml-declaration` and `standalone` Parameters

8.1.7 Text Output Method: the `doctype-system` and `doctype-public` Parameters

8.1.8 Text Output Method: the `undeclare-prefixes` Parameter

8.1.9 Text Output Method: the `normalization-form` Parameter

8.1.10 Text Output Method: the `media-type` Parameter

8.1.11 Text Output Method: the `use-character-maps` Parameter

8.1.12 Text Output Method: the `byte-order-mark` Parameter

8.1.13 Text Output Method: the `escape-solidus` Parameter

8.1.14 Text Output Method: the `escape-uri-attributes` Parameter

8.1.15 Text Output Method: the `include-content-type` Parameter

8.1.16 Text Output Method: the `item-separator` Parameter

8.1.17 Text Output Method: the `allow-duplicate-names` Parameter

8.1.18 Text Output Method: the `json-node-output-method` Parameter

9.1.1 JSON Output Method: the `version` Parameter

9.1.2 JSON Output Method: the `html-version` Parameter

9.1.3 JSON Output Method: the `encoding` Parameter

9.1.4 JSON Output Method: the `indent` and `suppress-indentation` Parameters

9.1.5 JSON Output Method: the `cdata-section-elements` Parameter

9.1.6 JSON Output Method: the `omit-xml-declaration` and `standalone` Parameters

9.1.7 JSON Output Method: the `doctype-system` and `doctype-public` Parameters

9.1.8 JSON Output Method: the `undeclare-prefixes` Parameter

9.1.9 JSON Output Method: the `normalization-form` Parameter