Please check the errata for any errors or issues reported since publication.
See also translations.
This document is also available in these non-normative formats: XML.
Copyright © 2000 W3C® (MIT, ERCIM, Keio, Beihang). W3C liability, trademark and document use rules apply.
This document defines serialization of an instance of the data model as defined in [XDM 4.0] into a sequence of octets. Serialization is designed to be a component that can be used by other specifications such as [XSLT 4.0] or [XQuery 4.0].
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at https://www.w3.org/TR/.
This document is governed by the 1 March 2017 W3C Process Document.
This is a Recommendation of the W3C. It was jointly developed by the W3C XSLT Working Group and the W3C XML Query Working Group, each of which is part of the XML Activity.
This Editor's Draft specifies XSLT and XQuery Serialization version 4.0, a fully compatible extension of Serialization version 3.1.
This specification is designed to be referenced normatively from other specifications defining a host language for it; it is not intended to be implemented outside a host language. The implementability of this specification has been tested in the context of its normative inclusion in host languages defined by the XQuery 3.1 and XSLT 3.0 specifications; see the XQuery 3.1 implementation report (and, in the future, the WGs expect that there will also be an XSLT 3.0 implementation report) for details.
No substantive changes have been made to this specification since its publication as a Proposed Recommendation.
Please report errors in this document using W3C's public Bugzilla system (instructions can be found at https://www.w3.org/XML/2005/04/qt-bugzilla). If access to that system is not feasible, you may send your comments to the W3C XSLT/XPath/XQuery public comments mailing list, public-qt-comments@w3.org. It will be very helpful if you include the string “[SER40]” in the subject line of your report, whether made in Bugzilla or in email. Please use multiple Bugzilla entries (or, if necessary, multiple email messages) if you have more than one comment to make. Archives of the comments and responses are available at https://lists.w3.org/Archives/Public/public-qt-comments/.
This document has been reviewed by W3C Members, by software developers, and by other W3C groups and interested parties, and is endorsed by the Director as a W3C Recommendation. It is a stable document and may be used as reference material or cited from another document. W3C's role in making the Recommendation is to draw attention to the specification and to promote its widespread deployment. This enhances the functionality and interoperability of the Web.
This document was produced by groups operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures (W3C XML Query Working Group) and a public list of any patent disclosures (W3C XSLT Working Group) made in connection with the deliverables of each group; these pages also include instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.
The publications of this community group are dedicated to our co-chair, Michael Sperberg-McQueen (1954–2024).
In the HTML and XHTML output methods, the rules for adding and replacing meta elements have been revised to take account of the new HTML5 syntax, for example <meta charset="utf-8">. [Issue 318 PR 342 14 February 2023]
The default HTML version is now 5. This may result in changes to the serialized output in cases where no explicit HTML version is requested. [Issue 1889 PR 1977 2 May 2025]
The XHTML output method serializes the input tree as XML, using the HTML compatibility guidelines defined in the XHTML specification ([XHTML 1.0] or the XHTML syntax of HTML5 (see [HTML5]).
The default value of the html-version serialization parameter for this method is 5.0, and all references to the value of this parameter assume this default when the parameter is absent. The value of the parameter is a decimal, so the values 5 and 5.0 are equivalent.
[Definition: The term with HTML5 is used in this specification to qualify rules that apply only when the effective version of the html-version serialization parameter is 5.0.]
[Definition: The term prior to HTML5 is used in this specification to qualify rules that apply only when the effective version of the html-version serialization parameter is less than 5.0.]
[Definition: An element node is recognized as an HTML element by the XHTML output method if either of the following conditions is true:
the element node is in the XHTML namespace; or
With HTML5: the element has a null namespace URI and the local part of the name is equal to the name of an element defined by HTML5 [HTML5], making the comparison without regard to case.
]
It is entirely the responsibility of the supplier of the input tree to ensure that it conforms to the relevant specification, this being:
With HTML5, the XHTML syntax of HTML5;
Prior to HTML5, the [XHTML 1.0] or [XHTML 1.1] specification.
It is not an error if the input tree is invalid XHTML. Equally, it is entirely under the control of the supplier of the input tree whether the output conforms to XHTML 1.0 Strict, XHTML 1.0 Transitional, the XHTML syntax of HTML5 (see [HTML5]), [POLYGLOT] or any other specific definition of XHTML.
The serialization of the input tree follows the same rules as for the XML output method, with the general exceptions noted below and parameter-specific exceptions in 6.1 The Influence of Serialization Parameters upon the XHTML Output Method. These differences are based on the HTML compatibility guidelines published in Appendix C of [XHTML 1.0] and on [POLYGLOT], both of which are designed to ensure that as far as possible, XHTML is rendered correctly on user agents designed originally to handle HTML.
With HTML5 the input tree is first subjected to prefix normalization.
[Definition: During prefix normalization, any element node in the input tree that is in one of the XHTML namespace, the SVG namespace or the MathML namespace has its name replaced by the local part of its name. Such an element node is given a default namespace node whose value is the element’s namespace URI. Any namespace node for any of those three namespaces that was previously present on any element node in the input tree is also removed, unless the prefix that that namespace node declared is used as the prefix on the name of an attribute on that element or an ancestor of that element.]
The process of prefix normalization is equivalent to replacing the input tree with the result of the transformation described by this XSLT stylesheet, with the root of the input tree as the initial context value.
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="4.0"
xmlns:xhtml="http://www.w3.org/1999/xhtml"
xmlns:svg="http://www.w3.org/2000/svg"
xmlns:mathml="http://www.w3.org/1998/Math/MathML">
<xsl:template match="xhtml:*|svg:*|mathml:*">
<xsl:element name="{local-name()}"
namespace="{namespace-uri()}">
<xsl:apply-templates select="@*|namespace::*|node()"/>
</xsl:element>
</xsl:template>
<xsl:template match="node()|@*|namespace::*">
<xsl:copy copy-namespaces="no">
<xsl:apply-templates select="@*|namespace::*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template
match="namespace::*[. eq 'http://www.w3.org/1999/xhtml']|
namespace::*[. eq 'http://www.w3.org/2000/svg']|
namespace::*[. eq 'http://www.w3.org/1998/Math/MathML']"/>
</xsl:stylesheet>[Definition: The following XHTML elements have an EMPTY content model: area, base, br, col, embed, hr, img, input, link, meta, basefont, frame, isindex, and param.] [Definition: The void elements of HTML5 are area, base, br, col, embed, hr, img, input, keygen, link, meta, param, source, track and wbr.]
[Definition: An element node is expected to be empty if it is recognized as an HTML element and:
With HTML5, the element is a void element.
Prior to HTML5, the content model is EMPTY.
]
If an element node that has no child nodes is not expected to be empty, and:
With HTML5, the HTML element is not a void element, or
Prior to HTML5, the content model of the HTML element is not EMPTY (for example, an empty title or paragraph)
then the serializermust not use the minimized form. That is, it must output <p></p> and not <p />.
If an element that has no children is expected to be empty, the serializermust use the minimized tag syntax, for example <br />, as the alternative syntax <br></br> allowed by XML gives uncertain results in many legacy user agents. If the html-version serialization parameter has a value less than 5.0, the serializermust include a space before the trailing />, e.g. <br />, <hr /> and <img src="karen.jpg" alt="Karen" />.
Prior to HTML5, the serializermust not use the entity reference ' which, although defined in XML and therefore in XHTML, is not defined in versions of HTML prior to HTML5, and is not recognized by all HTML user agents.
With HTML5, the serializershould output namespace declarations in a way that is consistent with the requirements of [POLYGLOT].
Prior to HTML5, the serializershould output namespace declarations in a way that is consistent with the requirements of the XHTML DTD if this is possible.
The XHTML 1.0 DTDs require the declaration xmlns="http://www.w3.org/1999/xhtml" to appear on the html element, and only on the html element. The [POLYGLOT] specification permits namespace declarations to appear in a conforming document, but restricts the elements on which they can appear. The serializermust output namespace declarations that are consistent with the namespace nodes present in the result tree, but it must avoid outputting redundant namespace declarations on elements where the DTD would make them invalid, for versions prior to HTML5, or where they are not permitted by [POLYGLOT], for serialization according to the syntax of HTML5.
Note:
If the html element is generated by an XSLT literal result element of the form <html xmlns="http://www.w3.org/1999/xhtml"> ... </html>, or by an XQuery direct element constructor of the same form, then the html element in the result document will have a node name whose prefix is "", which will satisfy the requirements of the DTD. In other cases the prefix assigned to the element is implementation-dependent.
Note:
[POLYGLOT] and Appendix C of [XHTML 1.0] describe a number of compatibility guidelines for users of XHTML who wish to render their XHTML documents with HTML user agents. In some cases, such as the guideline on the form empty elements take, only the serialization process itself has the ability to follow the guideline. In such cases, those guidelines are reflected in the requirements on the serializer described above.
In all other cases, the guidelines can be adhered to by the input tree. The guideline on the use of whitespace characters in attribute values is one such example. Another example is that xml:lang="..." does not serialize to both xml:lang="..." and lang="..." as required by some legacy user agents. It is the responsibility of the person or process that creates the instance of the data model that is input to the serialization process to ensure it is created in a way that is consistent with the guidelines. No serialization error results if the input tree does not adhere to the guidelines.
The serialization parameters that affect the XHTML output method are listed in the following subsections.
Serialization parameters other than those listed are not applicable to this output method. It is the responsibility of the host language to specify whether an error occurs if such a parameter is specified in combination with the XHTML output method, or whether the parameter is ignored, or whether it is validated and then ignored.
include-content-type ParameterIf the input tree includes a head element recognized as an HTML element, and the include-content-type parameter has the value true, the XHTML output method must add a meta element as the first child element of the head element, specifying the character encoding actually used. The meta element should be in no namespace if the head element is in no namespace, and in the XHTML namespace if the head element is in the XHTML namespace.
Prior to HTML5, the generated meta element must take the form shown below (assuming encoding EUC-JP):
<head>
<meta http-equiv="Content-Type"
content="text/html; charset=EUC-JP" />
...<head>
<meta http-equiv="Content-Type"
content="text/html; charset=EUC-JP" />
...With HTML5, the generated meta element must take the form shown below (again assuming encoding EUC-JP):
<head>
<meta charset="EUC-JP"/>
...<head>
<meta charset="EUC-JP"/>
...The content type, when included,should be set to the value given for the media-type parameter.
Note:
It is recommended that the host language use as default value for this parameter one of the MIME types ([RFC2046]) registered for XHTML. Currently, these are text/html (registered by [RFC2854]) and application/xhtml+xml (registered by [RFC3236]). Note that some user agents fail to recognize the charset parameter if the content type is not text/html.
If a meta element has been added to the head element as described above, then any existing meta element child of the head element having either a charset attribute, or an http-equiv attribute with the value "Content-Type", making the comparison without regard to case after first stripping leading and trailing spaces from the value of the attribute solely for the purposes of comparison, must be discarded.
Note:
This process removes possible parameters in the attribute value. For example,
<meta http-equiv="Content-Type"
content="text/html;version='4.0'" />in the input treemight be replaced by
<meta charset="utf-8"/>
In the HTML and XHTML output methods, the rules for adding and replacing meta elements have been revised to take account of the new HTML5 syntax, for example <meta charset="utf-8">. [Issue 318 PR 342 14 February 2023]
The default HTML version is now 5. This may result in changes to the serialized output in cases where no explicit HTML version is requested. [Issue 1889 PR 1977 2 May 2025]
The HTML output method serializes the input tree as HTML.
For example, the following XSL stylesheet generates html output,
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="html" version="4.0"/>
<xsl:template match="/">
<html>
<xsl:apply-templates/>
</html>
</xsl:template>
...
</xsl:stylesheet>In the example, the version attribute of the xsl:output element indicates the version of the HTML Recommendation [HTML] to which the serialized result is to conform.
[Definition: The requested HTML version is the value of the html-version serialization parameter if present; otherwise the value of the version serialization parameter if present; otherwise 5.0.]
This document provides the normative definition of serialization for the HTML output method if the requested HTML version has the lexical form of a value of type decimal whose value is 1.0 or greater, but no greater than 5.0. For any other requested HTML version, the behavior is implementation-defined. In that case the implementation-defined behavior may supersede all other requirements of this recommendation.
An implementation is required to behave as specified in this document when the requested version is 5.0. If the requested version is greater than or equal to 1 but less than 5.0, then the processor may behave as if the requested version were 5.0.
It is entirely the responsibility of the supplier of the input tree to ensure that it conforms to the relevant HTML specification. It is not an error if the input tree is invalid HTML. Equally, it is entirely under the control of the supplier of the input tree whether the output conforms to HTML. If the result tree is valid HTML, the serializermust serialize the result in a way that conforms with the requested HTML version.
The serialization parameters that affect the HTML output method are listed in the following subsections.
Serialization parameters other than those listed are not applicable to this output method. It is the responsibility of the host language to specify whether an error occurs if such a parameter is specified in combination with the HTML output method, or whether the parameter is ignored, or whether it is validated and then ignored.
include-content-type ParameterIf there is a head element, and the include-content-type parameter has the value true, the HTML output method must add a meta element as the first child element of the head element specifying the character encoding actually used.
Prior to HTML5, the generated meta element must take the form shown below (assuming encoding EUC-JP):
<head>
<meta http-equiv="Content-Type"
content="text/html; charset=EUC-JP" >
...<head>
<meta http-equiv="Content-Type"
content="text/html; charset=EUC-JP" >
...With HTML5, the generated meta element must take the form shown below (again assuming encoding EUC-JP):
<head>
<meta charset="EUC-JP">
...<head>
<meta charset="EUC-JP">
...The content type, when included,must be set to the value given for the media-type parameter.
If a meta element has been added to the head element as described above, then any existing meta element child of the head element having a charset attribute or an http-equiv attribute with the value "Content-Type", making the comparison without regard to case after first stripping leading and trailing spaces from the value of the attribute solely for the purposes of comparison, must be discarded.
Note:
This process removes possible parameters in the attribute value. For example,
<meta http-equiv="Content-Type"
content="text/html;version='4.0'">in the input treemight be replaced by
<meta http-equiv="Content-Type"
content="text/html;charset=utf-8">or by
<meta charset="utf-8">