This document is also available in these non-normative formats: XML.
Copyright © 2000 W3C® (MIT, ERCIM, Keio, Beihang). W3C liability, trademark and document use rules apply.
This document defines methods of serializing an instance of the data model defined in [XDM 4.0] into a sequence of octets, conforming to a variety of formats including XML, HTML, and JSON. Serialization is designed to be a component that can be used either on its own, or invoked from languages such as [XSLT 4.0], [XPath 4.0] or [XQuery 4.0].
This section describes the status of this document at the time of its publication. Other documents may supersede this document.
This document is a working draft developed and maintained by a W3C Community Group, the XQuery and XSLT Extensions Community Group unofficially known as QT4CG (where "QT" denotes Query and Transformation). This draft is work in progress and should not be considered either stable or complete. Standard W3C copyright and patent conditions apply.
The community group welcomes comments on the specification. Comments are best submitted as issues on the group's GitHub repository.
The community group maintains two extensive test suites, one oriented to XQuery and XPath, the other to XSLT. These can be found at qt4tests and xslt40-test respectively. New tests, or suggestions for correcting existing tests, are welcome. The test suites include extensive metadata describing the conditions for applicability of each test case as well as the expected results. They do not include any test drivers for executing the tests: each implementation is expected to provide its own test driver.
The publications of this community group are dedicated to our co-chair, Michael Sperberg-McQueen (1954–2024).
Changes in 4.0 (next | previous)
In the HTML and XHTML output methods, the rules for adding and replacing meta elements have been revised to take account of the new HTML5 syntax, for example <meta charset="utf-8">. [Issue 318 PR 342 14 February 2023]
The default HTML version is now 5. This may result in changes to the serialized output in cases where no explicit HTML version is requested. [Issue 1889 PR 1977 2 May 2025]
The HTML output method serializes the input tree as HTML.
For example, the following XSL stylesheet generates html output,
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="html" version="4.0"/>
<xsl:template match="/">
<html>
<xsl:apply-templates/>
</html>
</xsl:template>
...
</xsl:stylesheet>In the example, the version attribute of the xsl:output element indicates the version of the HTML Recommendation [HTML] to which the serialized result is to conform.
[Definition: The requested HTML version is the value of the html-version serialization parameter if present; otherwise the value of the version serialization parameter if present; otherwise 5.0.]
This document provides the normative definition of serialization for the HTML output method if the requested HTML version has the lexical form of a value of type decimal whose value is 1.0 or greater, but no greater than 5.0. For any other requested HTML version, the behavior is implementation-defined. In that case the implementation-defined behavior may supersede all other requirements of this recommendation.
An implementation is required to behave as specified in this document when the requested version is 5.0. If the requested version is greater than or equal to 1 but less than 5.0, then the processor may behave as if the requested version were 5.0.
It is entirely the responsibility of the supplier of the input tree to ensure that it conforms to the relevant HTML specification. It is not an error if the input tree is invalid HTML. Equally, it is entirely under the control of the supplier of the input tree whether the output conforms to HTML. If the result tree is valid HTML, the serializermust serialize the result in a way that conforms with the requested HTML version.
The serialization parameters that affect the HTML output method are listed in the following subsections.
Serialization parameters other than those listed are not applicable to this output method. It is the responsibility of the host language to specify whether an error occurs if such a parameter is specified in combination with the HTML output method, or whether the parameter is ignored, or whether it is validated and then ignored.
indent and suppress-indentation ParametersIf the indent parameter has the value true, then the HTML output method may add or remove whitespace as it serializes the result tree, if it observes the following constraints.
Whitespace must not be added other than before or after an element, or adjacent to an existing whitespace character.
Whitespace must not be added or removed adjacent to an inline element. The inline elements are those included in the %inline: category of any of the HTML 4.01 DTDs or those elements defined to be phrasing elements in HTML5, as well as the ins and del elements if they are used as inline elements (i.e., if they do not contain element children).
Prior to HTML5: elements included in the %inline category of any of the HTML 4.01 DTDs
With HTML5: elements defined to be phrasing elements in HTML5
as well as the ins and del elements if they do not contain element children.
Whitespace must not be added or removed inside a formatted element, the formatted elements being pre, script, style, title, and textarea.
Whitespace characters must not be added in the content of an element whose expanded QName matches a member of the list of expanded QNames in the value of the suppress-indentation parameter. The expanded QName of an element node is considered to match a member of the list of expanded QNames if:
the two expanded QNames are equal;
the expanded QNames both have null namespace URIs, and the local parts of the two QNames are equal without regard to case; or
the value of the requested HTML version is 5.0, the local parts of the two QNames are equal without regard to case and one QName has a null namespace URI and the namespace URI of the other is equal to the XHTML namespace URI.
Note:
The effect of the above constraints is to ensure that any insertion or deletion of whitespace would not affect how a conforming HTML user agent would render the output, assuming the serialized document does not refer to any HTML style sheets.
Note that the HTML definition of whitespace is different from the XML definition (see section 9.1 of the [HTML] specification).