W3C

XSLT Streaming Version 4.0

W3C Editor's Draft 5 Sept 2025

This version:
https://qt4cg.org/specifications/xslt-streaming-40/
Latest version:
https://qt4cg.org/specifications/xslt-streaming-40/
Most recent Recommendation of XSL Transformations (XSLT):
https://www.w3.org/TR/xslt-30/
Editor:
Michael Kay, Saxonica <http://www.saxonica.com/>

Please check the errata for any errors or issues reported since publication.

See also translations.

This document is also available in these non-normative formats: Specification in XML format.


Abstract

This document defines the streaming feature of XSLT 4.0.

The earlier XSLT 3.0 specification integrated the definition of streaming into the main language specification, although streaming was always an optional feature. In this version, the specification has been modularised so that streaming features are described separately. This has been done in order to make the set of specification documents more manageable both for editors and for readers.

Status of this Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document.

This document is a working draft developed and maintained by a W3C Community Group, the XQuery and XSLT Extensions Community Group unofficially known as QT4CG (where "QT" denotes Query and Transformation). This draft is work in progress and should not be considered either stable or complete. Standard W3C copyright and patent conditions apply.

The community group welcomes comments on the specification. Comments are best submitted as issues on the group's GitHub repository.

The community group maintains two extensive test suites, one oriented to XQuery and XPath, the other to XSLT. These can be found at qt4tests and xslt40-test respectively. New tests, or suggestions for correcting existing tests, are welcome. The test suites include extensive metadata describing the conditions for applicability of each test case as well as the expected results. They do not include any test drivers for executing the tests: each implementation is expected to provide its own test driver.

At the time of writing this specification does not fully describe the streamability of all new constructs introduced in XSLT 4.0 and XPath 4.0. This remains a work in progress.

Dedication

The publications of this community group are dedicated to our co-chair, Michael Sperberg-McQueen (1954–2024).


1 Introduction

Changes in 4.0 

  1. Use the arrows to browse significant changes since the 3.1 version of this specification.

  2. Sections with significant changes are marked Δ in the table of contents.

This document defines the streaming feature of the XSLT 4.0 language: see [XSLT 4.0].

Streaming is an optional feature of the XSLT language that enables documents to be transformed that are too large to be held in memory. Stylesheets that aim to achieve streamed processing must adhere to contraints (called streamability rules) that are defined in this specification.

1.1 Terminology

In this specification the phrases must, must not, should, should not, may, required, and recommended, when used in normative text and rendered in small capitals, are to be interpreted as described in [RFC2119].

Where the phrase must, must not, or required relates to the behavior of the XSLT processor, then an implementation is not conformant unless it behaves as specified, subject to the more detailed rules in 13 Conformance.

Where the phrase must, must not, or required relates to a stylesheet then the processor must enforce this constraint on stylesheets by raising an error if the constraint is not satisfied.

Where the phrase should, should not, or recommended relates to a stylesheet then a processor may produce warning messages if the constraint is not satisfied, but must not treat this as an error.

[Definition: In this specification, the term implementation-defined refers to a feature where the implementation is allowed some flexibility, and where the choices made by the implementation must be described in documentation that accompanies any conformance claim.]

[Definition: The term implementation-dependent refers to a feature where the behavior may vary from one implementation to another, and where the vendor is not expected to provide a full specification of the behavior.] (This might apply, for example, to limits on the size of source documents that can be transformed.)

In all cases where this specification leaves the behavior implementation-defined or implementation-dependent, the implementation has the option of providing mechanisms that allow the user to influence the behavior.

A paragraph labeled as a Note or described as an example is non-normative.

2 Concepts

2.1 Streaming

[Definition: The term streaming refers to a manner of processing in which XML documents (such as source and result documents) are not represented by a complete tree of nodes occupying memory proportional to document size, but instead are processed “on the fly” as a sequence of events, similar in concept to the stream of events notified by an XML parser to represent markup in lexical XML.]

[Definition: A streamed document is a source treeXT that is processed using streaming, that is, without constructing a complete tree of nodes in memory.]

[Definition: A streamed node is a node in a streamed document.]

Many processors implementing earlier versions of this specification adopted an architecture that allowed streaming of the result treeXT directly to a serializer, without first materializing the complete result tree in memory. Streaming of the source treeXT, however, has proved to be more difficult without subsetting the language. This has created a situation where documents exceeding the capacity of virtual memory could not be transformed. XSLT 3.0 therefore introduced facilities allowing stylesheets to be written in a way that makes streaming of source documents possible, without excessive reliance on processor-specific optimization techniques.

Streaming achieves two important objectives: it allows large documents to be transformed without requiring correspondingly large amounts of memory; and it allows the processor to start producing output before it has finished receiving its input, thus reducing latency.

This specification does not attempt to legislate precisely which implementation techniques fall under the definition of streaming, and which do not. A number of techniques are available that reduce memory requirements, while still requiring a degree of buffering, or allocation of memory to partial results. A stylesheet that requests streaming of a source document is indicating that the processor should avoid assuming that the entire source document will fit in memory; in return, the stylesheet must be written in a way that makes streaming possible. This specification does not attempt to describe the algorithms that the processor should actually use, or to impose quantitative constraints on the resources that these algorithms should consume.

Nothing in the XSLT specification prevents a processor using streaming whenever it sees an opportunity to do so. However, experience has shown that in order to achieve streaming, it is often necessary to write stylesheet code in such a way as to make this possible. Therefore, XSLT provides explicit constructs allowing the stylesheet author to request streaming, and defines explicit static constraints on the structure of the code which are designed to make streaming possible.

A processor that claims conformance with the streaming option offers a guarantee that when streaming is requested for a source document, and when the stylesheet conforms to the rules that make the processing guaranteed-streamable, then an algorithm will be adopted in which memory consumption is either completely independent of document size, or increases only very slowly as document size increases, allowing documents to be processed that are orders-of-magnitude larger than the physical memory available. A processor that does not claim conformance with the streaming option must still process a stylesheet and deliver the correct results, but is not required to use streaming algorithms, and may therefore fail with out-of-memory errors when presented with large source documents.

Apart from the fact that there are constructs to request streaming, and rules that must be followed to guarantee that streaming is possible, the language has been designed so there are as few differences as possible between streaming and non-streaming evaluation. The semantics of the language continue to be expressed in terms of the XDM data model, which is substantively unchanged; but readers must take care to observe that when terms like “node” and “axis” are used, the concepts are completely abstract and may have no direct representation in the run-time execution environment.

Streamed processing of a document can be initiated in one of three ways:

  • The initial modeXT can be declared as a streamable modeXT. In this case the initial match selectionXT will generally be a document node (or sequence of document nodes), supplied by the calling application in a form that allows streaming (that is, in some form other than a tree in memory; for example, as a reference to a push or pull XML parser primed to deliver a stream of events). The type of these nodes can be constrained by using the attribute on-no-match="fail" on the initial modeXT, and using this mode only for processing the top-level nodes.

  • Streamed processing of any document can be initiated using the xsl:source-document instruction. This has an attribute href whose value is the URI of a document to be processed, and an attribute streamable that indicates whether it is to be processed using streaming; the actual processing to be applied is defined by the instructions written as children of the xsl:source-document instruction.

  • Streamed merging of a set of input documents can be initiated using the xsl:merge instruction.

The rules for streamability, which are defined in detail in 3 Streamability Analysis Principles, impose two main constraints:

  • The only nodes reachable from the node that is currently being processed are its attributes and namespaces, its ancestors and their attributes and namespaces, and its descendants and their attributes and namespaces. The siblings of the node, and the siblings of its ancestors, are not reachable in the tree, and any attempt to use their values is a static errorXT.

  • When processing a given node in the tree, each descendant node can only be visited once. Essentially this allows two styles of processing: either visit each of the children once, and then process that child with the same restrictions applied; or process all the descendants in a single pass, in which case it is not possible while processing a descendant to make any further downward selection.

The second restriction, that only one visit to the children is allowed, means that XSLT code that was not designed with streaming in mind will often need to be rewritten to make it streamable. In many cases it is possible to do this using a technique sometimes called windowing or burst-mode streaming (note this is not quite the same meaning as windowing in XQuery 3.0). Many XML documents consist of a large number of elements, each of manageable size, representing transactions or business objects where each such element can be processed independently: in such cases, an effective design pattern is to write a streaming transformation that takes a snapshot of each element in turn, processing the snapshot using the full power of the XSLT language. Each snapshot is a tree built in memory and is therefore fully navigable. For details see the snapshot and copy-of functions.

The new facility of accumulators allows applications complete control over how much information is retained (and by implication, how much memory is required) in the course of a pass over a streamed document. An accumulator computes a value for every node in a streamed document: or more accurately, two values, one for the first visit to a node (before visiting its descendants), and a second value for the second visit to the node (after visiting the descendants). The computation is structured in such a way that the value for a given node can depend only on the value for the previous node in document order together with the data available when positioned at the current node (for example, the attribute values). Based on the well-established fold operation of functional programming languages, accumulators provide the convenience and economy of mutable variables while remaining within the constraints of a purely declarative processing model.

When streaming is initiated, for example using the xsl:source-document instruction, it is necessary to declare which accumulators are applicable to the streamed document.

Streaming applications often fall into one of the following categories:

  • Aggregation applications, where a single aggregation operation (perhaps count, sum, exists, or distinct-values) is applied to a set of elements selected from the streamed source document by means of a path expression.

  • Record-at-a-time applications, where the source document consists of a long sequence of elements with similar structure (“records”), and each “record” is processed using the same logic, independently of any other “records”. This kind of processing is facilitated using the snapshot and copy-of function mentioned earlier.

  • Grouping applications, where the output follows the structure of the input, except that an extra layer of hierarchy is added. For example, the input might be a flat series of banking transactions in date/time order, and the output might contain the same transactions grouped by date.

  • Accumulator applications, which are the same as record-at-a-time applications, except that the processing of one “record” might depend on data encountered earlier in the document. A classic example is processing a sequence of banking transactions in which the input transaction contains a debit or credit amount, and the output adds a running total (the account balance). The xsl:iterate instruction has been introduced to facilitate this style of processing.

  • Isomorphic transformations, in which there is an ordered (often largely one-to-one) relationship between the nodes of the source tree and the nodes of the result tree: for example, transformations that involve only the renaming or selective deletion of nodes, or scalar manipulations of the values held in the leaf nodes. Such transformations are most conveniently expressed using recursive application of template rules. This is possible with a streamed input document only if all the template rules adhere to the constraints required for streamability. To enforce these rules, while still allowing unrestricted processing of other documents within the same transformation, all streaming evaluation must be carried out using a specific modeXT, which is declared to be a streaming mode by means of an xsl:mode declaration in the stylesheet.

There are important classes of application in which streaming is possible only if multiple streams can be processed in parallel. This specification therefore provides facilities:

  1. allowing multiple sorted input sequences to be merged into one sorted output sequence (the xsl:merge instruction)

  2. allowing multiple output sequences to be generated during a single pass of an input sequence (the xsl:fork instruction).

These facilities have been designed in such a way that they can readily be implemented using streaming, that is, without materializing the input or output sequences in memory.

2.2 Streamed Validation

Streaming can be combined with schema-aware processing: that is, the streamed input to a transformation can be subjected to on-the-fly validation, a process which typically accepts an input stream from the XML parser and delivers an output stream (of type-annotated nodes) to the transformation processor. The XSD specification is designed so that validation is, with one or two exceptions, a streamable process. The exceptions include:

  • There may be a need to allocate memory to hold keys, in order to enforce uniqueness and referential integrity constraints (xs:unique, xs:key, xs:keyref).

  • In XSD 1.1, assertions can be defined by means of XPath expressions. These are not constrained to be streamable; in the general case, any subtree of the document that is validated using an assertion may need to be buffered in memory while the assertion is processed.

Applications that need to run in finite memory may therefore need to avoid these XSD features, or to use them with care.

XSD is designed so that the intended type of an element (the “governing type”) can be determined as soon as the start tag of the element is encountered: the process of validation checks whether the content of the element actually conforms to this type, and by the time the end tag is encountered, the process will have established either that the element is valid against the governing type, or that it is invalid.

By default, dynamic errors occurring during streamed processing are fatal: they typically cause the transformation to fail immediately. XSLT 3.0 introduced the ability to catch dynamic errors and recover from them. Schema invalidity, however, is treated as a dynamic error of the instruction that processes the entire input stream, so after a validation failure, no further processing of that input stream is possible.

In consequence, a streamed validator that is running in tandem with a streamed transformation can present the transformer with element nodes that carry a provisional type annotation representing the type that the element will have if it turns out to be valid. As soon as a node is encountered that violates this assumption, the validator should stop the flow of data to the transformer, so that the transformer never sees invalid data. This allows the stylesheet code to be compiled with the assumption of type-safety: at run-time, all nodes seen by the transformation will conform to their XSLT-declared types (for example, a type declared implicitly using match="schema-element(invoice)" on an xsl:template element).

A streamed transformation that only accesses part of the input document (for example, a header at the start of a document) is not required to continue reading once the data it needs has been read. This means that XML well-formedness or validity errors occurring in the unread part of the input stream may go undetected.

Note:

The analysis of guaranteed streamability (see 3 Streamability Analysis Principles) takes no account of information that might be obtained from a schema-aware static analysis of the stylesheet. Implementations may, however, be able to use streaming strategies for stylesheets that are not guaranteed-streamable, by taking advantage of such information. For example, an implementation might be able to treat the expression .//title as striding rather than crawling if it can establish from knowledge of the schema that two title elements will never be nested one inside the other.

2.3 Initiating a Streamed Transformation

With a streamable processor, the initial match selectionXT> can consist of streamed nodes, but the global context itemXT is always grounded, because it is available to all global variables and there is no control over the sequence of processing.

If the initial mode is declared-streamable, then a streaming processor should allow some or all of the items in the initial match selectionXT to be nodes supplied in streamable form, and any nodes that are supplied in this form must then be processed using streaming.

Since the global context itemXT cannot be a streamed node, in cases where the transformation is to proceed by applying streamable templates to a streamed input document, the global context itemXT must either be absent, or must be something that differs from the initial match selectionXT.

2.4 Streaming of non-XML data

The facilities in this specification designed to enable large data sets to be processed in a streaming manner are oriented almost entirely to XML data. This does not mean that there is never a requirement to stream non-XML data, or that the Working Group has ignored this requirement; rather, the Working Group has concluded that for the most part, streaming of non-XML data can be achieved by implementations without the need for specific language features in XSLT.

To make streamed processing of unparsed text files easier, the function unparsed-text-lines has been introduced. This is not only more convenient for stylesheet authors than reading the entire input using the unparsed-text function and then tokenizing the result, it is also easier for implementations to optimize, allowing each line of text to be discarded from memory after it has been processed.

For all functions that access external data, including document, doc, collection, unparsed-text, unparsed-text-lines, and json-doc, the requirements on determinism can now be relaxed using implementation-defined configuration options. This is significant because it means that when a transformation reads the same external resource more than once, it becomes legitimate for the contents of the resource to be different on different invocations, and this eliminates the need for the processor to cache the contents of the resource in memory.

In the XDM data model, every value is a sequence, and (as with most functional programming languages), processing of sequences of items is pervasive throughout the XSLT and XPath languages and their function library. Good performance of a functional programming language often depends on sequence-based operations being pipelined, and being evaluated in a lazy fashion (that is, many operations process items in a sequence one at a time, in order; and many operations can deliver a result without processing the entire sequence). The semantics of XSLT and XPath permit pipelined and lazy evaluation (for example, the error handling semantics are carefully written to ensure this), but they do not require it: the details are left to implementations. Pipelined processing of a sequence is not the same thing as streamed processing of a tree, and where the XSLT specification talks of operations being “guaranteed streamable”, this is always referring to processing of trees, not of sequences.

The facilities for streaming of XML trees include operations such as copy-of and snapshot which are able to take a sequence of streamed nodes as input, and produce a sequence of in-memory (unstreamed) nodes as output. It is also possible to generate a sequence of strings or other atomic items through the process of atomizationXT. The actual memory usage of a streamed XSLT application may depend significantly on whether the processing of the resulting sequence of in-memory nodes or atomic items is pipelined or not. The specification, however, has nothing to say on this matter: it is considered an area where implementers can exercise their discretion and ingenuity.

Streaming of JSON input receives little attention in this specification. One can envisage an implementation of the json-to-xml function in which the XML delivered by the function consists of streamed nodes; but the Working Group has not researched the feasibility of such an implementation in any detail.

2.5 Data Model for Streaming

2.5.1 Streamed Documents

The data model for nodes in a document that is being streamed is no different from the standard XDM data model, in that it contains the same objects (nodes) with the same properties and relationships. The facilities for streaming do not change the data model; instead they impose rules that limit the ability of stylesheets to navigate the data model.

A useful way to visualize streaming is to suppose that at any point in time, there is a current position in the streamed input document which may be the start or end of the document, the start or end tag of an element, or a text, comment, or processing instruction node. From this position, the stylesheet has access to the following information:

  • Properties intrinsic to the node, such as its name, its base URI, its type annotation, and its is-id and is-idref properties.

  • The ancestors of the node (but navigation downwards from the ancestors is not permitted).

  • The attributes of the node, and the attributes of its ancestors. For each such attribute, all the properties of the node including its string value and typed value are available, but there are limitations that restrict navigation from the attribute node to other nodes in the document.

  • The in-scope namespace bindings of the node.

  • In the case of attributes, text nodes, comments, and processing instructions, the string value and typed value of the node.

  • In the case of element nodes, whether or not the element has children. This information is obtained by calling the has-children function. This implies that the processor performs look-ahead (limited to a single token) to determine whether the start tag is immediately followed by a matching end tag.

  • In the case of document nodes, details of unparsed entities in the document. This information is obtained by calling the unparsed-entity-uri and unparsed-entity-public-id functions. A processor might enable this by reading the DTD as soon as the document is opened. Since comments and processing instructions that precede the DOCTYPE declaration are available as children of the document node, this also implies that a streaming processor needs sufficient memory to hold these comments and processing instructions until the start tag of the first element is encountered. Information about unparsed entities remains available for the duration of processing, in the same way as attributes of ancestor elements.

The children and other descendants of a node are not accessible except as a by-product of changing the current position in the document. The same applies to properties of an element or document node that require examination of the node’s descendants, that is, the string value and typed value. This is enforced by means of a rule that only one expression requiring downward navigation from a node is permitted.

Information about the type of a node is in general considered a property intrinsic to the node, and is available without advancing the input stream. There is an exception for an expression of the form (/) instance of document-node(element(invoice)). This is not guaranteed streamable, because it requires reading ahead to check that the document node has only one element child. However, a processor that knows that the parser delivering the document stream is only capable of delivering well-formed documents may use this knowledge (along with the limited look-ahead needed to get the name of the outermost element) to make this expression streamable.

A streaming processor is not required to read any more of the source document than is needed to generate correct stylesheet output. It is not required to read the full source document merely in order to satisfy the requirement imposed by the XML Recommendation that an XML Processor must report violations of well-formedness in the input.

More detailed rules are defined in 3 Streamability Analysis Principles.

2.5.2 Maps and Arrays

Maps and arrays were first introduced in XPath 3.1.

Streaming facilities in this specification are, for the most part, relevant only to streamed processing of XML trees, and not to other structures such as sequences, maps and arrays, which will typically be held in memory unless the processor is capable of avoiding this.

Maps, however, play an important role in enabling streamed applications to be written. For example, a map can be used as the data structure maintained by an accumulator (see [XSLT 4.0] section 19 Accumulators) to remember information that has been retrieved from a streamed document, given that it is not possible to revisit the same nodes later. There is also a special streamability rule for map constructor expressions (see 12.1 Maps and Streaming) that allows such an expression to make multiple downward selections in the streamed input document: for example one can write { 'authors': data(author), 'editors': data(editor) }, which gathers the values of these two elements, or sets of elements, from the input stream, regardless what order they appear in — even if they are interleaved.

The rules for creating maps and arrays are designed to ensure that the entries in a map, and the members of an array, cannot contain nodes from a streamed document. This is achieved by the way in which the streamability properties of the relevant expressions and functions are defined.

By contrast, sequences can and often do contain nodes from streamed documents, and a major purpose of the rules for streamability is to make this possible.

3 Streamability Analysis Principles

This section describes the principles used to determine properties of constructs in the stylesheetXT that are used in the analysis of streamability. Specifically, it introduces the concepts of the posture and sweep of a construct, which enable the streamability of the stylesheet to be assessed.

These properties are used, for example, to determine the streamability of:

In each case, the conditions for constructs to be guaranteed-streamable are defined in terms of these properties. The result of this analysis in turn (see 3.1 Streamability Guarantees) imposes rules on how the constructs are handled by processors that implement the streaming feature. The analysis has no effect on the behavior of processors that do not implement this feature.

The analysis is relevant to constructs such as streamable template rules and the xsl:source-document instruction that process a single streamed input document. The xsl:merge instruction, which processes multiple streamed inputs, has its own rules.

The rules in this section operate on the expression tree (more properly, construct tree) that is typically output by the XSLT and XPath parser. For the most part, the rules depend only on identifying the syntactic constructs that are present.

The rules in this section generally consider each componentXT in the stylesheet (and in the case of template rulesXT, each template rule) in isolation. The exception is that where a component contains references to other components (such as global variables, functions, or named templates), then information from the signature of the referenced component is sometimes used. This is invariably information that cannot be changed if a component is overridden in a different packageXT. The analysis thus requires as a pre-condition that function calls and calls on named templates have been resolved to the extent that the corresponding function/template signature is known.

The detailed way in which the construct tree is derived from the lexical form of the stylesheet is not described in this specification. There are many ways in which the tree can be optimized without affecting the result of the rules in this section: for example, a sequence constructor containing a single instruction can be replaced by that instruction, and a parenthesized expression can be replaced by its content.

[Definition: The term construct refers to the union of the following: a sequence constructorXT, an instructionXT, an attribute setXT, a value templateXT, an expressionXT, or a patternXT.]

These constructs are classified into construct kinds: in particular, instructionsXT are classified according to the name of the XSLT instruction, and expressionsXT are classified according to the most specific production in the XPath grammar that the expression satisfies. (This means, for example, that 2+2 is classified as an AdditiveExpr, rather than say as a UnionExpr; although it also satisfies the production rule for UnionExpr, AdditiveExpr is more specific.)

3.1 Streamability Guarantees

Certain constructs allow a stylesheet author to declare that a construct is streamable. Specifically:

  • Specifying streamable="yes" on xsl:mode declares that all template rules in that mode (and all template rules that specify mode="#all") are streamable;

  • Specifying streamable="yes" on xsl:source-document declares that its contained sequence constructor is streamable;

  • Specifying streamable="yes" on xsl:function declares that the stylesheet functionXT in question is streamable;

  • Specifying streamable="yes" on xsl:attribute-set declares that the attribute set in question is streamable;

  • Specifying streamable="yes" (explicitly or implicitly) on xsl:merge-source declares that the merging process is streamable with respect to that particular input.

  • Specifying streamable="yes" on xsl:accumulator declares that the accumulator can be evaluated on a streamed document.

[Definition: The above constructs (template rules belonging to a mode declared with streamable="yes"; and xsl:source-document, xsl:attribute-set, xsl:function, xsl:merge-source, and xsl:accumulator elements specifying streamable="yes") are said to be declared-streamable.]

In each case the construct in question is said to be guaranteed-streamable if it satisfies two conditions:

  1. The construct is declared-streamable.

  2. Streamability analysis following the rules defined in this specification determines that streamed processing is possible (the detailed conditions vary from one construct to another).

[Definition: A guaranteed-streamable construct is a construct that is declared to be streamable and that follows the particular rules for that construct to make streaming possible, as defined by the analysis in this specification.]

For a streaming processor, that is, a processor that claims conformance with the streaming feature:

  1. If a construct is guaranteed-streamable and the input is provided in streamable form, then the input must be processed using streaming.

    Note:

    The requirement to process the input using streaming does not apply if the processor is able to determine that this would convey no benefit: for example, if the input is supplied as a tree in memory. However, this does not remove the requirement to verify that the relevant stylesheet constructs are guaranteed-streamable.

  2. If a construct is declared as streamable but is not guaranteed-streamable (that is, if it fails to satisfy the conditions for streamability defined in this specification), then the processor must be prepared to do any one of the following at user option:

    1. Raise a static error [see ERR XTSE3430]

    2. Process the stylesheet as if it were a non-streaming processor (see below)

    3. Process the stylesheet with streaming if it is able to do so, or raise a static error [see ERR XTSE3430] if it is not able to do so.

[ERR XTSE3430] It is a static errorXT if a packageXT contains a construct that is declared to be streamable but which is not guaranteed-streamable, unless the user has indicated that the processor is to handle this situation by processing the stylesheet without streaming or by making use of processor extensions to the streamability rules where available.

For a non-streaming processor, the processor must evaluate the construct delivering the same results as if execution used streaming, but with no constraints on the evaluation strategy. (Processing may, of course, fail due to insufficient memory being available, or for other reasons.) A non-streaming processor is not required to assess whether constructs are guaranteed-streamable, or to apply restrictions such as the rules for where calls on the functions accumulator-before and accumulator-after may appear. However, a non-streaming processor must enforce the constraint implied by a use-accumulators attribute restricting which accumulators can be used with a particular document.

Note:

This specification does not attempt to legislate precisely what constitutes evaluation “using streaming”. The most important test is that the amount of memory needed should be for practical purposes independent of the size of the source document, and in particular that the finite size of memory available should not impose a limit on the size of source document that can be processed.

The rules are designed to ensure that streaming processors can analyze streamability using rules different from those in this specification, provided that all constructs that are guaranteed-streamable according to this specification are actually streamable by the implementation. Furthermore, non-streaming processors are not required to analyze streamability at all.

3.2 Operand Roles

[Definition: For every construct kind, there is a set of zero or more operand roles.] For example, an AdditiveExpr has two operand roles, referred to as the left-hand operand and the right-hand operand, while an IfExpr has three, referred to as the condition, the then-clause, and the else-clause. A function call with three arguments has three operand roles, called the first, second, and third arguments. The names of the operand roles for each construct kind are not formally listed, but should be clear from the context.

[Definition: In an actual instance of a construct, there will be a number of operands. Each operand is itself a construct; the construct tree can be defined as the transitive relation between constructs and their operands.] Each operand is associated with exactly one of the operand roles for the construct type. There may be operand roles where the operand is optional (for example, the separator attribute of the xsl:value-of instruction), and there may be operand roles that can be occupied by multiple operands (for example, the xsl:when/@test condition in xsl:choose, or the arguments of the concat function).

Operand roles have a number of properties used in the analysis:

  • The required typeXT of the operand. This is explicit in the case of function calls (the required type is defined in the function signature of the corresponding function). In other cases it is implicit in the detailed rules for the construct in question. In practice streamability analysis makes only modest use of the required type; the main case where it is relevant is for a function or template call, where knowing that the required type is atomic enables the inference that the operand usage for a supplied node is absorption.

  • [Definition: The operand usage. This gives information, in the case where the operand value contains nodes, about how those nodes are used. The operand usage takes one of the values absorption, inspection, transmission, or navigation.] The meanings of these terms are explained in 3.2 Operand Roles. If the required type of the operand does not permit nodes to be supplied (for example because the required type is a function item or a map), then the operand usage is inspection, because the only run-time operation on a supplied node will be to inspect it, discover it is a node, and raise a type error.

    In the particular case where the required type is atomic, and any supplied nodes are atomized, the operand usage will be absorption, because atomizeXT is a special case of absorption.

  • [Definition: Whether or not the operand is higher-order. For this purpose an operand O of a construct C is higher-order if the semantics of C potentially require O to be evaluated more than once during a single evaluation of C.] More specifically, O is a higher-order operand of C if any of the following conditions is true:

    • The context itemXT for evaluation of O is different from the context item for evaluation of C.

    • C is an instructionXT and O is a patternXT (as with the from and count attributes of xsl:number, and the group-starting-with and group-ending-with attributes of xsl:for-each-group).

    • C is an XPath for, some, or every expression and O is the expression in its return or satisfies clause.

    • C is an inline function declaration and O is the expression in its body.

Note:

There is one known case where this definition makes an operand higher-order even though it is only evaluated once: specifically, the sequence constructor contained in the body of an xsl:copy instruction that has a select attribute. See 12.5.12 Streamability of xsl:copy for further details.

3.3 Operand Usage

An operand role gives information about the operands of a particular kind of construct. The two important properties of an operand role are the required type and the operand usage.

The usage of an operand role is relevant only when the value of an operand supplied in that role is a node, or a sequence that contains nodes. It is one of the following:

  • [Definition: An operand usage of absorption indicates that the construct reads the subtree(s) rooted at a supplied node(s).] Examples are constructs that atomize their operands, or that obtain the string value of a supplied node, or that copy the supplied node to a new tree. Another example is the deep-equal function, which compares the subtrees rooted at the nodes supplied in its first two arguments.

  • [Definition: An operand usage of inspection indicates that the construct accesses properties of a supplied node that are available without reading its subtree.] Examples are functions such as name and base-uri, and the instance of expression which tests the type of a node (or other item), or functions such as count, exists, and boolean which are only interested in the existence of the node, and not in its properties.

  • [Definition: An operand usage of transmission indicates that the construct will (potentially) return a supplied node as part of its result to the calling construct (that is, to its parent in the construct tree).] It also indicates that document order is preserved: if the input is in document order, then the result must be in document order. An example is a filter expression, where nodes in the base expression (the expression being filtered) will typically appear in the result of the filter expression, in their original order.

  • [Definition: An operand usage of navigation indicates that the construct may navigate freely from the supplied node to other nodes in the same tree, in a way that is not constrained by the streamability rules.] This covers several cases: cases where it is known that the construct performs impermissible navigation (for example, the xsl:number instruction) or reordering (the reverse function), or that require look-ahead (the innermost function) and also cases where the analysis is unable to determine what use is made of the node, for example because it is passed as an argument to a user-defined function, or retained in a variable.

The concept of operand usage is not used for all constructs (for example, it is not used in the analysis of path expressions). Where it is used, the assignment of operand usages to each operand role of a construct is defined in the relevant subsection of 12 Streamability of Specific Constructs.

3.3.1 Type-determined Usage

[Definition: The type-determined usage of an operand is as follows: if the required type (ignoring occurrence indicator) is fn(*) or a subtype thereof, then inspection; if the required type (ignoring occurrence indicator) is an atomic or union type, then absorption; otherwise navigation.]

[Definition: The type-adjusted posture and sweep of a construct C, with respect to a type T, are the posture and sweep established by applying the general streamability rules to a construct D whose single operand is the construct C, where the operand usage of C in D is the type-determined usage based on the required type T.]

Note:

In effect, the type-adjusted posture and sweep are the posture and sweep of the implicit expression formed to apply the coercion rulesXT to the argument of a function or template call, or to the result of a function or template, given knowledge of the required type. For example, an expression such as discount in the function call abs(discount), which would otherwise be striding and consuming, becomes grounded and consuming because of the implicit atomization triggered by the coercion rules.

3.4 Streamability Properties

The process of determining whether a construct is streamable reduces to determining properties of the constructs in the construct tree. The properties in question (which are described in greater detail in subsequent sections) are:

  1. The static type of the construct. When the construct is evaluated, its value will always be an instance of this type. The value is a U-type; although type inferencing is capable of determining information about the cardinality as well as the item type, the streamability analysis makes no use of this.

  2. The context item type: that is, the static type of the context itemXT potentially used as input to the construct. When the construct is evaluated, the context item used to evaluate the construct (if it is used at all) will be an instance of this type.

  3. [Definition: The posture of the expression. This captures information about the way in which the streamed input document is positioned on return from evaluating the construct. The posture takes one of the values climbing, striding, crawling, roaming, or grounded.] The meanings of these terms are explained in 3.8 Posture.

  4. [Definition: The context posture. This captures information about how the context itemXT used as input to the construct is positioned relative to the streamed input. The context posture of a construct C is the posture of the expression whose value sets the focus for the evaluation of C.] Rules for determining the context posture of any construct are given in 3.8.2 Determining the Context Posture.

  5. The sweep of the construct. The sweep of a construct gives information about whether and how the evaluation of the construct changes the current position in a streamed input document. The possible values are motionless, consuming, and free-ranging. These terms are explained in 3.9 Sweep.

The values of these properties for a top-level construct such as the body of a template rule determine whether the construct is streamable.

The values of these properties are not independent. For example, if the static type is atomic, then the posture will always be grounded; if the sweep is free-ranging, then the posture will always be roaming.

The posture and sweep of a construct, as defined above, are calculated in relation to a particular streamed input document. If there is more than one streamed input document, then a construct that is motionless with respect to one streamed input might be consuming with respect to another. In practice, though, the streamability analysis is only ever concerned with one particular streamed input at a time; constructs are analyzed in relation to the innermost containing xsl:template, xsl:source-document, xsl:accumulator, or xsl:merge-source element, and this container implicitly defines the streamed input document that is relevant. The streamed input document affecting a construct is always the document that contains the context item for evaluation of that construct.

3.5 Determining the Static Type of a Construct

Changes in 4.0  

  1. The static typing rules have been updated to take account of new constructs in XPath 4.0.   [Issue 675 PR 2011 18 May 2025]

[Definition: The static type of a construct is such that all values produced by evaluating the construct will conform to that type. The static type of a construct comprises a U-type and a cardinality. A cardinality is a range of integers (from min to max).]

[Definition: A U-type is a set of fundamental item types.]

[Definition: There are 29 fundamental item types: the 7 node kinds defined in [XDM 4.0] (element, attribute, etc.), the 19 primitive atomic types defined in [XML Schema Part 2], plus the types fn(*), xs:untypedAtomic, and JNode. The fundamental item types are disjoint, and every item is an instance of exactly one of them.]

More specifically, the fundamental item types are:

  • document-node(), element(), attribute(), text(), comment(), processing-instruction(), namespace-node();

  • xs:boolean, xs:double, xs:decimal, xs:float, xs:string, xs:dateTime, xs:date, xs:time, xs:gYear, xs:gYearMonth, xs:gMonth, xs:gMonthDay, xs:gDay, xs:anyURI, xs:QName, xs:NOTATION, xs:base64Binary, xs:hexBinary, xs:duration

  • fn(*)

  • xs:untypedAtomic

  • JNode

TODO: extend the analysis to include JNodes.

A value V (in general, a sequence) is an instance of a U-type U if every item in V is an instance of one of the fundamental item types in U. For example, the sequence (23, "Paris") is an instance of the U-type U{xs:string, xs:decimal, xs:date} because both items in the sequence belong to item types in this U-type.

Note:

It is a consequence of this rule that the empty sequence, (), is an instance of every U-type.

A U-type is represented in this specification using the notation U{t1, t2, t3, ...} where t1, t2, t3, ... are the names of the fundamental item types making up the U-type. The item types are represented using the syntax of the ItemTypeXP production in XPath, for example comment() or xs:date.

Note:

This means that the order of t1, t2, t3, ... has no significance: U{A, B} is the same U-type as U{B, A}.

The smallest U-type is denoted U{}. This is not an empty type; like every other U-type, it has the empty sequence () as an instance. For convenience, the universal U-type is represented as U{*}; the U-type corresponding to the set of 7 node kinds is written U{N}, and the U-type corresponding to all atomic items (that is, the 19 primitive atomic types plus xs:untypedAtomic) is written U{A}.

Because a U-type is a set, the operations of union, intersection, and difference are defined over U-types, and the result is always a U-type. If one U-type U is a subset of another U-type V, then U is said to be a subtype of V, and V is said to be a supertype of U.

In some cases the inference of a static type depends on the declared types of variables or functions. Since declared types use the SequenceTypeXT syntax, there is therefore a mapping defined from SequenceTypes to U-types. The mapping is as follows:

  • The SequenceTypeXT empty-sequence() maps to U{}

  • For every other SequenceTypeXT, the mapping depends only on the item type and ignores the occurrence indicator. The mapping from item types is as follows:

    • item() maps to U{*}

    • A choice item type (A | B | C) maps to the union of the U-types corresponding to A, B, and C.

    • AnyKindTest (node()) maps to U{N}

    • DocumentTest maps to U{document-node()}

    • ElementTest and SchemaElementTest map to U{element()}

    • AttributeTest and SchemaAttributeTest map to U{attribute()}

    • TextTest maps to U{text()}

    • CommentTest maps to U{comment()}

    • PITest maps to U{processing-instruction()}

    • NamespaceNodeTest maps to U{namespace-node()}

    • FunctionType, MapType, and ArrayType and RecordType map to U{fn(*)}

    • The QName xs:error maps to U{}

    • A QName Q representing an atomic type that is a fundamental item type maps to U{Q}

    • A QName Q representing an atomic type derived from a fundamental item type F maps to U{F}

    • A QName Q representing a pure union type maps to a U-type containing the fundamental item types present in the transitive membership of the union, or from which the transitive members of the union are derived.

3.5.1 Static Type Inference for XPath Expressions

Although all constructs have a static type, the streamability analysis only needs to know the static type of XPath expressions, so the rules here are largely confined to that case. For patternsXT, the static type is deemed to be U{xs:boolean}, reflecting the fact that a pattern is essentially a function that can be applied to items to deliver a true or false (matching or non-matching) result. For constructs other than expressionsXT and patternsXT, the static type for the purpose of streamability analysis is taken as U{*}.

The rules given here are deliberately simple. Implementations may well be able to compute a more precise static type, but this will rarely be useful for streamability analysis. The item type for each kind of XPath expression is determined by the rules below. The columns are interpreted as follows:

  • The name in the first column is the name of a production in the XPath grammar.

  • In the second column, the Proforma uses an informal notation used both to provide a reminder of the syntax of the construct in question, and to attach labels to its operand roles so that they can be referred to in the text of the third column.

  • The third column gives the static type of the expression, either as a U-type, or as a formula for computing the U-type. In these formulae, T(E) means the U-type of expression E, U1|U2 means the union of U-types U1 and U2, and U1 intersect U2 means the intersection of U-types U1 and U2.

  • The fourth column gives the cardinality of the result, either as an explicit range, or as a formula. For example (0, 1) indicates a cardinality range from zero to one inclusive. N represents unbounded cardinality. The notation C(E) represents the cardinality of expression E.

    The sum of two cardinalities C + D is the range (Cmin+Dmin, Cmax+Dmax).

    The product of two cardinalities C * D is the range (Cmin*Dmin, Cmax*Dmax).

    The maximum of two cardinalities max(C, D) is the range (min(Cmin, Dmin), max(Cmax, Dmax)).

Inferring a Static Type for XPath 3.0 Expressions
Construct Proforma Static Item Type Cardinality
Expr E,F T(E) | T(F) C(E) + C(F)
ForExpr for $x in S return E T(E) C(S) * C(E)
LetExpr let $x := S return E T(E) C(E)
QuantifiedExpr some|every $x in S satisfies C U{xs:boolean} (1,1)
IfExpr if (C) then A else B T(A) | T(B) max(C(A), C(B))
if (C) { A } T(A) max(C(A), (0,0))
OtherwiseExpr A otherwise B T(A) | T(B) max(C(A), C(B))
OrExpr E or F U{xs:boolean} (1,1)
AndExpr E and F U{xs:boolean} (1,1)
ComparisonExpr E = F; E eq F; E is F U{xs:boolean} (0,1)
StringConcatExpr E || F U{xs:string} (1,1)
RangeExpr E to F U{xs:decimal} (0,N)
AdditiveExpr E + F U{A}. But if the expression is a predicate (that is, if it appears between square brackets in a filter expression or axis step), then U{xs:decimal, xs:double, xs:float} (0,1)
MultiplicativeExpr E * F U{A}. But if the expression is a predicate (that is, if it appears between square brackets in a filter expression or axis step), then U{xs:decimal, xs:double, xs:float} (0,1)
UnionExpr A | B T(A) | T(B) C(A) + C(B)
IntersectExceptExpr A intersect B T(A) intersect T(B) C(A) * (0,1)
E except F T(A) C(A) * (0,1)
InstanceOfExpr E instance of T U{xs:boolean} (1,1)
TreatExpr E treat as T The U-type corresponding to the SequenceType T The cardinality of the SequenceType T
CastableExpr E castable as T U{xs:boolean} (1,1)
CastExpr E cast as T if T is an atomic or pure union type, the corresponding U-type. Otherwise, for example if T is a list type, U{A}. if T is an atomic or pure union type, (0,1). Otherwise, for example if T is a list type, (0,N).
UnaryExpr -N U{xs:decimal, xs:double, xs:float} C(N)
SimpleMapExpr A ! B T(B) C(A) * C(B)
PathExpr / U{document-node()} (1,1)
/P T(P) (0,N)
//P T(P) (0,N)
RelativePathExpr P/Q; P//Q T(Q) (0,N)
AxisStep E[P] T(E): see 3.5.2 Static Type of an Axis Step (0,N)
ForwardStep, ReverseStep Axis::NodeTest See 3.5.2 Static Type of an Axis Step (0,N)
PostfixExpr FilterExpr E[P] the static type of E If P is numeric with max cardinality 1, then (0,1). Otherwise C(E) * (0,1)
FilterExprAM E?[P] T(E) C(E) * (0,1)
Dynamic Function Call F(X, Y) U{*}, unless ancillary information is available about the function signature of F: see below. The cardinality of the return type of F.
Literal "pH", 93.7, #xml:space U{xs:string}, U{xs:decimal}, U{xs:double}, or U{xs:QName} depending on the form of the literal (1,1)
StringTemplate `{$x}{$y}` U{xs:string} (1,1)
VarRef $V The declared type of the variable if declared, otherwise the item type of the expression to which the variable is bound. The declared cardinality of the variable if declared, otherwise the cardinality of the expression to which the variable is bound, or (1,1) in the case of a variable declared in a ForExpr or QuantifiedExpr.
ParenthesizedExpr (E) T(E) C(E)
() U{} (a type whose only instance is the empty sequence) (0,0)
ContextValueRef . the context item type: see below the context cardinality: see below
FunctionCall F(X, Y) In general: the U-type corresponding to the declared result type of function F. But:
  • If one or more of the arguments to the function have operand usage transmission, then the intersection of the U-type corresponding to the declared result type with the union of the static types of the arguments having usage transmission. (For example, the static type of the function call head(//text()) is U{text()}.)

  • Special rules apply to the current function: see 3.5.3 Static Type of a Call to current.

TODO
Partial Function Application F(X, ?), $F(X, ?) U{fn(*)} TODO
NamedFunctionRef F#n U{fn(*)} (1,1)
InlineFunctionExpr fn(P) {E} U{fn(*)} (1,1)
MapConstructor { "A": E, "B": F } U{fn(*)} (1,1)
Postfix Lookup (Shallow) E ? K If the type of E is a map type map(K, V) or an array type array(V), then the U-type corresponding to the item type of V; otherwise U{*}. An implementation may be able to determine a more precise type when the type of E is a record type. (0,N)
Unary Lookup (Shallow) ? K If the context item type is a map type map(K, V) or an array type array(V), then the U-type corresponding to the item type of V; otherwise U{*}. An implementation may be able to determine a more precise type when the context item type is a record type. (0,N)
Deep Lookup ?? K, E ?? K U{*} (0,N)
PipelineExpr A -> B T(B) C(B)
ArrowExpr X => F(Y, Z), X =!> F(Y, Z) The static type of the equivalent static or dynamic function call F(X, Y, Z) The cardinality of the equivalent static or dynamic function call F(X, Y, Z)
SquareArrayConstructor [ X, Y, ... ] U{fn(*)} (1,1)
CurlyArrayConstructor array {X, Y, ... } U{fn(*)} (1,1)

Where the static type of an expression is U{fn(*)}, it is useful to retain additional information: specifically, the signature of the function. This may be regarded as information ancillary to the U-type of the expression; it does not play any role in operations such as testing whether one U-type is a subtype of another, or forming the union of two U-types. This ancillary information is available for a NamedFunctionRef, for an InlineFunctionExpr, for a MapConstructor, for a FunctionCall whose static type is U{fn(*)}, and for a VarRef if the variable is bound to any of the forgoing, or if it has a declared type corresponding to U{fn(*)}.

Note:

The special case type inference used for an AdditiveExpr or MultiplicativeExpr appearing as a predicate is possible because if an arithmetic operation within a predicate produces any other result, for example an xs:duration or xs:dateTime, this would cause a type error (on the grounds that an xs:duration or xs:dateTime has no effective boolean value), and static type inference only needs to consider the type of non-error results. The benefit of this special rule is that filter expressions such as /descendant::section[$i + 1] can be recognized as returning a singleton, and therefore as being striding, even if the type of $i is unknown.

3.5.2 Static Type of an Axis Step

An AxisStep consists of either a ForwardStep or ReverseStep followed by zero or more predicates. The predicates have no effect on the inferred type of the AxisStep.

The static type of an abbreviated step is the static type of its expansion, for example the static type of @* is the same as the static type of attribute::*.

Both the constructs ForwardStep or ReverseStep, in their unabbreviated form, are written as Axis::NodeTest. The static type depends on both the Axis and the NodeTest, and also on the context item type, determined as described in 3.6 Determining the Context Item Type.

If the context item type has an empty intersection with U{N} (that is, if the context item type cannot be a node), then evaluation of the AxisStep will always fail; it is permissible to raise a type error statically in this case, but for the sake of the analysis, the static type of the AxisStep can be taken as U{}. In other cases, let CIT be the intersection of the context item type with U{N}.

Let K(A, CIT) be the set of reachable node kinds given an axis A (a U-type) as defined by the following table:

Axis Reachable Node Kinds
self CIT
attribute if CIT includes U{element()} then U{attribute()} else U{}
namespace if CIT includes U{element()} then U{namespace-node()} else U{}
child, descendant if CIT includes U{element()} or U{document-node()} then U{element(), text(), comment(), processing-instruction()} else U{}
following-sibling, preceding-sibling, following, preceding if CIT is U{document-node()} then U{} else U{element(), text(), comment(), processing-instruction()}
parent, ancestor if CIT is U{document-node()} then U{} else U{element(), document-node()}
ancestor-or-self the union of K(ancestor, CIT) and CIT
descendant-or-self the union of K(descendant, CIT) and CIT

Let T(NT) be the set of node kinds that are capable of satisfying a NodeTest NT, defined by the following table:

NodeTest Possible Node Kinds
AnyKindTest (that is, node()) U{N} (that is, any node)
Any other KindTest The corresponding U-type (for example, U{text()} for the KindTest text())
NameTest The U-type corresponding to the principal node kind of the specified axis

The static type of an AxisStep with axis A and node test NT, given a context item type CIT, is then defined to be the intersection of K(A, CIT) with T(NT).

3.5.3 Static Type of a Call to current

The rules in this section define the static type of a call to the current function.

  1. If the call is within a patternXT, the static type of the function call is the match type of the pattern.

    Note:

    There is no circularity in this definition: a call to current in a pattern can only appear within a predicate, and the match type of a pattern never depends on anything appearing in a predicate.

  2. Otherwise (the function call is within an XPath expression), the static type of the function call is the context item type that applies to the outermost containing XPath expression, determined by the rules in 3.6 Determining the Context Item Type.

3.5.4 Schema-Aware Streamability Analysis

Note:

The streamability analysis in this chapter is not schema-aware. There are cases where use of schema type information might enable a processor to determine that a construct is streamable when it would be unable to make this determination otherwise. Two examples:

  • A processor might decide that a construct such as price + salesTax is streamable if both the child elements have a simple type such as xs:decimal, or if the order in which they appear in the input document is known.

  • A processor might decide that a step using the descendant axis, such as .//title, has striding rather than crawling posture if it can establish that two title elements will never be nested (that is, a title cannot contain another title). This would allow the instruction <xsl:apply-templates select=".//title"/> to be used in a streaming template rule.

Although such constructs are not guaranteed streamable according to this specification, there is nothing to prevent a processor providing a streamed implementation if it is able to do so.

3.6 Determining the Context Item Type

[Definition: For every expression, it is possible to establish by static analysis, information about the item type of the context item for evaluation of that expression. This is called the context item type of the expression.]

The context item type of an expression is a U-type.

The semantics of every construct, defined in this specification or in the XPath specification, describe how the focusXT for evaluating each operand of the construct is determined. In most cases the focus is the same as that of the parent construct. In some cases the focus is determined by evaluating some other expression, for example in the expressions A/B, A!B, A[B], A?[B], or A -> B the focus for evaluating B is A. More generally:

The context item type of a construct C is the first of the following that applies:

  1. If the focus-setting container of C is an xsl:function element, an inline function declaration, or an xsl:on-completion element, then the context item type is U{}.

    Note:

    This is essentially an error case; expressions that depend on the focus should not normally appear within a construct that sets the focus to absentXT.

  2. If the focus-setting container of C is an xsl:source-document instruction, then the context item type is U{document-node()}.

  3. If the focus-setting container of C is a template ruleXT, then the context item type is the match type of the match pattern of the template rule, defined below.

  4. If the focus-setting container of C is a PredicatePattern, then the context item type is U{*}.

  5. If the focus-setting container is a global variableXT declaration, the context item type is determined by the type attribute of the xsl:global-context-item declaration, defaulting to U{*}, or U{} if the xsl:global-context-item declaration specifies use="absent".

  6. If the focus-setting container is any other declarationXT, for example xsl:key or xsl:accumulator, the context item type is U{*}.

  7. Otherwise, the context item type is the static type (see 3.5 Determining the Static Type of a Construct) of the controlling operand of the focus-setting container of C.

[Definition: The match type of a patternXT is the most specific U-type that is known to match all items that the pattern can match.] The match type of a pattern is the inferred static type of the pattern’s equivalent expression, determined according to the rules in 3.5 Determining the Static Type of a Construct. For example, the match type of the pattern para[1] is U{element()}, while that of the pattern @code[.='x'] is U{attribute()}

3.7 The Effect of Operand Usage

The effect of operand usage on streamability analysis is illustrated in the following examples:

Example: The Effect of Operand Usage on the Streamability of a Context Item Expression

Consider the following construct:

<xsl:source-document streamable="yes" href="emps.xml">
  <xsl:for-each select="*/emp">
    <xsl:value-of select="."/>
  </xsl:for-each>
</xsl:source-document>

To assess the streamability, we follow the following logic:

  1. The top-level construct is a sequence constructorXT. It is evaluated with a document node as the context item, and with a striding posture.

  2. The sequence constructor has one child instructionXT, which has an operand usage of transmission.

  3. The xsl:for-each instruction evaluates its select expression, with the context item and posture unchanged.

  4. The step child::* is evaluated with this context item and posture. The posture transition rules permit this; we now have a sequence of child elements, and still a striding posture.

  5. The same applies to the next step, child::emp

  6. The content of the xsl:for-each instruction is a sequence constructorXT which itself has a single operand, the xsl:value-of instruction.

  7. The xsl:value-of instruction is evaluated once for each emp child, with that child as context item and in a striding posture. This instruction uses the general streamability rules. The operand usage of the select expression is absorption. This means that the result of the xsl:value-of instruction is grounded and consuming.

  8. The result of the trivial sequence constructor contained in the xsl:for-each instruction is therefore grounded and consuming

  9. The result of the xsl:for-each instruction (see 12.5.18 Streamability of xsl:for-each) is therefore grounded and consuming

  10. The result of the trivial sequence constructor contained in the xsl:source-document instruction is therefore grounded and consuming

  11. The xsl:source-document instruction is therefore guaranteed-streamable.

Now consider a slightly different construct:

<xsl:source-document streamable="yes" href="emps.xml">
  <xsl:for-each select="*/emp">
    <xsl:sequence select="."/>
  </xsl:for-each>
</xsl:source-document>

To assess the streamability, we follow the following logic:

  1. The top-level construct is a sequence constructorXT. It is evaluated with a document node as the context item, and with a striding posture.

  2. The sequence constructor has one child instructionXT, which has an operand usage of transmission.

  3. The xsl:for-each instruction evaluates its select expression, with the context item and posture unchanged.

  4. The step child::* is evaluated with this context item and posture. The posture transition rules permit this; we now have a sequence of child elements, and still a striding posture.

  5. The same applies to the next step, child::emp

  6. The content of the xsl:for-each instruction is a sequence constructorXT which itself has a single operand, the xsl:sequence instruction.

  7. The xsl:sequence instruction is evaluated once for each emp child, with that child as context item and in a striding posture. This instruction uses the general streamability rules. The operand usage of the select expression is transmission. This means that the result of the xsl:sequence instruction is striding and motionless.

  8. The result of the trivial sequence constructor contained in the xsl:for-each instruction is therefore also striding and motionless.

  9. The result of the xsl:for-each instruction (see 12.5.18 Streamability of xsl:for-each) is therefore striding and consuming (the wider of the sweeps of the select expression and the sequence constructor).

  10. The result of the trivial sequence constructor contained in the xsl:source-document instruction is therefore striding and consuming.

  11. Since the result is not grounded, the xsl:source-document instruction is therefore not guaranteed-streamable.

Expressed informally, the result of a declared-streamable xsl:source-document instruction (or of a declared-streamable template rule) must not contain streamed nodes. The reason for this is that once streamed nodes are returned to constructs that are not declared streamable and therefore have no streamability constraints, there is no way to analyze what happens to them, and thus to guarantee streamability.

 

Example: The Effect of Operand Roles on the Streamability of Path Expressions

Consider the expression .//chapter.

When this appears as an argument to the function count or exists, it can be streamed (it is a consuming expression, meaning that the subtree rooted at the context item needs to be read in order to evaluate the expression). A possible strategy for performing a streamed evaluation is to read all descendants of the context item in document order, checking each one to see whether its name is chapter. The sweep of the expression will be consuming, and its posture will be crawling.

The operand usage (the usage of the argument to count or exists) is defined as inspection. The general streamability rules show that when the posture of an operand is crawling and the operand usage is inspection, the resulting expression is grounded and consuming. This means that (in the absence of other consuming expressions) the containing template or function will generally be streamable.

In the expression tail(.//chapter), the operand usage is classified as transmission, meaning that the nodes are simply passed up the tree to the next containing expression. In general, when a crawling expression is passed as an argument and the operand role is transmission, the containing expression will also be crawling. However, there is an exception where the expression is known to deliver a singleton (for example, head(.//chapter)). In this case the returned sequence cannot contain any nested nodes, so it is crawling.

When the same expression appears as an argument to an atomizing function string-join, the processor knows that it will need to access the subtree of each selected section element in order to compute the result of the function (the argument to string-join is classified as having operand usage absorption). The processor does not know whether these subtrees will be nested (one section might contain another). In most cases they will not be nested, because atomizing a sequence that contains nested nodes is not generally a useful thing to do. The streamability analysis therefore makes an optimistic assumption, by treating atomization of a crawling expression as a streamable operation. In the worst case, where it turns out that the selected nodes are indeed nested, the processor must handle this, typically by buffering the content of inner nodes until the end tag of the outer nodes is reached.

This treatment of nodes in a crawling expression applies to all cases in which the content of the nodes is handled in a way defined entirely by the rules of this specification: for example, operations such as atomization, obtaining the string value of nodes, deep copy of nodes, and the deep-equal function. It does not extend to cases where the processing applied to the nodes is user-defined: for example, operations such as xsl:apply-templates, xsl:for-each, or xsl:for-each-group. In these cases, the nodes selected for processing must not be nested (a crawling posture is not permitted in these contexts).

When a crawling expression appears as an argument to a call on a user-defined function, the effect depends on the streamability category of the function, as described in 8.1 Classifying Stylesheet Functions.

3.8 Posture

The posture of a construct indicates the relationship of the nodes selected by the construct to a streamed input document. The value is one of the following:

  • [Definition: Grounded: indicates that the value returned by the construct does not contain nodes from the streamed input document]. Atomic items and function items are always grounded; nodes are grounded if it is known that they are in a non-streamed document. For example the expressions doc('x') and copy-of(.) both return grounded nodes.

  • [Definition: Climbing: indicates that streamed nodes returned by the construct are reached by navigating the parent, ancestor[-or-self], attribute, and/or namespace axes from the node at the current streaming position.] When the context posture is climbing, use of certain axes such as parent and ancestor is permitted, but use of other axes such as child or descendant violates the streamability rules.

  • [Definition: Crawling: typically indicates that streamed nodes returned by a construct are reached by navigating the descendant[-or-self] axis.] Nodes reached in this way are potentially nested (one might be an ancestor of another), so further downward navigation is not permitted. Expressions that can be statically determined to return a singleton node (for example head(.//title)) generate a result with no such nesting, so they are striding rather than crawling.

  • [Definition: Striding: indicates that the result of a construct contains a sequence of streamed nodes, in document order, that are peers in the sense that none of them is an ancestor or descendant of any other.] This is typically achieved by using one or more steps involving the child or attribute axes only. Use of the outermost function can also result in a striding posture, as can functions such as head or zero-or-one that ensure the result will be a singleton node.

  • [Definition: Roaming: indicates that the nodes returned by an expression could be anywhere in the tree, which inevitably means that the construct cannot be evaluated using streaming.] For example, the posture of an axis step using the following or preceding axis will typically be roaming, which leads the analysis to conclude that the construct is not streamable.

Note:

One way to think about the posture values is as labels for states in a finite state automaton, where the alphabet of symbols accepted by the automaton is the set of 13 XPath axes, and the sentence being parsed is a path expression containing a sequence of axis steps. For example, use of the descendant axis when the current state is striding moves the new state to crawling, and use of the parent axis then takes it to climbing.

The posture of a construct is determined in one of several ways:

  • For axis steps, the posture of the expression is determined by the context posture and the choice of axis. For example, an axis step using the ancestor axis always has a posture of climbing, while an axis step using the child axis, if the context posture is striding, will itself have a posture of striding. The rules for the posture transitions produced by axis steps are given in 12.7.9 Streamability of Axis Steps.

  • For many other constructs, the posture is determined by the general streamability rules. These determine the result posture in terms of the operands of the construct and the way in which each operand is used. For example, a construct that accepts a streamed node as the value of an operand, and atomizes that node, will generally have a posture of grounded.

  • Other constructs have their own special rules, which are listed in 12 Streamability of Specific Constructs. For example, a call on the root function behaves analogously to an axis step, and is described in 12.8.21 Streamability of the root Function. Special rules are needed for:

    • Constructs that evaluate an operand more than once, such as an XPath for expression;

    • Constructs that have alternatives among their operands, such as an XPath if expression;

    • Constructs that navigate relative to the context item, such as axis steps;

    • Constructs with implicit inputs, such as the context item expression . (dot);

    • Constructs that change the focus, such as a filter expression;

    • Constructs that invoke functions or templates.

The characterization of an expression as striding, crawling, climbing, or roaming applies only to the streamed nodes in the result of the expression. The result of the expression may also contain non-streamed (grounded) nodes or atomic items. For example if /x/y is a striding expression, then (/x/y | $doc//x) is also striding, given that $doc contains non-streamed nodes. The assertion that the nodes in the result of a striding expression are in document order and are peers thus applies only to the subset of the nodes that are streamed.

Note:

A consequence of this is that when striding expressions are used in a context that requires sorting into document order, for example (/x/y | $doc//x) / @price, the fact that the expression is striding does not eliminate the need for the sequence to be re-ordered. However, there will never be a need for the relative order of the streamed nodes in the value to change.

Since the data model leaves the relative order of nodes in different trees implementation-defined, and since streamed and unstreamed nodes will necessarily be in different trees, a useful implementation strategy might be to arrange that streamed nodes always precede unstreamed nodes in document order (or vice versa). An operation that needs to process the result of a striding expression in document order can then first deliver all the streamed nodes (by consuming the input stream) in the order they arrive, and then deliver the unstreamed nodes, suitably sorted.

3.8.1 Choice Operand Groups

[Definition: For some construct kinds, one or more operand roles may be defined to form a choice operand group. This concept is used where it is known that operands are mutually exclusive (for example the then and else clauses in a conditional expression).]

[Definition: The combined posture of a choice operand group is determined by the postures of the operands in the group (the operand postures), and is the first of the following that applies:

  1. If any of the operand postures is roaming, then the combined posture is roaming.

  2. If all of the operand postures are grounded, then the combined posture is grounded.

  3. If one or more of the operand postures is climbing and the remainder (if any) are grounded, then the combined posture is climbing.

  4. If one or more of the operand postures is striding and the remainder (if any) are grounded, then the combined posture is striding.

  5. If one or more of the operand postures is crawling and each of the remainder (if any) is either striding or grounded, then the combined posture is crawling.

  6. Otherwise (for example, if the group includes both an operand with climbing posture and one with crawling posture), the combined posture is roaming.

]

3.8.2 Determining the Context Posture

In the same way as the type of the context item can be determined for any construct C by reference to the type of the construct that establishes the context for the evaluation of C, so the posture of the context item C can be determined by reference to the posture of the construct that establishes the context.

The context posture of a construct C is the first of the following that applies:

  1. If the focus-setting container of C is an xsl:function declaration, an inline function declaration, or an xsl:on-completion element, then the context posture is roaming.

    Note:

    This is essentially an error case; expressions that depend on the context item should not normally appear within these constructs.

  2. If the focus-setting container of C is an xsl:source-document instruction, then the context posture is striding if the instruction is declared-streamable, or grounded otherwise.

  3. If the focus-setting container of C is a template ruleXT whose mode is declared with streamable="yes", then the context posture is striding.

  4. If the focus-setting container of C is a patternXT, then the context posture is striding.

  5. If the focus-setting container of C is an xsl:attribute-set declaration with the attribute streamable="yes", then the context posture is striding.

  6. If the focus-setting container is any other declarationXT, for example a global variable declaration, a named templateXT, or a template rule or attribute set that does not specify streamable="yes", then the context posture is roaming.

  7. Otherwise, the context posture is the posture of the controlling operand of the focus-setting container of C.

3.9 Sweep

[Definition: Every construct has a sweep, which is a measure of the extent to which the current position in the input stream moves during the evaluation of the expression. The sweep is one of: motionless, consuming, or free-ranging .] This list of values is ordered: a free-ranging expression has wider sweep than a consuming expression, which has wider sweep than a motionless expression.

[Definition: A motionless construct is any construct deemed motionless by the rules for that construct (typically given in 12 Streamability of Specific Constructs).] Informally, a motionless construct is one that can be evaluated without changing the current position in the input stream.

Note:

The context item expression . is classified as motionless; however a construct that uses . as an operand (for example, string(.)) might be consuming. The streamability rules effectively consider expressions such as . within the context of the containing construct.

[Definition: A consuming construct is any construct deemed consuming by the rules for that construct (typically given in 12 Streamability of Specific Constructs).] Informally, a consuming construct is one whose evaluation requires repositioning of the input stream from the start of the current node to the end of the current node.

[Definition: A free-ranging construct is any construct deemed free-ranging by the rules for that construct (typically given in 12 Streamability of Specific Constructs).] Informally, a free-ranging construct is one whose evaluation may require access to information that is not available from the subtree rooted at the current node, together with information about ancestors of the current node and their attributes.

The table below shows some examples of expressions having different combinations of posture and sweep.

Combinations of Sweep and Posture
Motionless Consuming Free-Ranging
Grounded name() string(title) See Note
Climbing parent::* child::x/ancestor::y See Note
Striding @status child::* See Note
Crawling The subexpression . in //a/. descendant::* //x[child::y]
Roaming See Note See Note preceding::*

Note:

In all cases where either the posture is roaming, or the sweep is free-ranging, or both, the effect is to make an expression non-streamable. For convenience, therefore, evaluation of the streamability rules in most cases returns the values roaming and free-ranging only in combination with each other. In cases where the rules return a posture of roaming combined with some other sweep, or a sweep of free-ranging with some other posture, the final result of the analysis is always the same as if the expression were both roaming and free-ranging.

For an example of a case where an expression is roaming but not free-ranging, consider the right-hand operand of the relative path expression (preceding::x/.). The rules for the streamability of a context item expression (see 12.7.13 Streamability of the Context Item Expression) give . in this context a roaming posture, combined with motionless sweep. But the relative path expression as a whole is roaming and free-ranging (see 12.7.8 Streamability of Path Expressions), so the apparent inconsistency is transient.

3.10 General Streamability Rules

[Definition: Many constructs share the same streamability rules. These rules, referred to as the general streamability rules, are defined here.]

Examples of constructs that use these rules are: an arithmetic expression, an attribute value templateXT, a sequence constructorXT, the xsl:text and xsl:value-of instructions, and a call to the doc function.

The rules determine both the posture and sweep of a construct. To determine the posture and sweep of a construct C, assuming these general rules are applicable to that kind of construct:

  1. For each operand of C:

    1. Establish:

      1. The static type T of the operand (see 3.5 Determining the Static Type of a Construct).

        Note:

        The static type is a U-type. For example, the static type of the expression (@*, *) is U{element(), attribute()}.

      2. The sweep S and posture P of the operand (by applying the rules appropriate to that operand).

      3. The operand usage U corresponding to the role of the operand within C.

    2. Compute the adjusted sweep S′ of the operand by taking the first of the following that applies:

      1. If S is free-ranging or P is roaming, then S′ is free-ranging. (In this case the posture and sweep of C are roaming and free-ranging, regardless of any other operands.)

      2. If P is grounded, then S′ is S.

      3. Otherwise (P is not grounded, which implies that the operand is capable of returning streamed nodes), compute S′ as follows:

        1. Compute the adjusted usage U′ as follows:

          1. If U is absorption and the intersection of T with U{element(), document-node()} is U{} (that is, if T is a type that does not allow nodes with children), then U′ is inspection.

            Note:

            This is because the entire subtree of nodes such as text nodes is available without reading further data from the input stream.

          2. Otherwise, U′ is U.

        2. Compute the adjusted sweep S′ from the table below:

          Computing the Adjusted Sweep of an Expression
          Posture (P) Adjusted Usage (U')
          Absorption Inspection Transmission Navigation
          Climbing Free-ranging S S Free-ranging
          Striding Consuming S S Free-ranging
          Crawling Consuming S S Free-ranging
    3. [Definition: An operand is potentially consuming if at least one of the following conditions applies:

      1. The operand’s adjusted sweep S′ is consuming.

      2. The operand usage is transmission and the operand is not grounded.

      ]

  2. Having computed the adjusted sweep S′(o) of each operand o, the posture and sweep of C are the first of the following that applies:

    1. If C has no operands, then grounded and motionless.

    2. If any operand o has an adjusted sweep S′(o) of free-ranging, then roaming and free-ranging.

    3. If more than one operand is potentially consuming, then:

      1. If all these operands form part of a choice operand group, then the posture of C is the combined posture of the operands in this group, and the sweep of C is the widest sweep of the operands in this group

      2. If all these operands have S′ = motionless, (which necessarily means they have U′ = U = transmission) and if they all have the same posture P0, then motionless with posture P0.

        Note:

        For example, the expression (@a, @b) is motionless and striding.

      3. Otherwise, roaming and free-ranging.

    4. If exactly one operand o is potentially consuming, then:

      1. If o is a higher-order operand of C, then roaming and free-ranging.

      2. If the operand usage of o is absorption or inspection, then grounded and consuming.

      3. If the posture of o is crawling and C is a function call of a built-in function whose signature indicates a return type with a maximum cardinality of one then striding and the adjusted sweep of o.

        Note:

        Although this rule is written in general terms, the only functions that it applies to (at the time of publication) are head, exactly-one, and zero-or-one. This rule only applies if the argument usage is transmission (other cases having been handled by earlier rules); of the built-in functions, the three functions listed are the only ones having an argument with usage transmission and a return type with maximum cardinality one.

      4. Otherwise (the operand usage of o is transmission), the posture and adjusted sweep of o.

    5. Otherwise (all operands are motionless) grounded and motionless.

Note:

The rules ensure that if more than one operand is consuming, that is, if more than one operand reads the subtree of the context node in a way that would cause the current position of the input stream to change, then the construct is not streamable.

The rules also prevent multiple streamed nodes being returned in the result of an expression if they are delivered by different operands. For example, the expression count((.., *)) is not guaranteed streamable. This is to make static analysis possible: the posture needs to be statically determined to ensure that streaming does not fail at execution time. It is permitted, however, for streamed nodes to be mixed in a sequence with non-streamed nodes or with atomic items; in this case the posture of the result will be that of the streamed nodes. It is also permitted to have multiple operands delivering streamed nodes in different branches of a conditional, provided the sweep and posture are compatible: for example if (X) then @name else name is guaranteed streamable.

Expressions that have more than one operand with usage transmission, for example (A, B), or (A | B), or insert-before(A, n, B), generally allow only one of these operands to select streamed nodes. The result of the expression will contain a mixture of streamed and grounded nodes, but its posture and sweep will be that of the streamed operand. The nodes in the result will not necessarily be in document order, but the subset of the nodes that are streamed will always be in document order.

3.10.1 Examples of the General Streamability Rules

This section provides some examples of how the general streamability rules operate. In each example, the emphasis is on the outermost construct shown; explanations for how the sweep and posture of its operands are derived are not given, though in many cases they are explained in earlier examples.

The examples assume that the context item type for evaluation of the expression shown is an element node, and that its posture is striding.

  • 2 + 2 is grounded and motionless, because both the operands are grounded and motionless.

  • price * 2 is grounded and consuming, because one of the operands is consuming and the relevant operand usage is absorption.

  • price - discount is roaming and free-ranging, because both the operands are consuming (and they are not members of a parallel operand group).

  • price * @discount is grounded and consuming. The left-hand operand is consuming and the corresponding operand usage is absorption, while the right-hand operand is motionless, again with an operand usage of absorption, and its item type is attribute() which changes the effective usage to inspection.

  • a/b/c is striding and consuming. This is determined not by the general streamability rules, but by the rules for path expressions in 12.7.8 Streamability of Path Expressions.

  • a//c is crawling and consuming. This is similarly determined by the rules for path expressions in 12.7.8 Streamability of Path Expressions.

  • count(a/b/c) is grounded and consuming, because the operand (the argument to the count function) is striding and consuming (see earlier example) and the operand usage is inspection.

  • sum(a/b/c) is grounded and consuming, because the operand (the argument to the sum function) is striding and consuming (see earlier example) and the operand usage is absorption.

  • count(descendant::c) is grounded and consuming, because the operand (the argument to the count function) is crawling and consuming (see earlier example) and the operand usage is inspection.

  • tail(descendant::c) is crawling and consuming. The operand is crawling, the operand usage is transmission, so the posture and sweep of the result are the same as the posture and sweep of the consuming operand.

  • unordered(a|b) is crawling and consuming. The operand (the argument to the unordered function) is crawling (see 12.7.4 Streamability of union, intersect, and except Expressions), and the operand usage is transmission, so the posture and sweep of the result are the same as the posture and sweep of the consuming operand.

  • zero-or-one(descendant::c) is striding and consuming. Although the operand is crawling, the operand usage is transmission and the cardinality of the expression is zero or one, so the posture of the result is striding. The same analysis applies to exactly-one(descendant::c) and to head(descendant::c).

  • sum(descendant::c) is grounded and consuming, because the operand (the argument to the sum function) is crawling and consuming (see earlier example) and the operand usage is absorption. In theory (although it is unlikely in practice) the selected c elements might be nested one inside another. The processor is expected to handle this situation, which may require some buffering. For example, given the untyped source document <a><c><c>1</c><c>2</c><c>3</c></c></a>, the result of the expression is 129 (123 + 1 + 2 + 3), and to evaluate this, a streaming processor will typically maintain a stack of buffers to accumulate the typed values of each of the four c elements during a single pass of the source document.

  • "Q{" || namespace-uri(.) || "}" || local-name(.) is grounded and motionless. The two literal operands are grounded and motionless because they have no operands; the two function calls are grounded and motionless because they have a single operand that is striding and motionless, with an operand usage of inspection.

  • copy-of(.)/head/following-sibling::* is grounded and consuming. The left-hand operand copy-of(.)/head is grounded and consuming because, under the rules in 12.7.8 Streamability of Path Expressions, its left-hand operand copy-of(.) is grounded and consuming. This in turn is because . is striding and motionless, and the operand usage is absorption.

  • if ($discounted) then price else discounted-price is striding and consuming, because the two branches of the conditional are both striding and consuming, and they form a choice operand group with usage transmission.

  • if ($gratis) then 0 else price is striding and consuming because there is only one consuming operand (the fact that it is part of a choice operand group does not affect the reasoning).

  • count((author, editor)) is roaming and free-ranging. The first argument to the count function is an expression with two operands, both having usage=transmission, and neither being grounded.

  • count((author | editor)) is grounded and consuming. A union expression is not subject to the general streamability rules; it has its own rules, defined in 12.7.4 Streamability of union, intersect, and except Expressions, which establish in this case that the argument to the count is crawling and consuming. The count function does follow the general streamability rules, with an operand usage of inspection: under rule 1(b)(iii)(B) the adjusted sweep is consuming, and rule 2(d)(iii) then applies.

  • ('{', author, '}') is striding and consuming. Exactly one operand is consuming; it has usage transmission, so the result has the posture and sweep of that operand. (The formal analysis treats comma as a binary operator, but the same result can be obtained by treating the content of the parenthesized expression as an expression with three operands.)

4 Streaming Source Documents

The xsl:source-document instruction reads a source document whose URI is supplied, and processes the content of the document by evaluating the contained sequence constructorXT. The streamable attribute (default "no") allows streamed processing to be requested.

For example, if a document represents a book holding a sequence of chapters, then the following code can be used to split the book into multiple XML files, one per chapter, without allocating memory to hold the entire book in memory at one time:

<xsl:source-document streamable="yes" href="book.xml">
  <xsl:for-each select="book">             
    <xsl:for-each select="chapter">
      <xsl:result-document href="chapter{position()}.xml">
        <xsl:copy-of select="."/>
      </xsl:result-document>
    </xsl:for-each>
  </xsl:for-each>  
</xsl:source-document>

The stream-available function can be used to determine whether a particular document is available for streamed processing: see [XSLT 4.0] section 18.2 fn:stream-available.

4.1 Streamability of xsl:source-document

The xsl:source-document instruction is guaranteed-streamable if both the following conditions are satisfied:

  1. The instruction is declared-streamable, by specifying streamable="yes".

  2. The contained sequence constructorXT is grounded, as assessed using the streamability analysis in 3 Streamability Analysis Principles. The consequences of being or not being guaranteed streamable depend on the processor conformance level, and are explained in 3.1 Streamability Guarantees.

Note:

The rules for guaranteed streamability ensure that the sequence constructor (and therefore the xsl:source-document instruction) cannot return any nodes from a streamed document. For example, it cannot contain the instruction <xsl:sequence select="//chapter"/>. If nodes from this document are to be returned, they must first be copied, for example by using the xsl:copy-of instruction or by calling the copy-of or snapshot functions.

Because the xsl:source-document instruction cannot (if it satisfies the rules for guaranteed streamability) return nodes from the streamed document, any nodes it does return will be conventional (unstreamed) nodes that can be processed without restriction. For example, if xsl:source-document is invoked within a stylesheet functionXT f:firstChapter, and the sequence constructor consists of the instruction <xsl:copy-of select="//chapter"/>, then the calling code can manipulate the resulting chapter elements as ordinary trees rooted at parentless element nodes.

If the sequence constructor in an xsl:source-document instruction were to return nodes from the document for which streaming has been requested, the instruction would not be guaranteed streamable. Processors that support the streaming feature would then not be required to process it in a streaming manner, and this specification imposes no restrictions on the processing of the nodes returned. (The ability of a streaming processor to handle such stylesheets in a streaming manner might, of course, depend on how the nodes returned are processed, but those details are out of scope for this specification.)

4.2 Examples of xsl:source-document

The xsl:source-document instruction can be used to initiate processing of a document using streaming with a variety of coding styles, illustrated in the examples below.

Example: Using xsl:source-document with Aggregate Functions

The following example computes the number of transactions in a transaction file

Input:

<transactions>
  <transaction value="12.51"/>
  <transaction value="3.99"/>
</transactions>

Stylesheet code:

<xsl:source-document streamable="yes" href="transactions.xml">
  <count>
    <xsl:value-of select="count(transactions/transaction)"/>
  </count>
</xsl:source-document>

Result:

<count>2</count>

Analysis:

  1. The literal result element count has the same sweep as the xsl:value-of instruction.

  2. The xsl:value-of instruction has the same sweep as its select expression.

  3. The call to count has the same sweep as its argument.

  4. The argument to count is a RelativePathExpr. Under the rules in 12.7.8 Streamability of Path Expressions, this expression is striding and consuming. The call on count is therefore grounded and consuming.

  5. The entire body of the xsl:source-document instruction is therefore grounded and consuming.

The following example computes the highest-value transaction in the same input file:

<xsl:source-document streamable="yes" href="transactions.xml">
  <maxValue>
    <xsl:value-of select="max(transactions/transaction/@value)"/>
  </maxValue>
</xsl:source-document>

Result:

<maxValue>12.51</maxValue>

Analysis:

  1. The literal result element maxValue has the same sweep as the xsl:value-of instruction.

  2. The xsl:value-of instruction has the same sweep as its select expression.

  3. The call to max has the same sweep as its argument.

  4. The argument to max is a RelativePathExpr whose two operands are the RelativePathExpr transactions/transaction and the AxisStep @value. The left-hand operand transactions/transaction has striding posture. The right-hand operand @value, given that the context posture is striding, is motionless. The RelativePathExpr argument to max is therefore consuming.

  5. The entire body of the xsl:source-document instruction is therefore consuming.

To compute both the count and the maximum value in a single pass over the input, several approaches are possible. The simplest is to use maps (map constructors are exempt from the usual rule that multiple downward selections are not allowed):

<xsl:source-document streamable="yes" href="transactions.xml">
  <xsl:variable name="tally" select="{ 'count': count(transactions/transaction), 
                                          'max': max(transactions/transaction/@value) }"/>
  <value count="{ $tally('count') }" max="{ $tally('max') }"/>
</xsl:source-document>

Other options include the use of xsl:fork, or multiple xsl:accumulator declarations, one for each value to be computed.

 

Example: Using xsl:source-document with xsl:for-each to Process a Collection of Input Documents

This example displays a list of the chapter titles extracted from each book in a collection of books.

Each input document is assumed to have a structure such as:

<book>
  <chapter number-of-pages="18">
    <title>The first chapter of book A</title>
    ...
  </chapter>
  <chapter number-of-pages="15">
    <title>The second chapter of book A</title>
    ...
  </chapter>
  <chapter number-of-pages="12">
    <title>The third chapter of book A</title>
    ...
  </chapter>
</book>

Stylesheet code:

<chapter-titles>
  <xsl:for-each select="uri-collection('books')">
    <xsl:source-document streamable="yes" href="{.}">
      <xsl:for-each select="book">
        <xsl:for-each select="chapter">
           <title><xsl:value-of select="title"/></title>
        </xsl:for-each>
      </xsl:for-each>
    </xsl:source-document>
  </xsl:for-each>
</chapter-titles>

Output:

<chapter-titles>
  <title>The first chapter of book A</title>
  <title>The second chapter of book A</title>
  ...
  <title>The first chapter of book B</title>
  ...
</chapter-titles>

Note:

This example uses the function uri-collection to obtain the document URIs of all the documents in a collection, so that each one can be processed in turn using xsl:source-document.

 

Example: Using xsl:source-document with xsl:iterate

This example assumes that the input is a book with multiple chapters, as shown in the previous example, with the page count for each chapter given as an attribute of the chapter. The transformation determines the starting page number for each chapter by accumulating the page counts for previous chapters, and rounding up to an odd number if necessary.

<chapter-start-page>
   <xsl:source-document streamable="yes" href="book.xml">
      <xsl:iterate select="book/chapter">
         <xsl:param name="start-page" select="1"/>
         <chapter title="{title}" start-page="{ $start-page }"/>
         <xsl:next-iteration>
            <xsl:with-param name="start-page" 
                            select="$start-page + @number-of-pages + 
                                      (@number-of-pages mod 2)"/>
         </xsl:next-iteration>
      </xsl:iterate>
   </xsl:source-document>
</chapter-start-page>

Output:

<chapter-start-page>
  <chapter title="The first chapter of book A" start-page="1"/>
  <chapter title="The second chapter of book A" start-page="19"/>
  <chapter title="The third chapter of book A" start-page="35"/>
  ...
</chapter-start-page>

 

This example assumes that the input is a book with multiple chapters, and that each chapter belongs to a part, which is present as an attribute of the chapter (for example, chapters 1-4 might constitute Part 1, the next three chapters forming Part 2, and so on):

<book>
  <chapter part="1">
    <title>The first chapter of book A</title>
    ...
  </chapter>
  <chapter part="1">
    <title>The second chapter of book A</title>
    ...
  </chapter>
  ...
  <chapter part="2">
    <title>The fifth chapter of book A</title>
    ...
  </chapter>
</book>

The transformation copies the full text of the chapters, creating an extra level of hierarchy for the parts.

<book>
   <xsl:source-document streamable="yes" href="book.xml">
      <xsl:for-each select="book">
         <xsl:for-each-group select="chapter" group-adjacent="data(@part)">
            <part number="{current-grouping-key()}">
               <xsl:copy-of select="current-group()"/>
            </part>
         </xsl:for-each-group>
      </xsl:for-each>
   </xsl:source-document>
</book>

Output:

<book>
  <part number="1">
    <chapter part="1">
      <title>The first chapter of book A</title>
      ...
    </chapter>
    <chapter part="1">
      <title>The second chapter of book A</title>
      ...
    </chapter>
    ...
  </part>
  <part number="2">
    <chapter part="2">
      <title>The fifth chapter of book A</title>
    ...
    </chapter>
    ...
  </part>
</book>

 

This example copies an XML document while deleting all the ednote elements at any level of the tree, together with their descendants. This example is a complete stylesheet, which is intended to be evaluated by nominating main as the initial named templateXT. The use of on-no-match="deep-copy" in the xsl:mode declaration means that the built-in template rule copies nodes unchanged, except where overridden by a user-defined template rule.

<xsl:transform version="3.0" 
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:mode name="delete-ednotes" streamable="yes" 
                                on-no-match="shallow-copy"/>

<xsl:template name="main">
   <xsl:source-document streamable="yes" href="book.xml">
      <xsl:apply-templates mode="delete-ednotes"/>
   </xsl:source-document>
</xsl:template>

<xsl:template match="ednote" mode="delete-ednotes"/>

</xsl:transform>

Additional template rules could be added to process other elements and attributes in the same pass through the data: for example, to modify the value of a last-updated attribute (wherever it appears) to the current date and time, the following rule suffices:

<xsl:template match="@last-updated" mode="delete-ednotes">
  <xsl:attribute name="last-updated" select="current-dateTime()"/>
</xsl:template>

5 Streamable Templates

A template rule that is applicableXT to a mode M is guaranteed-streamable if and only if all the following conditions are satisfied:

  1. Mode M is declared in an xsl:mode declaration that specifies streamable="yes".

  2. The patternXT defined in the match attribute of the xsl:template element is a motionless pattern as defined in 12.9 Classifying Patterns.

  3. The sweep of the sequence constructorXT forming the body of the xsl:template element is either motionless or consuming.

  4. The type-adjusted posture of the sequence constructorXT forming the body of the xsl:template element, with respect to the U-type that corresponds to the declared return type of the template (defaulting to item()*), is grounded.

    Note:

    This means that either (a) the sequence constructor is grounded as written (that is, it does not return streamed nodes), or (b) it effectively becomes grounded because the declared result type of the template is atomic, leading to implicit atomization of the result.

  5. Every expressionXT and contained sequence constructorXT in a contained xsl:param element (the construct that provides the default value of the parameter) is motionless.

Specifying streamable="yes" on an xsl:mode declaration declares an intent that every template rule to which that mode is applicableXT (explicitly or implicitly, including by specifying #all), should be streamable, either because it is guaranteed-streamable, or because it takes advantage of streamability extensions offered by a particular processor. The consequences of declaring the mode to be streamable when there is such a template rule that is not guaranteed streamable depend on the conformance level of the processor, and are explained in 3.1 Streamability Guarantees.

Processing of a document using streamable templates may be initiated using code such as the following, where S is a mode declared with streamable="yes":

<xsl:source-document streamable="yes" href="bigdoc.xml">
  <xsl:apply-templates mode="S"/>
</xsl:source-document>

Alternatively, streamed processing may be initiated by invoking the transformation with an initial modeXT declared as streamable, while supplying the initial match selectionXT (in an implementation-defined way) as a streamed document.

Note:

Invoking a streamable template using the construct <xsl:apply-templates select="doc('bigdoc.xml')"/> does not ensure streamed processing. As always, processors may use streamed processing if they are able to do so, but when the doc or document functions are used, processors are obliged to ensure that the results are deterministic, which may be difficult to reconcile with streaming (if the same document is read twice, the results must be identical). The use of xsl:source-document with streamable="yes" does not offer the same guarantees of determinism.

For an example of processing a collection of documents by use of the function uri-collection in conjunction with xsl:source-document, see 4.2 Examples of xsl:source-document.

6 Streaming with xsl:iterate

The xsl:iterate instruction plays an important role in streamed applications, because it allows an application (by means of the xsl:iterate parameters) to remember selected information as elements from a streamed source document are processed.

Note:

An alternative way of achieving this is with streamed accumulators: see 11 Streamable Accumulators.

The examples below use the xsl:iterate instruction in conjunction with the xsl:source-document instruction. This is not the only way of using xsl:iterate, but it illustrates the way in which the two features can be combined to achieve streaming of a large input document.

Example: Using xsl:iterate to Compute Cumulative Totals

Suppose that the input XML document has this structure

<transactions>
  <transaction date="2008-09-01" value="12.00"/>
  <transaction date="2008-09-01" value="8.00"/>
  <transaction date="2008-09-02" value="-2.00"/>
  <transaction date="2008-09-02" value="5.00"/>
</transactions>

and that the requirement is to transform this to:

<account>
  <balance date="2008-09-01" value="12.00"/>
  <balance date="2008-09-01" value="20.00"/>
  <balance date="2008-09-02" value="18.00"/>
  <balance date="2008-09-02" value="23.00"/>
</account>

This can be achieved using the following code, which is designed to process the transaction file using streaming:

<account>
  <xsl:source-document streamable="yes" href="transactions.xml">
    <xsl:iterate select="transactions/transaction">
      <xsl:param name="balance" select="0.00" as="xs:decimal"/>
      <xsl:variable name="newBalance" 
                    select="$balance + xs:decimal(@value)"/>
      <balance date="{@date}" value="{format-number($newBalance, '0.00')}"/>
      <xsl:next-iteration>
        <xsl:with-param name="balance" select="$newBalance"/>
      </xsl:next-iteration>
    </xsl:iterate>
  </xsl:source-document>
</account>

The following example modifies this by only outputting the information for the first day’s transactions:

<account>
  <xsl:source-document streamable="yes" href="transactions.xml">
    <xsl:iterate select="transactions/transaction">
      <xsl:param name="balance" select="0.00" as="xs:decimal"/>
      <xsl:param name="prevDate" select="()" as="xs:date?"/>
      <xsl:variable name="newBalance" 
                    select="$balance + xs:decimal(@value)"/>
      <xsl:variable name="thisDate" 
                    select="xs:date(@date)"/>
      <xsl:choose>
        <xsl:when test="empty($prevDate) or $thisDate eq $prevDate">
          <balance date="{ $thisDate }" 
                   value="{ format-number($newBalance, '0.00') }"/>
          <xsl:next-iteration>
            <xsl:with-param name="balance" select="$newBalance"/>
            <xsl:with-param name="prevDate" select="$thisDate"/>
          </xsl:next-iteration>
        </xsl:when>
        <xsl:otherwise>
          <xsl:break/>
        </xsl:otherwise>
      </xsl:choose>
    </xsl:iterate>
  </xsl:source-document>
</account>

The following code outputs the balance only at the end of each day, together with the final balance:

<account>
  <xsl:source-document streamable="yes" href="transactions.xml">
    <xsl:iterate select="transactions/transaction">
      <xsl:param name="balance" select="0.00" as="xs:decimal"/>
      <xsl:param name="prevDate" select="()" as="xs:date?"/>
      <xsl:on-completion>
        <balance date="{ $prevDate }" 
                 value="{ format-number($balance, '0.00') }"/>
      </xsl:on-completion>     
      <xsl:variable name="newBalance" 
                    select="$balance + xs:decimal(@value)"/>
      <xsl:variable name="thisDate" select="xs:date(@date)"/>
      <xsl:if test="exists($prevDate) and $thisDate ne $prevDate">
        <balance date="{ $prevDate }" 
                 value="{ format-number($balance, '0.00') }"/>
      </xsl:if>
      <xsl:next-iteration>
        <xsl:with-param name="balance" select="$newBalance"/>
        <xsl:with-param name="prevDate" select="$thisDate"/>
      </xsl:next-iteration>     
    </xsl:iterate>
  </xsl:source-document>
</account>

If the sequence of transactions is empty, this code outputs a single element: <balance date="" value="0.00"/>.

 

Example: Collecting Multiple Values in a Single Pass

Problem: Given a sequence of employee elements, find the employees having the highest and lowest salary, while processing each employee only once.

Solution:

<xsl:source-document streamable="yes" href="employees.xml">
    <xsl:iterate select="employees/employee">
        <xsl:param name="highest" as="element(employee)*"/>
        <xsl:param name="lowest" as="element(employee)*"/>
        <xsl:on-completion>
            <highest-paid-employees>
                <xsl:value-of select="$highest/name"/>
            </highest-paid-employees>
            <lowest-paid-employees>
                <xsl:value-of select="$lowest/name"/>
            </lowest-paid-employees>  
        </xsl:on-completion>
        <xsl:variable name="this" select="copy-of()"/>
        <xsl:variable name="is-new-highest" as="xs:boolean"
            select="empty($highest[@salary ge current()/@salary])"/>
        <xsl:variable name="is-equal-highest" as="xs:boolean" 
            select="exists($highest[@salary eq current()/@salary])"/> 
        <xsl:variable name="is-new-lowest" as="xs:boolean" 
            select="empty($lowest[@salary le current()/@salary])"/>
        <xsl:variable name="is-equal-lowest" as="xs:boolean" 
            select="exists($lowest[@salary eq current()/@salary])"/> 
        <xsl:variable name="new-highest-set" as="element(employee)*"
            select="if ($is-new-highest) then $this
            else if ($is-equal-highest) then ($highest, $this)
            else $highest"/>
        <xsl:variable name="new-lowest-set" as="element(employee)*"
            select="if ($is-new-lowest) then $this
            else if ($is-equal-lowest) then ($lowest, $this)
            else $lowest"/>
        <xsl:next-iteration>
            <xsl:with-param name="highest" select="$new-highest-set"/>
            <xsl:with-param name="lowest" select="$new-lowest-set"/>
        </xsl:next-iteration>
    </xsl:iterate>
</xsl:source-document>

If the input sequence is empty, this code outputs an empty highest-paid-employees element and an empty lowest-paid-employees element.

 

Example: Processing the Last Item in a Sequence Specially

When streaming, it is not possible to determine whether the item being processed is the last in a sequence without reading ahead. The last function therefore cannot be used in guaranteed-streamableSG code. The xsl:iterate instruction provides a solution to this problem.

Problem: render the last paragraph in a section in some special way, for example by using bold face. (The actual rendition is achieved by processing the paragraph with mode last-para.)

The solution uses xsl:iterate together with the copy-of function to maintain a one-element look-ahead by explicit coding:

<xsl:template match="section" mode="streaming">
   <xsl:iterate select="para">
     <xsl:param name="prev" select="()" as="element(para)?"/>
     <xsl:on-completion>
       <xsl:apply-templates select="$prev" mode="last-para"/>      
     </xsl:on-completion>
     <xsl:if test="$prev">
       <xsl:apply-templates select="$prev"/>
     </xsl:if>
     <xsl:next-iteration>
       <xsl:with-param name="prev" select="copy-of(.)"/>
     </xsl:next-iteration>
   </xsl:iterate>
 </xsl:template>

7 Handling Empty Input

The three instructions xsl:where-populated, xsl:on-empty, and xsl:on-non-empty were introduced to the language explicitly to make it easier to generate result trees conditionally depending on what is found in the input, without violating the rules for streamability. These facilities are available whether or not streaming is in use. This section provides examples of how they can be used to process streamed input.

The specific problem tackled by these instructions arises when empty input is to be processed differently from non-empty input: for example, outputting the message “There are no orders today” rather than an empty list of orders, or outputting a subtotal only when the list of items to be totaled is non-empty. The conventional approach to this involves writing something like:

<xsl:choose>
   <xsl:when test="empty(item)">There are no orders today!</xsl:when>
   <xsl:otherwise><xsl:apply-templates select="item"/></xsl:otherwise>
</xsl:choose>

but this is not streamable, because there are two instructions that process the child item elements.

The alternative formulation using xsl:on-empty is fully streamable:

<xsl:sequence>
   <xsl:apply-templates select="item"/>   
   <xsl:on-empty>There are no orders today!</xsl:on-empty>
</xsl:sequence>

The examples that follow illustrate how to use these instructions when writing streamable stylesheet code.

Example: Generating a Wrapper Element for a non-Empty Sequence

The following example generates an events element if and only if there are one or more event elements. The code could be written like this:

<xsl:if test="exists(event)">
  <events>
    <xsl:copy-of select="event"/>
  </events>
</xsl:if>

However, the above code would not be guaranteed-streamable, because it processes the child event elements more than once. To make it streamable, it can be rewritten as:

<xsl:where-populated>
  <events>
    <xsl:copy-of select="event"/>
  </events>
</xsl:where-populated>

The effect of the xsl:where-populated instruction is to avoid outputting the events element if it would have no children. A streaming implementation will typically hold the start tag of the events element in a buffer, to be sent to the output destination only if and when a child node is generated.

 

Example: Generating a Header and Footer only if there is Content

The following example generates an h3 element and a summary paragraph only if a list of items is non-empty. The code could be written like this:

<xsl:if test="exists(item-for-sale)">
  <h1>Items for Sale</h1>
</xsl:if>  
<xsl:apply-templates select="item-for-sale"/>
<xsl:if test="exists(item-for-sale)">
  <p>Total value: {accumulator-before('total-value')}</p>
</xsl:if>

However, the above code would not be guaranteed-streamable, because it processes the child item-for-sale elements more than once. To make it streamable, it can be rewritten as:

<xsl:sequence>
  <xsl:on-non-empty>
    <h1>Items for Sale</h1>
  </xsl:on-non-empty>  
  <xsl:apply-templates select="item-for-sale"/>
  <xsl:on-non-empty>
    <p>Total value: {accumulator-before('total-value')}</p>
  </xsl:on-non-empty>  
</xsl:sequence>

The effect of the xsl:on-non-empty instruction is to output the enclosed content only if the containing sequence constructor also generates “ordinary” content, that is, if there is content generated by instructions other than xsl:on-empty and xsl:on-non-empty instructions.

 

Example: Generating Substitute Text when there is no Content

The following example generates a summary paragraph only if a list of items is empty. The code could be written like this:

<xsl:apply-templates select="item-for-sale"/>
<xsl:if test="empty(item-for-sale)">
  <p>There are no items for sale.</p>
</xsl:if>

However, the above code would not be guaranteed-streamable, because it processes the child item-for-sale elements more than once (the fact that the list is empty is irrelevant, because streamability is determined statically). To make the code streamable, it can be rewritten as:

<xsl:sequence>
  <xsl:apply-templates select="item-for-sale"/>
  <xsl:on-empty>
    <p>There are no items for sale.</p>
  </xsl:on-empty>
</xsl:sequence>

The effect of the xsl:on-empty instruction is to output the enclosed content only if the containing sequence constructor generates no “ordinary” content, that is, if there is no content generated by instructions other than xsl:on-empty and xsl:on-non-empty instructions.

Note:

In some cases, similar effects can be achieved by using the has-children function, which tests whether an element has child nodes without consuming the children. However, use of has-children has the drawback that the function is unselective: it cannot be used to test whether there are any children of relevance to the application. In particular, it returns true if an element contains comments or whitespace text nodes that the application might consider to be insignificant.

Note:

There are no special streamability rules for the three instructions xsl:where-populated, xsl:on-empty, or xsl:on-non-empty. The general streamability rules apply. In many cases the xsl:on-empty and xsl:on-non-empty instructions will generate content that does not depend on the source document, and they will therefore be motionless, but this is not required.

Example: Generating an HTML list

The following example generates an HTML unnumbered list, if and only if the list is non-empty. Note that the presence of the class attribute does not make the list non-empty. The code is written to be streamable.

<xsl:where-populated expand-text="yes">
  <ul class="my-list">
    <xsl:for-each select="source-item">
       <li>{.}</li>
    </xsl:for-each>
  </ul>
</xsl:where-populated>

 

Example: A More Complex Example

This example shows how the three instructions xsl:where-populated, xsl:on-empty, and xsl:on-non-empty may be combined.

The example generates a table containing the names and ages of a set of students; if there are no students, it substitutes a paragraph explaining this.

<div id="students" xsl:expand-text="yes">
<xsl:where-populated>
   <table>
      <xsl:on-non-empty>
         <thead>
            <tr><th>Name</th><th>Age</th></tr>
         </thead>
      </xsl:on-non-empty>
      <xsl:where-populated>
         <tbody>
            <xsl:for-each select="student/copy-of()">
               <tr>
                  <td>{name}</td>
                  <td>{age}</td>
               </tr>
            </xsl:for-each>
         </tbody>
      </xsl:where-populated>
   </table>
</xsl:where-populated>
<xsl:on-empty>
   <p>There are no students</p>
</xsl:on-empty>
</div>

Explanation:

  • The xsl:where-populated around the table element ensures that if there is no thead and no tbody, then there will be no table.

  • The xsl:on-non-empty surrounding the thead element ensures that the thead element is not output unless the tbody element is output.

  • The xsl:where-populated around the tbody element ensures that the tbody element is not output unless there is at least one table row (tr).

  • The xsl:on-empty around the p element ensures that if no table is output, then the paragraph There are no students is output instead.

7.1 Streamed Evaluation of xsl:on-empty and xsl:on-non-empty

The following non-normative algorithm explains one possible strategy for streamed evaluation of a sequence constructorXT containing xsl:on-empty and/or xsl:on-non-empty instructions.

The algorithm makes use of the following mutable variables:

  • L : a list of instructions awaiting evaluation. Initially empty.

  • R : a list of items to act as the result of the evaluation. Initially empty.

  • F : a boolean flag, initially false, to indicate whether any non-vacuousXT items have been written to R by ordinary instructions. The term ordinary instruction means any node in the sequence constructor other than an xsl:on-empty or xsl:on-non-empty instruction.

The algorithm is as follows:

  1. The nodes in the sequence constructor are evaluated in document order.

  2. When an xsl:on-non-empty instruction is encountered, then:

    1. If F is true, the instruction is evaluated and the result is appended to R.

    2. Otherwise, the instruction is appended to L.

  3. When an ordinary instruction is evaluated:

    1. The results of the evaluation are appended to R, in order.

    2. When a non-vacuousXT item is about to be appended to R, and F is false, then before appending the item to R, the following actions are taken:

      1. Any xsl:on-non-empty instructions in L are evaluated, in order, and their results are appended to R.

      2. F is set to true.

  4. When an xsl:on-empty instruction is encountered, then:

    1. If F is true, the instruction is ignored.

    2. Otherwise, the existing contents of R are discarded, the instruction is evaluated, and its results are appended to R.

      Note:

      The need to discard items from R arises only when all the items in R are vacuousXT. Streaming implementations may therefore need a limited amount of buffering to retain insignificant items until it is known whether they will be needed. However, in many common cases an optimized implementation will be able to discard vacuousXT items such as empty text nodes immediately, because when a node is being constructed using the rules in [XSLT 4.0] section 5.7.1 Constructing Complex Content or [XSLT 4.0] section 5.7.2 Constructing Simple Content, such items have no effect on the final outcome.

      Otherwise, the instruction is evaluated and its results are appended to R.

  5. The result of the sequence constructor is the list of items in R.

8 Streamable Stylesheet Functions

The streamability attribute of xsl:function is used to assign the function to one of a number of streamability categories. The various categories, and their effect on the streamability of function calls, are described in 8.1 Classifying Stylesheet Functions.

The streamability category of a function characterizes the way in which the function processes any streamed nodes supplied in the first argument to the function. (In general, streamed nodes cannot be supplied in other arguments, unless they are atomized by the coercion rulesXT.) The streamability attribute is therefore not applicable unless the function takes at least one argument.

[ERR XTSE3155] It is a static error if an xsl:function element with no xsl:param children has a streamability attribute with any value other than unclassified.

8.1 Classifying Stylesheet Functions

Under specific conditions, described in this section, a stylesheet function can be used to process nodes from a streamed input document.

[Definition: Stylesheet functions belong to one of a number of streamability categories: the choice of category characterizes the way in which the function handles streamed input.]

The category to which a function belongs is declared in the streamability attribute of the xsl:function declaration, and defaults to unclassified.

The streamability categories defined in this specification are: unclassified, absorbing, inspection, filter, shallow-descent, deep-descent, and ascent. It is also possible to specify the streamability category as a QName in an implementation-defined namespace, in which case the streamability rules are implementation-defined; a processor that does not recognize a category defined in this way must analyze the function as if streamability="unclassified" were specified.

A stylesheet function is declared-streamable if the xsl:function declaration has a streamability attribute with a value other than unclassified.

The only category permitted for a zero-arity function (one with no arguments) is unclassified. All function calls to zero-arity stylesheet functions are grounded and motionless.

In general (subject to more detailed rules below), a node belonging to a streamed document can be present within the value of an argument of a call on a stylesheet functionXT only if one of the following conditions is true:

  1. The stylesheet function is declared-streamable, and the argument in question is the first argument of the function call.

  2. The corresponding function parameterXT is declared with a required typeXT that triggers atomizationXT of any supplied node.

[Definition: The first parameterXT of a declared-streamable stylesheet functionXT is referred to as a streaming parameter.]

Note:

If a stylesheet function returns streamed nodes, then these nodes can only derive from streamed nodes passed in an argument to the function. This is because streamed nodes cannot be bound to global variables, and they cannot be returned by an xsl:source-document instruction within the function body (the result of xsl:source-document is always grounded).

The choice of category places constraints on the function body, and also on calls to the function. These constraints are defined below, separately for each category. A function is guaranteed-streamable only if the constraints are satisfied, and a static function call is guaranteed-streamable only if the function is guaranteed-streamable and the function call itself satisfies the constraints for the chosen category.

Dynamic function calls are guaranteed-streamable only in trivial cases, for example where the function signature indicates that an argument is required to be a text node or an attribute node. For details, see 12.7.11 Streamability of Dynamic Function Calls.

The constraints on the function body are expressed in terms of the posture and sweep of the function result. The posture and sweep of the function result are the type-adjusted posture and sweep of the sequence constructorXT contained within the xsl:function element, given the declared return type of the function, which defaults to item()*.

Note:

Determining the posture and sweep of the function result requires first determining the posture and sweep of the contained sequence constructorXT, which is done according to the rules in 12.4 Classifying Sequence Constructors. This in turn will usually involve examination of variable references that are bound to the function’s parameters. The analysis of these variable references is described in 12.7.12 Streamability of Variable References.

If the function is declared-streamable but does not satisfy the constraints that make it guaranteed-streamable, the consequences are explained in 3.1 Streamability Guarantees.

If a stylesheet function is overridden in another package (using xsl:override), then the overriding stylesheet function must belong to the same streamability category as the function that it overrides. This ensures that overriding a function cannot affect the streamability of calls to that function.

The rules for each streamability category are given in the following sections.

8.1.1 Streamability Category: unclassified

Informal description: Functions in this category cannot be called with streamed nodes supplied in an argument, unless the function signature causes such nodes to be atomized.

Rules for the function signature: there are no constraints.

Rules for the function body: there are no constraints.

Rules for references to the streaming parameter: not applicable, because there is no streaming parameter.

Rules for function calls: the general streamability rules apply. The operands are the expressions appearing in the argument list of the function call, with the operand usage of each operand being the type-determined usage based on the declared type of the corresponding parameter in the function signature.

Example: An unclassified stylesheet function that accepts nodes

The streamability category is unclassified.

<xsl:function name="f:exclude-first" as="node()*">
  <xsl:param name="nodes" as="node()*"/>
  <xsl:sequence select="$nodes[not(node-name() = preceding-sibling::*/node-name())]"/>
</xsl:function>

The effect of the rules is that a call to this function is guaranteed streamable if and only if the sequence supplied as the value of the $nodes argument is grounded (that is, it contains no streamed nodes).

 

Example: An unclassified stylesheet function that accepts atomic items

The streamability category is unclassified.

<xsl:function name="f:min" as="xs:integer">
  <xsl:param name="arg0" as="xs:integer"/>
  <xsl:param name="arg1" as="xs:integer"/>
  <xsl:sequence select="min(($arg0, $arg1))"/>
</xsl:function>

The effect of the rules is that a call to this function is streamable under similar circumstances to those that apply to a binary operator such as +. For example, a call is streamable if two atomic items are supplied, or if two attribute nodes are supplied, whether from streamed or unstreamed documents. The main constraint is that it is not permitted for both arguments to be consuming; for example, if the context node is a node in a streamed document, then the function call f:min((price, discount)) would not be guaranteed streamable.

8.1.2 Streamability Category: absorbing

Informal description: Functions in this category typically read the subtrees rooted at the node or nodes supplied in the first argument. These subtrees must not overlap each other. The function must not return any streamed nodes.

Rules for the function signature: there are no constraints.

Rules for the function body: For the function to be guaranteed-streamable, the type-adjusted posture of the function body with respect to the declared return type must be grounded, and the type-adjusted sweep of the function body with respect to the declared return type must be motionless or consuming.

Rules for references to the streaming parameter: If the declared type of the streaming parameter permits more than one node, then a variable reference referring to the streaming parameter is striding and consuming. Otherwise such a variable reference is striding and motionless.

Rules for function calls: If the first argument is crawling then the function call is roaming and free-ranging; otherwise the general streamability rules apply. The operands are the expressions appearing in the argument list of the function call. The operand usage of the first argument is absorption; the operand usage of other arguments is the type-determined usage based on the declared type of the corresponding parameterXT in the function signature.

Note:

Absorbing functions perform an operation analogous to atomization on their supplied arguments, in that they typically use information from the subtree rooted at a node to compute atomic items. Atomization can be seen as a special case of absorption. Calls on absorbing functions are therefore, from a streamability point of view, equivalent to calls on functions that implicitly atomize the supplied nodes.

An important difference, however, is that whereas atomization can be applied to any argument of a function call, absorption applies only to the first argument.

Another difference is that atomization is allowed on a sequence of nodes in crawling posture, whereas generalized absorption is not. Within a sequence, there may be nodes whose subtrees overlap, and the code for atomization is expected to handle this, but more general absorption operations are not. To write a function that accepts streamed nodes and atomizes them, it is better to use the streamability category unclassified, and to declare the first argument with an atomic type, rather than using the category absorbing which allows more general processing, but restricts what can be supplied in the argument to the function call.

Example: An absorbing stylesheet function

The following function is declared as absorbing, and the function body meets the rules for this category because it makes downward selections only, and returns an atomic item.

<xsl:function name="f:count-descendants" as="xs:integer" streamability="absorbing">
  <xsl:param name="input" as="node()*"/>
  <xsl:sequence select="count($input//*)"/>
</xsl:function>

The effect of the rules is that a call to this function is guaranteed-streamable provided that the sequence supplied as the value of the $input argument is motionless or consuming, and is either grounded or striding.

 

Example: An absorbing stylesheet function with two arguments

The following function is declared as absorbing, and the function body meets the rules for this category because it makes downward selections only from the node supplied as the first argument, and returns an atomic item.

<xsl:function name="f:compare-size" as="xs:integer" streamability="absorbing">
  <xsl:param name="input0" as="node()"/>
  <xsl:param name="input1" as="node()"/>
  <xsl:sequence select="count($input0//*) - count($input1//*)"/>
</xsl:function>

This function takes two nodes as its arguments. Some examples of function calls include:

  • Streamable: f:compare-size(a, b) where a is an element in a streamed document and b is an element in an unstreamed document

  • Streamable: f:compare-size(a, b) where a and b are both elements in unstreamed documents

  • Not streamable: f:compare-size(a, b) where a is an element in an unstreamed document and b is an element in a streamed document

The reason for the asymmetry is that for the first argument the operand usage is absorption, while for the second argument it is navigation. It is a consequence of the general streamability rules that when streamed nodes are supplied to an operand with usage navigation, the resulting expression is roaming and free-ranging.

 

Example: A recursive absorbing stylesheet function

The following function is declared as absorbing, and the function body meets the rules for this category. Analysis of the function body reveals that it is grounded and consuming; to establish this, it is necessary to analyze the recursive call f:outline(*), and this is possible because it is known to be a call on an absorbing stylesheet function.

<xsl:function name="f:outline" as="xs:string" streamability="absorbing">
  <xsl:param name="input" as="element()*"/>
  <xsl:value-of select="$input ! (name() || '(' || f:outline(*) || ')')" 
                separator=", "/>
</xsl:function>

The effect of the rules is that a call to this function is guaranteed streamable in the typical case where the sequence supplied as the value of the $input argument is striding and consuming.

8.1.3 Streamability Category: inspection

Informal description: Functions in this category typically return properties of the node supplied in the first argument, where these properties can be determined without advancing the input stream. This allows access to properties such as the name and type of each node, and also to its ancestors, attributes, and namespaces.

Rules for the function signature: If the declared type of the streaming parameter permits more than one node, the function is not guaranteed-streamable.

Rules for the function body: For the function to be guaranteed-streamable, the type-adjusted posture of the function body with respect to the declared return type must be grounded, and the type-adjusted sweep of the function body with respect to the declared return type must be motionless.

Rules for references to the streaming parameter: Such a variable reference is striding and motionless.

Rules for function calls: the general streamability rules apply. The operands are the expressions appearing in the argument list of the function call. The operand usage of the first argument is inspection; the operand usage of other arguments is the type-determined usage based on the declared type of the corresponding argument in the function signature.

Note:

The streaming parameter is restricted to be a single node because if $input were a sequence of nodes, then an expression such as ($input/name(), $input/@id) would not be streamable.

Example: Example of an inspection stylesheet function

The following function is declared with category inspection, and the function body meets the rules for this category because all references to the supplied node are motionless.

<xsl:function name="f:depth" as="xs:integer" streamability="inspection">
  <xsl:param name="input" as="node()"/>
  <xsl:sequence select="count($input/ancestor-or-self::*)"/>
</xsl:function>

The effect of the rules is that a call to this function is guaranteed streamable provided that the expression supplied as the value of the $input argument is motionless or consuming.

 

Example: Example of an inspection stylesheet function with two arguments

The following function is declared with category inspection, and the function body meets the rules for this category because the function signature ensures that the second argument cannot be a node.

<xsl:function name="f:get-attribute-value" as="xs:string">
  <xsl:param name="element" as="node()"/>
  <xsl:param name="attribute-name" as="xs:string"/>
  <xsl:sequence select="string($element/@*[local-name() = $attribute-name])"/>
</xsl:function>

Although the normal usage of this function might be to supply an element from a streamed document as the first argument, and a literal string as the second, it is also permissible (and guaranteed streamable) to supply an unstreamed element as the first argument, and an element node from a streamed document as the second. When applying the general streamability rules in this case, the first operand is grounded and motionless, while the second is grounded and consuming (by virtue of the rules for type-determined usage), and this makes the function call grounded and consuming.

8.1.4 Streamability Category: filter

Informal description: Functions in this category typically return either the node supplied in the first argument or nothing, depending on the values of properties that can be determined without advancing the input stream. This allows access to properties such as the name and type of each node, and also to its ancestors, attributes, and namespaces.

Rules for the function signature: If the declared type of the streaming parameter permits more than one node, the function is not guaranteed-streamable.

Rules for the function body: For the function to be guaranteed-streamable, the type-adjusted posture of the function body with respect to the declared return type must be striding, and the type-adjusted sweep of the function body with respect to the declared return type must be motionless.

Rules for references to the streaming parameter: Such a variable reference is striding and motionless.

Rules for function calls: The posture and sweep of a call to a function in this category are determined by applying the general streamability rules. The operands are the expressions supplied as arguments to the function call. The first argument has operand usage transmission; any further arguments have type-determined usage based on the declared type of the corresponding parameter in the function signature.

Example: Example of a filtering stylesheet function

The following function is declared as filtering, and the function body meets the rules for this category because it selects nodes from the input based on motionless properties (namely, the existence of attributes).

<xsl:function name="f:large-regions" as="element(region)" streamability="filter">
  <xsl:param name="input" as="element(region)"/>
  <xsl:sequence select="$input[@size gt 1000]"/>
</xsl:function>

The effect of the rules is that the posture and sweep of a function call f:large-regions(EXPR) are the same as the posture and sweep of EXPR.

Although the name filter suggests that the result must always be a subset of the input, this is not strictly required by the rules. The function can also return atomic items, as well as attribute and namespace nodes.

8.1.5 Streamability Category: shallow-descent

Informal description: Functions in this category typically return children of the nodes supplied in the first argument. They may also select deeper in the subtrees of these nodes, provided that no node in the result can possibly be an ancestor of any other node in the result.

Rules for the function signature: If the declared type of the streaming parameter permits more than one node, the function is not guaranteed-streamable.

Rules for the function body: For the function to be guaranteed-streamable, the type-adjusted posture of the function body with respect to the declared return type must be striding, and the type-adjusted sweep of the function body with respect to the declared return type must be motionless or consuming.

Rules for references to the streaming parameter: Such a variable reference is striding and motionless.

Rules for function calls: The rules are as follows, in order:

  1. Let T0 be the U-type corresponding to the declared type of the streaming parameter in the function signature (defaulting to U{*}).

  2. Let P0 and S0 be the type-adjusted posture and sweep of the first argument expression, based on type T0.

  3. If P0 is not striding or grounded, the function call is roaming and free-ranging.

  4. Consider a construct C whose operands are the argument expressions other than the first argument, with type-determined operand usage based on the declared type of the corresponding parameter in the function signature. Let P1 and S1 be the posture and sweep of C, assessed using the general streamability rules.

    Note:

    If there is only one argument, then P1 is grounded and S1 is motionless.

  5. If P1 is not grounded, the function call is roaming and free-ranging.

  6. If S0 and S1 are both consuming, or if either is free-ranging, then the function call is roaming and free-ranging.

  7. If P0 is grounded, then the posture of the function call is grounded, and the sweep of the function call is the wider of S0 and S1.

  8. Otherwise, the posture of the function call is P0, and the sweep of the function call is as follows:

    1. If the intersection of T0 with U{document-node(), element()} is empty (that is, the declared type of the first argument does not permit document or element nodes) then S0.

    2. Let A be the static type of the expression supplied as the first argument. If the intersection of A with U{document-node(), element()} is empty (that is, the inferred type of the expression supplied as the first argument does not permit document or element nodes) then S0.

    3. Otherwise, consuming.

Example: A shallow-descent stylesheet function

The following function is declared as shallow-descent, and the function body meets the rules for this category because it selects children of the supplied input node.

<xsl:function name="f:alternate-children" as="node()*" 
                                          streamability="shallow-descent">
  <xsl:param name="input" as="element()"/>
  <xsl:sequence select="$input/node()[position() mod 2 = 1]"/>
</xsl:function>

The effect of the rules is that a call to this function is guaranteed streamable in the typical case where the node supplied as the value of the $input argument is striding and consuming.

8.1.6 Streamability Category: deep-descent

Informal description: Functions in this category typically return descendants of the nodes supplied in the first argument.

Rules for the function signature: If the declared type of the streaming parameter permits more than one node, the function is not guaranteed-streamable.

Rules for the function body: For the function to be guaranteed-streamable, the type-adjusted posture of the function body with respect to the declared return type must be crawling, and the type-adjusted sweep of the function body with respect to the declared return type must be motionless or consuming.

Rules for references to the streaming parameter: Such a variable reference is striding and motionless.

Rules for function calls: The rules are as follows, in order:

  1. Let T0 be the U-type corresponding to the declared type of the streaming parameter in the function signature (defaulting to U{*}).

  2. Let P0 and S0 be the type-adjusted posture and sweep of the first argument expression, based on type T0.

  3. If P0 is not striding or grounded, the function call is roaming and free-ranging.

  4. Consider a construct C whose operands are the argument expressions other than the first argument, with type-determined operand usage based on the declared type of the corresponding parameter in the function signature. Let P1 and S1 be the posture and sweep of C, assessed using the general streamability rules

    Note:

    If there is only one argument, then P1 is grounded and S1 is motionless.

  5. If P1 is not grounded, the function call is roaming and free-ranging.

  6. If S0 and S1 are both consuming, or if either is free-ranging, the function call is roaming and free-ranging.

  7. If P0 is grounded, then the posture of the function call is grounded, and the sweep of the function call is the wider of S0 and S1.

  8. Otherwise, the posture of the function call is crawling, and the sweep of the function call is as follows:

    1. If the intersection of T0 with U{document-node(), element()} is empty (that is, the declared type of the first argument does not permit document or element nodes) then S0.

    2. Let A be the static type of the expression supplied as the first argument. If the intersection of A with U{document-node(), element()} is empty (that is, the inferred type of the expression supplied as the first argument does not permit document or element nodes) then S0.

    3. Otherwise, consuming.

Example: A deep-descent stylesheet function

The following function is declared as deep-descent, and the function body meets the rules for this category because it selects descendants of the supplied input node.

<xsl:function name="f:all-comments" as="comment()*" 
                                    streamability="deep-descent">
  <xsl:param name="input" as="element()"/>
  <xsl:sequence select="$input//comment()"/>
</xsl:function>

The effect of the rules is that a call to this function is guaranteed streamable in the typical case where the node supplied as the value of the $input argument is striding and consuming.

8.1.7 Streamability Category: ascent

Informal description: Functions in this category typically return ancestors of the nodes supplied in the first argument.

Rules for the function signature: If the declared type of the streaming parameter permits more than one node, the function is not guaranteed-streamable.

Rules for the function body: For the function to be guaranteed-streamable, the type-adjusted posture of the function body with respect to the declared return type must be either climbing or grounded, and the type-adjusted sweep of the function body with respect to the declared return type must be motionless.

Rules for references to the streaming parameter: Such a variable reference is climbing and motionless.

Rules for function calls: The posture and sweep of a call to a function in this category are determined as follows:

  1. Let P0 and S0 be the posture and sweep obtained by assessing the function call using the general streamability rules, where the operands are the arguments to the function call, with an operand usage for the first argument of transmission, and an operand usage for arguments after the first being the type-determined usage based on the declared type of the corresponding function parameterXT.

  2. If P0 is roaming or S0 is free-ranging, then the function call is roaming and free-ranging.

  3. If S0 is not motionless, then the function call is roaming and free-ranging.

  4. If P0 is roaming, then the function call is roaming and free-ranging.

  5. If P0 is grounded, then the function call is grounded and motionless.

  6. If the declared return type of the function does not permit nodes, then the function call is grounded and motionless.

  7. Otherwise, the function call is climbing and motionless.

Example: An ascending stylesheet function

The following function is declared with category ascent, and the function body meets the rules for this category because it selects ancestors of the supplied node.

<xsl:function name="f:containing-section" as="element(section)" 
                                          streamability="ascent">
  <xsl:param name="input" as="element(para)*"/>
  <xsl:sequence select="$input/ancestor::section[last()]"/>
</xsl:function>

The effect of the rules is that a call to this function is guaranteed streamable provided that the node supplied as the value of the input argument is not roaming or free-ranging. There are no other constraints on the node supplied in the input sequence.

9 Streamable Attribute Sets

An attribute setXT may be designated as streamable by including the attribute streamable="yes" on each xsl:attribute-set declaration making up the attribute set. If any xsl:attribute-set declaration for an attribute set has the attribute streamable="yes", then every xsl:attribute-set declaration for that attribute set must have the attribute streamable="yes".

An attribute setXT is guaranteed-streamable if all the following conditions are satisfied:

  1. Every xsl:attribute-set declaration for the attribute set has the attribute streamable="yes".

  2. Every xsl:attribute-set declaration for the attribute set is grounded and motionless according to the analysis in 9.1 Classifying Attribute Sets.

Specifying streamable="yes" on an xsl:attribute-set element declares an intent that the attribute set should be streamable, either because it is guaranteed-streamable, or because it takes advantage of streamability extensions offered by a particular processor. The consequences of declaring the attribute set to be streamable when it is not in fact guaranteed streamable depend on the conformance level of the processor, and are explained in 3.1 Streamability Guarantees.

[ERR XTSE0730] If an xsl:attribute set element specifies streamable="yes" then every attribute set referenced in its use-attribute-sets attribute (if present) must also specify streamable="yes".

Note:

It is common for attribute sets to create attributes with constant values, and such attribute sets will always be grounded and motionless and therefore streamable. Although such cases are fairly simple for a processor to detect, references to attribute sets are not guaranteed streamable unless the attribute set is declared with the attribute streamable="yes", which should therefore be used if interoperable streaming is required.

9.1 Classifying Attribute Sets

The posture of an attribute setXT is always grounded (its result can never return streamed nodes).

The sweep of an attribute setXT is motionless if all the following conditions hold:

  1. Every xsl:attribute instruction within the declarations comprising the attribute set is motionless when assessed as described in 9 Streamable Attribute Sets, using a context posture of striding.

  2. Every attribute set referenced in the use-attribute-sets attribute of an xsl:attribute-set declaration of the attribute set has the attribute streamable="yes".

If the sweep of an attribute setXT is not motionless then it is free-ranging.

Note:

Attribute sets will always be grounded, because they return newly constructed attribute nodes.

Attribute sets will very often be motionless, but if they access the context item, they may be free-ranging. Although some attribute sets could theoretically be classified as consuming, this option has been excluded because it is unlikely to be useful; given the requirement to create attributes whose values are obtained by reading a streamed input document, use of a streamable template ruleXT is a more versatile approach.

Because attribute sets can be overridden in another packageXT, the streamability of a construct such as an xsl:element instruction containing a use-attribute-sets attribute is based on the declared streamability of the named attribute sets, as defined by the streamable attribute of the xsl:attribute-set element. If streamable="yes" is specified, then there is a requirement that any overriding attribute set should also specify streamable="yes", and a streaming processor is required to check that an attribute set containing such a declaration does in fact satisfy the streamability rules.

10 Streamable Merging

Any input to a merging operation, provided it is selected by means of the xsl:merge-source element with a for-each-source attribute, may be designated as streamable by including the attribute streamable="yes" on the xsl:merge-source element.

When streamable="yes" is specified on an xsl:merge-source element, then (whether or not streamed processing is actually used, and whether or not the processor supports streaming) the expression appearing in the select attribute is implicitly used as the argument of a call on the snapshot function, which means that merge keys for each selected node are computed with reference to this snapshot, and the current-merge-group function, when used within the xsl:merge-action sequence constructor, delivers snapshots of the selected nodes.

Note:

There are therefore no constraints on the navigation that may be performed in computing the merge key, or in the course of evaluating the xsl:merge-action body. An attempt to navigate outside the portion of the source document delivered by the snapshot function will typically not cause an error, but will return empty results.

There is no rule to prevent the select expression returning atomic items, or grounded nodes from a different source document, or newly constructed nodes, but they are still processed using the snapshot function.

Because the snapshot copies accumulator values as described in [XSLT 4.0] section 19.10 Copying Accumulator Values, the functions accumulator-before and accumulator-after may be used to gain access to information that is not directly available in the nodes that are present within each snapshot (for example, information in a header section of the merge input document).

An xsl:merge-source element is guaranteed-streamable if it satisfies all the following conditions:

  1. The xsl:merge-source element has the attribute value streamable="yes";

  2. The for-each-source attribute is present on that xsl:merge-source element;

  3. The expression in the select attribute of that xsl:merge-source element, assessed with a context posture of striding and a context item type of U{document-node()}, has striding or grounded posture and motionless or consuming sweep;

  4. The sort-before-merge attribute of that xsl:merge-source element is either absent or takes its default value of no.

Specifying streamable="yes" on an xsl:merge-source element declares an intent that the xsl:merge instruction should be streamable with respect to that particular source, either because it is guaranteed-streamable, or because it takes advantage of streamability extensions offered by a particular processor. The consequences of declaring the instruction to be streamable when it is not in fact guaranteed streamable depend on the conformance level of the processor, and are explained in 3.1 Streamability Guarantees.

Example: Streamed Merging

The following example merges two log files, processing each of them using streaming.

<events>
   <xsl:merge>
      <xsl:merge-source for-each-source="'log-file-1.xml'" 
                        select="/events/event" 
                        streamable="yes">
         <xsl:merge-key select="@timestamp"/>
      </xsl:merge-source>
      <xsl:merge-source for-each-source="'log-files-2.xml'" 
                        select="/log/day/record" 
                        streamable="yes">
         <xsl:merge-key select="dateTime(../@date, time)"/>
      </xsl:merge-source>
      <xsl:merge-action>
         <events time="{current-merge-key()}">
            <xsl:copy-of select="current-merge-group()"/>
         </events>   
      </xsl:merge-action>
   </xsl:merge>
</events>

Note that the merge key for the second merge source includes data from a child element of the selected element and also from an attribute of the parent element. This works because the merge key is evaluated on the result of implicitly applying the snapshot function.

Example: Merging XML and non-XML Data

The following example merges two log files, one in text format and one in XML format.

<events>
   <xsl:merge>
      <xsl:merge-source name="fax" 
                        select="unparsed-text-lines('fax-log.txt')">
         <xsl:merge-key select="xs:dateTime(substring-before(., ' '))"/>
      </xsl:merge-source>
      <xsl:merge-source name="mail"
                        for-each-source="'mail-log.xml'" 
                        select="/log/day/message" 
                        streamable="yes">
         <xsl:merge-key select="dateTime(../@date, @time)"/>
      </xsl:merge-source>
      <xsl:merge-action>
         <messages at="{current-merge-key()}">
            <xsl:where-populated>
               <fax>
                  <xsl:for-each select="current-merge-group('fax')">
                     <message xsl:expand-text="true">{
                        substring-after(., ' ')
                     }</message>
                  </xsl:for-each>   
               </fax>
               <mail>
                  <xsl:sequence select="current-merge-group('mail')/*"/>
               </mail>
            </xsl:where-populated>   
         </messages>   
      </xsl:merge-action>
   </xsl:merge>
</events>

11 Streamable Accumulators

Accumulators were introduced to the XSLT language specifically with the needs of streaming applications in mind. They allow information from a streamed source document (for example, the contents of a header element) to be retained for use when subsequent elements in the document are processed.

The capture attribute introduced in XSLT 4.0 (see [XSLT 4.0] section 19.9 Capturing Accumulators) further increases the power of this approach.

An accumulator is guaranteed-streamable if it satisfies all the following conditions:

  1. The xsl:accumulator declaration has the attribute streamable="yes" (that is, it is declared-streamable).

  2. In every contained xsl:accumulator-rule, the patternXT in the match attribute is a motionless pattern (see 12.9 Classifying Patterns).

  3. The expressionXT in the initial-value attribute is grounded and motionless.

  4. In an xsl:accumulator-rule with phase="start" (the default value), the type-adjusted posture and sweep of the expressionXT in the select attribute or the contained sequence constructorXT, with respect to the declared type of the accumulator, is grounded and motionless.

  5. In an xsl:accumulator-rule with phase="end", one of the following conditions holds:

    1. The rule has capture="no" (the default value), and the type-adjusted posture and sweep of the expressionXT in the select attribute or the contained sequence constructorXT, with respect to the declared type of the accumulator, is grounded and motionless.

    2. The rule has capture="yes".

Specifying streamable="yes" on an xsl:accumulator element declares an intent that the accumulator should be streamable, either because it is guaranteed-streamable, or because it takes advantage of streamability extensions offered by a particular processor. The consequences of declaring the accumulator to be streamable when it is not in fact guaranteed streamable depend on the conformance level of the processor, and are explained in 3.1 Streamability Guarantees.

When an accumulator is declared to be streamable, the stylesheet author must ensure that the accumulator function accumulator-after is only called at appropriate points in the processing, as explained in 12.8.1 Streamability of the accumulator-after Function.

For constructs that use accumulators to be guaranteed-streamable:

12 Streamability of Specific Constructs

12.1 Maps and Streaming

Maps have many uses, but their introduction to XSLT 3.0 was strongly motivated by streaming use cases. In essence, when a source document is processed in streaming mode, data that is encountered in the course of processing may need to be retained in variables for subsequent use, because the nodes cannot be revisited. This creates a need for a flexible data structure to accommodate such temporary data, and maps were designed to fulfil this need.

The entries in a map are not allowed to contain references to streamed nodes. This is achieved by ensuring that for all constructs that supply content to be included in a map (for example the third argument of map:put, and the select attribute of xsl:map-entry), the relevant operand is defined to have operand usage navigation. Because maps cannot contain references to streamed nodes, they are effectively grounded, and can therefore be used freely in contexts (such as parameters to functions or templates) where only grounded operands are permitted.

The xsl:map instruction, and the XPath MapConstructor construct, are exceptions to the general rule that during streaming, only one downward selection (one consuming subexpression) is permitted. They share this characteristic with xsl:fork. As with xsl:fork, a streaming processor is expected to be able to construct the map during a single pass of the streamed input document, which may require multiple expressions to be evaluated in parallel.

In the case of the xsl:map instruction, this exemption applies only in the case where the instruction consists exclusively of xsl:map-entry (and xsl:fallback) children, and not in more complex cases where the map entries are constructed dynamically (for example using a control flow implemented using xsl:choose, xsl:for-each, or xsl:call-template). Such cases may, of course, be streamable if they only have a single consuming subexpression.

For example, the following XPath expression is streamable, despite making two downward selections:

let $m := { 'price': xs:decimal(price), 'discount': xs:decimal(discount) } 
return ($m?price - $m?discount)

Analysis:

  1. Because the return clause is motionless, the sweep of the let expression is the sweep of the map expression (the expression in curly brackets).

  2. The sweep of a map expression is the maximum sweep of its key/value pairs.

  3. For both key/value pairs, the key is motionless and the value is consuming.

  4. The expression carefully atomizes both values, because retaining references to streamed nodes in a map is not permitted.

  5. Therefore the map expression, and hence the expression as a whole, is grounded and consuming.

The streamability of the xsl:record instruction is determined by analysing the streamability of the equivalent xsl:map instruction as described in the definition of xsl:record above.

See also: 12.7.17 Streamability of Map Constructors, 12.5.23 Streamability of xsl:map, 12.5.24 Streamability of xsl:map-entry, 12.5.35 Streamability of xsl:record

12.2 Keys and Streaming

Keys are not applicable to streamed documents.

This is ensured by the rules for the streamability of the key function (see 12.8 Classifying Calls to Built-In Functions). These rules make the operand usage of the third argument navigation, which has the consequence that when the key function is applied to a streamed input document, the call is roaming and free-ranging, which effectively makes the containing construct non-streamable.

12.3 Grounded Consuming Constructs

A construct is grounded if the items it delivers do not include nodes from a streamed document; it is consuming if evaluation of the construct reads nodes from a streamed input in a way that requires advancing the current position in the input.

Grounded consuming constructs play an important role in streaming, and this section discusses some of their characteristics.

Examples of grounded consuming constructs (assuming the context item is a streamed node) include:

  • sum(.//transaction/@value)

  • copy-of(./account/history/event)

  • distinct-values(./account/@account-nr)

  • <xsl:for-each select="transaction"><t><xsl:value-of select="@value"/></t></xsl:for-each>

XSLT 3.0 provides the two functions copy-of and snapshot with the explicit purpose of creating a sequence of grounded nodes, that can be processed one-by-one without the usual restrictions that apply to streamed processing, such as the rule permitting at most one downward selection. The processing style that exploits these functions is often called “windowed streaming”.

In general the result of a grounded consuming construct is a sequence. Depending on how this sequence is used, it may or may not be necessary for the processor to allocate sufficient memory to hold the entire sequence. The streamability rules in this specification place few constraints on how a grounded sequence is used. This is deliberate, because it gives users control: by creating a grounded sequence (for example, by use of the copy-of function) stylesheet authors create the possibility to process data in arbitrary ways (for example, by sorting the sequence), and accept the possibility that this may consume memory.

Pipelined evaluation of a sequence is analogous to streamed processing of a source document. Pipelined evaluation occurs when the items in a sequence can be processed one-by-one, without materializing the entire sequence in memory. Pipelining is a common optimization technique in all functional programming languages. Operations for which pipelined evaluation is commonly performed include filtering ($transactions[@value gt 1000]), mapping ($transactions!(@value - @processing-fee)), and aggregation (sum($transactions)). Operations that cannot be pipelined (because, for example, the first item in the result sequence cannot be computed without knowing the last item in the input sequence) include those that change the order of items (reverse(), sort()). Other operations such as distinct-values() allow the input to be processed one item at a time, but require memory that potentially increases as the sequence length increases. Saving a grounded sequence in a variable is also likely in many cases to require allocation of memory to hold the entire sequence.

When the input to an operation is a grounded consuming sequence (more accurately, a sequence resulting from the evaluation of a grounded consuming construct), this specification does not attempt to dictate whether the operation is pipelined or not. The goal of interoperable streaming in finite memory can therefore only be achieved if stylesheet authors take care to avoid constructing grounded sequences that occupy large amounts of memory. In practice, however, users can expect that many grounded consuming constructs will be pipelined where the semantics permit this.

Note:

Some processors may recognize an opportunity for pipelining only if the expression is written in a particular way. For example the constructs copy-of(/a/b/c) and /a/b/c/copy-of(.) are to all intents and purposes equivalent, but some processors might recognize the second form more easily as suitable for pipelining.

(There is one minor difference between these expressions: the order of nodes in copy-of(/a/b/c) is required to reflect the document order of the nodes in /a/b/c, while the result of /a/b/c/copy-of(.) can be in any order, in consequence of the rule that document order for nodes in different trees is implementation-dependent.)

The use of the last function requires particular care because of its effect on pipelining. The streamability rules prevent the use of last() in conjunction with an expression that returns streamed nodes (because it would require look-ahead in the stream), but there is no similar constraint for grounded sequences. So for example it is not permitted (in a context that requires streaming) to write

<xsl:for-each select="transaction">
  <xsl:value-of select="position(), ' of ', last()"/>
</xsl:for-each>

but it is quite permissible to write

<xsl:for-each select="transaction/copy-of()">
  <xsl:value-of select="position(), ' of ', last()"/>
</xsl:for-each>

because the call on copy-of makes the sequence grounded. This construct cannot be pipelined because computing the first item in the result sequence depends on knowing the length of the input sequence; in consequence, a processor might be obliged to buffer all the transactions (or their copies) in memory. In this simple example the impact of the call on last is easily detected both by the human reader and by the XSLT processor, but there are other cases where the effect is less obvious. For example if the stylesheet executes the instruction

<xsl:apply-templates select="transaction/copy-of(.)"/>

then the presence of a call on last in one of the template rules that gets invoked might not be easily spotted; yet the effect is exactly the same in preventing the result being computed by processing input items strictly one at a time. Avoiding such effects is entirely the responsibility of the stylesheet author.

By contrast, there is no intrinsic reason why use of the position should prevent pipelined processing: all it requires is for the processor to count how many items have been processed so far. Processors may also be able to handle the construct position() = last() without storing the entire sequence in memory; rather than actually evaluating the numeric values of position() and last(), this can be done by testing whether the context item is the last item in the sequence, which only requires a one-item lookahead.

12.4 Classifying Sequence Constructors

The posture and sweep of a sequence constructorXT are determined by the general streamability rules.

The operand roles and their usages are:

  1. The immediately contained instructionsXT and literal result elementsXT, including any xsl:on-empty or xsl:on-non-empty instructions. The operand usage for these operands is transmission.

  2. Any text value templatesXT appearing in text nodes within the sequence constructor, if text value templates are enabled. The operand usage for these operands is absorption.

Note:

Some consequences of these rules are:

  1. An empty sequence constructor is motionless, and its posture is grounded.

  2. A sequence constructor containing a single instruction has the same sweep and posture as that instruction. (This means that sequence constructors containing a single instruction can usefully be dropped from the construct tree.)

  3. Informally, a sequence constructor is not streamable if it contains more than one instruction that moves the position of the input stream.

  4. xsl:on-empty or xsl:on-non-empty instructions are not treated specially. For example, there is no attempt to take into account that they are mutually exclusive: if one is evaluated, the other will not be evaluated. In most use cases for these instructions, they will be motionless, so the additional complexity of doing more advanced analysis would rarely be justified.

12.5 Classifying Instructions

This section describes how instructionsXT are classified with respect to their streamability. The criteria are given first for literal result elementsXT and extension instructionsXT, then for each XSLT instruction, listed alphabetically.

12.5.1 Streamability of Literal Result Elements

The posture and sweep of a literal result elementXT follow the general streamability rules. The operand roles and their usages are:

  1. The contained sequence constructor (usage absorption)

  2. Any expressions contained in attribute value templatesXT among the literal result element’s attributes (usage absorption)

  3. Any attribute setsXT named in the xsl:use-attribute-sets attribute (usage irrelevant, but can be taken as inspection).

    Note:

    In practice, a reference to an attribute set that is declared-streamable does not affect the analysis, while a reference to any other attribute set makes the literal result element roaming and free-ranging.

12.5.2 Streamability of extension instructions

For a processor that recognizes an extension instructionXT, the posture and sweep of the instruction are implementation-defined.

For a processor that does not recognize an extension instructionXT, the posture and sweep of the instruction are determined by applying the general streamability rules. The operand roles and their usages are:

  1. The sequence constructorsXT contained in any xsl:fallback children (usage transmission)

Instructions in the XSLT namespace that are present under the provisions for forwards compatible behaviorXT are treated in the same way as unrecognized extension instructions.

Note:

These rules mean that if there is no xsl:fallback child instruction, the containing construct will be classified as streamable. However, any attempt to execute the instruction will lead to a dynamic error, so in fact, neither streamed nor unstreamed evaluation is possible.

12.5.3 Streamability of xsl:analyze-string

The posture and sweep of xsl:analyze-string follow the general streamability rules. The operand roles and their usages are:

  1. the select expression (usage absorption);

  2. the regex attribute value template (usage absorption);

  3. the sequence constructors contained in the xsl:matching-substring and xsl:non-matching-substring elements. These have usage navigation, because they can be evaluated more than once. The context posture for the two sequence constructors is grounded, reflecting the fact that their context item type is xs:string.

Note:

In practice, the sweep of the instruction will usually be the same as the sweep of the select expression, and its posture will be grounded. Exceptions occur for example if the regex attribute is not motionless, or if the contained sequence constructors refer to a grouping variable bound in a contained xsl:for-each-group instruction.

12.5.4 Streamability of xsl:apply-imports

The rules in this section apply also to xsl:next-match.

The posture and sweep of these two instructions follow the general streamability rules. The operand roles and their usages are:

  1. An implicit operand: a context item expression (.), with usage absorption;

  2. The select attribute or contained sequence constructorXT of each xsl:with-param child element, with type-determined usage based on the type declared in the xsl:with-param/@as attribute, or item()* if absent.

Note:

The instruction will normally be grounded and consuming, provided that nodes in a streamed document are not passed as parameters to the called template rule.

12.5.5 Streamability of xsl:apply-templates

If there is no select attribute, the following analysis assumes the presence of an implicit operand select="child::node()".

The posture and sweep of the xsl:apply-templates instruction are the first of the following that apply:

  1. If the select expression is grounded, then the posture and sweep of the xsl:apply-templates instruction follow the general streamability rules, with the operand roles and their usages as follows:

    1. The select expression (the operand usage is irrelevant, but can be taken as absorption)

    2. The select expressions and contained sequence constructors of any child xsl:with-param elements (usage type-determined, based on the type in the xsl:with-param/@as attribute, defaulting to item()*)

    3. Any attribute value templates appearing in attributes of a child xsl:sort instruction (usage absorption)

    4. The select expression or contained sequence constructor of any xsl:sort children, assessed with a context posture of grounded (usage absorption).

    For example, <xsl:apply-templates select="copy-of(.)"/> is grounded and consuming.

  2. If there is an xsl:sort child element, then roaming and free-ranging.

  3. If the implicit or explicit mode attribute identifies a modeXT that is not declared with streamable="yes", then roaming and free-ranging.

    Note:

    When mode="#current" is specified, this is treated as equivalent to specifying a streamable mode; although it is not known statically what the mode will be, it is always the case that if the template is invoked with a streamed node as the context item, then the current mode must be a streamable mode.

  4. If the select expression is climbing or crawling, then roaming and free-ranging

  5. Otherwise, the posture and sweep of the xsl:apply-templates instruction follow the general streamability rules. The operand roles and their usages are as follows:

    1. The (explicit or implicit) select expression, with usage absorption;

    2. The select attribute or contained sequence constructorXT of each xsl:with-param child element, with type-determined usage based on the type declared in the xsl:with-param/@as attribute, or item()* if absent.

12.5.6 Streamability of xsl:assert

The posture and sweep of xsl:assert follow the general streamability rules. The operand roles and their usages are as follows:

  1. The test expression (usage inspection)

  2. The select expression (usage absorption)

  3. The error-code attribute value template (usage absorption)

  4. The contained sequence constructorXT (usage absorption).

12.5.7 Streamability of xsl:attribute

The posture and sweep of xsl:attribute follow the general streamability rules. The operand roles and their usages are as follows:

  1. The name attribute value template (usage absorption)

  2. The namespace attribute value template (usage absorption)

  3. The select expression (usage absorption)

  4. The separator attribute value template (usage absorption)

  5. The contained sequence constructorXT (usage absorption).

12.5.8 Streamability of xsl:break

The posture and sweep of xsl:break follow the general streamability rules. The operand roles and their usages are as follows:

  1. The select expression (usage transmission)

  2. The contained sequence constructorXT (usage transmission).

12.5.9 Streamability of xsl:call-template

The posture and sweep of xsl:call-template follow the general streamability rules. The operand roles and their usages are as follows:

  1. Unless the referenced template has a child xsl:context-item element with the attribute use="prohibited", there is an implicit operand, a context item expression (.): its operand usage is the type-determined usage based on the type declared in the xsl:context-item/@as attribute of the target named template, defaulting to item()* if absent.

  2. The select expression or sequence constructor content of any contained xsl:with-param child element: its operand usage is the type-determined usage based on the type declared in the xsl:with-param/@as attribute, or the xsl:param/@as attribute of the corresponding parameter on the target named template, whichever is more restrictive, defaulting to item()* if both are absent.

Note:

Calling xsl:call-template will usually make stylesheet code unstreamable if a streamed node is passed explicitly or implicitly to the called template, unless it is atomized by declaring the expected type to be atomic.

12.5.10 Streamability of xsl:choose

The posture and sweep of xsl:choose follow the general streamability rules. The operand roles and their usages are as follows:

  1. The test attribute of contained xsl:when elements (usage inspection).

  2. The sequence constructors and select expressions contained within xsl:when and xsl:otherwise child elements (usage transmission). These operands form a choice operand group.

Note:

The effect is to allow either of the following:

  1. Any or all of the sequence constructors and select expressions in xsl:when and xsl:otherwise branches may be consuming, in which case the test expressions must all be motionless.

  2. Any one of the test expressions may be consuming, in which case all the other test expressions, and all the sequence constructors and select expressions, must be motionless.

12.5.11 Streamability of xsl:comment

The posture and sweep of xsl:comment follow the general streamability rules. The operand roles and their usages are as follows:

  1. The select expression (usage absorption)

  2. The contained sequence constructorXT (usage absorption).

12.5.12 Streamability of xsl:copy

The posture and sweep of xsl:copy follow the general streamability rules. The operand roles and their usages are as follows:

  1. The expression in the select attribute, defaulting to a context item expression (.) (usage inspection)

  2. The contained sequence constructor (usage absorption), assessed with context posture and context item type based on the select expression if present, or the outer focus otherwise.

  3. Any attribute setsXT named in the use-attribute-sets attribute (usage irrelevant, but can be taken as inspection).

    Note:

    In practice, a reference to an attribute set that is declared-streamable does not affect the analysis, while a reference to any other attribute set makes the xsl:copy instruction roaming and free-ranging.

Note:

The effect of these rules is that when a select attribute is present, the sequence constructor contained by the xsl:copy instruction is deemed to be a higher-order operand of the instruction, even though it can only be evaluated once.

This has the practical consequence that the following example is not guaranteed-streamable, even though it is possible to imagine a strategy for streamed evaluation:

 <xsl:for-each-group select="product" group-adjacent="@category">
     <xsl:copy select="..">
         <xsl:copy-of select="current-group()"/>
     </xsl:copy>
 </xsl:for-each-group>

A workaround in this case might be to rewrite the code as follows:

 <xsl:for-each-group select="product" group-adjacent="@category">
     <xsl:element name="{name(..)}" namespace="{namespace-uri(..)}">
         <xsl:copy-of select="current-group()"/>
     </xsl:element>
 </xsl:for-each-group>

12.5.13 Streamability of xsl:copy-of

The posture and sweep of xsl:copy-of follow the general streamability rules. The operand roles and their usages are as follows:

  1. The select expression (usage absorption).

12.5.14 Streamability of xsl:document

The posture and sweep of xsl:document follow the general streamability rules. The operand roles and their usages are as follows:

  1. The contained sequence constructorXT (usage absorption).

12.5.15 Streamability of xsl:element

The posture and sweep of xsl:element follow the general streamability rules. The operand roles and their usages are as follows:

  1. The name attribute value template (usage absorption)

  2. The namespace attribute value template (usage absorption)

  3. Any attribute setsXT named in the use-attribute-sets attribute (usage irrelevant, but can be taken as inspection).

    Note:

    In practice, a reference to an attribute set that is declared-streamable does not affect the analysis, while a reference to any other attribute set makes the xsl:element instruction roaming and free-ranging.

  4. The contained sequence constructorXT (usage absorption).

12.5.16 Streamability of xsl:evaluate

The posture and sweep of xsl:evaluate follow the general streamability rules. The operand roles and their usages are as follows:

  1. The xpath expression (usage absorption)

  2. The context-item expression (usage navigation)

  3. The with-params expression (usage navigation)

  4. The base-uri attribute value template (usage absorption)

  5. The namespace-context expression (usage inspection)

  6. The schema-aware attribute value template (usage absorption)

  7. The select attributes and contained sequence constructorsXT of any xsl:with-param child elements (usage type-determined, based on the type in the xsl:with-param/@as attribute, defaulting to item()*)

Note:

In practice, code containing an xsl:evaluate instruction will usually be streamable provided that streamed nodes are not passed to the dynamic expression either as the context item or as the value of a parameter.

12.5.17 Streamability of xsl:fallback

The posture and sweep of the xsl:fallback instruction depend on whether the processor is performing fallback (which is known statically).

If the processor is performing fallback, then the posture and sweep of the xsl:fallback instruction are the posture and sweep of the contained sequence constructor.

If the processor is not performing fallback, then the instruction is grounded and motionless.

12.5.18 Streamability of xsl:for-each

The posture and sweep of the xsl:for-each instruction are the first of the following that applies:

  1. If the select expression is grounded, then the posture and sweep of the xsl:for-each instruction follow the general streamability rules, with the operand roles and their usages as follows:

    1. The select expression (the operand usage is irrelevant, but can be taken as inspection)

    2. The contained sequence constructorXT (usage transmission). This is a higher-order operand; its context posture is grounded.

    3. Any attribute value templates appearing in attributes of a child xsl:sort instruction (usage absorption)

    4. The select expression or contained sequence constructor of any xsl:sort children, assessed with a context posture of grounded (usage absorption). These are higher-order operands; their context posture is grounded.

  2. If there is an xsl:sort child element, then roaming and free-ranging.

  3. If the posture of the select expression is crawling and the sweep of the contained sequence constructorXT is consuming, then roaming and free-ranging.

  4. Otherwise:

    1. The posture of the instruction is the posture of the contained sequence constructorXT, assessed with the context posture and context item type set to the posture and type of the select expression.

    2. The sweep of the instruction is the wider of the sweep of the select expression and the sweep of the contained sequence constructorXT.

      Note:

      The ordering of sweep values is in increasing order: motionless, consuming, free-ranging.

    Note:

    Because the body of the xsl:for-each instruction is a higher-order operand of the instruction, any variable reference within the body that is bound to a streaming parameter of a containing stylesheet functionXT will not be singular, which in many cases will make the entire function non-streamable.

12.5.19 Streamability of xsl:for-each-group

The posture and sweep of the xsl:for-each-group instruction are the first of the following that applies:

  1. If the select expression is grounded, then the posture and sweep of the xsl:for-each-group instruction follow the general streamability rules, with the operand roles and their usages as follows:

    1. The select expression (the operand usage is irrelevant, but can be taken as inspection)

    2. The collation attribute value template (usage absorption)

    3. Any attribute value templates appearing in attributes of a child xsl:sort instruction (usage absorption)

    4. The group-by or group-adjacent expression, assessed with a context posture of grounded (usage absorption).

    5. The select expression or contained sequence constructor of any xsl:sort children, assessed with a context posture of grounded (usage absorption).

    6. The group-starting-with or group-ending-with patterns if present; these are higher-order operands with usage inspection.

  2. If there is a group-by attribute and the instruction is not a child of xsl:fork, then roaming and free-ranging.

  3. If there is a group-by or group-adjacent attribute that is not motionless, then roaming and free-ranging.

  4. If there is an xsl:sort child element and the instruction is not a child of xsl:fork, then roaming and free-ranging.

  5. If the posture of the select expression is crawling and the sweep of the contained sequence constructorXT is consuming, then roaming and free-ranging.

  6. Otherwise:

    1. The posture of the instruction is the posture of the contained sequence constructorXT, assessed with the context posture and context item type set to the posture and type of the select expression.

    2. The sweep of the instruction is the wider of the sweeps of the select expression and the contained sequence constructorXT, where the ordering of increasing width is motionless, consuming, free-ranging.

    Note:

    Because the body of the xsl:for-each-group instruction is a higher-order operand of the instruction, any variable reference within the body that is bound to a streaming parameter of a containing stylesheet functionXT will not be singular, which in many cases will make the entire function non-streamable.

Note:

The above rules do not explicitly mention any constraints on the presence or absence of a call on the current-group function. In practice, however, this plays an important role. In the most common case, the select expression of xsl:for-each-group is likely to be striding, for example an expression such as select="*". Any call on current-group associated with this xsl:for-each-group instruction will ordinarily be striding and consuming, which is consistent with streaming provided there is only one such call, and if it appears in a suitable context (for example, not within a predicate). If there is more than one call, or if it appears in an unsuitable context (for example, within a predicate), then this will have the same effect as multiple appearances of other consuming expressions: the construct as a whole will be free-ranging. These rules are not spelled out explicitly, but rather emerge as a consequence of the general streamability rules.

12.5.20 Streamability of xsl:fork

The posture and sweep of xsl:fork are the first of the following that applies:

  1. If there is a child xsl:for-each-group instruction, then the posture and the sweep of that instruction.

  2. If there are no child xsl:sequence instructions (other than xsl:fallback), then grounded and motionless.

  3. If there is a child xsl:sequence instruction whose posture is not grounded, then roaming and free-ranging.

  4. Otherwise, the posture is grounded, and the sweep is the widest sweep of the xsl:sequence child instructions.

Note:

None of the branches of xsl:fork can return streamed nodes. The reason for this is that xsl:fork has to assemble its results in the correct order, and streamed nodes cannot be re-ordered.

The effect of the rules is that each of the child xsl:sequence instructions can independently consume the streamed input document, provided that the result of each child instruction is grounded.

Thus the following example is streamable:

<xsl:fork>
   <xsl:sequence select="copy-of(author)"/>
   <xsl:sequence select="copy-of(editor)"/>
</xsl:fork>

While the following is not streamable, because it returns streamed nodes in an order that might not be document order:

<xsl:fork>
   <xsl:sequence select="author"/>
   <xsl:sequence select="editor"/>
</xsl:fork>

12.5.21 Streamability of xsl:if

The posture and sweep of xsl:if follow the general streamability rules. The operand roles and their usages are as follows:

  1. The test expression (usage inspection)

  2. The then and else expressions and the contained sequence constructorXT (usage transmission). These operands form a choice operand group

Note:

The effect is to allow either of the following:

  1. The test expression may be motionless, in which case any or all of the then and else expressions, and the containing sequence constructor, may be consuming.

  2. The test expression may be consuming, in which case the then and else expressions, and the containing sequence constructor, must all be motionless.

12.5.22 Streamability of xsl:iterate

The posture and sweep of the xsl:iterate instruction are the first of the following that applies:

  1. If the select expression is grounded, then the posture and sweep of the xsl:iterate instruction follow the general streamability rules, with the operand roles and their usages as follows:

    1. The select expression (the operand usage is irrelevant, but can be taken as inspection)

    2. The select expression or contained sequence constructor of any xsl:param children (usage navigation)

    3. The sequence constructor contained within the xsl:iterate instruction itself, assessed with its context item type and context posture based on the select expression (usage transmission)

    4. The select expression or contained sequence constructor of any child xsl:on-completion element, assessed with a context item type of xs:error and a context posture of roaming to reflect the fact that any attempt to reference the context item within the xsl:on-completion element is an error (usage transmission)

      Note:

      The on-completion element can cause the instruction to become non-streamable if, for example, it contains a call on current-group or a variable reference bound to a streaming parameter.

  2. If there is an xsl:param child whose initializing select expression or sequence constructorXT is not grounded and motionless, then roaming and free-ranging.

  3. If there is an xsl:on-completion child whose select expression or sequence constructorXT is not grounded and motionless, then roaming and free-ranging.

  4. If the posture of the select expression is crawling and the sweep of the contained sequence constructorXT is consuming, then roaming and free-ranging.

  5. Otherwise:

    1. The posture of the instruction is the posture of the contained sequence constructorXT, assessed with the context posture and context item type set to the posture and type of the select expression.

    2. The sweep of the instruction is the wider of the sweeps of the select expression and the contained sequence constructorXT, where the ordering of increasing width is motionless, consuming, free-ranging.

Note:

If any xsl:break or xsl:next-iteration instructions appear within the sequence constructor, their posture and sweep will be assessed in the course of evaluating the posture and sweep of the sequence constructor, by reference to the rules in 12.5.8 Streamability of xsl:break and 12.5.28 Streamability of xsl:next-iteration respectively.

Note:

Because the body of the xsl:iterate instruction is a higher-order operand of the instruction, any variable reference within the body that is bound to a streaming parameter of a containing stylesheet functionXT will not be singular, which in many cases will make the entire function non-streamable.

12.5.23 Streamability of xsl:map

Changes in 4.0 

  1. The special rule allowing xsl:map to have multiple consumable operands does not apply if duplicate keys are permitted.  [Issue 2036 ]

The posture and sweep of the xsl:map instruction are determined by the first of the following that applies:

  1. If the sequence constructor within the instruction consists exclusively of xsl:map-entry instructions (and xsl:fallback instructions, which are ignored), and the duplicates attribute is absent, then:

    1. If any of these xsl:map-entry children is roaming or free-ranging, then roaming and free-ranging;

    2. Otherwise, grounded and the widest sweep of the xsl:map-entry children.

  2. Otherwise, the posture and sweep of the xsl:map instruction are the posture and sweep of the contained sequence constructorXT.

Note:

See discussion in 12.1 Maps and Streaming.

The effect of the rules is that it is possible to compute multiple map entries in a single pass of the streamed input document. For example, the following is streamable:

<xsl:map>
  <xsl:map-entry key="'authors'" select="copy-of(author)"/>
  <xsl:map-entry key="'editors'" select="copy-of(editor)"/>
</xsl:map>

The call on copy-of is necessary to ensure that the content of the map entry is grounded; it is not possible to create a map whose entries contain references to streamed nodes.

This rule does not apply when duplicate keys are permitted, because in that situation the child xsl:map-entry instructions generally need to be evaluated in the order they are written, rather than in the order their operands are encountered.

12.5.24 Streamability of xsl:map-entry

The posture and sweep of xsl:map-entry follow the general streamability rules. The operand roles and their usages are as follows:

  1. The key expression (usage absorption)

  2. The select expression (usage navigation)

    Note:

    This effectively means that the select expression must not return nodes from a streamed input document.

  3. The contained sequence constructorXT (usage navigation).

12.5.25 Streamability of xsl:merge

Note:

This section is concerned with the (not very interesting) impact of the xsl:merge instruction on the streamability of its containing template rule or xsl:source-document instruction.

For the (more important) rules concerning the way in which xsl:merge performs streamed processing of its own inputs, see 10 Streamable Merging.

The posture and sweep of xsl:merge are as follows:

  1. If every xsl:merge-source child element satisfies all the following conditions:

    1. The expression in the for-each-item attribute is either absent, or grounded and motionless;

    2. The expression in the for-each-source attribute is either absent, or grounded and motionless;

    3. Either at least one of the attributes for-each-item and for-each-source is present, or the expression in the select attribute is grounded and motionless

    then the xsl:merge instruction is grounded and motionless.

  2. Otherwise, the xsl:merge instruction is roaming and free-ranging.

12.5.26 Streamability of xsl:message

The posture and sweep of xsl:message follow the general streamability rules. The operand roles and their usages are as follows:

  1. The select expression (usage absorption)

  2. The terminate attribute value template (usage absorption)

  3. The error-code attribute value template (usage absorption)

  4. The contained sequence constructorXT (usage absorption).

12.5.27 Streamability of xsl:namespace

The posture and sweep of xsl:namespace follow the general streamability rules. The operand roles and their usages are as follows:

  1. The name attribute value template (usage absorption)

  2. The select expression (usage absorption)

  3. The contained sequence constructorXT (usage absorption).

12.5.28 Streamability of xsl:next-iteration

The posture and sweep of xsl:next-iteration follow the general streamability rules. The operand roles and their usages are as follows:

  1. The select expression or sequence constructorXT content of any contained xsl:with-param child element: its operand usage is the type-determined usage based on the type declared in the xsl:with-param/@as attribute, or the xsl:param/@as attribute of the corresponding parameter on the containing xsl:iterate instruction, whichever is more restrictive, defaulting to item()* if both are absent.

12.5.30 Streamability of xsl:number

The posture and sweep of xsl:number follow the general streamability rules. The operand roles and their usages are as follows:

  1. The value attribute if present: usage absorption

  2. The select attribute if there is no value attribute, defaulting to the context item expression (.) if the select attribute is also absent: usage navigation

  3. The attribute value templates in the format, lang, letter-value, ordinal, start-at, grouping-separator, and grouping-size attributes (usage absorption)

  4. The from and count patterns if present. These can be treated as higher-order operands with usage inspection, though neither of these properties affects the outcome.

Note:

The effect of these rules is that xsl:number can be used for formatting of numbers supplied directly using the value attribute, and also for numbering of nodes in a non-streamed document, but it cannot be used for numbering streamed nodes.

In practice the rules depend very little on the from and count patterns. This is because when the instruction is applied to a streamed node, the instruction will be free-ranging regardless of these patterns; while if it is applied to a grounded node or atomic item, the instruction will normally be motionless regardless of the values of these patterns. The pattern does matter, however, if it contains a variable reference bound to a streaming parameter; because such a reference occurs within a higher-order operand of the xsl:number instruction, its presence automatically makes the variable reference free-ranging, which in turn ensures that the containing stylesheet function is not guaranteed-streamable.

12.5.31 Streamability of xsl:on-empty

The streamability rules for the xsl:on-empty instruction are the same as the rules for xsl:sequence: see 12.5.37 Streamability of xsl:sequence.

Note:

The streamability rules for a sequence constructor containing an xsl:on-empty instruction are given in 12.4 Classifying Sequence Constructors.

12.5.32 Streamability of xsl:on-non-empty

The streamability rules for the xsl:on-non-empty instruction are the same as the rules for xsl:sequence: see 12.5.37 Streamability of xsl:sequence.

Note:

The streamability rules for a sequence constructor containing an xsl:on-non-empty instruction are given in 12.4 Classifying Sequence Constructors.

12.5.33 Streamability of xsl:perform-sort

The posture and sweep of xsl:perform-sort follow the general streamability rules. The operand roles and their usages are as follows:

  1. The expression in the select attribute: usage navigation (because order is not preserved)

  2. The expressions in the attribute value templates of xsl:sort child elements: usage absorption

  3. The expression in the select attribute or contained sequence constructor in child xsl:sort child elements, with usage absorption, assessed with context posture based on the expression in the xsl:perform-sort/@select attribute.

Note:

In practice, the xsl:perform-sort instruction cannot be used to sort nodes from the streamed input document, but it can be used to sort atomic items or grounded nodes, for example a copy of nodes from the streamed document made using the copy-of function.

12.5.34 Streamability of xsl:processing-instruction

The posture and sweep of xsl:processing-instruction follow the general streamability rules. The operand roles and their usages are as follows:

  1. The name attribute value template (usage absorption)

  2. The select expression (usage absorption)

  3. The contained sequence constructorXT (usage absorption).

12.5.35 Streamability of xsl:record

The posture and sweep of the xsl:record instruction are determined by the posture and sweep of the equivalent xsl:map instruction as described in the definition of xsl:record

12.5.36 Streamability of xsl:result-document

The posture and sweep of xsl:result-document follow the general streamability rules. The operand roles and their usages are as follows:

  1. The href attribute value template (usage absorption)

  2. The attribute value templates containing serialization properties (usage absorption)

  3. The contained sequence constructorXT (usage absorption).

12.5.37 Streamability of xsl:sequence

The posture and sweep of xsl:sequence follow the general streamability rules. The operand roles and their usages are as follows:

  1. The select attribute value template (usage transmission)

  2. The contained sequence constructorXT (usage transmission).

12.5.38 Streamability of xsl:source-document

Note:

The concern here is with the impact of xsl:source-document on any streaming template, or ancestor xsl:source-document instruction, and not with the streamed processing of the document accessed using the xsl:source-document/@href attribute.

The streamability of the document opened by the xsl:source-document instruction is not assessed using the rules in this section; it depends only on the streamability properties of the contained sequence constructor, as described in 4.1 Streamability of xsl:source-document.

The posture and sweep of xsl:source-document are the first of the following that applies:

  1. If the contained sequence constructor contains, at any depth, a call on the current-group function whose nearest containing xsl:for-each-group instruction exists and is an ancestor of the xsl:source-document instruction, then roaming and free-ranging.

  2. If the contained sequence constructor contains, at any depth, a call on the current-merge-group function whose nearest containing xsl:merge instruction exists and is an ancestor of the xsl:source-document instruction, then roaming and free-ranging.

  3. Otherwise, the posture is grounded and the sweep is the sweep of the href attribute value template.

12.5.39 Streamability of xsl:switch

The posture and sweep of xsl:switch follow the general streamability rules. The operand roles and their usages are as follows:

  1. The select attribute of the xsl:switch elements (usage absorption).

  2. The test attribute of contained xsl:when elements (usage absorption).

  3. The sequence constructors and select expressions contained within xsl:when and xsl:otherwise child elements (usage transmission). These operands form a choice operand group.

Note:

The effect is to allow any of the following:

  1. The select expression of the xsl:switch instruction may be consuming, in which case all the other operands must be motionless.

  2. Any one of the test expressions may be consuming, in which case all the other operands must be motionless.

  3. Any or all of the sequence constructors and select expressions in xsl:when and xsl:otherwise branches may be consuming, in which case the test expressions and the select of the xsl:switch instruction must all be motionless.

12.5.40 Streamability of xsl:text

The posture and sweep of xsl:text follow the general streamability rules. The operand roles and their usages are as follows:

  1. The select expression (usage absorption)

  2. The separator attribute value template (usage absorption)

  3. The contained sequence constructorXT (usage absorption).

12.5.41 Streamability of xsl:try

The posture and sweep of the xsl:try instruction follow the general streamability rules. The operand roles and usages are as follows:

  1. The select expression or contained sequence constructorXT of the xsl:try element. This has operand usage transmission. (Note that the xsl:catch children of xsl:try are not part of the sequence constructor and therefore not part of this operand.)

  2. The select expressions and/or contained sequence constructorXT of the xsl:catch child elements. These form a choice operand group with operand usage transmission.

Note:

The overall effect of these rules is that either the xsl:try branch or the xsl:catch branch may consume the streamed input, but not both. If there is more than one xsl:catch branch then they may all consume the input, since only one of these branches can be evaluated.

12.5.42 Streamability of xsl:value-of

The posture and sweep of xsl:value-of follow the general streamability rules. The operand roles and their usages are as follows:

  1. The select expression (usage absorption)

  2. The separator attribute value template (usage absorption)

  3. The contained sequence constructorXT (usage absorption).

12.5.43 Streamability of xsl:variable

The posture and sweep of xsl:variable follow the general streamability rules. The operand roles and their usages depend on the as attribute, as follows:

  1. If there is an as attribute, then:

    1. The select expression (with type-determined usage based on the as attribute).

    2. The contained sequence constructorXT (with type-determined usage based on the as attribute).

  2. If there is no as attribute, then:

    1. The select expression (usage navigation).

    2. The contained sequence constructorXT (usage absorption).

Note:

The effect of the initialization expression having usage navigation is that it is not possible in streamable constructs to bind a variable to a node in a streamed document.

12.5.44 Streamability of xsl:where-populated

The posture and sweep of an xsl:where-populated instruction are the posture and sweep of the contained sequence constructorXT.

12.6 Classifying Value Templates

A value templateXT (that is, an attribute value templateXT or text value templateXT) is a construct whose operands are the expressions contained within curly brackets. The required type for this operand role is xs:string and the usage is absorption.

The sweep and posture of a value template are determined using the general rules in 3.10 General Streamability Rules.

If there are no expressions contained within curly brackets, the value template is motionless.

12.7 Classifying Expressions

XPath expressions are classified using the rules in this section.

In the analysis that follows, expressionsXT are classified according to the most specific production rule that they match for which there is an entry in this section. A production P is considered more specific than a production Q (QP) if every expression that matches P also matches Q. For example:

  • The expression 3 satisfies the productions NumericLiteral, Literal, and ArithmeticExpression; the most specific of these for which there is an entry in this section is Literal.

  • The expression text() (appearing as an expression) is a TextTest, and therefore a KindTest, which is itself a NodeTest, and therefore an AxisStep with a defaulted ForwardAxis. The most specific of these for which there is an entry in this section is AxisStep. Although the expression is also a RelativePathExpr, that production is less specific than AxisStep so its rules do not apply.

  • The expression section/title is a RelativePathExpr, for which there is an entry in this section. Although the expression is also a PathExpr, that production is less specific than RelativePathExpr so its rules do not apply.

The production rules for different kinds of expression are listed (with their names and numbers) in the order in which they appear in Appendix A.1 of the XPath 3.0 specification; rules are also given for new constructs introduced by XPath 3.1. Where two numbers are given, they are the production rule numbers in XPath 3.0 and XPath 3.1 respectively; where there is a single number, it is the production rule number in XPath 3.1.

Many expressions can be analyzed using the general streamability rules. These are indicated in the table below by means of a simple proforma in which the operand roles are represented by a short code (A = absorption, I = inspection, T = transmission, N = navigation). For example the proforma A + A indicates that for an arithmetic expression, both operands have operand usage absorption, while I or I indicates that for an or expression, both operands have operand usage inspection. For expressions where further explanation is needed, the table contains a link to the relevant section.

Operand Roles for XPath Expressions
Construct Proforma or Reference to Detailed Rules Further Information
Expr T, T
ForExpr See 12.7.1 Streamability of for Expressions
LetExpr let $var := N return T Binding of variables to streamed nodes is not allowed.
QuantifiedExpr See 12.7.2 Streamability of Quantified Expressions
IfExpr if (I) then T else T The then-clause and else-clause form a choice operand group with usage transmission
OrExpr I or I
AndExpr I and I
StringConcatExpr A || A
RangeExpr A to A
AdditiveExpr A + A, A - A
MultiplicativeExpr A * A, A div A, etc.
UnionExpr See 12.7.4 Streamability of union, intersect, and except Expressions
IntersectExceptExpr See 12.7.4 Streamability of union, intersect, and except Expressions
InstanceOfExpr See 12.7.5 Streamability of instance of Expressions
TreatExpr See 12.7.6 Streamability of treat as Expressions
CastableExpr A castable as TYPE
CastExpr A cast as TYPE
UnaryExpr +A, -A
GeneralComp A = A, A < A, A != A, etc.
ValueComp A eq A, A lt A, A ne A, etc.
NodeComp I is I, I << I, I >> I See Note 1 below
SimpleMapExpr See 12.7.7 Streamability of Simple Mapping Expressions
PathExpr See 12.7.8 Streamability of Path Expressions
RelativePathExpr See 12.7.8 Streamability of Path Expressions
AxisStep See 12.7.9 Streamability of Axis Steps
ForwardStep, ReverseStep See 12.7.9 Streamability of Axis Steps
PostfixExpr: Filter Expression See 12.7.10 Streamability of Filter Expressions
PostfixExpr: Dynamic Function Call See 12.7.11 Streamability of Dynamic Function Calls
Literal There are no operands, so the construct is grounded and motionless
VarRef See 12.7.12 Streamability of Variable References
ParenthesizedExpr (T)
() There are no operands, so the construct is grounded and motionless
ContextItemExpr See 12.7.13 Streamability of the Context Item Expression
FunctionCall See 12.7.14 Streamability of Static Function Calls
NamedFunctionRef See 12.7.15 Streamability of Named Function References
InlineFunctionExpr See 12.7.16 Streamability of Inline Function Declarations
MapConstructor See 12.7.17 Streamability of Map Constructors
Lookup (Postfix and Unary) See 12.7.18 Streamability of Lookup Expressions
ArrowExpr See 12.7.14 Streamability of Static Function Calls and 12.7.11 Streamability of Dynamic Function Calls: the rules for X => F(Y, Z) are the same as the rules for F(X, Y, Z)
SquareArrayConstructor [ N, N, ... ]
CurlyArrayConstructor array { N, N, ... }

Note:

  1. The operators is, <<, and >> apply to streamed nodes just as to any other nodes, though there are few practical situations where they will be useful. A streamed document conforms to the rules of the XDM data model, and its nodes are therefore distinct and ordered. They follow the usual rules, for example that a parent node precedes its children in document order. Expressions such as .. is parent::X or ancestor::x[1] << ancestor::y[1] are therefore perfectly meaningful. The usefulness of the operators is limited by the fact that variables cannot be bound to nodes in a streamed document. It is permitted, though perhaps not useful, for one of the operands to be consuming: one can write . << child::x, and the resulting expression is (by applying the general rules) consuming and grounded.

    The restriction that variables cannot be bound to streamed nodes prevents writing of expressions such as let $x := . return descendant::x[ancestor::y[1] is $x]. As a workaround, the intended effect can be achieved by comparing node identity using the generate-id function: let $x := generate-id(.) return descendant::x[generate-id(ancestor::y[1]) = $x]

12.7.1 Streamability of for Expressions

Writing the expression as for $v in S return R, the two operand roles are S and R.

The posture and sweep are determined by the first of the following that applies:

  1. If S is not grounded, then roaming and free-ranging.

  2. Otherwise, the general streamability rules apply. The operand roles are:

    1. The in expression (S). This has usage navigation.

    2. The return expression (R). This is a higher-order operand with usage transmission.

Note:

Expressions of the form for $i in 1 to 3 return $i*2, where there is no reference to a streamed node, are clearly streamable.

The in expression can also be consuming, for example for $e in copy-of(emp) return $e/salary.

The rule that S must be grounded prevents the variable being bound to a node in a streamed document. This disallows expressions of the form for $x in child::section return $x/para, because this requires data flow analysis (tracing from the binding of a variable to its usages), rather than purely syntactic analysis. Some implementations may be able to stream such constructs.

The fact that the return clause is a higher-order operand prevents it from being a consuming expression, for example for $i in 1 to 3 return salary. Use of a motionless expression that accesses streamed nodes is however allowed, for example for $i in 1 to 3 return name(ancestor::x[$i]).

12.7.2 Streamability of Quantified Expressions

An expression with multiple in-clauses is first rewritten using nested quantified expressions: for example some $i in X, $j in Y satisfies $i eq $j can be rewritten as some $i in X satisfies (some $j in Y satisfies $i eq $j). The analysis therefore only needs to consider expressions with a single in-clause.

Writing such an expression as some|every $v in S satisfies C, the two operand roles are S and C.

The general streamability rules apply. The operand roles are:

  1. The in expression (S). This has usage navigation.

  2. The satisfies expression (C). This is a higher-order operand with usage inspection.

Note:

Expressions of the form some $i in 1 to 3 satisfies $i lt 2, where there is no reference to a streamed node, are clearly streamable.

The expression S can be consuming, so long as it is grounded: for example some $e in emp/salary/number(.) satisfies $e gt 10000.

The rule that S has usage navigation prevents the variable being bound to a node in a streamed document. This disallows expressions of the form some $x in child::section satisfies has-children($x), because this requires data flow analysis (tracing from the binding of a variable to its usages), rather than purely syntactic analysis. Some implementations may be able to stream such constructs.

The fact that C is a higher-order operand prevents it from being a consuming expression: for example some $i in 1 to 3 satisfies author[$i] eq "Kay" is not streamable. Use of a motionless expression that accesses streamed nodes is however allowed, for example some $i in 1 to 3 satisfies @grade = $i.

Quantified expressions that fail the streamability rules can often be rewritten as filter expressions. For example, the expression some $x in child::section satisfies has-children($x) can be rewritten as exists(child::section[has-children(.)]), which is grounded and consuming.

12.7.3 Streamability of if expressions

Writing the expression as if (C) then T else E, there are three operand roles: C, T, and E. The usage of C is inspection, while the usage of T and E is transmission. Operands T and E form a choice operand group, meaning that they can both consume the input stream, provided they have consistent posture. The general streamability rules apply.

12.7.4 Streamability of union, intersect, and except Expressions

The posture and sweep are the first of the following that applies:

  1. If either of the two operands is free-ranging, then roaming and free-ranging (Example: . | following-sibling::*).

  2. If either of the two operands is grounded and motionless, then the posture and sweep of the other operand (Example: . | doc('abc.com')//x)

  3. If both operands are climbing, then climbing and the wider of the sweeps of the two operands (Example: parent::A | */ancestor::B).

  4. If the left-hand operand is striding or crawling and the right-hand operand is also striding or crawling, then crawling and the wider of the sweeps of the two operands (Example: * | */*).

  5. Otherwise, roaming and free-ranging (Example: child::div | parent::div).

Note:

Essentially the principle is that if both operands are streamable, then the result is streamable (this assumes an evaluation strategy where both operands are evaluated during the same pass of the streamed input document, and the results merged). But there are caveats because of the need for static streamability analysis of the result. This prevents constructs such as .. | * that have heterogeneous posture.

Where the two operands are both striding, there are cases where an implementation could determine that the result is also striding: for example (author | editor). In general, however, the combination of two striding operands may produce a sequence of nodes that have nested subtrees (consider author | author/name), so the result is classified as crawling.

The expression (author | editor), although it is not striding, can be rewritten in the form *[self::author or self::editor], which is striding.

12.7.5 Streamability of instance of Expressions

For an expression of the form X instance of ST (where X is an expression and ST is a SequenceTypeXT), the posture and sweep are determined by the general streamability rules. There is a single operand X, whose operand usage is as follows:

  1. If the ItemType of ST is a DocumentTest, optionally parenthesized, that contains an ElementTest or SchemaElementTest then absorption

  2. Otherwise, inspection.

Note:

In general, it is possible to determine whether a node matches an ItemType without consuming the node. For example it can be established whether an element matches the test element(para) when positioned at the start tag.

An ItemType of the form document-node(element(X)) is an exception to this rule because it matches a document node only if it has exactly one element node child, and this cannot be determined without consuming the document.

A processor may have knowledge that the document node cannot contain multiple element nodes, for example because it knows that the source of the streamed document is an XML parser that is not capable of generating such a stream. In such cases the processor may make a different assessment of the streamability of this construct. This comes under the general provision that a processor is always at liberty to use streaming even when the stylesheet is not guaranteed streamable.

Note:

As with other constructs that are evaluated with inspection usage, for example the name function or access to an attribute node, evaluation of a construct such as $X instance of schema-element(E) as true or false may be invalidated if reading of the input stream subsequently fails. Dynamic errors during streamed processing of an input document invalidate all output generated prior to the failure, and this case is no different.

Note:

Given an expression such as child::* instance of element(E)*, the expression as a whole is consuming and grounded. By contrast, the expression . instance of element(E)* is motionless and grounded. This can be verified by applying the general streamability rules to these cases.

12.7.6 Streamability of treat as Expressions

For an expression of the form X treat as ST (where X is an expression and ST is a SequenceTypeXT), the posture and sweep are determined as follows:

  1. If the ItemType of ST is a DocumentTest, optionally parenthesized, that contains an ElementTest or SchemaElementTest then roaming and free-ranging.

  2. Otherwise, the general streamability rules apply. There is a single operand X, whose operand usage is transmission.

Note:

See the notes in 12.7.5 Streamability of instance of Expressions for a discussion of the streamability difficulties associated with document-node() tests.

12.7.7 Streamability of Simple Mapping Expressions

The mapping operator ! is treated as a left-associative binary operator, so the expression a!b!c is processed as (a!b)!c.

The posture of the expression is the posture of the right-hand operand, assessed with a context posture and type set to the posture and type of the left-hand operand.

The sweep of the expression is the wider of the sweeps of the two operands.

12.7.8 Streamability of Path Expressions

The streamability analysis applies after the expansion of the // pseudo-operator to /descendant-or-self::node()/, and after expanding .. to parent::node(), @X to attribute::X, and an omitted axis to the default axis for the node kind.

Following the rules in XPath, a leading "/" is converted to (root(self::node()) treat as document-node())/ (with the final "/" omitted for the expression "/" on its own). This is followed by a rewrite of the call on root, as described in 12.8.21 Streamability of the root Function.

Note:

Taken together, these rewrites have the effect that a path expression such as //a is streamable only if the statically determined context item type is document-node(), which will be the case for example immediately within xsl:source-document, or in a template rule with match="/".

A RelativePathExpr with more than two operands (such as a/b/c) is taken as a tree of binary expressions (that is, (a/b)/c).

The sweep of a relative path expression is the wider sweep of the two operands, where the ordering of increasing width is motionless, consuming, free-ranging.

Note:

Examples:

The posture of a relative path expression is assessed in two phases, as follows:

  1. First, the provisional posture is determined as follows: The provisional posture of the expression is the posture of the right-hand operand, assessed with a context posture and type set to the posture and type of the left-hand operand; and the provisional sweep is the wider of the sweeps of the two operands.

  2. If the provisional posture is roaming, then it is reassessed as follows:

    1. [Definition: A RelativePathExpr is a scanning expression if and only if it is syntactically equivalent to some motionless patternXT.]

      Note:

      This means that a RelativePathExpr is a scanning expression if it conforms to the grammar for a RelativePathExprP in the grammar for patterns (see [XSLT 4.0] section 6.3.2 Syntax of Patterns), and if, when considered as a pattern, the pattern is motionless according to the rules in 12.9 Classifying Patterns.

      In practice, the test as to whether the construct is equivalent to a pattern is likely to be made by examining the structure of the expression tree, rather than by re-parsing the lexical form of the expression against the grammar for patterns; but the outcome is the same.

    2. If the expression is a scanning expression then:

      1. If the static type of the expression contains U{element} then its posture is crawling.

      2. Otherwise, its posture is striding

  3. Otherwise (if the provisional posture is not roaming, or the expression is not a scanning expression), the posture of the expression is the provisional posture.

Note:

The special rules for scanning expressions are designed to ensure that expressions such as //section/head are streamable. The problem with such an expression is that it is possible to have two nested sections A and B, where A is the parent of B and thus precedes B in document order, but where there are children of A that come after children of B in document order. This means that a nested-loop strategy for the evaluation of /descendant::section/child::head is not guaranteed to deliver nodes in document order without a sort, and is therefore not a viable strategy for streaming.

However, there is a different strategy for evaluating such an expression, which is in effect to rewrite the expression as /descendant::head[parent::section]; specifically, it is possible to scan all descendants in document order, looking for a head element that has a section parent. Hence the term scanning expressions.

The expressions that qualify as scanning expressions are paths that can be evaluated by scanning all descendants and testing each one (independently) to see whether the elements on its ancestor axis match the specified path. The subset of expressions that qualify as scanning expressions is therefore the same as the subset that qualify as motionless patterns.

Scanning expressions cannot use positional predicates: for example //section/head[1] is not recognized as a scanning expression because this would require information about a streamed node (specifically, about its preceding siblings) that is not retained during streaming.

Note:

Perhaps surprisingly, the expression .//section/head is not a scanning expression and is therefore not guaranteed streamable. This is because it does not take the syntactic form of a patternXT. To make it streamable, it can be rewritten as descendant::section/head or as self::node()//section/head.

Similarly, within a streamable stylesheet function whose streaming parameter is $node, the expression $node//section/head is not a scanning expression. In this case the expression does have the syntactic form of a pattern, but the pattern is not classified as motionless. (See 12.9 Classifying Patterns — a motionless pattern cannot contain a RootedPath.) A workaround in this case is to rewrite the expression as $node/(descendant::section/head). Assuming that the function in question declares streamability="absorbing", the analysis here is that the left-hand operand ($node) is striding and consuming, while the right hand operand (descendant::section/head) is crawling and consuming (because it is a scanning expression). The expression as a whole is therefore crawling and consuming.

These are cases where an implementation might reasonably choose to relax the rules, insofar as this is permitted by 3.1 Streamability Guarantees.

Note:

Examples:

In each of the following cases, assume that the context posture is striding.

  • The posture of the expression a/b/c is striding, because (under the rules for AxisStep [38]) a child axis step evaluated with striding context posture creates a new striding posture.

  • The posture of the expression a/descendant::c is crawling, because a descendant axis step evaluated with striding context posture creates a new crawling posture.

  • The posture of the expression ../@status is striding, because a parent axis step evaluated with striding context posture creates a new climbing posture, and an attribute axis step evaluated with climbing context posture creates a new striding posture.

  • The posture of the expression copy-of(.)//a/following-sibling::* is grounded, because the copy-of evaluated with striding posture creates a grounded posture, and all subsequent axis steps leave this posture unchanged.

  • The expression section//head expands to (section/descendant-or-self::node())/child::head. The posture of the left-hand operand section/descendant-or-self::node() is crawling, because a descendant axis step evaluated with striding context posture creates a new crawling posture. The provisional posture of the expression as a whole is therefore roaming, because a child axis step evaluated with crawling context posture gives a resulting roaming posture. However, the expression is a scanning expression (both section//head and its expansion are motionless patterns), so the expression as a whole has crawling posture.

  • The expression section//head[1] is free-ranging: unlike the previous example, it contains a positional predicate, which means that the operands do not satisfy the rules for scanning expressions.

12.7.9 Streamability of Axis Steps

The sweep and posture of an AxisStep S are determined by the first of the following rules that applies:

  1. If the context posture is grounded, then the sweep is motionless and the posture is grounded;

  2. If the context posture is roaming, then the sweep is free-ranging and the posture is roaming;

  3. If the statically inferred context item type is such that the axis will always be empty (for example, applying the child axis to a text node or the parent axis to a document node), or if the NodeTest is one that can never select nodes on the chosen axis (for example, selecting attribute nodes on the child axis), then the sweep is motionless and the posture is grounded (because the expression is statically known to return an empty sequence);

  4. If all the following conditions are satisfied:

    1. The context posture is striding

    2. The axis is descendant or descendant-or-self

    3. There is a predicate P in the list of predicates that satisfies all the following conditions:

      1. The static type of P is a subtype of U{xs:decimal, xs:double, xs:float}

      2. The maximum cardinality of P is 1

      3. Neither P, nor any operand of P, at any depth provided it has the AxisStep S as its focus-setting container, is a context item expression, an axis expression, or a call on a focus-dependent function;

    then striding and consuming

    Note:

    Examples are descendant::section[1], descendant::section[$i+1], descendant::section[count($x)]. The significance of this rule is that it detects cases where the descendant axis selects a singleton, and where the posture of the result can therefore be striding rather than crawling.

  5. If the list of predicates contains a Predicate that is not motionless, then the sweep is free-ranging and the posture is roaming;

  6. Otherwise, the sweep and posture of the expression are as determined by the table below, based on the context posture, the choice of axis, and the node test. The condition “Selects elements?” is true if the U-type of S has a non-empty intersection with U{element()}.

    Streamability of Axis Steps Based on Context Posture
    Context posture Axis Selects elements? Result posture Sweep
    Grounded any Grounded Motionless
    Climbing self, parent, ancestor-or-self, ancestor Climbing Motionless
    Climbing attribute, namespace Striding Motionless
    Striding parent, ancestor-or-self, ancestor Climbing Motionless
    Striding self, attribute, namespace Striding Motionless
    Striding child Striding Consuming
    Striding descendant, descendant-or-self Yes Crawling Consuming
    Striding descendant, descendant-or-self No Striding Consuming
    Crawling parent, ancestor-or-self, ancestor Climbing Motionless
    Crawling attribute, namespace Striding Motionless
    Crawling self Yes Crawling Motionless
    Crawling self No Striding Motionless
    Any other combination Roaming Free-ranging

Note:

This analysis does not attempt to classify para[title] as a consuming expression; an implementation might choose to do so.

12.7.10 Streamability of Filter Expressions

For a filter expression F of the form B[P] (where B might itself be a filter expression), the posture and sweep are the first of the following that applies:

  1. If all the following conditions are satisfied:

    1. B is crawling;

    2. The static type of P is a subtype of U{xs:decimal, xs:double, xs:float};

    3. The maximum cardinality of P is 1;

    4. Neither P, nor any operand of P, at any depth provided it has F as its focus-setting container, is a context item expression, an axis expression, or a call on a focus-dependent function

    then the posture is striding and the sweep is the sweep of B.

    Note:

    This rule captures cases where it can be statically determined that the predicate is a single numeric item and is independent of the focus. In such cases, the filter expression selects at most one node, and the posture can therefore be changed from crawling to striding (if there is only one node, there can be no overlapping trees). Examples of filter expressions that satisfy this test are (//x)[3], (//x)[$i+1], (//x)[index-of($a, $b)[last()]]. The expression (//x)[1 to 5] does not satisfy the test, because the value of the predicate is not a singleton.

  2. If P is motionless, then the posture and sweep of B;

    Note:

    This includes the case where B is grounded. The predicate P is assessed with the posture of B as its context posture, and if this is grounded, then P will almost invariably be motionless, making the filter expression as a whole grounded and motionless. For example if $s is grounded, then $s[child::*] is also grounded. A counter-example is the expression $s[$n = 2] where $n is a reference to the first argument of a stylesheet function that is declared-streamable: here the predicate is not motionless, so the filter expression is roaming and free-ranging.

  3. Otherwise, roaming and free-ranging.

Note:

The first rule allows a construct such as <xsl:apply-templates select="(//title)[1]"/>, where a crawling operand would not be guaranteed streamable.

Note:

This section is not applicable to predicates forming part of an axis step, such as //title[1], as these are not technically filter expressions. See 12.7.9 Streamability of Axis Steps.

12.7.11 Streamability of Dynamic Function Calls

Note:

This section applies to dynamic function calls written using the traditional syntax $F(X, Y, Z) and equally to those using the syntax X => $F(Y, Z)

The posture and sweep of a dynamic function call such as $F(X, Y) are determined by the 3.10 General Streamability Rules. The operands and their usages are as follows:

  1. The base expression that computes the function value itself (here $F). This has usage inspection.

  2. The argument expressions excluding any ? placeholders (here X and Y). These have type-determined usage dependent on ancillary information associated with the static type of the base expression, where available (see 3.5 Determining the Static Type of a Construct). If this information indicates that the base expression is a function with signature fn(A, B, ...) as R, then the first argument X has type-determined usage based on the first argument type A, the second argument Y has type-determined usage based on the second argument type B, and so on. If no function signature is available, then the usage of each of the argument expressions is navigation.

Note:

As explained in [XSLT 4.0] section 10.3.5 Dynamic Access to Functions, use of a dynamic function call where the function value is bound to a focus-dependent function such as name#0, lang#1, or last#0 is likely to lead to a dynamic error if the context item is a node in a streamed document, but this does not affect the static streamability analysis.

Note:

Maps and arrays are functions, and it is possible to look up a value in a map or array using a dynamic function call of the form $map($key) or $array($index). If it is statically known that the function in question is a map or array, then it is also known that the argument type is xs:anyAtomicType, and that the operand usage is therefore absorption. A call that passes a streamed node will therefore be grounded and consuming. However, if it is not known statically that the function is a map or array, then the expression will generally be roaming and free-ranging.

This means it is desirable to declare the type of any variable holding a map or array. If streamable nodes are used to lookup a value in a map or array, then it may be advisable to use the map:get or array:get functions explicitly; or the lookup operator (?).

12.7.12 Streamability of Variable References

For variable references that are bound to the streaming parameter of a declared-streamable stylesheet functionXT, see the rules for the streamability category of the containing function, under 8.1 Classifying Stylesheet Functions.

In all other cases, variable references are grounded and motionless.

12.7.13 Streamability of the Context Item Expression

The posture of the expression is the context posture, and the sweep is motionless.

Note:

Although . is intrinsically motionless, when used in certain contexts (such as data(.)) the containing expression will be consuming. This arises because of the operand usage: the argument to data has usage absorption, and the combination of a motionless operand with usage absorption leads to the containing expression being consuming.

Similarly, if . is used where the operand usage is navigation, the containing expression will be free-ranging.

12.7.14 Streamability of Static Function Calls

Note:

This section applies to static function calls written using the traditional syntax F(X, Y, Z) and equally to those using the syntax X => F(Y, Z)

For calls to built-in functions, see 12.8 Classifying Calls to Built-In Functions.

For calls to stylesheet functionsXT, see 8.1 Classifying Stylesheet Functions.

For partial function applications (where one or more of the arguments is supplied as a ? placeholder), see the rules at the end of this section.

For a call to a constructor function, the 3.10 General Streamability Rules apply. There is a single operand role (the argument to the function), with operand usage absorption.

For a call to an extension functionXT, the posture and sweep are implementation-defined.

If the function call is a partial function application (that is, if one or more of the arguments is given as a ? placeholder), then:

  1. If the function is focus-dependent and the context posture is not grounded, then the function call is roaming and free-ranging.

  2. If the target of the function call is a stylesheet functionXT that is declared-streamable, and if the first argument is actually supplied (that is, this argument is not supplied as a ? placeholder), and if the expression that is supplied as the first argument is not grounded, then the function call is roaming and free-ranging.

  3. If the target is an extension functionXT, the posture and sweep are implementation-defined.

  4. Otherwise, the general streamability rules apply. The operands of a partial function application are the expressions actually supplied as arguments to the function, ignoring ? place-holders; the corresponding operand usage is the type-determined usage based on the declared type of that argument.

12.7.15 Streamability of Named Function References

Let F be the function to which the NamedFunctionRef refers.

If F is focus-dependent and the context posture is not grounded, then the NamedFunctionRef is roaming and free-ranging.

If F is an extension functionXT, the posture and sweep are implementation-defined.

Otherwise, the NamedFunctionRef is grounded and motionless.

Note:

The main intent behind these rules is to ensure that the function item returned by a named function reference does not encapsulate a reference to a streamed node.

In the case of an expression such as local-name#0, implementations might be able to do better by pre-evaluating the function at the point where the named function reference occurs.

In the case of extension functions, implementations may be able to distinguish whether the function is focus-dependent, and decide the streamability of the named function reference accordingly.

12.7.16 Streamability of Inline Function Declarations

An inline function declaration that textually contains a variable reference bound to a streaming parameter (of some containing stylesheet function) is roaming and free-ranging.

All other inline function declarations are grounded and motionless.

Note:

It is not possible to pass a streamed node as an argument to a call to an inline function unless the declared type of the corresponding function parameter causes the node to be atomized: see 12.7.11 Streamability of Dynamic Function Calls. The only other way an inline function could access a streamed node is by having the streamed node in its closure, and this is prevented by the rule above.

12.7.17 Streamability of Map Constructors

The posture and sweep of a map constructor (see section 4.14.1.1 Map Constructors) are the same as the posture and sweep of the equivalent xsl:map instruction. The equivalent xsl:map instruction is formed by creating a sequence of xsl:map-entry instructions, one for each key/value pair in the map expression, where the key expression becomes the value of xsl:map-entry/@key, and the value expression becomes the value of xsl:map-entry/@select; this sequence of xsl:map-entry instructions is then wrapped in an xsl:map parent instruction.

For example, the map constructor { 'red': false(), 'green': true() } translates to the instruction:

<xsl:map>
  <xsl:map-entry key="'red'" select="false()"/>
  <xsl:map-entry key="'green'" select="true()"/>
</xsl:map>

The rules for the streamability of xsl:map appear in 12.5.23 Streamability of xsl:map.

See also 12.1 Maps and Streaming.

12.7.18 Streamability of Lookup Expressions

For the unary lookup operator, the posture and sweep of the expression ?X are defined to be the same as the posture and sweep of the postfix lookup expression .?X.

For the postfix lookup expression E?K, the general streamability rules apply as follows:

  1. In the wildcard form of the expression, E?*, there is only one operand, E. This has operand usage inspection.

  2. Where the construct K is an NCName, the expression E?NAME is treated as equivalent to E?("NAME").

  3. Where the construct K is an integer, the expression E?N is treated as equivalent to E?(N).

  4. In the general case where K is a parenthesized expression, the lookup expression E?(K) has two operands. The first operand E has operand usage inspection, while the second operand K has operand usage absorption.

12.8 Classifying Calls to Built-In Functions

This section describes the rules that determine the streamability of calls to built-in functions. These differ from user-written functions because it is known (defined in the specification) how nodes supplied as operands are used. Knowledge of the usage of each operand, together with the posture of the actual operands, is in most cases enough to determine the posture and sweep of the function result.

All the built-in functions are listed below. For most functions, a simple proforma is shown that indicates the operand usage of each argument, using the code (A = absorption, I = inspection, T = transmission, N = navigation). So, for example, the entry fn:remove(T, A) means that for the function fn:remove#2, the operand usage of the first argument is transmission, and the operand usage of the second argument is absorption. By reference to the general rules in 3.10 General Streamability Rules, this demonstrates that if the context posture is striding, the posture and sweep of the expression sum(remove(*,1)) will be grounded and consuming respectively.

For functions that default one of their arguments (typically to the context item), the relevant entry shows the equivalence, and the posture and sweep can in these cases be computed by filling in the default value for the relevant argument.

Some functions do not follow the general rules, and these are listed with a link to the section where the particular rules for that function are described.

  • array:append(I, N)

  • array:build(N, I)

  • array:empty(I)

  • array:filter(I, I)

  • array:flatten(T)

  • array:fold-left(I, N, I)

  • array:fold-right(I, N, I)

  • array:foot – See

  • array:for-each(I, I)

  • array:for-each-pair(I, I, I)

  • array:get(I, A)

  • array:get(I, A, A)

  • array:head(I)

  • array:index-of(A, A, A)

  • array:index-where(I, I)

  • array:insert-before(I, A, N)

  • array:items(I)

  • array:join(I, N)

  • array:members(I)

  • array:of-members(I)

  • array:put(I, I, N)

  • array:remove(I, A)

  • array:reverse(I)

  • array:size(I)

  • array:slice(I, A, A, A)

  • array:sort(N, A, I)

  • array:sort-by(N, A)

  • array:sort-with(A, A)

  • array:split(I)

  • array:subarray(I, A, A)

  • array:tail(I)

  • array:trunk – See

  • fn:abs(A)

  • fn:accumulator-after – See 12.8.1 Streamability of the accumulator-after Function

  • fn:accumulator-before – See 12.8.2 Streamability of the accumulator-before Function

  • fn:adjust-date-to-timezone(A, A)

  • fn:adjust-dateTime-to-timezone(A, A)

  • fn:adjust-time-to-timezone(A, A)

  • fn:all-different(A, A)

  • fn:all-equal(A, A)

  • fn:analyze-string(A, A, A)

  • fn:apply(I, N)

  • fn:apply-templates(A, A)

  • fn:atomic-equal(A, A)

  • fn:atomic-type-annotation(A)

  • fn:available-environment-variables()

  • fn:available-system-properties()

  • fn:avg(A)

  • fn:base-uri(I)

  • fn:boolean(I)

  • fn:build-uri(A, A)

  • fn:ceiling(A)

  • fn:char(A)

  • fn:character-map(A)

  • fn:characters(A)

  • fn:civil-timezone(A, A)

  • fn:codepoint-equal(A, A)

  • fn:codepoints-to-string(A)

  • fn:collation(A)

  • fn:collation-available(A, A)

  • fn:collation-key(A, A)

  • fn:collection(A)

  • fn:compare(A, A, A)

  • fn:concat(A)

  • fn:contains(A, A, A)

  • fn:contains-subsequence(T, T, A)

  • fn:contains-token(A, A, A)

  • fn:copy-of(A)

  • fn:count(I)

  • fn:csv-doc(A, A)

  • fn:csv-to-arrays(A, I)

  • fn:csv-to-xml(A, I)

  • fn:current – See 12.8.3 Streamability of the current Function

  • fn:current-date()

  • fn:current-dateTime()

  • fn:current-group – See 12.8.4 Streamability of the current-group Function

  • fn:current-grouping-key – See 12.8.5 Streamability of the current-grouping-key Function

  • fn:current-merge-group – See 12.8.6 Streamability of the current-merge-group Function

  • fn:current-merge-key – See 12.8.7 Streamability of the current-merge-key Function

  • fn:current-merge-key-array – See 12.8.8 Streamability of the current-merge-key-array Function

  • fn:current-output-uri()

  • fn:current-time()

  • fn:data(A)

  • fn:dateTime(A, A)

  • fn:day-from-date(A)

  • fn:day-from-dateTime(A)

  • fn:days-from-duration(A)

  • fn:decode-from-uri(A)

  • fn:deep-equal(A, A, A)

  • fn:default-collation()

  • fn:default-language()

  • fn:distinct-ordered-nodes – See 12.8.9 Streamability of the distinct-ordered-nodes Function

  • fn:distinct-values(A, A)

  • fn:divide-decimals(A, A, A)

  • fn:do-until(N, I, I)

  • fn:doc(A, A)

  • fn:doc-available(A, A)

  • fn:document(A, I)

  • fn:document-uri(I)

  • fn:duplicate-values(A, A)

  • fn:element-available(A)

  • fn:element-to-map(A, I)

  • fn:element-to-map-plan(A)

  • fn:element-with-id(A, N)

  • fn:empty(I)

  • fn:encode-for-uri(A)

  • fn:ends-with(A, A, A)

  • fn:ends-with-subsequence(T, T, A)

  • fn:environment-variable(A)

  • fn:error(A, A, N)

  • fn:escape-html-uri(A)

  • fn:every(N, I)

  • fn:exactly-one(I)

  • fn:exists(I)

  • fn:expanded-QName(A)

  • fn:false()

  • fn:filter(N, I)

  • fn:floor(A)

  • fn:fold-left(N, A, I)

  • fn:fold-right – See 12.8.11 Streamability of the fold-right Function

  • fn:foot – See 12.8.12 Streamability of the foot Function

  • fn:for-each(N, I)

  • fn:for-each-pair(N, N, I)

  • fn:format-date(A, A, A, A, A)

  • fn:format-dateTime(A, A, A, A, A)

  • fn:format-integer(A, A, A)

  • fn:format-number(A, A, A)

  • fn:format-time(A, A, A, A, A)

  • fn:function-annotations(A)

  • fn:function-arity(A)

  • fn:function-available(A, A)

  • fn:function-identity(A)

  • fn:function-lookup – See 12.8.15 Streamability of the function-lookup Function

  • fn:function-name(A)

  • fn:generate-id(I)

  • fn:graphemes(A)

  • fn:has-children(I)

  • fn:hash(A, A, A)

  • fn:head(T)

  • fn:highest(N, A, I)

  • fn:hours-from-dateTime(A)

  • fn:hours-from-duration(A)

  • fn:hours-from-time(A)

  • fn:html-doc(A, A)

  • fn:id(A, N)

  • fn:identity(T)

  • fn:idref(A, N)

  • fn:implicit-timezone()

  • fn:in-scope-namespaces(I)

  • fn:in-scope-prefixes(I)

  • fn:index-of(A, A, A)

  • fn:index-where(N, I)

  • fn:innermost – See 12.8.16 Streamability of the innermost Function

  • fn:insert-before(T, A, T)

  • fn:invisible-xml(N, A)

  • fn:iri-to-uri(A)

  • fn:is-NaN(A)

  • fn:items-at(T, A)

  • fn:jnode-content(A)

  • fn:jnode-position(I)

  • fn:jnode-selector(I)

  • fn:json-doc(A, I)

  • fn:json-to-xml(A, I)

  • fn:jtree(A)

  • fn:key(A, A, N)

  • fn:lang(A, I)

  • fn:last – See 12.8.17 Streamability of the last Function

  • fn:load-xquery-module(A, I)

  • fn:local-name(I)

  • fn:local-name-from-QName(A)

  • fn:lower-case(A)

  • fn:lowest(N, A, I)

  • fn:map-for-key(A, N)

  • fn:matches(A, A, A)

  • fn:max(A, A)

  • fn:message(T, A)

  • fn:min(A, A)

  • fn:minutes-from-dateTime(A)

  • fn:minutes-from-duration(A)

  • fn:minutes-from-time(A)

  • fn:month-from-date(A)

  • fn:month-from-dateTime(A)

  • fn:months-from-duration(A)

  • fn:name(I)

  • fn:namespace-uri(I)

  • fn:namespace-uri-for-prefix(A, I)

  • fn:namespace-uri-from-QName(A)

  • fn:nilled(I)

  • fn:node-name(I)

  • fn:node-type-annotation(I)

  • fn:normalize-space(A)

  • fn:normalize-unicode(A, A)

  • fn:not(I)

  • fn:number(A)

  • fn:one-or-more(I)

  • fn:op(A)

  • fn:outermost – See 12.8.18 Streamability of the outermost Function

  • fn:parse-csv(A, I)

  • fn:parse-html(A, A)

  • fn:parse-ietf-date(A)

  • fn:parse-integer(A, A)

  • fn:parse-json(A, I)

  • fn:parse-QName(A)

  • fn:parse-uri(A, A)

  • fn:parse-xml(A, A)

  • fn:parse-xml-fragment(A, A)

  • fn:partial-apply(I, I)

  • fn:partition(N, I)

  • fn:path(N, A)

  • fn:position – See 12.8.19 Streamability of the position Function

  • fn:prefix-from-QName(A)

  • fn:QName(A, A)

  • fn:random-number-generator(A)

  • fn:regex-group(A)

  • fn:remove(T, A)

  • fn:replace(A, A, A, A)

  • fn:replicate(N, A)

  • fn:resolve-QName(A, I)

  • fn:resolve-uri(A, A)

  • fn:reverse – See 12.8.20 Streamability of the reverse Function

  • fn:root – See 12.8.21 Streamability of the root Function

  • fn:round(A, A, A)

  • fn:round-half-to-even(A, A)

  • fn:scan-left(N, N, I)

  • fn:scan-right(N, N, I)

  • fn:schema-type(A)

  • fn:seconds(A)

  • fn:seconds-from-dateTime(A)

  • fn:seconds-from-duration(A)

  • fn:seconds-from-time(A)

  • fn:sequence-join(N, N)

  • fn:serialize(A, A)

  • fn:siblings(N)

  • fn:slice(T, A, A, A)

  • fn:snapshot(A)

  • fn:some(N, I)

  • fn:sort(N, A, I)

  • fn:sort-by(N, A)

  • fn:sort-with(A, A)

  • fn:starts-with(A, A, A)

  • fn:starts-with-subsequence(T, T, A)

  • fn:static-base-uri()

  • fn:stream-available(A)

  • fn:string(A)

  • fn:string-join(A, A)

  • fn:string-length(A)

  • fn:string-to-codepoints(A)

  • fn:subsequence(T, A, A)

  • fn:subsequence-where(T, A, A)

  • fn:substring(A, A, A)

  • fn:substring-after(A, A, A)

  • fn:substring-before(A, A, A)

  • fn:sum(A, A)

  • fn:system-property(A)

  • fn:tail(T)

  • fn:take-while(N, I)

  • fn:timezone-from-date(A)

  • fn:timezone-from-dateTime(A)

  • fn:timezone-from-time(A)

  • fn:tokenize(A, A, A)

  • fn:trace(T, A)

  • fn:transform(I)

  • fn:transitive-closure(N, I)

  • fn:translate(A, A, A)

  • fn:true()

  • fn:trunk – See 12.8.22 Streamability of the trunk Function

  • fn:type-available(A)

  • fn:type-of(I)

  • fn:unix-dateTime(A)

  • fn:unordered(T)

  • fn:unparsed-binary(A)

  • fn:unparsed-entity-public-id(A, I)

  • fn:unparsed-entity-uri(A, I)

  • fn:unparsed-text(A, A)

  • fn:unparsed-text-available(A, A)

  • fn:unparsed-text-lines(A, A)

  • fn:upper-case(A)

  • fn:uri-collection(A)

  • fn:void(I)

  • fn:while-do(N, I, I)

  • fn:xml-to-json(A, I)

  • fn:xsd-validator(A)

  • fn:year-from-date(A)

  • fn:year-from-dateTime(A)

  • fn:years-from-duration(A)

  • fn:zero-or-one(I)

  • map:build(N, I, I, A)

  • map:contains(I, A)

  • map:empty(I)

  • map:entries(I)

  • map:entry(A, N)

  • map:filter(I, I)

  • map:find(I, A)

  • map:for-each(I, I)

  • map:get(I, A, T)

  • map:items(I)

  • map:keys(I)

  • map:keys-where(I, I)

  • map:merge(I, I)

  • map:put(I, A, N)

  • map:remove(I, A)

  • map:size(I)

  • math:acos(A)

  • math:asin(A)

  • math:atan(A)

  • math:atan2(A, A)

  • math:cos(A)

  • math:cosh(A)

  • math:e()

  • math:exp(A)

  • math:exp10(A)

  • math:log(A)

  • math:log10(A)

  • math:pi()

  • math:pow(A, A)

  • math:sin(A)

  • math:sinh(A)

  • math:sqrt(A)

  • math:tan(A)

  • math:tanh(A)

12.8.1 Streamability of the accumulator-after Function

See also 11 Streamable Accumulators.

The posture of the function call is in all cases grounded.

The sweep is determined by applying the following rules, in order:

  1. If the first argument (the accumulator name) is not motionless, the function is free-ranging.

  2. If the context posture is grounded, the function is motionless.

  3. If the context item type has an empty intersection with U{document-node(), element()} (that is, if the context item cannot have children), the function is motionless.

  4. If the function call is contained in the select expression or contained sequence constructor of an xsl:accumulator-rule specifying phase="start", then it is free-ranging.

  5. If the function call is contained in the select expression or contained sequence constructor of an xsl:accumulator-rule specifying phase="end", then it is motionless.

  6. If no enclosing node of the function call is part of a sequence constructorXT, then it is free-ranging. For this purpose, the enclosing nodes of a function call are the attribute or text node that immediately contains the XPath expression in which the function call appears, and its ancestors.

  7. If the focus-setting container of the function call is different from the focus-setting container of the innermost containing instructionXT, then the function is free-ranging.

  8. If no enclosing node N of the function call has a preceding sibling node P such that (a) N and P are part of the same sequence constructorXT, and (b) the sweep of P is consuming, then the function call is consuming. (The term enclosing node is defined above.)

  9. Otherwise, the function call is motionless.

Note:

The following notes apply to the above rules with matching numbers:

  1. This rule prevents the accumulator name being computed by reading the streamed source document. This is disallowed primarily because there is no conceivable use case for doing it.

  2. If the context posture is grounded, then the target of the accumulator is not a streamed node, so no streaming restrictions apply.

  3. If the context item is a childless node (such as a text node), then both the pre-descent and post-descent values of the accumulator can be computed before evaluating any user-written constructs that access this node; there are therefore no constraints on where a call to accumulator-after can appear.

  4. This rule ensures that when computing the pre-descent value of an accumulator for a particular streamed node, the post-descent values of accumulators for that node are not available.

  5. This rule states that the post-descent value of an accumulator is allowed to depend on the post-descent values of other accumulators for the same node. There is a rule preventing cycles [ERR XTDE3400] XT40.

  6. This rule prevents the use of the function (when applied to a streamed node) in contexts like the use attribute of xsl:key. It allows its use in the attributes of an instructionXT or literal result elementXT, or in a text value templateXT. It does not allow use in an xsl:sort or xsl:param element, as these elements do not form part of a sequence constructor (see [XSLT 4.0] section 5.7 Sequence Constructors).

  7. This rule prevents the use of the function (when applied to a streamed node) in contexts such as predicates, or the right-hand side of the / operator. The focus for evaluation of the function must be the same as the focus for a containing sequence constructor. Sequence constructors are treated differently from all other constructs for this purpose in that their operands (the contained instructions) are treated as ordered: in conjunction with the next rule, this rule is assuming that instructions in a sequence constructor that follow a consuming instruction are evaluated after the consuming instruction and therefore have access to the post-descent accumulator value.

  8. This rule is subtle, and has a number of consequences. In these notes, the term instruction should be read as including all nodes making up a sequence constructor, including XSLT instructions, extension instructions, literal result elements, and text nodes containing text value templates.

    • In a sequence constructor that contains a consuming instruction such as <xsl:apply-templates/>, it allows any number of calls on accumulator-after to appear in instructions that follow the call on <xsl:apply-templates/>.

    • In such a sequence constructor it prevents a call on accumulator-after from appearing in an instruction that precedes the <xsl:apply-templates/>, because there would then be two consuming instructions.

    • In a sequence constructor that contains calls on accumulator-after, and contains no other consuming construct, the first instruction that contains a call on accumulator-after is consuming (unless it contains more than one such call, in which case it is free-ranging), and subsequent instructions containing such a call are motionless. So it is possible to have two or more calls on accumulator-after provided they appear in different instructions, which allows the analysis to assume an order of execution.

    • It prevents a call on accumulator-after from appearing in the same instruction as another consuming construct: for example it disallows concat(child::p, accumulator-after('a')). This rule preserves the ability to evaluate the arguments of the concat function in any order.

    • It disallows a call on accumulator-after from appearing in a sequence constructor that is required to be motionless, for example within xsl:sort.

    • The reference to a “preceding sibling node within the same sequence constructor” is carefully worded to ensure that preceding siblings among the children of xsl:fork are not taken into account; the children of xsl:fork are sibling instructions, but do not constitute a sequence constructor. The term also excludes elements such as xsl:param and xsl:sort that may precede a sequence constructor but are not part of it.

  9. The final rule states that if none of the previous rules apply, the function is considered motionless. This applies when the accumulator-after appears after a consuming instruction within the same sequence constructor.

    Note also that a call to accumulator-after can safely appear within a construct such as a named template or (non-streamable) stylesheet function; this is safe because the rules ensure that in such situations, the context item cannot be a streamed node.

Dynamic invocation of accumulator-after is covered by the rules in [XSLT 4.0] section 10.3.5 Dynamic Access to Functions. These rules ensure that a function item cannot include a streamed node in its closure; circumventing the streamability rules for accumulator-after by making a dynamic call is therefore not possible.

12.8.2 Streamability of the accumulator-before Function

See also 11 Streamable Accumulators.

The posture and sweep of the function call are assessed as follows:

  1. If the argument to accumulator-before is motionless, the function call is grounded and motionless.

  2. Otherwise, the function call is roaming and free-ranging.

12.8.3 Streamability of the current Function

The sweep and posture of a call to the current function are determined as follows:

  1. If the call appears within a pattern, then climbing and motionless.

    Note:

    The call to current will always be within a predicate of the pattern. The use of climbing posture here allows predicates such as [@class = current()/@class], while disallowing downwards navigation from the node returned by the function.

  2. Otherwise, let E be the outermost containing XPath expression of the call to the current function.

  3. If the context posture of E is grounded, then motionless and grounded.

  4. If the path in the expression tree that connects the call on current to E (excluding E itself) contains an expression that is a higher-order operand of its parent expression, then motionless and climbing.

    Note:

    Many common uses of the current, such as //p[@class=current()/@class], fall into this category: a predicate is a higher-order operand of its containing filter expression.

    The use of climbing posture here might seem unrelated to its usual connection with the ancestor axis. The explanation (apart from the fact that it happens to produce the right results) lies in the fact that at the point where the current call is evaluated, the node it returns will always be an ancestor-or-self of the context node, as a consequence of the fact that the containing XPath expression is required to be either motionless or consuming.

    The effect of the rule is to allow expressions such as //*[name() = name(current())] or //*[@ref = current()/@id].

  5. Otherwise, the posture is the context posture, and the sweep is motionless.

12.8.4 Streamability of the current-group Function

The sweep and posture of a call C to the current-group function are as follows:

  1. If all the following conditions are true:

    1. C has a containing xsl:for-each-group instruction (call it F)

    2. The path in the construct tree that connects C to the sequence constructor forming the body of F is such that no child construct is a higher-order operand of its parent

    3. The focus-setting container of C is F

    then the sweep and posture of C are the sweep and posture of the select expression of F.

  2. Otherwise, roaming and free-ranging.

Note:

Informally, for streamed evaluation to be possible, a call to current-group must not appear in a construct that is evaluated repeatedly. For example, the expression for $i in 1 to 10 return current-group() would not be streamable.

12.8.5 Streamability of the current-grouping-key Function

A call to the current-grouping-key function is grounded and motionless.

12.8.6 Streamability of the current-merge-group Function

A call to the current-merge-group function is grounded and motionless.

Note:

This is because the nodes to be merged are always snapshots, and therefore grounded: see 10 Streamable Merging.

12.8.9 Streamability of the distinct-ordered-nodes Function

The posture and sweep of a call to the distinct-ordered-nodes function are the same as the posture and sweep of the first argument.

12.8.10 Streamability of the fold-left Function

The function call fold-left($seq, $zero, $f), follows the general streamability rules, with the first argument $seq having type-determined usage based on the type of the second argument of the function supplied as $f.

For example, given the call fold-left(/*/transaction, 0, fn($x as xs:decimal, $y as xs:decimal) as xs:decimal { $x + $y }), the operand usage of the argument /*/transaction is determined by the declared type of $y, namely xs:decimal. Since this is an atomic type, the type-determined usage is absorption. Applying this to the general streamability rules, the function call is grounded and consuming.

12.8.11 Streamability of the fold-right Function

The function follows the general streamability rules, with the first argument having operand usage navigation to reflect the fact that the supplied sequence is processed in reverse order.

Note:

The same considerations apply as for the reverse function: see 12.8.20 Streamability of the reverse Function.

12.8.12 Streamability of the foot Function

The posture and sweep of the expression foot($x) are defined to be the same as the posture and sweep of the equivalent expression $x[position()=last()]. See 12.8.17 Streamability of the last Function.

12.8.13 Streamability of the for-each Function

The function call for-each($seq, $f), follows the general streamability rules, with the first argument $seq having type-determined usage based on the type of the (single) argument of the function supplied as $f.

For example, given the call for-each(/*/transaction, fn($x as xs:decimal) as xs:decimal {abs($x)}), the operand usage of the argument /*/transaction is determined by the declared type of $x, namely xs:decimal. Since this is an atomic type, the type-determined usage is absorption. Applying this to the general streamability rules, the function call is grounded and consuming.

Note:

In practice, the filter function is streamable if either (a) the supplied sequence is grounded, or (b) the supplied function is statically known to atomize its argument.

12.8.14 Streamability of the for-each-pair Function

The function call for-each($seq1, $seq2, $f), follows the general streamability rules, where:

  1. The first argument $seq1 has type-determined usage based on the type of the first argument of the function supplied as $f.

  2. The second argument $seq2 has type-determined usage based on the type of the second argument of the function supplied as $f

Note:

In practice, the for-each-pair function is streamable provided (a) at most one of the input sequences is consuming, and (b) either (i) that input sequence is grounded, or (ii) the supplied function is statically known to atomize the relevant argument.

If it is necessary to combine two sequences that are both streamed, consider using xsl:merge.

12.8.15 Streamability of the function-lookup Function

See [XSLT 4.0] section 10.3.5 Dynamic Access to Functions for special rules that relate to streamability of calls to the function-lookup function.

With the caveats given there, the function follows the general streamability rules, for a function with two arguments that both have operand usage absorption.

12.8.16 Streamability of the innermost Function

The function follows the general streamability rules, with the first argument having operand usage navigation. This is to reflect the fact that the processing is not strictly sequential: it cannot be determined that a node is part of the result sequence of innermost until all its descendants have been read.

12.8.17 Streamability of the last Function

If the context posture for a call on the last function is striding, crawling, or roaming, then the posture of the function is roaming, and the sweep is free-ranging.

In all other cases the function is grounded and motionless.

Note:

The cases where last can be used without affecting streamability are where the context item is either grounded or climbing. The latter condition makes expressions like ancestor::*[@xml:space][last()] streamable.

There are special rules restricting the use of last in the predicate of a pattern: see 12.9 Classifying Patterns.

Note that there are no restrictions preventing the use of last() when the context posture is grounded. The implications of this are discussed in 12.3 Grounded Consuming Constructs. In the case where the sequence being processed is delivered by a consuming expression, using last() may result in this sequence being buffered in memory.

12.8.18 Streamability of the outermost Function

The single argument to this function has operand usage transmission.

The streamability of the function call follows the general streamability rules with one exception: if the posture of the argument is crawling, then the posture of the result is striding.

Note:

There are cases where the streaming rules allow the construct outermost(//para) but do not allow //para; the function can therefore be useful in cases where it is known that para elements will not be nested, as well as cases where the application actually wishes to process all para elements except those that are nested within another.

By contrast, the innermost function offers no streaming benefits. Although it delivers a subset of the input nodes as its result, in the correct order, it is classed as navigational because it needs to look ahead in the input stream before deciding whether a node can be included in the result.

12.8.19 Streamability of the position Function

The position function follows the general streamability rules. Since it has no operands, this means it is grounded and motionless.

Note:

Within an expression, there are no special difficulties in evaluating the position function.

It does have special treatment within a predicate of a patternXT, however: a pattern is not motionless if it contains a call to position, as explained in 12.9 Classifying Patterns.

12.8.20 Streamability of the reverse Function

The reverse function follows the general streamability rules, with its operand classified as having operand usage navigation.

Note:

This means in effect that a call on reverse is not streamable unless the operand is grounded. This may cause few surprises:

  • The expression reverse(/*/emp/copy-of()) is considered streamable, although all the emp elements will typically need to be in memory at the same time. The explanation here is that the streamability rules do not attempt to restrict the amount of memory used for data that is explicitly copied by use of a function such as copy-of.

  • The expression reverse(ancestor::*)/name() is considered non-streamable, because the operand is not grounded. This problem can be circumvented by rewriting the expression as reverse(ancestor::*/name())

12.8.21 Streamability of the root Function

The zero-argument function root() is equivalent to root(.).

Given the expression root(X), if the static type of X is U{document-node()}, and if its posture is striding, then root(X) is rewritten as X. Otherwise, it is rewritten as head((X)/ancestor-or-self::node()). Streamability analysis is then applied to the rewritten expression.

Note:

Because path expressions starting with / are rewritten to use the root function, this ensures that a leading slash is ignored if the context item is a document node, for example within a template rule with match="/". This improves streamability, because upwards navigation followed by downward navigation is disallowed.

12.8.22 Streamability of the trunk Function

The posture and sweep of the expression trunk($x) are defined to be the same as the posture and sweep of the equivalent expression $x[position()!=last()]. See 12.8.17 Streamability of the last Function.

12.9 Classifying Patterns

Note:

Patterns differ from other kinds of construct in that they are not composable in the same way. It is best to think of a pattern as specialized syntax for a function that takes an item as its argument and returns a boolean: true if the pattern matches the item, otherwise false. The static type of a pattern is therefore taken as U{xs:boolean} (this is not to be confused with the type of the items that the pattern is capable of matching).

The sweep of a patternXT is either motionless or free-ranging. (Although there are patterns that could in principle be evaluated by consuming the element node that they match, these are of no interest in the analysis, so they are classified as free-ranging.)

The posture of a patternXT is grounded if the pattern is motionless, or roaming otherwise. (This reflects the fact that a pattern always returns a boolean result; it never returns a node in a streamed document.)

Informally, a motionless pattern is one that can be evaluated by a streaming processor when the input stream is positioned at the start of the node being matched, without advancing the input stream.

A pattern is motionless if and only if it satisfies all the following conditions:

  1. The pattern does not contain a RootedPathXT.

  2. If the pattern contains predicates, then every top-level Predicate in the pattern satisfies all the following conditions:

    1. The expression immediately contained in the predicate is motionless, when assessed with a context posture of striding, and a context item type set to the static type of the expression to which the predicate applies, determined using the rules in 3.5 Determining the Static Type of a Construct.

    2. The predicate is a non-positional predicate.

    The use of the term top-level in this rule means that predicates that are nested within other predicates do not themselves have to be non-positional, though they may play a role in the analysis of top-level predicates.

  3. The pattern does not contain (at any depth) a variable reference that is bound to a streaming parameter. (See 12.7.14 Streamability of Static Function Calls).

[Definition: A predicate is a non-positional predicate if it satisfies both of the following conditions:

  1. The predicate does not contain a function call or named function reference to any of the following functions, unless that call or reference occurs within a nested predicate:

    1. position

    2. last

    3. function-lookup.

    Note:

    The exception for nested predicates is there to ensure that patterns such as match="p[@code = $status[last()]] are not disqualified.

  2. The expression immediately contained in the predicate is a non-numeric expression. An expression is non-numeric if the intersection of its static type (see 3.5 Determining the Static Type of a Construct) with U{xs:decimal, xs:double, xs:float} is U{}.

]

Note:

A non-positional predicate can be evaluated by considering each item in the filtered sequence independently; the result never depends on the position of other items in the sequence or the length of the sequence.

A pattern that is not motionless is classified as free-ranging.

The following list shows examples of motionless patterns:

  • /

  • *

  • /*

  • p

  • p|q

  • p/q

  • p[@status='red']

  • p[base-uri()]

  • p[@class or @style]

  • p[@status]

  • p[@status = $status-codes[1]]

  • p[@class | @style]

  • p[contains(@class, ':')]

  • p[substring-after(@class, ':')]

  • p[ancestor::*[@xml:lang]]

  • text()[starts-with(., '$')]

  • @price

  • @price[starts-with(., '$')]

  • //p/text()[. = 'Introduction']

  • document-node(element(html)) (Note: this is classified as motionless even though testing a document node against the pattern might require a small amount of look-ahead.)

The following list shows examples of patterns that are not motionless, explaining why not:

  • id('abc') (contains a RootedPath)

  • $doc//p (contains a RootedPath)

  • p[b] (the predicate is not motionless)

  • p[. = 'Introduction'] (the predicate is not motionless)

  • p[starts-with(., '$')] (the predicate is not motionless)

  • p[preceding-sibling::p[1] = ''] (the predicate is not motionless)

  • p[1] (contains a positional predicate: return type is numeric)

  • p[$pnum + 1] (contains a positional predicate: return type is numeric)

  • p[data(@status)] (contains a positional predicate: return type is potentially numeric)

  • p[position() gt 2] (contains a positional predicate: calls position())

  • p[last()] (contains a positional predicate: calls last())

12.10 Examples of Streamability Analysis

The examples in this section are intended to illustrate how the streamability rules are applied “top down” to establish whether template rules are guaranteed streamable.

Example: A recursive-descent template rule

Consider the following template rule, where mode s is defined with streamable="yes":

<xsl:template match="para" mode="s">
  <div class="para">
    <xsl:apply-templates mode="s"/>
  </div>
</xsl:template>

The processor is required to establish that this template meets the streamability rules. Specifically, as stated in 5 Streamable Templates, it must satisfy three conditions:

  1. The match pattern must be motionless.

  2. The body of the template rule must be grounded.

  3. The initializers of any template parameters must be motionless.

The third condition is satisfied trivially because there are no parameters.

The first rule depends on the rules for assessing patterns, which are given in 12.9 Classifying Patterns. This pattern is motionless because (a) it does not contain a RootedPath, and (b) it contains no predicates.

So it remains to determine that the body of the template is grounded. The proof of this is as follows:

  1. The sequence constructor forming the body of the template is assessed according to the rules in 12.4 Classifying Sequence Constructors, which tell us that there is a single operand (the <div> literal result elementXT) which has operand usage U = transmission.

  2. The assessment of the sequence constructor uses the general streamability rules. These rules require us to determine the type T, sweep S, posture P, and usage U of each operand. We have already established that there is a single operand, with U = transmission. Section 3.5 Determining the Static Type of a Construct tells us that for all instructions, we can take T = U{*}. The posture P and sweep S of the literal result element are established as follows:

    1. The rules for literal result elements (specifically the <div> element) are given in 12.5.1 Streamability of Literal Result Elements. This particular literal result element has only one operand (its contained sequence constructor), with operand usage U = absorption.

    2. The general streamability rules again apply. Again the static type T of the operand is U{*}, and we need to determine the posture P and sweep S.

    3. To determine the posture and sweep of this sequence constructor (the one that contains the xsl:apply-templates instruction) we refer again to the general streamability rules.

      1. The sequence constructor has a single operand (the xsl:apply-templates instruction); again U = transmission, T = U{*}.

      2. The posture P and sweep S of the xsl:apply-templates instruction are established as follows:

        1. The rules that apply are in 12.5.5 Streamability of xsl:apply-templates.

        2. Rule 1 does not apply because the select expression (which defaults to child::node()) is not grounded. This is a consequence of the rules in 12.7.9 Streamability of Axis Steps, specifically:

          1. The context posture of the axis step is established by the template rule as a whole, as striding.

          2. Therefore rules 1 and 2 do not apply.

          3. The statically inferred context item type is derived from the match pattern (match="para"). This gives a type of U{element()}. The child axis for element nodes is not necessarily empty, so rule 3 does not apply.

          4. Rule 4 does not apply because there are no predicates.

          5. So the posture and sweep of the axis step child::node() are given by the table in rule 5. The entry for (context posture = striding, axis = child) gives a posture of striding and a sweep of consuming.

          6. So the select expression is not grounded. (The same result can be reached intuitively: an expression that selects streamed nodes will never be grounded.)

        3. Rule 2 does not apply because there is no xsl:sort element.

        4. Rule 3 does not apply because the mode is declared with streamable="yes".

        5. So the posture P and sweep S of the xsl:apply-templates instruction are established by the general streamability rules, as follows:

          1. There is a single operand, the implicit select="child::node()" expression, with usage U = absorption.

          2. We have already established that for this operand, the posture P = striding and the sweep S = consuming.

          3. By the rules in 3.5 Determining the Static Type of a Construct, the type T of the select expression is node().

          4. In the general streamability rules, the adjusted sweep S′ for an operand with (P = striding, U = absorption) is consuming,

          5. Rule 2(d) then applies, so the xsl:apply-templates instruction is consuming and grounded.

      3. So the sequence constructor that contains the xsl:apply-templates instruction has one operand with U = transmission, T = item(), P = grounded, S = consuming. Rule 2(d) of the general streamability rules applies, so the sequence constructor itself has P = grounded, S = consuming.

    4. So the literal result element has one operand with U = absorption, T = item(), P = grounded, S = consuming. Rule 2(d) of the general streamability rules applies, so the literal result element has P = grounded, S = consuming.

  3. So the sequence constructor containing the literal result element has one operand with U = transmission, T = item(), P = grounded, S = consuming. Rule 2(d) of the general streamability rules applies, so this sequence constructor itself has P = grounded, S = consuming.

  4. So we have established that the sequence constructor forming the body of the template rule is grounded.

Therefore, since the other conditions are also satisfied, the template is guaranteed-streamable.

The analysis presented above could have been simplified by taking into account the fact that the streamability properties of a sequence constructor containing a single instruction are identical to the properties of that instruction. This simplification will be exploited in the next example.

 

Example: An aggregating template rule

Consider the following template rule, where mode s is defined with streamable="yes":

<xsl:template match="transactions[@currency='USD']" mode="s">
  <total><xsl:value-of select="sum(transaction/@value)"/></total>
</xsl:template>

Again, as stated in 5 Streamable Templates, it must satisfy three conditions:

  1. The match pattern must be motionless.

  2. The body of the template rule must be grounded.

  3. The initializers of any template parameters must be motionless.

The third condition is satisfied trivially because there are no parameters.

The first rule depends on the rules for assessing patterns, which are given in 12.9 Classifying Patterns. This pattern is motionless because (a) it is not a RootedPath, and (b) every predicate is motionless and non-positional. The analysis that proves the predicate is motionless and non-positional proceeds as follows:

  1. First establish that the expression @currency='USD' is motionless, as follows:

    1. The predicate is a general comparison (GeneralComp) which follows the general streamability rules.

    2. There are two operands: an AxisStep with a defaulted ForwardAxis, and a Literal. Both operand roles are absorption.

    3. The left-hand operand has type T = attribute(). Its posture and sweep are determined by the rules in 12.7.9 Streamability of Axis Steps. The context posture is striding, so the posture and sweep are determined by the entry in the table (rule 5) with context posture = striding, axis = attribute: that is, the result posture is striding and the sweep is motionless.

    4. The right-hand operand, being a literal, is grounded and motionless.

    5. In the general streamability rules, rule 2(e) applies, so the predicate is grounded and motionless

  2. Now establish that the expression @currency='USD' is non-positional, as follows:

    1. Rule 1 is satisfied: the predicate does not call position, last, or function-lookup.

    2. Rule 2 is satisfied: the expression @currency='USD' is non-numeric. The static type of the expression is determined using the rules in 3.5 Determining the Static Type of a Construct as U{xs:boolean}, and this has no intersection with U{xs:decimal, xs:double, xs:float}.

So both conditions in 12.9 Classifying Patterns are satisfied, and the pattern is therefore motionless.

It remains to show that the body of the template rule is grounded. The proof of this is as follows. Unlike the previous example, the analysis is shown in simplified form; in particular the two sequence constructors which each contain a single instruction are ignored, and replaced in the construct tree by their contained instruction.

  1. We need to show that the <total> literal result elementXT is grounded.

  2. The rules that apply are in 12.5.1 Streamability of Literal Result Elements.

  3. These rules refer to the general streamability rules. There is one operand, the xsl:value-of child element, which has operand usage U = absorption, and type T = item().

  4. So we need to determine the posture and sweep of the xsl:value-of instruction.

    1. The rules are given in 12.5.42 Streamability of xsl:value-of.

    2. The general streamability rules apply. There is one operand, the expression sum(transaction/@value), which has operand usage U = absorption.

    3. The type T of this operand is the return type defined in the signature of the sum function, that is, xs:anyAtomicType.

    4. The posture P and sweep S are established as follows:

      1. The rules that apply to the call on sum are given in 12.8 Classifying Calls to Built-In Functions.

      2. The relevant proforma is fn:sum(A), indicating that the general streamability rules apply, and that there is a single operand with usage U = absorption.

      3. The type T of the operand transaction/@value is determined (by the rules in 3.5 Determining the Static Type of a Construct) as attribute().

      4. The posture P and sweep S of the operand transaction/@value are determined by the rules in 12.7.8 Streamability of Path Expressions, as follows:

        1. The expression is expanded to child::transaction/attribute::value.

        2. The posture and sweep of the left-hand operand child::transaction are determined by the rules in 12.7.9 Streamability of Axis Steps, as follows:

          1. The context posture is striding, because the focus-setting container is the template rule itself.

          2. The context item type is element(), based on the match type of the pattern match="transactions[@currency='USD']".

          3. Rules 1 and 2 do not apply because the context posture is striding.

          4. Rule 3 does not apply because the child axis applied to an element node is not necessarily empty.

          5. Rule 4 does not apply because there are no predicates.

          6. Rule 5 applies, and the table entry with context posture = striding, axis = child gives a result posture of striding and a sweep of consuming.

        3. The posture of the relative path expression child::transaction/attribute::value is therefore the posture of its right-hand operand attribute::value, assessed with a context posture of striding. This is determined by the rules in 12.7.9 Streamability of Axis Steps, as follows:

          1. The context posture, as we have seen, is striding.

          2. The context item type is element(), based on the type of the left-hand operand child::transaction.

          3. Rules 1 and 2 do not apply because the context posture is striding.

          4. Rule 3 does not apply because the attribute axis applied to an element node is not necessarily empty.

          5. Rule 4 does not apply because there are no predicates.

          6. Rule 5 applies, and the table entry with context posture = striding, axis = attribute gives a result posture of striding and a sweep of motionless.

        4. The posture of the relative path expression child::transaction/attribute::value is therefore striding.

        5. The sweep of the relative path expression child::transaction/attribute::value is the wider of the sweeps of its two operands, namely consuming and motionless. That is, it is consuming.

      5. So the first and only operand to the call on sum() has U = absorption, T = attribute(), P = climbing, and S = consuming

      6. Rule 1(b) of the general streamability rules computes the adjusted sweep S′. Rule 1(b)(iii)(A) applies, so the effective operand usage U′ is inspection. Rule 1(b)(iii)(A) then computes the adjusted sweep from the table entry for P = climbing, U′ = inspection; this shows S′ = S, that is, consuming.

      7. Rule 2(d) now applies, so the call on sum() is grounded and consuming.

    5. Since the xsl:value-of instruction has one operand with U = absorption, T = xs:anyAtomicType, P = grounded, and S = consuming, rule 2(d) again applies, and the xsl:value-of instruction is grounded and consuming.

  5. Since the literal result element has one operand with U = absorption, T = item(), P = grounded, and S = consuming, rule 2(d) again applies, and the literal result element is grounded and consuming.

  6. Therefore the body of the template rule is grounded, and since the other conditions are also satisfied, it is guaranteed-streamable.

 

Example: Streamed Grouping

Consider the following code, which is designed to process a transaction file containing transactions in chronological order, and output the total value of the transactions for each day.

<xsl:template name="go">
  <out>
    <xsl:source-document streamable="yes" href="transactions.xml">
      <xsl:for-each-group select="/account/transaction" 
                          group-adjacent="xs:date(@timestamp)">
         <total date="{current-grouping-key()}" value="{sum(current-group()/@value)}"/>
      </xsl:for-each-group>
    </xsl:source-document>
  </out>
</xsl:template>

The rules for xsl:source-document say that the instruction is guaranteed-streamable if the contained sequence constructorXT is grounded, and the task of streamability analysis is to prove that this is the case. As in the previous example, we will take a short-cut by making the assumption that a sequence constructor containing a single instruction can be replaced by that instruction in the construct tree.

So the task is to show that the xsl:for-each-group instruction is grounded, which we can do as follows:

  1. The relevant rules are to be found in 12.5.19 Streamability of xsl:for-each-group.

    Note:

    Rule numbers may be different in a version of the specification with change markings.

  2. Rule 1 applies only if the select expression is grounded. It is easy to see informally that this is not the case (an expression that returns streamed nodes is never grounded). More formally:

    1. The select expression is a path expression; the rules in 12.7.8 Streamability of Path Expressions apply.

    2. The expression is rewritten as ((root(.) treat as document-node())/child::account)/child::transaction

    3. The left-hand operand (root(.) treat as document-node())/child::account is also a path expression, so the rules in 12.7.8 Streamability of Path Expressions apply recursively:

      1. The left-hand operand root(.) treat as document-node() follows the rules for a TreatExpr in 12.7 Classifying Expressions; the proforma T treat as TYPE indicates that the general streamability rules apply with a single operand having usage transmission.

      2. This single operand root(.) follows the rules in 12.8.21 Streamability of the root Function. The item type of the operand . is the context item type, which is the type established by the xsl:source-document instruction, namely document-node(). Under these conditions root(.) is rewritten as ., so the posture is the context posture established by the xsl:source-document instruction, namely striding. The sweep is motionless.

      3. The posture and sweep of the expression root(.) treat as document-node() are the same as the posture and sweep of root(.), namely striding and motionless

      4. The right-hand operand child::account is governed by the rules in 12.7.9 Streamability of Axis Steps. The context posture is striding, and the axis is child, so the result posture is striding and the sweep is consuming.

      5. The posture of the path expression is the posture of the right-hand operand, that is striding, and its sweep is the wider sweep of the two operands, that is consuming

    4. Returning to the outer path expression, the posture of the right hand operand child::transaction is striding, and its sweep is consuming.

    5. So the posture of the select expression as a whole is the posture of the right hand operand, that is striding; and its sweep is the wider of the sweeps of the operands, which is consuming.

  3. Rule 2 does not apply: there is no group-by attribute.

  4. Rule 3 does not apply: there is a group-adjacent attribute, but it is motionless. The reasoning is as follows:

    1. The value is a call to the constructor function xs:date. The rules in 12.7.14 Streamability of Static Function Calls apply. There is a single operand, whose required type is atomic, so the operand usage is absorption.

    2. These rules refer to the general streamability rules, so we need to determine the context item type, posture, and sweep of the operand expression @timestamp. This is done as follows:

      1. The expression is an AxisStep, so the relevant rules are in 12.7.9 Streamability of Axis Steps.

      2. The context posture is the posture of the controlling operand of the focus-setting container, that is, is the select expression of the containing xsl:for-each-group instruction, which as established above is striding. The context item type is similarly the inferred type of the select expression, and is element().

      3. Rules 1 and 2 do not apply because the context posture is striding.

      4. Rule 3 does not apply because the attribute axis for an element node is not necessarily empty.

      5. Rule 4 does not apply because there is no predicate.

      6. So the sweep and posture of the expression @timestamp are given by the table in Rule 5 as striding and motionless.

    3. Returning to the general streamability rules for the expression xs:date(@timestamp), the operand @timestamp has U = absorption, T = attribute(), P = striding, S = motionless.

    4. Under Rule 1(b)(iii)(A), because T = attribute(), the operand usage U′ becomes inspection.

    5. Under Rule 1(b)(iii)(A), S′ = S = motionless.

    6. Under Rule 2(e), the expression xs:date(@timestamp) is grounded and motionless.

  5. Rule 4 (under xsl:for-each-group) does not apply, because there is no xsl:sort child.

  6. So Rule 5 applies. This relies on knowing the posture of the sequence constructor contained in the xsl:for-each-group instruction: that is, the posture of the total literal result elementXT. This is calculated as follows:

    1. The rules that apply are in 12.5.1 Streamability of Literal Result Elements. The general streamability rules apply; there are two operands, the attribute value templates {current-grouping-key()} and {sum(current-group()/@value)}, and in each case the usage is absorption. We can simplify the analysis by observing that the empty sequence constructorXT contained in the literal result element can be ignored, since it is grounded and motionless.

    2. Consider first the operand {current-grouping-key()}.

      1. Section 12.6 Classifying Value Templates applies. This refers to the general streamability rules; there is a single operand, the expression current-grouping-key(), with usage absorption.

      2. Section 12.8.5 Streamability of the current-grouping-key Function applies. This establishes that the expression is grounded and motionless.

      3. It follows that the operand {current-grouping-key()} expression is also grounded and motionless.

    3. Now consider the operand {sum(current-group()/@value)}.

    4. Section 12.6 Classifying Value Templates applies. This refers to the general streamability rules; there is a single operand, the expression sum(current-group()/@value), with usage absorption.

    5. The rules for the sum function appear in 12.8 Classifying Calls to Built-In Functions. The proforma is given there as fn:sum(A), which means that the general streamability rules apply, and that the single operand current-group()/@value has usage absorption. So we need to establish the posture, sweep, and type of this expression, which we can do as follows:

      1. The expression is a RelativePathExpr, so section 12.7.8 Streamability of Path Expressions applies.

      2. The expression is expanded to current-group()/attribute::value.

      3. The posture and sweep of the left-hand operand current-group() are defined in 12.8.4 Streamability of the current-group Function. Since all the required conditions are satisfied, the posture of current-group() is the posture of the select expression, that is striding, and its sweep is the sweep of the select expression, that is consuming.

      4. The posture and sweep of the right hand operand @value are defined in 12.7.9 Streamability of Axis Steps. The context posture is the posture of the left-hand operand current-group(), namely striding; the table in Rule 5 applies, giving the result climbing and motionless

      5. The posture of the RelativePathExpr is the posture of the right hand operand, namely striding. The sweep of the RelativePathExpr is the wider of the sweeps of its operands, which is consuming

      6. The type of the expression current-group()/@value is determined using the rules in 3.5 Determining the Static Type of a Construct as attribute().

    6. So the sum function has a single operand with U = absorption, P = striding, S = consuming, T = attribute().

    7. In the general streamability rules, Rule 1(b)(iii)(A) gives the adjusted usage as U′ = inspection, and Rule 1(b)(iii)(B) gives the adjusted sweep as S′ = S = consuming. Rule 2(d) gives the posture and sweep of the call to sum as grounded and consuming.

  7. So the literal result element has two operands, one of which is grounded and motionless, the other grounded and consuming. Rule 2(d) of the general streamability rules determines that the literal result element is grounded and consuming.

  8. So the content of the xsl:source-document instruction is grounded, which means that the instruction is guaranteed-streamable.

12.11 Streamability and 1.0 Compatibility Mode

Processing an instructionXT with XSLT 1.0 behavior is not compatible with streaming. More specifically, and notwithstanding anything stated in 3 Streamability Analysis Principles, an instruction that is processed with XSLT 1.0 behavior is roaming and free-ranging, which has the effect that any construct containing such an instruction is not guaranteed-streamable.

13 Conformance

[Definition: A processor that claims conformance with the streaming feature must use streamed processing in cases where (a) streaming is requested (for example by using the attribute streamable="yes" on xsl:mode, or on the xsl:source-document instruction) and (b) the constructs in question are guaranteed-streamable according to this specification.]

A processor that does not claim conformance with the streaming feature is not required to use streamed processing and is not required to determine whether any construct is guaranteed streamable. Such a processor must, however, implement the semantics of all constructs in the language provided that enough memory is available to perform the processing without streaming.

A processor that conforms with the feature must return the value "yes" in response to the function call system-property('xsl:supports-streaming'); a processor that does not conform with the feature must return the value "no".

Note:

The term streamed processing as used here means the ability to process arbitrarily large input documents without ever-increasing memory requirements.

A References

A.1 Normative References

XDM 4.0
XQuery and XPath Data Model (XDM) 4.0, XSLT Extensions Community Group, World Wide Web Consortium.
Functions and Operators 4.0
CITATION: T.B.D.
XML Information Set
XML Information Set (Second Edition), John Cowan and Richard Tobin, Editors. World Wide Web Consortium, 04 Feb 2004. This version is http://www.w3.org/TR/2004/REC-xml-infoset-20040204. The latest version is available at http://www.w3.org/TR/xml-infoset.
ISO 15924
ISO (International Organization for Standardization) Information and documentation — Codes for the representation of names of scripts ISO 15924:2004, January 2004. See https://www.iso.org/obp/ui/#!iso:std:iso:15924:ed-1:v1:en.
ISO 15924 Register
Unicode Consortium. Codes for the representation of names of scripts — Alphabetical list of four-letter script codes. See http://www.unicode.org/iso15924/iso15924-codes.html. Retrieved February 2013; continually updated.
ISO 21320
ISO (International Organization for Standardization) Information technology — Document Container File, Part 1: Core ISO 21320-1:2015, October 2015. See https://www.iso.org/obp/ui/#iso:std:iso-iec:21320:-1:ed-1:v1:en.
Serialization 4.0
XSLT and XQuery Serialization 4.0, XSLT Extensions Community Group, World Wide Web Consortium.
RFC 7595
IETF. Guidelines and Registration Procedures for URI Schemes. June 2015. See http://www.ietf.org/rfc/rfc7595.txt
RFC 7159
IETF. The JavaScript Object Notation (JSON) Data Interchange Format. March 2014. See http://www.ietf.org/rfc/rfc7159.txt
UNICODE
Unicode Consortium. The Unicode Standard as updated from time to time by the publication of new versions. See http://www.unicode.org/standard/versions/ for the latest version and additional information on versions of the standard and of the Unicode Character Database. The version of Unicode to be used is implementation-defined, but implementations are recommended to use the latest Unicode version.
UNICODE TR10
Unicode Consortium. Unicode Technical Standard #10. Unicode Collation Algorithm. Unicode Technical Report. See http://www.unicode.org/reports/tr10/.
UNICODE TR35
Unicode Consortium. Unicode Technical Standard #35. Unicode Locale Data Markup Language. Unicode Technical Report. See http://www.unicode.org/reports/tr35/.
XML 1.0
World Wide Web Consortium. Extensible Markup Language (XML) 1.0. W3C Recommendation. See http://www.w3.org/TR/REC-xml/. The edition of XML 1.0 must be no earlier than the Third Edition; the edition used is implementation-defined, but we recommend that implementations use the latest version.
XML 1.1
Extensible Markup Language (XML) 1.1 (Second Edition), Tim Bray, Jean Paoli, Michael Sperberg-McQueen, et. al., Editors. World Wide Web Consortium, 16 Aug 2006. This version is http://www.w3.org/TR/2006/REC-xml11-20060816. The latest version is available at http://www.w3.org/TR/xml11/.
XML Base
XML Base (Second Edition), Jonathan Marsh and Richard Tobin, Editors. World Wide Web Consortium, 28 Jan 2009. This version is http://www.w3.org/TR/2009/REC-xmlbase-20090128/. The latest version is available at http://www.w3.org/TR/xmlbase/.
xml:id
xml:id Version 1.0, Jonathan Marsh, Daniel Veillard, and Norman Walsh, Editors. World Wide Web Consortium, 09 Sep 2005. This version is http://www.w3.org/TR/2005/REC-xml-id-20050909/. The latest version is available at http://www.w3.org/TR/xml-id/.
Namespaces in XML
Namespaces in XML 1.0 (Third Edition), Tim Bray, Dave Hollander, Andrew Layman, et. al., Editors. World Wide Web Consortium, 08 Dec 2009. This version is http://www.w3.org/TR/2009/REC-xml-names-20091208/. The latest version is available at http://www.w3.org/TR/xml-names/.
Namespaces in XML 1.1
Namespaces in XML 1.1 (Second Edition), Tim Bray, Dave Hollander, Andrew Layman, and Richard Tobin, Editors. World Wide Web Consortium, 16 Aug 2006. This version is http://www.w3.org/TR/2006/REC-xml-names11-20060816. The latest version is available at http://www.w3.org/TR/xml-names11/.
XML Schema Part 1
XML Schema Part 1: Structures Second Edition, Henry Thompson, David Beech, Murray Maloney, and Noah Mendelsohn, Editors. World Wide Web Consortium, 28 Oct 2004. This version is http://www.w3.org/TR/2004/REC-xmlschema-1-20041028/. The latest version is available at http://www.w3.org/TR/xmlschema-1/.
XML Schema Part 2
XML Schema Part 2: Datatypes Second Edition, Paul V. Biron and Ashok Malhotra, Editors. World Wide Web Consortium, 28 Oct 2004. This version is http://www.w3.org/TR/2004/REC-xmlschema-2-20041028/. The latest version is available at http://www.w3.org/TR/xmlschema-2/.
XML Schema 1.1 Part 1
W3C XML Schema Definition Language (XSD) 1.1 Part 1: Structures, Sandy Gao, Michael Sperberg-McQueen, Henry Thompson, et. al., Editors. World Wide Web Consortium, 05 Apr 2012. This version is http://www.w3.org/TR/2012/REC-xmlschema11-1-20120405/. The latest version is available at http://www.w3.org/TR/xmlschema11-1/.
XML Schema 1.1 Part 2
W3C XML Schema Definition Language (XSD) 1.1 Part 2: Datatypes, David Peterson, Sandy Gao, Ashok Malhotra, et. al., Editors. World Wide Web Consortium, 05 Apr 2012. This version is http://www.w3.org/TR/2012/REC-xmlschema11-2-20120405/. The latest version is available at http://www.w3.org/TR/xmlschema11-2/.
XPath 4.0
CITATION: T.B.D.
XSLT Media Type
World Wide Web Consortium. Registration of MIME Media Type application/xslt+xml. In Appendix B.1 of the XSLT 2.0 specification.

A.2 Other References

Unicode CLDR
CLDR - Unicode Common Locale Data Repository. Available at: http://cldr.unicode.org
DOM Level 2
Document Object Model (DOM) Level 2 Core Specification, Arnaud Le Hors, Philippe Le Hégaret, Lauren Wood, et. al., Editors. World Wide Web Consortium, 13 Nov 2000. This version is http://www.w3.org/TR/2000/REC-DOM-Level-2-Core-20001113. The latest version is available at http://www.w3.org/TR/DOM-Level-2-Core/.
ECMA-404
ECMA International. The JSON Data Interchange Format October 2013. See http://www.ecma-international.org/publications/files/ECMA-ST/ECMA-404.pdf.
ICU
ICU - International Components for Unicode. Available at http://site.icu-project.org
RFC2119
S. Bradner. Key words for use in RFCs to Indicate Requirement Levels. IETF RFC 2119. See http://www.ietf.org/rfc/rfc2119.txt.
RFC3986
T. Berners-Lee, R. Fielding, and L. Masinter. Uniform Resource Identifiers (URI): Generic Syntax. IETF RFC 3986. See http://www.ietf.org/rfc/rfc3986.txt.
RFC3987
M. Duerst, M. Suignard. Internationalized Resource Identifiers (IRIs). IETF RFC 3987. See http://www.ietf.org/rfc/rfc3987.txt.
RFC4647
A. Phillips and M. Davis. Matching of Language Tags. IETF RFC 4647. See http://www.ietf.org/rfc/rfc4647.txt.
RFC7303
H. Thompson and C. Lilley. XML Media Types. IETF RFC 7303. See http://www.ietf.org/rfc/rfc7303.txt
SemVer
Tom Preston-Werner, Semantic Versioning 2.0.0. See http://semver.org/. Undated (retrieved 1 August 2014).
STX
Petr Cimprich et al, Streaming Transformations for XML (STX) Version 1.0. Working Draft 27 April 2007. See http://stx.sourceforge.net/documents/spec-stx-20070427.html
XLink
XML Linking Language (XLink) Version 1.0, Steven DeRose, Eve Maler, and David Orchard, Editors. World Wide Web Consortium, 27 Jun 2001. This version is http://www.w3.org/TR/2001/REC-xlink-20010627/. The latest version is available at http://www.w3.org/TR/xlink/.
XML Schema 1.0 and XML 1.1
World Wide Web Consortium. Processing XML 1.1 documents with XML Schema 1.0 processors. W3C Working Group Note 11 May 2005. See https://www.w3.org/TR/2005/NOTE-xml11schema10-20050511/
XML Stylesheet
Associating Style Sheets with XML documents 1.0 (Second Edition), James Clark, Simon Pieters, and Henry Thompson, Editors. World Wide Web Consortium, 28 Oct 2010. This version is http://www.w3.org/TR/2010/REC-xml-stylesheet-20101028. The latest version is available at http://www.w3.org/TR/xml-stylesheet.
XPointer Framework
XPointer Framework, Paul Grosso, Eve Maler, Jonathan Marsh, and Norman Walsh, Editors. World Wide Web Consortium, 25 Mar 2003. This version is http://www.w3.org/TR/2003/REC-xptr-framework-20030325/. The latest version is available at http://www.w3.org/TR/xptr-framework/.
XSL-FO
Extensible Stylesheet Language (XSL) Version 1.1, Anders Berglund, Editor. World Wide Web Consortium, 05 Dec 2006. This version is http://www.w3.org/TR/2006/REC-xsl11-20061205/. The latest version is available at http://www.w3.org/TR/xsl11/.
XSLT 1.0
XSL Transformations (XSLT) Version 1.0, James Clark, Editor. World Wide Web Consortium, 16 Nov 1999. This version is http://www.w3.org/TR/1999/REC-xslt-19991116. The latest version is available at http://www.w3.org/TR/xslt.
XSLT 2.0
XSL Transformations (XSLT) Version 2.0 (Second Edition), Michael Kay, Editor. World Wide Web Consortium, 23 January 2007. This version is https://www.w3.org/TR/2007/REC-xslt20-20070123/. The latest version is available at https://www.w3.org/TR/xslt20/.
XSLT 3.0
XSL Transformations (XSLT) Version 3.0, Michael Kay, Editor. World Wide Web Consortium, 7 February 2017. This version is https://www.w3.org/TR/2017/CR-xslt-30-20170207/. The latest version is available at https://www.w3.org/TR/xslt-30/.
XSLT 4.0
XSL Transformations (XSLT) Version 4.0, XSLT Extensions Community Group, World Wide Web Consortium.

B Summary of Error Conditions (Non-Normative)

This appendix provides a summary of error conditions that a processor may raise. This list includes all error codes defined in this specification, but this is not an exhaustive list of all errors that can occur. Implementations must raise errors using these error codes, and applications can test for these codes; however, when more than one rule in the specification is violated, different processors will not necessarily raise the same error code. Implementations are not required to raise errors using the descriptive text used here.

Note:

The appendix is non-normative because the same information is given normatively elsewhere.

Static errors

ERR XTSE0730

If an xsl:attribute set element specifies streamable="yes" then every attribute set referenced in its use-attribute-sets attribute (if present) must also specify streamable="yes".

ERR XTSE3155

It is a static error if an xsl:function element with no xsl:param children has a streamability attribute with any value other than unclassified.

ERR XTSE3430

It is a static errorXT if a packageXT contains a construct that is declared to be streamable but which is not guaranteed-streamable, unless the user has indicated that the processor is to handle this situation by processing the stylesheet without streaming or by making use of processor extensions to the streamability rules where available.

C Change Log (Non-Normative)

This appendix lists changes made in version 4.0 of this specification.

  1. Use the arrows to browse significant changes since the 3.1 version of this specification.

    See 1 Introduction

  2. Sections with significant changes are marked Δ in the table of contents.

    See 1 Introduction

  3. The special rule allowing xsl:map to have multiple consumable operands does not apply if duplicate keys are permitted.

    See 12.5.23 Streamability of xsl:map

  4. PR 2011 

    The static typing rules have been updated to take account of new constructs in XPath 4.0.

    See 3.5 Determining the Static Type of a Construct