@qt4cg statuses in 2025

This page displays status updates about the QT4 CG project from 2025.

See also recent statuses.

QT4 CG meeting 133 draft minutes #minutes—08-19

19 Aug at 16:10:00 GMT

Draft minutes published.

Issue #2139 closed #closed-2139

19 Aug at 15:56:36 GMT

Binary comparisons

Issue #2168 closed #closed-2168

19 Aug at 15:56:35 GMT

2139 Make hexBinary and base64Binary fully comparable

Issue #2162 closed #closed-2162

19 Aug at 15:55:02 GMT

QT4CG-132-04 Expand the rectangle?area example

Issue #2143 closed #closed-2143

19 Aug at 15:48:05 GMT

JNodes and Methods

Issue #1714 closed #closed-1714

19 Aug at 15:48:00 GMT

sibling:: axis. Action Item QT4CG-097-03

Issue #350 closed #closed-350

19 Aug at 15:47:52 GMT

CompPath (Composite-objects path) Expressions

Issue #119 closed #closed-119

19 Aug at 15:47:46 GMT

Allow a map's key value to be any sequence

Issue #106 closed #closed-106

19 Aug at 15:47:41 GMT

Decorators' support

Issue #34 closed #closed-34

19 Aug at 15:47:36 GMT

Proposal to introduce the set datatype in XPath 4

Issue #2164 closed #closed-2164

19 Aug at 15:45:35 GMT

Fix return type in `fn:parse-csv` signature

Issue #2072 closed #closed-2072

19 Aug at 15:14:09 GMT

JNodes: accessing properties

Issue #2170 closed #closed-2170

18 Aug at 15:46:50 GMT

The current "?>" method call operator is ugly, difficult to read, difficult to find and understand. We have much better alternatives.

Issue #2170 created #created-2170

18 Aug at 15:17:43 GMT
The current "?>" method call operator is ugly, difficult to read, difficult to find and understand. We have much better alternatives.

The current "?>" method call operator is ugly, difficult to read, difficult to find and understand. We have much better alternatives.

One obvious big improvement is to have ==> .

This is:

  1. Readable.
  2. Distinctly distinguishable.
  3. Understandable and intuitive for anyone who has used an OOP language (C++, C#, Java)
  4. Expresses the similarity with the => operator. The same way => provide the LHS item as the first argument of the RHS function, the similar, and extended, in appearance ==> provides the LHS map/object as an implicit argument to the RHS method.
  5. Because of 1, 2, 3, and 4 above, very little additional learning and understanding effort is required from the XPath user.

Proposed action: Replace the ugly, difficult to read, difficult to find and understand operator "?>" with ==> .

QT4 CG meeting 133 draft agenda #agenda-08-19

18 Aug at 10:00:00 GMT

Draft agenda published.

Issue #1938 closed #closed-1938

16 Aug at 18:10:47 GMT

Invoking coerced methods

Issue #2169 created #created-2169

16 Aug at 17:34:08 GMT
Longest-token rule incorrectly produces `StringInterpolation` delimiter

StringInterpolation currently defines a two-character token, curly right brace + backtick, to follow Expr as a terminator:

StringInterpolation ::= "`{" Expr? "}`"

On other occasions, Expr is followed by a single right curly brace:

EnclosedExpr ::= "{" Expr? "}"

The following applies to tokenization (the "longest-token" rule):

If the current position is not the end of the input, then return the longest literal terminal or variable terminal that can be matched starting at the current position, regardless whether this terminal is valid at this point in the grammar.”

My concern is that input like

<a>{42}`</a>

is going to be mis-tokenized under the longest-token rule: after 42, the next (longest) token is the two-character StringInterpolation terminator, which however is not a valid terminator of the EnclosedExpr serving as CommonContent of the direct element constructor.

Proposed fix

My proposal is to replace the two-character tokens that introduce and terminate StringInterpolation with single backticks around an EnclosedExpr with no intervening whitespace:

StringInterpolation ::= "`" EnclosedExpr "`"   /* ws: explicit */

This replaces both of the two-character delimiters of StringInterpolation, while still describing the intended language, but without causing the longest-token rule to produce a token that cannot be handled afterwards.

Issue #1736 closed #closed-1736

15 Aug at 08:52:40 GMT

Add option retain-order=false when constructing maps

Pull request #2168 created #created-2168

14 Aug at 19:33:46 GMT
2139 Make hexBinary and base64Binary fully comparable

Fix #2139

hexBinary and base64Binary become mutually comparable under all comparison operators: which may affect backward compatibility.

Pull request #2167 created #created-2167

14 Aug at 17:59:08 GMT
2166 Reinstate lost text for lookup expressions

Fix #2166

Issue #2166 created #created-2166

14 Aug at 09:40:28 GMT
Lookup expressions: we have deleted too much text

In reverting many of the features previously added to lookup expressions (for example deep lookup and lookup modifiers) we seem to have accidentally lost text that actually defines what the different key specifiers mean; we're left with lots of examples but no actual specification.

I was reading to see what the current spec says about array bound checking: it appears to say nothing.

Issue #2165 created #created-2165

14 Aug at 08:58:55 GMT
Treat expression: inconsistencies, questionable uses

Related: Michael’s observation in https://github.com/qt4cg/qtspecs/issues/2163#issuecomment-3185618064.

The current spec says for treat:

XPath 4.0 provides an expression called treat that can be used to modify the static type of its operand.

It further mentions the static analysis phase, which has been removed from the specs; maybe we should remove these references.

There are hardly any uses of the expression in the current spec. One is for Absolute Path Expressions:

An expression of the form /PP (that is, a path expression with a leading /) is treated as an abbreviation for the expression self::gnode()/(fn:root(.) treat as (document-node()|jnode())/PP.

The use seems confusing, as in many cases self::gnode() can only be evaluated at runtime. Maybe we could rewrite it to a variant that coerces the node? It would generally be easier for optimizers to rewrite paths when there is no need to differentiate between treat and coercion (and I have never seen code that catches XPDY0050).

Indeed I think it would be helpful to have a coerce to expression, even if people will rarely use it explicitly. It would allow us to remove all remaining uses of treat as (except, of course, for the expression itself), and we could simplify various examples that use variable declarations only for coercing values.

Pull request #2164 created #created-2164

13 Aug at 13:51:40 GMT
Fix return type in `fn:parse-csv` signature

In f2e1f48, fn:parse-csv was changed to return an empty sequence, when its first argument is an empty sequence. This is however not reflected in the function's return type, which is here changed to parsed-csv-structure-record?.

Issue #2163 created #created-2163

12 Aug at 21:45:35 GMT
Method calls: `?>` or` =?>`

I propose using =?> for method calls rather than ?>

(a) I lilke the association with => to call a function item with an implicit first argument; =?> combines selection of an item from a map (?) with function invocation (=>)

(b) ?>, while technically unambiguous, smells strongly of XML processing instructions

Issue #2100 closed #closed-2100

12 Aug at 21:32:04 GMT

JNodes: functions

Issue #2149 closed #closed-2149

12 Aug at 21:32:03 GMT

2100 Make innermost, outermost, has-children, path apply to JNodes

Pull request #2162 created #created-2162

12 Aug at 20:54:16 GMT
QT4CG-132-04 Expand the rectangle?area example

Expands the explanation of the example of method chaining

QT4 CG meeting 132 draft minutes #minutes—08-12

12 Aug at 16:20:00 GMT

Draft minutes published.

Issue #2132 closed #closed-2132

12 Aug at 16:16:40 GMT

Error handling in and/or expressions

Issue #2133 closed #closed-2133

12 Aug at 16:16:39 GMT

2132 error handling in logical expressions

Issue #1996 closed #closed-1996

12 Aug at 16:14:38 GMT

Lookups, KeySpecifier: add NumericLiteral and ContextValueRef?

Issue #2134 closed #closed-2134

12 Aug at 16:14:37 GMT

1996 Lookups, KeySpecifier: Literal, ContextValueRef

Issue #2147 closed #closed-2147

12 Aug at 16:12:59 GMT

2143 Redesign of method calls

Issue #2152 closed #closed-2152

12 Aug at 16:08:32 GMT

"x" is not an instance of enum("x")

Issue #2154 closed #closed-2154

12 Aug at 16:08:31 GMT

2152 Revise rules for enumeration types

Issue #2156 closed #closed-2156

12 Aug at 16:06:13 GMT

2092 Drop map:pair, map:of-pairs, map-pairs

Issue #2135 closed #closed-2135

12 Aug at 16:04:15 GMT

QT4CG-131-01/02 Expand on example as actioned

Issue #2136 closed #closed-2136

12 Aug at 16:02:11 GMT

Drop full-width angle brackets

Issue #2137 closed #closed-2137

12 Aug at 16:02:10 GMT

2136 Drop full-width < and > symbols

Issue #2141 closed #closed-2141

12 Aug at 15:59:57 GMT

Remove nested paragraphs

Issue #2145 closed #closed-2145

12 Aug at 15:57:49 GMT

Allow implicit whitespace in StringInterpolation

Issue #2146 closed #closed-2146

12 Aug at 15:56:15 GMT

Require at least one character in StringTemplateFixedPart

Issue #1062 closed #closed-1062

12 Aug at 15:55:45 GMT

150bis revised proposal for fn:ranks

Issue #150 closed #closed-150

12 Aug at 15:55:29 GMT

fn:ranks: Produce all ranks in applying a function on the items of a sequence

Issue #714 closed #closed-714

12 Aug at 15:55:24 GMT

Function annotations in XSLT

Issue #1698 closed #closed-1698

12 Aug at 15:55:20 GMT

Allow select attribute for xsl:call-template instruction

Issue #1852 closed #closed-1852

12 Aug at 15:55:14 GMT

fn:values-except: Return atomic values that occur in A but not in B

Issue #2157 closed #closed-2157

12 Aug at 15:53:59 GMT

Unicode collation algorithm references

Issue #2158 closed #closed-2158

12 Aug at 15:53:58 GMT

2157 Editorial updates to F+O §5.5 (Unicode collations)

Issue #2161 created #created-2161

12 Aug at 15:35:43 GMT
Drop other non-ASCII operators (×, ÷)

Adopted from https://github.com/qt4cg/qtspecs/issues/2136#issuecomment-3135426200:

The feedback for U+00D7 (MULTIPLICATION SIGN, ×) and U+00F7 (DIVISION SIGN, ÷) that we got so far was not very positive either, so I would suggest dropping also those operators; they offer no real added value.

Pull request #2160 created #created-2160

12 Aug at 14:59:07 GMT
2073 data model changes for JNodes and Sequences

This is a first draft of a PR, giving the data model changes only, for a change to the JNode model affecting maps and arrays with sequence-valued entries. A sequence of length 2 or more now has children representing the items in the sequence. Although there is still an asymmetry between sequences of length 1 and longer sequences, it is more manageable than i the previous model.

QT4 CG meeting 132 draft agenda #agenda-08-12

11 Aug at 12:20:00 GMT

Draft agenda published.

Issue #2159 created #created-2159

10 Aug at 05:16:54 GMT
JNodes: Learning from JSONiq?

For those who have not stumbled upon JSONiq yet, I am adding some introductory links:

  • https://www.jsoniq.org/docs/JSONiq-usecases/html-single/
  • https://www.jsoniq.org/docs/Introduction_to_JSONiq/html/
  • https://www.jsoniq.org/docs/JSONiqExtensionToXQuery/html-single/index.html

JSONiq has been designed as a query and update language for JSON data. Its first versions were based on XQuery. Due to its similarities, it may give us some good inspirations for traversing and modifying JNodes.

RumbleDB is a current implementation maintained by Ghislain Fourny (@ghislainfourny).

Pull request #2158 created #created-2158

09 Aug at 11:29:04 GMT
2157 Editorial updates to F+O §5.5 (Unicode collations)

Fix #2157

Issue #2157 created #created-2157

09 Aug at 10:18:55 GMT
Unicode collation algorithm references

In the F&O reference to UTS#10 we say incorrectly that: The current version is 9.0.0, dated 2016-05-18.

Similarly for UTS#35 we say incorrectly: The current version is 29, dated 2016-03-15.

In §5.5, functions based on substring matching, we say

"In the definitions below, we refer to the terms match and minimal match as defined in definitions DS2 and DS4 of [[UTS #10]]."

It's not made clear what "the definitions below" is referring to: the terms "match" and "minimal match" are actually used in the rules of the individual functions.

The parenthetical sentence (“collation unit” is equivalent to "collation element" as defined in [[UTS #10]]) is not very elegantly expressed.

Pull request #2156 created #created-2156

08 Aug at 21:35:41 GMT
2092 Drop map:pair, map:of-pairs, map-pairs

Addresses part of issue #2092.

While the function family map:pair, map:of-pairs, and map:pairs can be handy, they are not necessary, especially now that we have JNodes. They are also very easily user-written:

map:pair => {'key': $key, 'value': $value}

map:of-pairs => map:build($pairs, fn{?key}, fn{?value})

map:pairs => map:for-each($map, fn($k, $v){ {'key': $k, 'value': $v })

On the grounds that we should avoid providing multiple ways of solving the same problem, I propose dropping these three functions.

Note: in some ways I would have preferred to drop the alternative trio map:entry, map:entries, and map:merge; but two of these are present in the 3.1 specification.

Issue #2021 closed #closed-2021

08 Aug at 18:31:46 GMT

XSLT: Move "Patterns" section into "Template Rules"

Issue #2078 closed #closed-2078

08 Aug at 01:28:21 GMT

2031/2025 JNodes: inconsistency in data model taxonomy, definitions

Pull request #2155 created #created-2155

07 Aug at 12:02:37 GMT
2150 Define patterns for JNodes

Fix #2150 Fix #2010

Issue #2151 closed #closed-2151

07 Aug at 09:06:48 GMT

2021 Move the section on Patterns to a more logical place in the spec

Issue #2153 closed #closed-2153

07 Aug at 08:27:56 GMT

Remove limitations from `enum` type

Pull request #2154 created #created-2154

07 Aug at 08:25:22 GMT
2152 Revise rules for enumeration types

Fix #2152

Revises the rules for enumeration types: they are now structural subtypes of xs:string rather than nominative subtypes. The main effect is that "x" instance of enum("x") is now true. The change is motivated by use cases involving XSLT pattern matching, where strict "instance of" matching is required, with no coercion.

Issue #2153 created #created-2153

06 Aug at 18:53:06 GMT
Remove limitations from `enum` type

"Tolkein" isn't an actual instance of enum("Tolkein"), it's only coercible to that type, and when types are used in paths it has to be an actual instance. I think we need to fix that.

Originally posted by @michaelhkay in #2150

It seems strange that there is no way to create a value that is an instance of a singleton enumeration type. Only casting (and annotation, which is a kind of casting too) is available.

On the other hand:

let $x as enum("foo") := "foo"
return ( ()
  , $x instance of enum("foo")
  , $x instance of xs:string
  , atomic-equal($x, "foo")
)
(: true(), true(), true() :)

This means that "foo" should be an instance of enum("foo"), and then enum("foo") is a subtype of xs:string.

And the following is unclear (from 3.2.6 Enumeration Types):

  • It follows from these rules that an atomic item will only satisfy an instance of test if it has the correct type annotation, and this typically requires an explicit cast. So the expression "red" instance of enum("red", "green", "blue") returns false, while "red" cast as enum("red") instance of enum("red", "green", "blue") returns true.

Probably, a more narrow reason is that a singleton enumeration type is an "anonymous atomic type derived from xs:string by restriction using an enumeration facet" that permits only one value. Yes, this makes type checking for an enum more complex, but seems not more complex than casting.

Anyway, is it possible to make any instance of xs:string also an instance of the corresponding singleton enumeration type? (that is, essentially make it so that this casting happens "hidden", if required).

Issue #2152 created #created-2152

06 Aug at 15:55:49 GMT
"x" is not an instance of enum("x")

The usefulness of enum() types is limited by the fact that the string "x" is not actually an instance of enum("x"), it is only coercible to that type. This means that in contexts where strict type matching is required (for example, in XSLT patterns), either (a) you can't use enum() the way you would like, or (b) you use it and fail to understand why it fails.

Pull request #2151 created #created-2151

06 Aug at 11:04:04 GMT
2021 Move the section on Patterns to a more logical place in the spec

This PR simply moves the section on Patterns to a more logical place in the XSLT specification. Unless anyone objects, I will merge the PR without waiting for group approval, so that I can use the result as a baseline for further work on patterns and templates, hopefully giving a better diff baseline.

Issue #1776 closed #closed-1776

06 Aug at 08:27:59 GMT

Using `?` and `??` in XSLT patterns

Issue #2150 created #created-2150

06 Aug at 08:26:10 GMT
XSLT Patterns to match JNodes

Supersedes #1776.

Part of the motivation for introducing JNodes was to make rule-based recursive-descent transformation of JSON structures much easier. This issue addresses part of that capability, namely defining patterns that match JNodes (and perhaps improving the patterns that match maps and arrays).

In general I think the patterns that match JNodes should be distinct from the patterns that match XNodes; although we have unified path expressions so that a/b can select either an XNode or a JNode, I think there would be too much scope for confusion if match="a/b" were able to match a JNode as well as an XNode.

My first idea would be to allow the syntax match="jnode(a)" for a template rule that matches JNodes having a selector property of "a", similarly jnode(a/b), jnode(a//b), jnode(a/*/b), jnode(a[x="c"]) with semantics defined in much the same way.

But there's a question how this relates to type patterns. With type patterns, we can already do match="type(jnode(record(Author, Title, *)))" which matches a JNode whose content is of type record(Author, Title, *). Where syntactically possible we allow type patterns to be abbreviated, so this would become match="jnode(record(Author, Title, *))" which conflicts with the above.

An analogy with element(N, T) might suggest match="jnode(K, V)" where K constrains the selector property of the JNode, and V constrains its content property. So we might have match="jnode(books, array(record(Author, Title, *)))" to match a JNode whose selector is "books" and whose content is of type array(record(Author, Title, *)).

At the same time, while matching maps by a type such as match="record(Author, Title, *))" works well, I find that this is often accompatied by a predicate so it becomes match="record(Author, Title, *))[Author='Tolkein']". It would be nice to express this more concisely and readably perhaps as match="record(Author[.='Tolkein'], Title, *))"

Issue #115 closed #closed-115

06 Aug at 07:52:48 GMT

Lookup operator on arrays of maps

Pull request #2149 created #created-2149

05 Aug at 11:52:53 GMT
2100 Make innermost, outermost, has-children, path apply to JNodes

Fix #2100

Issue #2148 created #created-2148

05 Aug at 09:21:22 GMT
fn:base-uri: Raise errors?

The (rather old) test case K2-BaseURIFunc-29 indicates that invalid URIs may result in an error:

<test-case name="K2-BaseURIFunc-29">
  <description> Use an URI in an xml:base element that is a valid URI, but an invalid HTTP URL. 
    Since implementations aren't required to validate specific schemes but allowed to, 
    this may either raise an error or return the URI. 
  </description>
  <created by="Frans Englich" on="2007-11-26"/>
  <dependency type="spec" value="XQ10+"/>
  <test><![CDATA[let $i := fn:base-uri(<anElement xml:base="http:\\example.com\\examples">Element content</anElement>) 
    return $i eq "http:\\example.com\\examples" or empty($i)]]></test>
  <result>
    <assert-true/>
  </result>
</test-case>

I raise this issue in the qtspecs repository as I wondered whether we should clarify how invalid URIs are to be handled by fn:base-uri.

If it’s the test that is misleading, I will be glad to correct the comment, or add an error code.

Pull request #2147 created #created-2147

05 Aug at 08:49:22 GMT
2143 Redesign of method calls

Although issue #2143 envisaged redefining method calls in terms of JNodes, this PR takes a different approach.

The "magic" performed by the lookup operator when the entry in a map is annotated %method is dropped. Instead we have a new operator ?> which is essentially defined as a macro: in simple cases $map ?> method (X) is defined to be essentially an abbreviation for ($map ? method)($map, X).

I have used the operator ?> suggested by Christian, but in some ways I prefer the operator we had originally, =?>, because (a) there is a stronger analogy with =>, and (b) ?> brings up images of XML syntax for processing instructions.

Pull request #2146 created #created-2146

04 Aug at 19:01:16 GMT
Require at least one character in StringTemplateFixedPart

The grammar rules for StringTemplate are as follows:

StringTemplate              ::=  "`" (StringTemplateFixedPart | StringTemplateVariablePart)* "`"
                                                                                      /* ws: explicit */
StringTemplateFixedPart     ::=  ((Char - ('{' | '}' | '`')) | "{{" | "}}" | "``")*
                                                                                      /* ws: explicit */
StringTemplateVariablePart  ::=  EnclosedExpr
                                                                                      /* ws: explicit */

But StringTemplateFixedPart should not be allowed as a zero-length token, because this is causing an ambiguity: the input `` currently can be parsed as any of

<StringTemplate>`<StringTemplateFixedPart/>`</StringTemplate>
<StringTemplate>`<StringTemplateFixedPart/><StringTemplateFixedPart/>`</StringTemplate>
<StringTemplate>`<StringTemplateFixedPart/><StringTemplateFixedPart/><StringTemplateFixedPart/>`</StringTemplate>

and so on.

In order to ensure an unambiguous result, StringTemplateFixedPart should be required to consist of at least one character. Also the /* ws: explicit */ on StringTemplateVariablePart is superfluous. The grammar rules thus should be changed to:

StringTemplate              ::=  "`" (StringTemplateFixedPart | StringTemplateVariablePart)* "`"
                                                                                      /* ws: explicit */
StringTemplateFixedPart     ::=  ((Char - ('{' | '}' | '`')) | "{{" | "}}" | "``")+
                                                                                      /* ws: explicit */
StringTemplateVariablePart  ::=  EnclosedExpr

Pull request #2145 created #created-2145

03 Aug at 19:22:35 GMT
Allow implicit whitespace in StringInterpolation

Production StringInterpolation currently does not allow implicit whitespace:

StringInterpolation ::= "`{"  Expr?  "}`"
                                                                         /* ws: explicit */

But this is likely not intended - all examples in the spec do have whitespace adjacent to the braces.

This change thus removes /* ws: explicit */ in order to allow implicit whitespace.

Issue #2143 created #created-2143

03 Aug at 07:05:13 GMT
JNodes and Methods

I propose changing the mechanism for invoking methods to take advantage of JNodes.

Instead of the current magic rule for the "?" operator, we move the magic to the rules for dynamic function calls: in a dynamic function call F(X, Y), if the value of F is a JNode J whose content property is a function item annotated with %method, then the function body is executed with the parent of J (that is, the containing map or array) as the context value. It seems much cleaner semantics to make this a rule for dynamic function calls rather than for map lookup.

The downside is that the call syntax now would become ($rectangle/area)() rather than $rectangle?area(). Unfortunately $rectangle/area() parses as $rectangle/(area()). So we might want to invent some better syntax.

Issue #2142 closed #closed-2142

01 Aug at 12:56:54 GMT

Markup fixes in the HTML output

Pull request #2142 created #created-2142

01 Aug at 12:56:19 GMT
Markup fixes in the HTML output
  1. Moved all the processor comments to the end; this avoids having a comment before <!DOCTYPE HTML> which is frowned upon because ... reasons
  2. The XSLT stylesheet was adding links for sections, but so was the main stylesheet, so they were coming out nested.
  3. Don't attempt to link to functions or elements inside titles. (This also results in nested links)
  4. Attempt to "unwrap" <p> elements around things that can't be inside a <p>, like various sorts of lists. It's a bit ugly, but it makes for much cleaner HTML.

I'm just going to merge this because there's no practical way to see the consequences in the PR.

Apologies in advance that this will introduce some spurious diffs. I think those will go away after the build finishes and after you've rebased your PRs on the new stylesheets.

Pull request #2141 created #created-2141

01 Aug at 11:27:24 GMT
Remove nested paragraphs

I have no idea why the DTD allows <p> inside <p> but I assume this is a markup error and not intentional.

Issue #2140 closed #closed-2140

01 Aug at 11:25:05 GMT

Restore diffs

Pull request #2140 created #created-2140

01 Aug at 11:24:55 GMT
Restore diffs

With great appreciation to the fine folks at DeltaXignia!

Issue #2138 closed #closed-2138

31 Jul at 17:39:12 GMT

NodeTest `type(X|Y)`: double parentheses needed

Issue #2139 created #created-2139

31 Jul at 10:42:45 GMT
Binary comparisons

It seems confusing to me that deep-equal & atomic-equal return different results than eq for binary types:

let $hex := xs:hexBinary(''), $base64 := xs:base64Binary('')
return (
  (: false :) deep-equal($hex, $base64),
  (: false :) atomic-equal($hex, $base64),
  (: true  :) $hex eq $base64
)

The rules say:

fn:deep-equal

If both $i1 and $i2 are instances of xs:hexBinary or xs:base64Binary, $i1 eq $i2 returns true.

This can be interpreted in two ways, but it seems to mean that $i1 and $i2 need to have the same type?

fn:atomic-equal

One of the following conditions is true: $value1 and $value2 are both instances of xs:hexBinary. $value1 and $value2 are both instances of xs:base64Binary.

op:binary-equal

op:binary-equal(
  $value1	as (xs:hexBinary | xs:base64Binary),	
  $value2	as (xs:hexBinary | xs:base64Binary)	
) as xs:boolean

The function returns true if $value1 and $value2 are of the same length, measured in binary octets, and contain the same octets in the same order. Otherwise, it returns false.

As atomic-equal(xs:double(3), xs:float(3)) returns true, I would also expect true for binary items with the same contents.

Related (for numbers): #986

Issue #2138 created #created-2138

30 Jul at 22:11:33 GMT
NodeTest `type(X|Y)`: double parentheses needed

It is not currently possible to write a NodeTest as (for example) child::type(xs:string | xs:integer). As a consequence of the way the grammar is defined, two pairs of parentheses are needed: child::type((xs:string | xs:integer))

It would be easy enough to fix this usability glitch.

Pull request #2137 created #created-2137

30 Jul at 07:29:36 GMT
2136 Drop full-width < and > symbols

Fix #2136

Issue #2136 created #created-2136

30 Jul at 06:58:58 GMT
Drop full-width angle brackets

The option of using full-width angle brackets doesn't seem to have attracted great enthusiasm, and now that we have the precedes and follows operators, I suggest we drop them. Nearly all cases of plain < and <= can be replaced with lt and le.

One of the problems with using non-ASCII characters is not just that it's hard to type them, it's also quite hard to recognise them by their appearance. There are so many characters that look a bit like less-than and greater-than symbols.

Pull request #2135 created #created-2135

29 Jul at 17:45:18 GMT
QT4CG-131-01/02 Expand on example as actioned

Expands on the let binding example

QT4 CG meeting 131 draft minutes #minutes—07-29

29 Jul at 16:30:00 GMT

Draft minutes published.

Issue #2130 closed #closed-2130

29 Jul at 16:23:39 GMT

Proposed new operator keywords: precedes, follows

Issue #2080 closed #closed-2080

29 Jul at 16:20:17 GMT

Destructuring let clauses: Bind remaining values

Issue #2119 closed #closed-2119

29 Jul at 16:20:16 GMT

2080 allow let $($head, $tail)

Issue #2087 closed #closed-2087

29 Jul at 16:17:14 GMT

Adaptive serialization: JNodes

Issue #2114 closed #closed-2114

29 Jul at 16:17:13 GMT

2087 Change adaptive serialization of JNodes

Issue #2084 closed #closed-2084

29 Jul at 16:14:02 GMT

Steps when the context value contains multiple nodes

Issue #2115 closed #closed-2115

29 Jul at 16:14:01 GMT

2084 - document order of axis steps when context value is a sequence

Issue #2082 closed #closed-2082

29 Jul at 16:12:13 GMT

parse-html options parameter conventions

Issue #2117 closed #closed-2117

29 Jul at 16:12:12 GMT

2082 parse-html options

Issue #2099 closed #closed-2099

29 Jul at 16:09:26 GMT

Choosing names for the jnode function and the jnode type

Issue #2129 closed #closed-2129

29 Jul at 16:09:25 GMT

2099 Rename fn:jnode and jnode-type

Issue #2086 closed #closed-2086

29 Jul at 16:09:12 GMT

Can the ¶value property of a JNode be (or contain) a JNode?

Issue #1978 closed #closed-1978

29 Jul at 16:09:07 GMT

Function `map:build` does not allow expressing the dependency of a value on its key. Some simple types of maps cannot be built.

Issue #1946 closed #closed-1946

29 Jul at 16:09:02 GMT

We need examples of a record with an entry that is a %method and invoking this method with the result it must produce

Issue #1514 closed #closed-1514

29 Jul at 16:08:55 GMT

Editorial: optional position argument in function signature for for-each and other HOF

Issue #1175 closed #closed-1175

29 Jul at 16:08:44 GMT

XPath: Optional parameters in the definition of an inline function

Issue #2102 closed #closed-2102

29 Jul at 16:06:57 GMT

Type diagrams: drop/add parentheses

Issue #2113 closed #closed-2113

29 Jul at 16:06:56 GMT

2102 Make type labels in diagram consistent

Pull request #2134 created #created-2134

29 Jul at 06:35:01 GMT
1996 Lookups, KeySpecifier: Literal, ContextValueRef

Revised; closes #1996

Issue #2063 closed #closed-2063

29 Jul at 06:30:13 GMT

1996 Lookups, KeySpecifier: Literal, ContextValueRef

Pull request #2133 created #created-2133

28 Jul at 22:10:02 GMT
2132 error handling in logical expressions

Note: depends on #2115 because of terminology changes.

Fix #2132

Issue #2132 created #created-2132

28 Jul at 19:49:07 GMT
Error handling in and/or expressions

In §2.4.5 we introduced the concept of guarded expressions, and included the rules

  • In an and expression, the second operand is guarded by the value of the first operand being true.
  • In an or expression, the second operand is guarded by the value of the first operand being false.

This change is not mentioned in 4.11 Logical Expressions. For example, this still says

The order in which the operands of a logical expression are evaluated is [implementation-dependent]

and the truth tables in 4.11 are unchanged from 3.1.

(We have also introduced defined terms "and expression" and "or expression", and should use them here).

My understanding of the new rule for guarded expressions is that with (A and B), if A is false then the result is false even if B raises an error; this is not what the truth table says.

Issue #2131 closed #closed-2131

28 Jul at 18:53:30 GMT

XSLT `xsl:for-each-group` `split-when` variables

QT4 CG meeting 131 draft agenda #agenda-07-29

28 Jul at 11:30:00 GMT

Draft agenda published.

Issue #2131 created #created-2131

28 Jul at 10:28:36 GMT
XSLT `xsl:for-each-group` `split-when` variables

Currently the spec says:

The expression is supplied with two variables: $group is set to the contents of the current group being constructed, and $next is the next item in the population.

1. Do these supplied variables shadow the user-defined variable of the same name? (probably they should, but it is not mentioned) For example:

  <xsl:variable name="group" select="(1,2,3)"/>  
  <xsl:for-each-group select="$input" split-when=" $next = $group "> ... </xsl:for-each-group>

Maybe these variables should be in the fn: namespace?

2. To access the current grouping key and current group the functions fn:current-grouping-key and fn:current-group are used. What is the rationale of introducing variables instead of functions to access the group and item in split-when?

Issue #2066 closed #closed-2066

28 Jul at 09:25:59 GMT

Cells in the F&O signature blocks should be vertically aligned to the top

Issue #2122 closed #closed-2122

28 Jul at 09:25:58 GMT

2066 CSS changes for function prototypes

Pull request #2130 created #created-2130

28 Jul at 05:06:08 GMT
Proposed new operator keywords: precedes, follows

The operators << and >>, in my opinion, are poorly known, and challenging for developers working in XSLT. Many punctuation-based operators have aliases in ordinary-language quasi-equivalents, but << and >> lack any ordinary verbal equivalents, and break this principle.

The attached proposal offers to make precedes a keyword equivalent to << and follows a keyword equivalent to >>. This means that //title[. &lt;&lt; following-sibling::isbn[1]] can now be expressed as //title[. precedes following-sibling::isbn[1]]

Pull request #2129 created #created-2129

26 Jul at 10:44:50 GMT
2099 Rename fn:jnode and jnode-type

Renames the function fn:jnode as fn:jtree, and the item type jnode-type() as jnode()

Fix #2099

Issue #2128 created #created-2128

25 Jul at 15:49:30 GMT
JNodes and XSLT Streaming

In principle, there's no reason why JTrees shouldn't be streamable. However, it's an immense amount of work both for the specification and for an implementation, so I intend to rule it out.

That then leaves the question about deciding streamability in the case of constructs that could be processing JNodes.

I think we might be able to define a rule something like: "if a template rule (etc.) is declared with streamable="yes", this amounts to a declaration that it will only be used to process XNodes, even if without this declaration it would also be capable of processing JNodes."

The details depend on other aspects of how we define template rule processing and pattern matching for JNodes. At present I'm inclined to say that the pattern syntax for matching JNodes should be distinct from that for matching XNodes, so that a template can only match one or the other, never both.

Issue #2127 created #created-2127

25 Jul at 09:31:01 GMT
JNodes: Include atomic items

With the introduction of JNodes, it feels like a natural step to enhance the processing of documents, collections and databases to JSON data. Currently, the roots of JNodes are restricted to maps and arrays. We should generalize them and include support for atomic items. The Background:

JSON data types are not restricted to maps (objects) and arrays, they can also be strings, numbers and booleans. As a consequence, a json-doc('input.json') call may also return atomic types. When iterating over JSON input, we should ensure that there is no need for additional type checks to ensure that code does not fail:

for $i in 1 to 10
for $doc in json-doc($i || '.json')
(: where $doc instance of (map(*)|array(*)) :)
return $doc/a/b/c

I assume it would be no substantial change to open $json/step for types other than nodes, maps and arrays. The only tricky special case is null, but it is converted to an empty sequence, why json-doc('document-with-single-null-value.json')/a/b/c already succeeds.

To address any concerns that JNodes do not make sense for standalone atomic items: There is a certain analogy to XML text nodes, which can also be created without serving any immediate purpose, but are helpful and necessary for mapping the entire XML data model.

Issue #2126 created #created-2126

25 Jul at 07:16:34 GMT
Absolute path expressions with JTrees

I'm thinking there may be a case for disallowing or restricting the use of absolute path expressions over JTrees.

Firstly, there's a strong likelihood that users are only dimly aware of where the root of the tree actually is. They are likely to imagine, if they parse a json text, that the root of the tree will be the root of that text. But in fact the root is wherever they started path navigation from, which may be different. It's likely to be particularly confusing if you move out of JNode space into map/array territory and then back again. Requiring an explicit call on root(), or on ancestor::*[last()], might mean that users think more carefully about it. This makes it clearer that the root is not some kind of absolute fixed point, it is simply "the place where you started your current journey".

Certainly, I don't think we should allow /x to implicitly construct a JNode wrapping the context item: the '/' here is redundant and means the user almost certainly doesn't understand what they are doing.

Another consideration is that if we restrict leading / to work only with XTrees, then we reinstate a lot of static type checking capability that we have currently lost.

Issue #2125 created #created-2125

24 Jul at 10:32:42 GMT
csv-to-xml() - untestable results

The machinery for generating test cases from spec examples is failing in the case of the csv-to-xml() tests.

The test generator produces an expected result assertion of

<assert-xml ignore-prefixes="false"><![CDATA[<substitute-for-unparseable-result-xml xmlns="http://www.w3.org/2010/09/qt-fots-catalog"/>]]></assert-xml>

The same problem occurs with some fn:analyze-string tests.

The failure basically means that parse-xml() on the expected test results has failed. (Stylesheet generate-qt3-test-set.xsl line 122).

I think that the problem is that the build is running Saxon in schema-aware mode, and this applies to all XML parsing including the parse-xml() function, so we're getting a schema validation failure where we really don't want to be validating in the first place.

Unfortunately we can't selectively switch parse-xml() validation off unless we move to a later (and not yet stable) Saxon version, and rewriting the whole stylesheet to not be schema-aware would be painful.

Pull request #2124 created #created-2124

24 Jul at 08:24:53 GMT
573 Functions to Construct Trees

A first cut at providing a functional approach to XNode and XTree construction.

At this stage I'm interested in comments on the general approach, not the fine detail (some of which, e.g. namespace inheritance, still needs work.)

Pull request #2123 created #created-2123

24 Jul at 03:23:10 GMT
2051: XSLT group by cluster

Companion PR to #2051 .

I have opted for only two examples, hoping they catalyze the imagination of what is possible. Comments welcome.

Pull request #2122 created #created-2122

23 Jul at 21:35:28 GMT
2066 CSS changes for function prototypes

Fix #2066

Issue #2121 closed #closed-2121

23 Jul at 21:32:30 GMT

2066 fo signature table format

Pull request #2121 created #created-2121

23 Jul at 21:31:59 GMT
2066 fo signature table format

CSS changes to improve alignment of complex signatures, e.g. fn:round

Fix #2066

Issue #1774 closed #closed-1774

23 Jul at 20:25:19 GMT

Nomenclature: relabelling

Issue #1775 closed #closed-1775

23 Jul at 20:18:29 GMT

Navigation in JSON trees

Pull request #2120 created #created-2120

23 Jul at 10:14:30 GMT
2007 Revised design for xsl:array

Revised design for xsl:array based on usage experience.

Fix #2007

Issue #2118 closed #closed-2118

23 Jul at 06:23:02 GMT

2080 Tweak the rules for destructuring variable bindings

Pull request #2119 created #created-2119

23 Jul at 06:20:52 GMT
2080 allow let $($head, $tail)

Fix #2080

With let $($x, $y, $z), $z bids to the rest of the sequence.

With let $[$a, $b, $c], FOAR0001 is raised if the array is too short.

Note, XPath and XQuery should be reviewed separately as the source text for let expressions is different.

Pull request #2118 created #created-2118

22 Jul at 23:01:43 GMT
2080 Tweak the rules for destructuring variable bindings
  1. When binding to a sequence, the last variable binds to the rest of the sequence.
  2. When binding to an array, an FOAR0001 occurs if there are more variables than array members.

Fix #2080.

Pull request #2117 created #created-2117

22 Jul at 22:34:15 GMT
2082 parse-html options
  1. Use non-optional types such as xs:boolean for options parameters
  2. Use regular error codes for bad options
  3. Drop error code relating to the discontinued method option.

Fix #2082

Pull request #2116 created #created-2116

22 Jul at 21:04:30 GMT
2112 Refine/revise the rules for get() in node tests

Proposed revision of the rules for get() in node tests.

Mainly editorial clarification; but also changes the rules for the focus - the expression is now evaluated with absent focus to ensure an error in preference to unexpected results.

Fix #2112

Pull request #2115 created #created-2115

22 Jul at 19:06:30 GMT
2084 - document order of axis steps when context value is a sequence

Clarifies that the results are in document order and deduplicated.

Fix #2084

Pull request #2114 created #created-2114

22 Jul at 17:36:07 GMT
2087 Change adaptive serialization of JNodes

Fix #2087

Pull request #2113 created #created-2113

22 Jul at 17:23:46 GMT
2102 Make type labels in diagram consistent

Fix #2102

Drops the parentheses in map(), array(), function(*)

QT4 CG meeting 130 draft minutes #minutes—07-22

22 Jul at 16:20:00 GMT

Draft minutes published.

Issue #2036 closed #closed-2036

22 Jul at 16:13:23 GMT

Streamability of xsl:map instruction

Issue #2037 closed #closed-2037

22 Jul at 16:13:22 GMT

2036 Add rule for streamability of xsl:map

Issue #2104 closed #closed-2104

22 Jul at 16:11:37 GMT

JNodes: unwrapping

Issue #2111 closed #closed-2111

22 Jul at 16:10:29 GMT

2104 Point out places where jnode-content is called implicitly

Issue #2098 closed #closed-2098

22 Jul at 16:07:57 GMT

JNodes: combining node sequences

Issue #2110 closed #closed-2110

22 Jul at 16:07:56 GMT

2098 Clarify when jnode() is called implicitly

Issue #2103 closed #closed-2103

22 Jul at 16:04:29 GMT

JNodes functions: 0-arity variant

Issue #2109 closed #closed-2109

22 Jul at 16:04:28 GMT

2103 Allow operand of JNode accessors to be omitted or empty

Issue #2107 closed #closed-2107

22 Jul at 16:01:41 GMT

QT4CG-129-01: Actions from review of PR2094

Issue #2108 closed #closed-2108

22 Jul at 15:58:32 GMT

QT4CG-123-01 Add example of library module using methods

Issue #2106 closed #closed-2106

22 Jul at 15:55:27 GMT

Add note on the impossibility of cyclic instances

Issue #2105 closed #closed-2105

22 Jul at 15:52:20 GMT

Fix type of `fn:schema-type-record` field `constructor`

Issue #2097 closed #closed-2097

22 Jul at 15:51:51 GMT

`jnode` as a subtype of `node`

Issue #2089 closed #closed-2089

22 Jul at 15:51:38 GMT

JNode properties: Presentation

Issue #2112 created #created-2112

22 Jul at 13:46:28 GMT
JNodes: get()

The documentation for get() says for XNodes…

A selector can also take the form get(Expr). The contained expression Expr is evaluated with the focus of the containing axis step (so its value is independent of the specific XNode being tested). The result of the expression after atomization must be a sequence of zero or more xs:QName values (otherwise a type error [err:XPTY0004] is raised). An XNode satisfies the selector if its node kind is the principal node kind of the axis and its node name is among the values returned by the selector expression.

…and for JNodes…

If the selector takes the form get(Expr), then the contained expression Expr is evaluated with the focus of the containing axis step (so its value is independent of the specific JNode being tested). A JNode satisfies the selector if its ·selector· property is equal to one or more of the values returned by the selector expression, under the rules of the fn:atomic-equal function.

Nitpicking:

  • With the existing rule for JNodes, I assume that no match would be returned for EXPR := [ 'a', 'b' ]. I would thus propose to atomize EXPR first and compare it afterwards (see below).
  • “A JNode satisfies the selector if [the value of] its ·selector· property”

Next, I would like us to unify the rules for XNodes and JNodes. The rationale (besides “simpler rules are simpler to explain”):

  1. XPath is well-known for being forgiving. Maybe we can maintain that tradition for name tests, by tolerating input other than QNames.
  2. By using identical rules for JNodes and XNodes, it will be easier to process input that mixed XNodes and JNodes. An example:
(<xml>ignored</xml>, { 1: 'one' }, [ 'one' ])/get(1)

I would propose to simplify the joint rules to the following XPath expression:

some(
  data(EXPR),
  atomic-equal(?, if(. instance of node()) then node-name() else jnode-selector())
)

Finally, I assume that the focus information can be utilized in the get expression, right? Is it correct to assume that all of the following expressions will return <a2/>?

let $xml := <xml><a3/><a2/><a1/></xml>
let $name := #a2
return (
  $xml/get(#a2),
  $xml/get($name),
  $xml/get(node-name()[. = $name])
  $xml/get(xs:QName('a' || position())),
  $xml/get(if(position() = 2) { $name } ),
  $xml/get(xs:QName(`a{ last() - 1 }`))
)

QT4 CG meeting 130 draft agenda #agenda-07-22

21 Jul at 10:45:00 GMT

Draft agenda published.

Issue #1786 closed #closed-1786

17 Jul at 22:27:43 GMT

A case study for XSLT transformation of JSON: the transpiler

Issue #2025 closed #closed-2025

17 Jul at 22:25:51 GMT

Combine the concepts of pins/labels and modified lookups

Pull request #2111 created #created-2111

17 Jul at 22:09:12 GMT
2104 Point out places where jnode-content is called implicitly

Fix #2095

This PR is purely editorial: it adds notes and examples showing where jnode-content() is (or is not) called implicitly.

Pull request #2110 created #created-2110

17 Jul at 21:08:13 GMT
2098 Clarify when jnode() is called implicitly

Fix #2098

Pull request #2109 created #created-2109

17 Jul at 20:38:08 GMT
2103 Allow operand of JNode accessors to be omitted or empty

Fix #2103

Pull request #2108 created #created-2108

17 Jul at 08:32:26 GMT

QT4CG-123-01 Add example of library module using methods

Pull request #2107 created #created-2107

17 Jul at 07:50:34 GMT

QT4CG-129-01: Actions from review of PR2094

Pull request #2106 created #created-2106

15 Jul at 21:44:03 GMT
Add note on the impossibility of cyclic instances

Responding to an action at today's meeting, this adds a note to the effect that although types can contain cyclic references, instances can not.

QT4 CG meeting 129 draft minutes #minutes—07-15

15 Jul at 16:45:00 GMT

Draft minutes published.

Pull request #2105 created #created-2105

15 Jul at 17:16:39 GMT
Fix type of `fn:schema-type-record` field `constructor`

The constructor of fn:schema-type-record is currently shown with a type of

fn(xs:anyAtomicType?) as xs:anyAtomicType?

However this does not cover list type constructors, returning multiple occurrences. It should thus be

fn(xs:anyAtomicType?) as xs:anyAtomicType*

A test exists that requires this: schema-type-005

Issue #2104 created #created-2104

15 Jul at 17:16:38 GMT
JNodes: unwrapping

Related (but not identical to) https://github.com/qt4cg/qtspecs/issues/2095:


As Michael indicated in https://github.com/qt4cg/qtspecs/issues/2095#issuecomment-3069173742, the implicit unwrapping of accessed/iterated JNode results may already be defined in the current spec, but it may need to be further clarified. Examples:

let $jnode := { 'array': [ 1, 2 ] }/array
return (
  (: FLWOR expressions :)
  for member $m in $jnode return $m,
  (: Functions :)
  array:size($jnode),
  (: Lookups :)
  $jnode?1
)

Issue #2054 closed #closed-2054

15 Jul at 17:08:53 GMT

JPath expression

Issue #2103 created #created-2103

15 Jul at 16:48:51 GMT
JNodes functions: 0-arity variant

Similar to fn:name and other accessor functions, the new JNode functions (fn:node-content, fn:node-position, fn:node-selector) should be gifted with a 0-arity variant.

Issue #2102 created #created-2102

15 Jul at 16:38:18 GMT
Type diagrams: drop/add parentheses

The current presentation of the data types is inconsistent (https://qt4cg.org/specifications/xpath-datamodel-40/Overview.html#types-hierarchy):

It includes GNode, XNode, and JNode, attribute, document etc. (without parentheses), but function(*), array(*) and map(*). Shouldn’t we remove the parentheses from function(*) etc., or add them to the other types?

Or maybe we should even change function(*) to Function, etc.

Labels | Types -- | -- GNode | gnode() XNode | node() JNode | jnode() attribute | attribute(), attribute(*), attribute(a), … document | document-node(), document-node(*), … function() | function(*), function(xs:int) as xs:int, … array() | array(*), array(xs:int), … map(*) | map(*), map(xs:int, xs:int), …

Issue #2011 closed #closed-2011

15 Jul at 16:27:58 GMT

675(part): Add XSLT static typing rules for new kinds of XPath expression

Issue #2038 closed #closed-2038

15 Jul at 16:24:56 GMT

Drop dependency of fn:apply-templates on the default mode

Issue #2043 closed #closed-2043

15 Jul at 16:24:55 GMT

2038 Tweak the rules for fn:apply-templates references to modes

Issue #2101 created #created-2101

15 Jul at 16:21:58 GMT
Named record types: drop constructors, complete list

The spec defines various built-in named record types (https://qt4cg.org/specifications/xpath-functions-40/Overview.html#id-built-in-named-record-types):

key-value-pair
load-xquery-module-record
parsed-csv-structure-record
random-number-generator-record
schema-type-record
uri-structure-record

Suggestions:

  1. The spec says in https://qt4cg.org/specifications/xquery-40/xquery-40.html#id-named-record-types:

Named record types implicitly create a constructor function that can be used to create instances of the record type.

I would propose excluding the constructors for built-in types. It will save us a lot of bulky code in the implementations, for functions that will hardly be used, as they all refer to return types of existing functions. Ironically, the only exception might be key-value-pair, but it is redundant anyway (we already have map:pair… unless it is dropped, together with the record type, by #2092).

  1. A type is missing for the result of fn:divide-decimals, and we should suffix key-value-pair with -record (provided we will keep it in the spec).

Issue #2014 closed #closed-2014

15 Jul at 16:19:43 GMT

QT4CG-122-01 Add notes, examples, and rationale for xsl:select

Issue #2003 closed #closed-2003

15 Jul at 16:17:04 GMT

Conditional entries in map constructors

Issue #2094 closed #closed-2094

15 Jul at 16:17:03 GMT

2003 Generalize Map Constructors

Issue #2083 closed #closed-2083

15 Jul at 16:13:32 GMT

2054 Generalized Path Expressions

Issue #2031 closed #closed-2031

15 Jul at 16:03:33 GMT

2025 JNodes

Issue #1307 closed #closed-1307

15 Jul at 16:02:56 GMT

For symmetry, add functions array:scan-left and array:scan-right

Issue #2057 closed #closed-2057

15 Jul at 16:02:51 GMT

Steps: variable element names

Issue #2035 closed #closed-2035

15 Jul at 16:02:37 GMT

Recursive record types: unrealistic example in XPath spec

Issue #2096 closed #closed-2096

15 Jul at 16:02:35 GMT

2035 Drop unworkable example of recursive record types

Issue #2100 created #created-2100

15 Jul at 15:55:40 GMT
JNodes: functions

With #2083, some XQFO functions was generalized for JNodes:

fn:distinct-ordered-nodes
fn:generate-id
fn:root
fn:siblings
fn:transitive-closure

Others are still pending (to be completed):

fn:has-children
fn:innermost
fn:outermost
fn:path

We should…

  1. analyze the rules of the existing functions (for example, what happens if both XNodes and JNodes are used in fn:transitive-closure?)
  2. add more functions.

Issue #2099 created #created-2099

15 Jul at 12:31:59 GMT
Choosing names for the jnode function and the jnode type

As a late-breaking change to PR #2083, the item type syntax for matching JNodes has been changed to jnode-type(), to avoid a clash with the name of the function fn:jnode.

We might prefer to resolve this clash in a different way.

Issue #2098 created #created-2098

15 Jul at 11:24:28 GMT
JNodes: combining node sequences

Internal questions and feedback on the union operator, triggered by the JNodes proposal:

  1. Will { 1: 2 } union { 3: 4 } be allowed, or will it be { 1: 2 }/. union { 3: 4 }/. ?
  2. Combining maps and arrays: one might expect { 1: 2 } union { 3: 4 } to result in { 1: 2, 3: 4 }.
  3. If union et al. are enhanced anyway, couldn’t they be generalized for sequences? (1, 2) union 3(1, 2, 3).

…which I answered as follows:

  1. I guess no; the conversion to JNodes is needed.
  2. …a good reason why we should not implicitly coerce maps/arrays to JNodes.
  3. For atomic-only sequences, it could be equivalent to fn:distinct-values((A, B)). For heterogenous sequences, it gets tricky: How should <a>1</a> union 1e0 be combined? Similar to how functions like fn:min are defined, it could be the first item that determines how the remaining input is combined (but the operation would not be commutative anymore; A union B might yield different results than B union A).

QT4 CG meeting 129 draft agenda #agenda-07-15

14 Jul at 12:20:00 GMT

Draft agenda published.

Issue #366 closed #closed-366

14 Jul at 06:41:06 GMT

Support xsl:use-package with xsl:package-location

Issue #2097 created #created-2097

13 Jul at 10:31:47 GMT
`jnode` as a subtype of `node`

It seems that introduce jnode by extending node is more consistent than introducing jnode as a data type that is a sibling to node.

I would like to hear comments about pros and cons of this approach.

@ruv wrote:

I wonder what if jnode was a subtype of node

@michaelhkay wrote:

Then all operations on node would become available for jnode, including many that obviously don't make sense, for example getting the in-scope namespaces, applying schema validation, etc etc.

This does not seem to be a problem. There are many features that make sense for one node kind (or type) and don't make sense for other.

For example,

  • fn:name() does not make sense for document-node(), comment(), text() (but applies to them and return the empty string);
  • fn:in-scope-prefixes() applies to only a node that is an element().
  • the constructor document { } does not accept an attribute node.

Thus, nodes that are a subtype of jnode can have their own restrictions.

jnode as subtype of node

There can be four direct subtypes of jnode:

  • map-node
  • array-node
  • map-entry
  • array-member (or maybe let's call it array-entry)

So, jnode is a union of them: jnode = map-node | array-node | map-entry | array-entry.

Nodes of the type map-node and array-node are similar to document-node. Their parent is always ().

The child axis of a jnode can contains only nodes of the type map-entry | array-entry.

An advantage of this approach is that there is no need to introduce XNode and GNode, and corresponding confusion, like say that node() matches XNode, but not GNode (in the general case).

This probably also allows us to specify XSLT for jnodes more seamlessly.

Pull request #2096 created #created-2096

13 Jul at 10:07:48 GMT
2035 Drop unworkable example of recursive record types

Fix #2035

Issue #2095 created #created-2095

13 Jul at 09:37:28 GMT
JNodes: result processing

By playing around with the new JNodes syntax, I noticed that the somewhat bulky function fn:jnode-content needs to be used a lot to process the result of a path traversal, basically every time when coercion is not possible or does not make sense (iterations, any operation based on item()*).

An example:

{ 'Catania': { 'nomi': ('Alfredo', 'Andrea') } }
  1. Count the number of persons in Catania:
./Catania/nomi => jnode-content() => count()
  1. List all persons in upper case:
for $nome in .//nomi => jnode-content()
return upper-case($nome)

To prevent users from selectively resorting to the shorter lookup syntax…

.?Catania?nomi => count()

for $nome in .?Catania?*?nomi
return upper-case($nome)

…we could include another pseudo-function that returns the content/value instead of a JNode:

./Catania/content(nomi) => count()

for $nome in .//content(nomi) 
return upper-case($nome)

One drawback would be that this would violate the current principle that the righthand side is only a filter.

Pull request #2094 created #created-2094

13 Jul at 08:40:48 GMT
2003 Generalize Map Constructors

Allows conditional and repeated entries in a map constructor.

Fix #2003

Issue #2093 created #created-2093

12 Jul at 09:07:18 GMT
XQFO: structuring

The way the XQFO functions are structured is becoming increasingly arbitrary. Examples:

  • Processing sequences: fn:doc
  • Processing nodes: fn:string
  • Processing QNames: fn:in-scope-prefixes
  • Parsing and serializing: fn:xsd-validator

For at least 10% of the functions, it will be difficult to find a good categorization (categorization is a challenging topic in itself), but maybe we can improve the status quo.

We can certainly tackle this at a later stage; this issue is only about reminding us of the task.

Issue #1715 closed #closed-1715

11 Jul at 21:45:32 GMT

Array Lookups: partial removal of out-of-bounds checks

Issue #1995 closed #closed-1995

11 Jul at 21:42:54 GMT

Consistency: array lookups

Issue #1872 closed #closed-1872

11 Jul at 21:42:00 GMT

Arrays: members → values / entries?

Issue #1871 closed #closed-1871

11 Jul at 21:38:13 GMT

Arrays and maps: consistency

Issue #2092 created #created-2092

11 Jul at 20:34:18 GMT
Drop map:pair, map:of-pairs, map:pairs, array:members, array:of-members

I propose dropping the three functions map:pair, map:of-pairs, map:pairs, together with the built-in record type fn:key-value-pair.

With the introduction of JNodes, I think these are redundant.

  • In place of map:pairs($map), use $map/*.
  • In place of map:pair($key, $value), use {$key : $value}/*
  • In place of map:of-pairs, use map:build($jnodes, jnode-selector#1, jnode-content#1)

I'm proposing to keep the map:entry, map:entries, and map:merge trio which also do much the same thing.

Similarly, I propose dropping array:members and array:of-members, and "value records":

  • In place of array:members($array), use $array/*
  • In place of array:of-members, use array:build($jnodes, jnode-content#1)

(Alternatively: keep the functions but define them to return and consume JNodes, rather than key-value and value records.)

Issue #2046 closed #closed-2046

11 Jul at 15:29:25 GMT

Promote ".." to a primary expression

Issue #2076 closed #closed-2076

11 Jul at 14:18:37 GMT

TOC: Interaction

Issue #2091 closed #closed-2091

11 Jul at 14:18:35 GMT

ToC changes per issue #2076

Pull request #2091 created #created-2091

11 Jul at 14:18:10 GMT
ToC changes per issue #2076

Fix #2076

Hopefully this is an improvement!

Issue #2090 closed #closed-2090

11 Jul at 07:12:12 GMT

What does it mean to send an `encoding` serialization parameter to `fn:serialize`?

Issue #2090 created #created-2090

10 Jul at 16:35:28 GMT
What does it mean to send an `encoding` serialization parameter to `fn:serialize`?

We say that fn:serialize() returns a string, so I'd expect it to be UTF-8 (or UTF-16, or whatever the implementation's common character set is for strings). I don't see any mention of the meaning of encoding.

Issue #2089 created #created-2089

09 Jul at 11:46:09 GMT
JNode properties: Presentation

Maybe the presentation of XNode and JNode properties could be aligned.

The XDM uses square brackets for XML node properties:

Document node properties are derived from the infoset as follows:

base-uri     The value of the [base URI] property, if available, […]

…and the character ¶ for JNode properties:

JNode has the following properties:     ¶parent: a JNode […]

Would it make sense to always use ¶ or square brackets, or are the kind of properties we talk about too different in order to be aligned? In the latter case, we may need to clarify this in the spec, or add some words on (possibly non-existing) GNode properties.

Issue #2088 created #created-2088

09 Jul at 11:24:17 GMT
File Module: Feedback, Observations

1. “Regular files” (QT4CG-128-01)

Add POSIX reference.

2. Permissions (QT4CG-128-02)

All functions should be checked with regard to permission handling. Due to the variety of file systems and the programming languages that operate on them, it may turn out that file:not-found and file:io-error is all we can offer.

3. file:is-absolute

The rule says: “A path is absolute if it does not need to be combined with other path information, such as the current directory, to locate a file.”

Thus, file:is-absolute('/') must not return true on Windows systems, as the drive letter is missing.

Other rules of the spec that refer to absolute file paths should reflect this.

4. file:resolve-path

Additionally, Rule 1 “If $path is an absolute path, it is returned unchanged.” contradicts the final sentence, which states that a separator must be added to directory paths:

…to be continued.

Issue #2087 created #created-2087

09 Jul at 08:24:22 GMT
Adaptive serialization: JNodes

The (proposed) spec currently says (for the adaptive output method)

A JNode is serialized by serializing its ¶value property.

I propose changing the output to

JNode(k:v)

where k is the serialization of the selector property and v is the serialization of the value property.

The rule for the JSON output method remains the same.

The reason for the change is so that people can see when a query returns a JNode as distinct from returning its value.

Issue #2086 created #created-2086

08 Jul at 20:10:59 GMT
Can the ¶value property of a JNode be (or contain) a JNode?

The data model allows the ¶value property of a JNode be (or contain) a JNode. But can it actually happen, and if so, what are the consequences?

I think it can happen. Although fn:JNode can't be applied directly to a JNode, it is possible to construct a map or array in which the entries/members are (or contain) JNodes. We can then wrap such an array or map in a JNode using the fn:JNode function, and the child axis applied to this containing array will return JNodes that have JNodes as their ¶value properties.

While the results may be confusing, I don't think they are harmful (and someone may find an imaginative way of making use of such a structure). For the time being therefore, I propose to allow it, perhaps with an explanatory note to point out any dangers.

Should atomization of a JNode unwrap multiple layers? We currently say that if a JNode J has a ¶value V, then the atomization of J is the atomization of V. I see no particular reason to change that rule, but again, it's an edge case we might draw attention to.

QT4 CG meeting 128 draft minutes #minutes—07-08

08 Jul at 16:10:00 GMT

Draft minutes published.

Issue #2085 closed #closed-2085

08 Jul at 16:24:12 GMT

Fix markup errors in the EXPath file: specification

Pull request #2085 created #created-2085

08 Jul at 16:24:06 GMT
Fix markup errors in the EXPath file: specification

The FOS schema does not allow an fos:example containing an fos:test to omit the fos:result element. I’ve inserted

<fos:result>���</fos:result>

as a placeholder to mark the problem. I also had to move some of the fos:errors sections to a different location.

@ChristianGruen apologies for just pushing these in. I wanted to get the specs building again.

Issue #2070 closed #closed-2070

08 Jul at 16:01:11 GMT

Map build patch

Issue #2016 closed #closed-2016

08 Jul at 16:00:59 GMT

File Module: Incorporate changes

Issue #2077 closed #closed-2077

08 Jul at 16:00:57 GMT

2016 File Module: Incorporate changes

QT4 CG meeting 128 draft agenda #agenda-07-08

07 Jul at 12:40:00 GMT

Draft agenda published.

Issue #2084 created #created-2084

05 Jul at 17:30:19 GMT
Steps when the context value contains multiple nodes

We have two conflicting statements in the spec.

§4.6.4 says: "The step expression S is equivalent to ./S. Thus, if the context value is a sequence containing multiple nodes, the semantics of a step expression are equivalent to a path expression in which the step is always applied to a single node."

§4.6.5 says: "When the context value for evaluation of a step includes multiple nodes, the step is evaluated separately for each of those nodes, and the results are combined without reordering."

I'm not sure which is intended: does S means. / S, or . ! S?

Pull request #2083 created #created-2083

05 Jul at 12:09:37 GMT
2054 Generalized Path Expressions

This proposal (which has the JNodes proposal as its baseline) is a first cut at defining generalised steps and path expressions that handle both XNodes and JNodes in a uniform way.

The proposal adds functionality to path expressions (using "/") but does not yet remove the corresponding functionality from lookup expressions (using "?") - that will follow in a subsequent draft.

The changes are largely confined to XPath section §4.6.

Obviously there is much scope to add notes and examples. There is also a need to reorganise sections so concepts are introduced before they are referenced.

Issue #2082 created #created-2082

04 Jul at 16:43:41 GMT
parse-html options parameter conventions

In most functions, the options parameters have types such as xs:boolean and xs:string. But in parse-html, they are xs:boolean? and xs:string?

Issue #2081 closed #closed-2081

03 Jul at 21:23:07 GMT

Destructuring let combined with for clause

Issue #2081 created #created-2081

03 Jul at 20:29:30 GMT
Destructuring let combined with for clause

We may need to look at expressions of the following kind…

let $($a, $b) := (1 to 6) ! string()
for $i in 1 to 3
return ($a, $b, $i)

…which currently yield an exception:

Improper use? Potential bug? Your feedback is welcome:
Contact: basex-talk@mailman.uni-konstanz.de
Version: BaseX 12.1 beta
Java: Eclipse Adoptium, 17.0.14
OS: Windows 11, amd64
java.lang.ArrayIndexOutOfBoundsException: Index -1 out of bounds for length 32
	at org.basex.query.var.QueryStack.set(QueryStack.java:124)
	at org.basex.query.QueryContext.set(QueryContext.java:577)
	at org.basex.query.expr.gflwor.Let$LetEval.next(Let.java:147)
	at org.basex.query.expr.gflwor.GFLWOR$1.next(GFLWOR.java:79)
	at org.basex.query.scope.MainModule$1.next(MainModule.java:65)
	at org.basex.query.QueryContext.next(QueryContext.java:395)
	at org.basex.query.QueryContext.lambda$6(QueryContext.java:629)
	at org.basex.query.QueryContext.run(QueryContext.java:770)
	at org.basex.query.QueryContext.cache(QueryContext.java:609)

Issue #2080 created #created-2080

03 Jul at 20:24:20 GMT
Destructuring let clauses: Bind remaining values

With the new LetSequenceBinding, LetArrayBinding and LetMapBinding clauses, single items of an evaluated expression can be partially bound:

let $($a, $b) := (1, 2, 3)
let $[$a $b, $c] := [ 1, 2, 3 ]
let ${$a, $b} := { 'a': 1, 'b': 2, 'c': 3 }

It would be helpful to be able to also bind all remaining items, for example with the three-dot syntax. If we allow this syntax for maps, the result could be a submap with all entries except for the ones that were bound by previous bindings:

(: $first: 1, $remaining: (2, 3) :)
let $($first, $remaining...) := (1, 2, 3)

(: $first: 1, $remaining: (2, 3) :)
let $[$first, $remaining...] := [ 1, 2, 3 ]

(: $a: 1, $remaining: { 'b': 2, 'c': 3 } :)
let ${$a, $remaining...} := { 'a': 1, 'b': 2, 'c': 3 }

Issue #2079 created #created-2079

03 Jul at 12:18:40 GMT
Extend EQName with optional prefix

A QName has three components, prefix, uri, and local, and it's sometimes useful to be able to specify all three.

I suggest extending the definition of EQName to allow the format Q{uri}prefix:local, where the optional prefix is documentary only, but is present in the resulting QName value.

There may be a need to look at places where EQNames are used and to describe the consequences of using this format.

Issue #2078 created #created-2078

03 Jul at 03:14:09 GMT
2031/2025 JNodes: inconsistency in data model taxonomy, definitions

Per chair request, I'm raising an issue on PR #2031 (on issue #2025), even though it has not been adopted by the CG.

I strongly support the JNodes proposal. But in its current state, I have concerns about the fundamentals. If I am right, or only partly right, adjustments will have cascading effects.

I realize that throughout the specs we use the "Definition" rubric loosely, but, for the points I raise below, I would ask that we at least aspire to a more robust definition of "definition." I draw on the classic Aristotelian model, where a good definition should specify the definiendum's genus ("definiendum" = "the thing to be defined"), then supply only those predicates needed to distinguish the definiendum from other species of that genus. The classic example is the definition of "human being" as a "rational animal." The animal is the genus, and the adjective "rational" delimits human beings from non-human being animals. No need to get hung up on details -- that's the gist of the what informs my comments below.

The PR proposes the new top-level structure:

  • GNode
    • XNode
    • JNode

The first term is defined: 

[Definition: The term generic node or GNode is a collective term for XNodes (more commonly called simply nodes) representing the parts of an XML document, and JNodes, often used to represent the parts of a JSON document.]

This definition attempts to define things outside the scope of the definiendum. It is presented here as a kind of abstract umbrella category for more specific things. Not a big deal; we carry on:

[Definition: An XNode, more commonly referred to simply as a node, represents a construct found in an XML document. There are seven kinds: document nodes, element nodes, attribute nodes, text nodes, comment nodes, processing instruction nodes, and namespace nodes] [Definition: A JNode represents an encapsulation of a value in a tree of maps and arrays, such as might be obtained by parsing a JSON document. XDM maps and arrays, however, are more general than those found in JSON.]

Each of these two definitions says not what the definiendum is, but rather what it does ("represents": like a member of parliament represents constituents? -- confusing). It also includes buffer words that introduce intermediaries between the definiendum and the thing you would think most immediate to it: "construct," "found," "encapsulation," "value."

More difficult is the fact that, like GNode, an XNode is an abstract category and not a thing in itself. But a JNode, we learn later, is not an abstract category, but an actual thing, with properties. So at the top level of the taxonomy, we have an inconsistency, between abstract categories that have no instantiation, versus those that do, and the two are put in parataxis.

The definition of JNode is not well formulated. It restricts itself to "a value in a tree of maps and arrays" but not to maps and arrays themselves. Does the quoted phrase mean a map entry or an array member? Or the value within said entry or member? 

Slight tangent: in the specs' definition of "value," the term is not really defined, but simply said to be synonymous with "sequence." But in practice the word substitution doesn't work. More often, the specs use "value" in a more restricted, common-sense meaning, to describe a two-term relationship. A thing "owns" a value and some datum inhabits the role of that thing's value. X has value Y. Y is value of X. We run into problems with the ambiguous word "value." Currently a JNode encapsulates a value (see above). But it also has the property (we learn later) of value. So the value has a value?

The JNode definition is sharpened, not in the data model, where it should be, but in the opening sentence of XSLT section 20: "A JNode is a wrapper around a map or array, or around a value that appears within the content of a map or array." This raises the question, what is a wrapper? And content? But it also raises the question about the relationship between JNode and map and between JNode and array, and the juxtaposition of JNode with XNode accentuates the difference. We would never say that an XNode is a wrapper around an element, an attribute, etc. The inconsistency is of a piece with the confusion I've pointed out above concerning the taxonomy of the data model.

Before I propose a solution, I need to probe a similar problem that already exists in the specs:

[Definition: A function is an item that can be called. ]

The word "called" is set in boldface, as if it is a technical term defined elsewhere. It is not, and is rarely used in the specs. What is it for something to be callable? Non-callable? To my mind, we do not have a proper definition of "function," and it is fair game for adjustment. As we have done. In 4.0 we have promoted the function and its proper parts into the topmost level of the data model taxonomy (with adjustments to a few definitions).

[Definition: An array item (also called simply an array) is a function item that represents an array.]

This suffers from the same flaws as GNode and XNode ("represents"), and is tautologous. In the version 3.1 definition of "map" we had the same problem, but the version 4.0 definition at least avoids the tautology.

So, to sum, we have definitions that aren't, inconsistency in our data model taxonomy, and a variety of other problems.

A different approach

We all intuit that the new taxonomy GNode - (XNode | JNode) is meaningful, useful, and important. Arrays and maps really are trees as much as they are functions.

Let all three terms GNode, XNode, and JNode be defined as abstract categories.

Just as XNode is subdivided into specific xnodes, let JNode subdivide into four specific jnodes:

  1. map
  2. map entry
  3. array
  4. array member

Adopt the same approach we do for xnodes, and define each of the four on its own terms. Define map - map entry and array - array member along lines similar to the approach adopted in 6.6 to define element - attribute (quite analogous!). We have wrestled over having to have both sequence and selector properties. But with this new approach, we are not stuck. Only map and array jnodes require a sequence property. Map entry and array member jnodes require only the selector property, not the sequence.

This approach is extensible. Suppose we have a proposal for a new JNode. It's a blork, and every blork has one or more cheegs, each one of which has one or more drazers. We simply define three more JNodes: blork, cheeg, drazer. We make sure that the properties for each are suited to what they are individually (the same way we do for the 7 types of XNodes).

One more step, the most controversial: drop Map Items and Array Items from the Function Items category. Yes, there is a fundamental way in which maps and arrays behave like (non-map/array) functions, but there are also equally fundamental ways in which maps and arrays behave like XNodes. If we do not need to define maps and arrays subordinate to XNodes/nodes, then why should we define them subordinate to functions? JNodes have dual citizenship.

An alternative taxonomy is to drop the concept of GNode altogether, and let there be four kinds of item: 

  • item 
    • anyAtomicType 
    • XNode 
      • attribute 
      • document 
      • element 
      • text 
      • comment 
      • processing-instruction 
      • namespace 
    • JNode/JFunction 
      • map 
      • map entry 
      • array 
      • array member 
    • function(*)

Pull request #2077 created #created-2077

02 Jul at 11:00:37 GMT
2016 File Module: Incorporate changes

Includes general refactorings.

Closes #2016

Issue #2076 created #created-2076

02 Jul at 08:59:56 GMT
TOC: Interaction

Based on user feedback that I got, it seems that the solution to expand/collapse a sub-TOC could be more intuitive.

@ndw Would it be possible to remove the rightmost arrow icon, and to simply expand the subentries when a TOC entry is clicked? This would also decrease the number of icons, and it would remove the current, somewhat confusing behavior that a TOC entry is also expanded/collapsed when the empty area to the right of the arrow is clicked.

We could keep the arrow on top level to be able to expand/collapse all entries.

Issue #2075 created #created-2075

02 Jul at 08:50:01 GMT
Editorial notes (incremental)
  • The »Summary of Changes« sections contain outdated information that refer to the presentation of the specs. Maybe we don’t really need them:
    • Use the arrows to browse significant changes since the 3.1 version of this specification.
    • Sections with significant changes are marked Δ in the table of contents. New functions introduced in this version are marked ➕ in the table of contents.

…to be continued.

Issue #2074 closed #closed-2074

02 Jul at 08:13:38 GMT

JPath operator

Issue #2074 created #created-2074

02 Jul at 08:05:07 GMT
JPath operator

Christian (I'm not sure where....) has proposed using an operator other than ?, for example ?/, for paths involving JNodes.

This would eliminate some of the non-orthogonalities introduced in the interests of backwards compatibility. For example, the result could always be in document order with duplicates eliminated (which is not currently true for A?B). A?/B could then be truly synonymous with A?/child::B, with no ifs and buts. The result would always be a sequence of JNodes, they would only be unwrapped if used in a context (such as arithmetic) where coercion forces the unwrapping.

This might (perhaps?) also enable a tidier syntax for filtering by type: the current syntax A?~[sequenceType] is rather clunky.

Issue #2073 created #created-2073

02 Jul at 07:47:05 GMT
JNodes and Sequences

The JNode model as currently proposed doesn't handle sequences very elegantly: specifically maps and arrays whose entries/members contain values of more than one item.

Also (but separate) handling of empty maps, arrays, and sequences isn't ideal.

Consider the map {"a": 1, "b": [2], "c": (3, 4), "d": ([5], [6]), "e": (7, [8]), "f": []}

Applying child::* to this map gives you six JNodes, as you would expect, with selector properties "a", "b", "c", "d", "e", "f". After that, things get complicated.

  • The JNode child::a has a value of 1, and no children.

  • The JNode child::b has a value of [2], and has one child, with selector=1, value=2, position=1

  • The JNode child::c has a value of (3,4), and no children.

  • The JNode child::d has a value of ([5], [6]), and has two children. The first child has selector=1, position=1, value=5; the second has selector=1, position=2, value=6. Each of these two children itself has one child.

  • The JNode child::e has a value of (7, [8]), and has one child. The child has selector=1, position=2, value=8.

  • The JNode child::f has a value of [] and no children.

There is logic to this, but it isn't easy to explain. I don't at the moment have any clear ideas for improving matters, but raise the issue in the hope that we can come up with ideas.

Issue #835 closed #closed-835

02 Jul at 06:23:10 GMT

Review names of record types

Issue #2040 closed #closed-2040

02 Jul at 05:53:05 GMT

XQuery context value declaration

Issue #2050 closed #closed-2050

02 Jul at 05:53:04 GMT

2040 Fix context value declaration issues

QT4 CG meeting 127 draft minutes #minutes—07-01

01 Jul at 16:10:00 GMT

Draft minutes published.

Issue #2072 created #created-2072

01 Jul at 16:27:30 GMT
JNodes: accessing properties

If we decided to introduce a custom JPath expression for JNodes (see #2054), we could possibly use the classic lookup operator to access the properties of JNodes:

For example, the expression…

let $data := { 'name': 'Achab' }
return $data/name

…would result in a JNode…

JNode(
  value     := 'Achab'
  parent    := JNode($data)
  selector  := 'name'
  position  := 1
)

…and we could use $name?selector to retrieve the selector. This way, we could possibly go without the specific functions fn:JNode-content, fn:JNode-selector and fn:JNode-position.

In principle, the classic lookup operator could be further extended to access properties of XNodes.

Issue #967 closed #closed-967

01 Jul at 15:09:42 GMT

XPath Appendix I: Comparisons

Issue #1021 closed #closed-1021

01 Jul at 15:09:38 GMT

Extend `fn:doc`, `fn:collection` and `fn:uri-collection` with options maps

Issue #1583 closed #closed-1583

01 Jul at 15:09:34 GMT

JSON: Parsing and serializing numbers, often undesired E notation

Issue #1903 closed #closed-1903

01 Jul at 15:09:30 GMT

`fn:scan-left`, `fn:scan-right`: missing steps

Issue #1283 closed #closed-1283

01 Jul at 11:12:22 GMT

77b Update expressions

Pull request #2071 created #created-2071

01 Jul at 11:11:21 GMT
77c deep update

Proposes a new fn:update function that can handle both JNodes and XNodes.

(this is a branch on a branch, so I don't know how well the diff'ing will work; but look in F&O for the fn:update function)

Pull request #2070 created #created-2070

30 Jun at 20:04:35 GMT
Map build patch

Small edits to map:build

  • fix parameter name
  • remove surplus blank lines
  • add example with multiple keys returned by key function (a map of sequences)

QT4 CG meeting 127 draft agenda #agenda-07-01

30 Jun at 10:30:00 GMT

Draft agenda published.

Issue #2068 closed #closed-2068

25 Jun at 12:53:35 GMT

Editorial notes

Issue #2069 closed #closed-2069

25 Jun at 12:53:34 GMT

1970, 2068 Editorial notes

Pull request #2069 created #created-2069

25 Jun at 11:58:17 GMT
1970, 2068 Editorial notes

#1970, Closes #2068.

As this PR includes changes that should have been part of #1970 (and minor other fixes), I will immediately merge the PR. If someone objects, I will be happy to revert the change.

Issue #2068 created #created-2068

25 Jun at 11:16:50 GMT
Editorial notes
  • #2017: Key should also be mandatory for array:sort-by

…to be continued.

Issue #2059 closed #closed-2059

24 Jun at 16:21:16 GMT

Literal QNames: Adaptive serialization

Issue #2060 closed #closed-2060

24 Jun at 16:21:15 GMT

2059 Literal QNames: Adaptive serialization

Issue #2017 closed #closed-2017

24 Jun at 16:19:56 GMT

`fn:sort-by`: Observations

Issue #2062 closed #closed-2062

24 Jun at 16:19:55 GMT

2017 fn:sort-by: Observations

Issue #1970 closed #closed-1970

24 Jun at 16:18:26 GMT

Editorial notes

Issue #2065 closed #closed-2065

24 Jun at 16:18:25 GMT

1970 Editorial notes

Issue #2056 closed #closed-2056

24 Jun at 16:16:38 GMT

Implicit Whitespace in `MarkedNCName` and `QNameLiteral`

Issue #2064 closed #closed-2064

24 Jun at 16:16:37 GMT

2056 Implicit Whitespace in MarkedNCName and QNameLiteral

Issue #2058 closed #closed-2058

24 Jun at 16:13:10 GMT

Literal QNames: Annotations

Issue #2061 closed #closed-2061

24 Jun at 16:13:09 GMT

2058 Literal QNames: Annotations

Issue #2067 closed #closed-2067

24 Jun at 16:12:32 GMT

Fix 'TODO' entries in the function catalog from PR 2013

Pull request #2067 created #created-2067

24 Jun at 16:12:25 GMT

Fix 'TODO' entries in the function catalog from PR 2013

Issue #2009 closed #closed-2009

24 Jun at 16:06:08 GMT

xsl:variable implicit document nodes

Issue #2015 closed #closed-2015

24 Jun at 16:06:07 GMT

2009 Avoid constructing document node when it makes no sense

Issue #2045 closed #closed-2045

24 Jun at 16:04:44 GMT

Functions taking "." as default argument, when "." is empty

Issue #2049 closed #closed-2049

24 Jun at 16:04:42 GMT

2045 Context value can be an empty sequence

Issue #748 closed #closed-748

24 Jun at 16:02:49 GMT

Parse functions: consistency

Issue #2013 closed #closed-2013

24 Jun at 16:02:48 GMT

748 Parse functions: consistency

Issue #1942 closed #closed-1942

24 Jun at 15:58:05 GMT

37 Support sequence, array, and map destructuring declarations

Issue #37 closed #closed-37

24 Jun at 15:57:42 GMT

Support sequence, array, and map destructuring declarations

Issue #2055 closed #closed-2055

24 Jun at 15:57:41 GMT

37 Sequence, Array, and Map destructuring

Issue #2066 created #created-2066

24 Jun at 15:41:14 GMT
Cells in the F&O signature blocks should be vertically aligned to the top

If a function parameter type (e.g. in map:build) has a long type that wraps to multiple lines the parameter name and default are center aligned. A similar issue will happen with the type if the default value wraps.

Setting vertical align to top will fix this.

Pull request #2065 created #created-2065

24 Jun at 11:39:07 GMT
1970 Editorial notes

Closes #1970

Editorial. The only controversial change may be to rename the second parameter of map:build from $keys to $key.

Pull request #2064 created #created-2064

24 Jun at 11:17:11 GMT
2056 Implicit Whitespace in MarkedNCName and QNameLiteral

Closes #2056

Issue #2002 closed #closed-2002

24 Jun at 11:10:28 GMT

Adaptive serialization: QNames

Pull request #2063 created #created-2063

24 Jun at 11:08:21 GMT
1996 Lookups, KeySpecifier: Literal, ContextValueRef

Closes #1996

Pull request #2062 created #created-2062

24 Jun at 10:51:17 GMT
2017 fn:sort-by: Observations

Closes #2017

Issue #850 closed #closed-850

24 Jun at 10:22:02 GMT

fn:parse-html: Finalization

Pull request #2061 created #created-2061

24 Jun at 09:59:58 GMT
2058 Literal QNames: Annotations

Closes #2058

Pull request #2060 created #created-2060

23 Jun at 14:35:21 GMT
2059 Literal QNames: Adaptive serialization

Closes #2059

QT4 CG meeting 126 draft agenda #agenda-06-24

23 Jun at 11:30:00 GMT

Draft agenda published.

Issue #2059 created #created-2059

23 Jun at 10:02:02 GMT
Literal QNames: Adaptive serialization

QNames that are output with the adaptive serialization method could be prefixed with a # character:

serialize((#a, #xml:a), { 'method': 'adaptive' })

(: Result :)
#Q{}a
#Q{http://www.w3.org/XML/1998/namespace}a

Issue #2058 created #created-2058

22 Jun at 16:38:01 GMT
Literal QNames: Annotations

The literal QName syntax could be useful for annotations:

(: Current RESTXQ syntax :)
%rest:query-param('search', '{$search}')
(: Alternative new syntax :)
%rest:query-param('search', #search)

To remove the current exotic status of literal QNames, we could allow the syntax at some more places, like:

  • Catch clauses: catch #err:XPTY0004 { ... }
  • Element/attribute tests: element(#a:b)

Issue #2057 created #created-2057

22 Jun at 10:30:15 GMT
Steps: variable element names

With lookups, it is simple to use dynamic keys:

$map?$name

An equivalent solution is missing for path expressions. The current approach is to check the name with a predicate:

$node/*[node-name() = $name]

We could provide a more compare concise syntax by extending element tests (and, similarly, attribute tests):

$node/element($name)

The new literal QName syntax simplifies things further:

for $name in (#h1, #h2, #h3)
return $node/element($name)

Issue #2056 created #created-2056

18 Jun at 13:52:01 GMT
Implicit Whitespace in `MarkedNCName` and `QNameLiteral`

The REx-generated XQuery parser has failed to raise the expected syntax error for test case nscons-046.

This is caused by REx' unability to precisely handle differing whitespace allowances of multiple viable parsing alternatives. In the actual case,

<foo>{namespace # ...

there are two possible interpretations for namespace # that still need to be distinguished:

  • a NamedFunctionRef refering to some arity of a function named namespace,
  • a CompNamespaceConstructor using a MarkedNCName.

Now the problem is that the former allows implicit whitespace to follow, while the latter does not. A REx parser accepts whitespace here and thus fails the test case.

While this is REx' problem, and can be worked around by creating multiple # tokens with different lexical lookahead, it would be nicer to avoid this situation. This would possibly also simplify the work for other parsers.

May I ask to allow implicit whitespace in both the MarkedNCName and QNameLiteral productions? This would be in line with variable declarations and references, which also allow implicit whitespace between $ and EQName.

Pull request #2055 created #created-2055

17 Jun at 20:11:24 GMT
37 Sequence, Array, and Map destructuring

Redrafting of PR 1942, after discussion, and extension to XQuery

Fix #37 Supersedes #1942

This PR implements the decisions of today's discussion to the best of my understanding. I don't think further discussion is needed, but it does merit careful checking.

QT4 CG meeting 125 draft minutes #minutes—06-17

17 Jun at 16:30:00 GMT

Draft minutes published.

Issue #1888 closed #closed-1888

17 Jun at 16:20:16 GMT

366 xsl:package-location

Issue #2029 closed #closed-2029

17 Jun at 16:18:59 GMT

fn:xsd-validator - more explanation needed

Issue #2030 closed #closed-2030

17 Jun at 16:18:58 GMT

2029 xsd validator notes and examples

Issue #2041 closed #closed-2041

17 Jun at 16:11:32 GMT

Incorrect example of xsl:namespace-alias

Issue #2042 closed #closed-2042

17 Jun at 16:11:31 GMT

2041 Correction to xsl:namespace-alias example

Issue #1127 closed #closed-1127

17 Jun at 16:11:25 GMT

Binary resources

Issue #2044 closed #closed-2044

17 Jun at 16:07:54 GMT

Hide `MarkedNCName` from XPath spec

Issue #2054 created #created-2054

16 Jun at 18:21:55 GMT
JPath expression

Edit: ?/ is preferred over \; see comments.


I feel that there is one aspect about the exciting new JNode proposal that may develop into a permanent crutch. It is the lack of symmetry between simple lookups and lookups with axes:

a?b
a?child::b

This particularly strikes me as this inconsistency does not exist in XPath.

What about the idea to keep XPath 3.1 lookups unchanged and simple – for the vast number of use cases that do not require navigation – and to introduce a new “JPath expression” instead that will exclusively produce JNodes?

The JStep separator that I would recommend for it is the backslash character \. It bears even more resemblance to the XPath step separator than the often questioned question mark. By using a new syntax, I believe we would have much more freedom in designing a clean solution for navigating maps and arrays that is more consistent as well as more similar to classical node paths.

Some syntactical examples:

let $countries := {
  'Japan': {
    'cities': [ { 'Fukuoka': { 'population': 1600000 } } ]
  }
}
return (
  $countries\*,
  $countries\\cities
  $countries\Japan\cities[.\\population > 1000]\..,
  $countries\\*[.\ancestor::Japan]
)

Among other advantages that I see, the traversal of arrays of maps would also be less controversial (#115).

QT4 CG meeting 125 draft agenda #agenda-06-17

16 Jun at 12:30:00 GMT

Draft agenda published.

Issue #2053 created #created-2053

16 Jun at 12:43:12 GMT
Add fn:collection-available

Since the fn:collection function can raise errors, perhaps it should have a corresponding function to check if the collection is available?

For reference, see the other *-available functions:

Also, eXist has an implementation-specific xmldb:collection-available function: https://exist-db.org/exist/apps/fundocs/index.html?q=xmldb:collection-available. It's typically used to determine if a collection already exists before creating or deleting it.

A proposed signature, based on those linked above, could be:

fn:collection-available( $source as xs:string? := () ) as xs:boolean

Issue #2052 created #created-2052

16 Jun at 06:18:14 GMT
fn:collation-available: $usage

The function fn:collation-available defines a $usage parameter, but few information is given on how to interpret the collation URI to return the correct result. In addition, there are no test cases.

Do we believe that the additional parameter offers enough advantages, or should we simplify the function?

Issue #2051 created #created-2051

14 Jun at 01:37:09 GMT
XSLT group by cluster

I propose an enhancement of xsl:for-each-group to support clustering.

To start off with a simple use case, suppose one has the following population, <xsl:variable name="ages" as="xs:integer*" select="5, 24, 9, 5, 6, 8, 36, 38, 28"/> and one wishes to cluster the figures like so, in four groups: (5, 5, 6, 8, 9), (24), (28), (36, 38).

One is tempted to create an <xsl:for-each-group> with group-by="((. - 1) to (. + 1))". But this does not work. If @composite is absent or is no, eighteen groups are created. If @composite is yes, eight groups are created. In both cases, the results are not significantly close to the desired output.

I propose a new @group-by-cluster. The following code

<xsl:for-each-group select="$ages" group-by-cluster="((. - 1) to (. + 1))">
    <xsl:sort select="current-grouping-key()"/>
    <group key="{current-grouping-key()}" count="{count(current-group())}">
        <xsl:copy-of select="current-group()"/>
    </group>
</xsl:for-each-group> 

would produce this

   <group key="4 5 6 7 8 9 10" count="2">5 5 6 8 9</group>
   <group key="23 24 25" count="1">24</group>
   <group key="27 28 29" count="1">28</group>
   <group key="35 36 37 38 39" count="1">36 38</group>

There are numerous use cases for the proposed new feature. Here are a few:

  • Clustering map or spatial coordinates
  • Grouping disparate rectangles from OCR output
  • Reconciling triples in linked open data (RDF) that use different IRIs synonymously
  • Detecting typologies within in large corpora of documents that have periodically repetitive formulaic paragraphs.
  • Discovering networks of connected things, e.g., networks of email correspondence or publication citations   Currently, the clustering I describe above is feasible in XSLT, but it requires creative strategies, usually a combination of preprocessing and the creation of specialized helper functions to recursively iterate over multiple grouping keys to create group numbers. These are challenging to write and debug, and one loses identity in a preprocessed copy of the original.

By putting clustering into a @group-by-cluster construct, users benefit not only from convenience but also from performance, as a processor might bring novel strategies for clustering.

The current-grouping-key() for a group would consist of a sequence of all members' grouping keys, duplicates removed. No two groups would have any overlap in their grouping key sequences. (That's the definition of a cluster.)

@group-by-cluster would have effect only if its value actually produced a sequence of length greater than one, and if @composite is no. (Should a user should be warned if @composite is yes?)

Pull request #2050 created #created-2050

13 Jun at 22:51:02 GMT
2040 Fix context value declaration issues

Fix #2040

Pull request #2049 created #created-2049

13 Jun at 16:45:01 GMT
2045 Context value can be an empty sequence

For functions like name(), local-name() etc with as="node()? default="." in the signature, allow the context value to be an empty sequence.

Fix #2045

Issue #2048 created #created-2048

13 Jun at 16:12:51 GMT
Untrusted execution, and security more generally

Discussion of untrusted execution (that is, a Processor executing code from an untrusted source), and security in general, is present in the spec, but spread out and not really connected.

Untrusted execution seems to me to be one of the biggest security issues for XSLT/XQuery and XPath in general, and I think it’s important to make the distinction between untrusted execution where an untrusted stylesheet or query is executed (perhaps via Saxon’s XsltCompiler.compile()), and when a trusted stylesheet or query causes untrusted code to be executed, as with fn:parse-xml(), fn:doc(), fn:transform() and <xsl:evaluate>.

The proposals in #2034 address one case, but have no effect on the other, and that raises the question of what would happen if an untrusted (unsafe) stylesheet executed fn:doc() with safe = true.

I would like to see the specs address the security implications of untrusted execution more explicitly, and to provide clearer guidance for implementation authors around both completely untrusted code, and trusted code which executes untrusted code.

I think that could take the form of:

  • Annotation of all functions that have potentially problematic external effects (primarily file / resource access).
  • Clear expectations for implementors about what (and how) security restrictions should be configurable.
  • Consistent Error codes for security-related exceptions
  • Consistent security-related options for functions which can cause untrusted source parsing or code execution.
  • A additional security section in the spec under section 2 (Concepts / Basics) in the XSLT/XQuery/XPath specs, and section 1 of XPFO, which collects the overview and points to what to look for in the rest of the spec.

Issue #2047 created #created-2047

13 Jun at 16:11:36 GMT
external resource-accessing functions, available resources, and error codes

While looking at a problem in an implementation of a vendor function that read data from an external resource I was looking through the spec to see what was said about external resources and untrusted execution contexts (for example a processor executing XSLT or XPath provided by a user).

The spec makes mention of ‘available documents’, ‘available text resources’, ‘available binary resources’, ‘available collections’, and ‘available URI collections’.

The documentation for fn:doc(), fn:collection(), and fn:uri-collection() mention an error (err:FODC0002) to be raised if the URI requested is not in the relevant ‘available X’ (although there is some disagreement about what the ‘X’ should be - we have ‘available node collections’ and ‘available resource collections’ mentioned in the function docs). fn:unparsed-text() and fn:unparsed-binary() do not.

I think the documented behaviour of fn:unparsed-text() and fn:unparsed-binary() should be brought into line with the others.

Issue #2046 created #created-2046

13 Jun at 10:19:56 GMT
Promote ".." to a primary expression

I propose promoting ".." to be a primary expression (rather than an abbreviated step), and (assuming the JNodes proposal is accepted) allowing it to work on JNodes as well as XNodes.

Ignoring JNodes for now, I don't think the change would make any observable difference to the language syntax or semantics. It changes the way that a predicate is interpreted : ..[P] becomes a regular filter expression and is no longer subject to the special rules for predicates within steps; but the only difference is how position() is interpreted, and since .. can only return a singleton, this makes no difference.

For JNodes it means we will be able to write expressions such as $my-map??x[..?y = 3] rather than $my-map??x[?parent::*?y = 3]

Issue #2045 created #created-2045

12 Jun at 22:20:17 GMT
Functions taking "." as default argument, when "." is empty

A number of functions such as fn:name() take the context value as their default argument, with the implication that name() is exactly equivalent to name(.).

However, these functions also say that a type error XPTY0004 is raised if the context value is not a single node.

It seems to me that if the context value is an empty sequence, no type error should be raised; the effect of () -> name() should be the same as name(()).

Pull request #2044 created #created-2044

10 Jun at 19:42:23 GMT
Hide `MarkedNCName` from XPath spec

Today's merge of qt4cg/qtspecs#2028 has added the MarkedNCName production to both the XQuery and the XPath spec. It is however only used within the XQuery spec.

QT4 CG meeting 124 draft minutes #minutes—06-10

10 Jun at 16:15:00 GMT

Draft minutes published.

Issue #2027 closed #closed-2027

10 Jun at 16:09:38 GMT

QNameLiteral syntax for namespace and Processing Instruction constructors

Issue #2028 closed #closed-2028

10 Jun at 16:09:37 GMT

2027 '#' syntax for computed PIs and namespaces

Issue #2032 closed #closed-2032

10 Jun at 16:01:58 GMT

Simple typo in XPath 4.0 example - inherited from XPath 3.0 spec

Issue #2033 closed #closed-2033

10 Jun at 16:01:57 GMT

2032 Fix typo in example

Issue #2022 closed #closed-2022

10 Jun at 15:57:56 GMT

Simplify optional XQuery conformance features

Issue #2026 closed #closed-2026

10 Jun at 15:57:55 GMT

2022 Drop module feature

Pull request #2043 created #created-2043

09 Jun at 17:10:24 GMT
2038 Tweak the rules for fn:apply-templates references to modes

Fix #2038

Pull request #2042 created #created-2042

09 Jun at 16:34:58 GMT
2041 Correction to xsl:namespace-alias example

Fix #2041

Issue #2041 created #created-2041

09 Jun at 16:30:06 GMT
Incorrect example of xsl:namespace-alias

Reported against XSLT 3.0

https://github.com/w3c/qtspecs/issues/71

QT4 CG meeting 124 draft agenda #agenda-06-10

09 Jun at 07:30:00 GMT

Draft agenda published.

Issue #2040 created #created-2040

08 Jun at 19:56:22 GMT
XQuery context value declaration

Section 5.17:

  • there is no changes entry flagging the fact that the coercion rules are now applied (and corresponding test cases such as contextDecl-037a do not identity a PR). The relevant PR is PR #254 .

  • The statement "The context value declaration has the effect of setting the context value static type T in the static context." is incorrect. The static context no longer includes a context value static type.

  • The statement "In all cases where the context value has a value, that value must match the type T according to the rules for SequenceType matching" is incorrect (or at least, misleading): as stated two paragraphs later, coercion is applied. But because there can be multiple context value declarations in different modules, specifying different types, perhaps the intent is that coercion is applied only to a value supplied in the query, and not to a value supplied externally? If so, this needs clarifying. There are apparently no tests for coercing an externally-supplied value to the required type.

Issue #2039 created #created-2039

07 Jun at 21:37:09 GMT
Generalize context item to context value in XSLT

Various places, for example the xsl:context-item declaration and the xsl:evaluate/@context-item attribute, should be updated to allow the there being a context value rather than a context item.

At present the context value at instruction level is always either a singleton or absent. We should consider generalizing this to align with XPath, where the -> and ?[....] operators allow the context value to be an arbitrary sequence.

An xsl:for-each-member instruction that iterates over an array and binds each member to the context value would make sense.

Issue #2038 created #created-2038

05 Jun at 14:04:27 GMT
Drop dependency of fn:apply-templates on the default mode

The new fn:apply-templates function in XSLT can invoke the "default mode", either by specifying mode="#default" or by not specifying a mode. The default mode is defined by the nearest containing instruction that has a [xsl:]default-mode attribute.

I would like to drop this dependency.

Most of the cases where a function call depends on the static context (especially an XSLT function) are cases where the relevant property is fixed for a package (e.g. the set of named keys, decimal formats, or character maps). There are places where there is a dependency on something tha can vary in a more fine-grained way - notably (a) the default collation, and (b) the set of namespace bindings, but on the whole such dependencies are undesirable (a) because they introduce opportunities for user error (e.g. when copying and pasting code) and (b) because they increase the amount of information the processor has to keep around at runtime just in case it is needed (for example, in a dynamic function call). I would therefore like to avoid introducing this dependency.

The proposed change is (a) to drop "#default" as a value of the mode option for this function, and (b) to say that if no mode is specified by fn:apply-templates the unnamed mode is used.

Pull request #2037 created #created-2037

03 Jun at 14:45:18 GMT
2036 Add rule for streamability of xsl:map

Fix #2036

Issue #2036 created #created-2036

03 Jun at 14:34:31 GMT
Streamability of xsl:map instruction

The special condition that allows more than one operand of xsl:map to be consuming should apply only if the duplicates attribute is absent (defaulting to "error"). If duplicates are allowed then in general the result cannot be streamed.

Issue #1955 closed #closed-1955

02 Jun at 21:23:25 GMT

fn:doc, fn:parse-xml: entity expansion

Issue #2035 created #created-2035

02 Jun at 17:00:40 GMT
Recursive record types: unrealistic example in XPath spec

The example of mutually-recursive record types in XPath §3.2.8.3.1 (using the schema component model as an example) is unrealistic, because an instance of this structure would be cyclic at the instance level, and therefore would be non-instantiable. In practice, the only way to represent cyclic structures using maps and arrays is by use of functions to represent some of the relationships, as we do in the schema record type returned by functions such as fn:schema-type(). We should change the example to use this technique and explain why it is being used: it would still illustrate the point; although it would be in danger of becoming excessively complicated.

Issue #2034 created #created-2034

01 Jun at 18:33:53 GMT
fn:parse-xml, fn:doc: `safe` option

This issue replaces #1955.

The first feedback that we got for the entity-expansion-limit option indicates that our current solution is neither fish nor fowl (weder Fisch noch Fleisch?):

  1. With the initial suggestion in #1860, I hoped we could define sane defaults to prevent attacks caused by fn:parse-xml and fn:doc. This turned out to be difficult. Instead, we now have two specific options (allow-external-entities, entity-expansion-limit) that need to be explicitly assigned to make parsing safer.

  2. In order to parse certain XML documents, like dblp..xml.gz, more than one JDK 11 limit needs to be increased:

http://www.oracle.com/xml/jaxp/properties/entityExpansionLimit
http://www.oracle.com/xml/jaxp/properties/maxGeneralEntitySizeLimit
http://www.oracle.com/xml/jaxp/properties/totalEntitySizeLimit

As we have observed, XML parsing depends on the specific XML parsers. I believe it would be more user-friendly to replace the specific settings with a single option safe, and to let the processor decide which properties are assigned:

  1. true (default): Avoid XXE and billion laughs attacks
  2. false: disable safe parsing (increase limits, allow parsing of external resources)

Pull request #2033 created #created-2033

29 May at 13:41:43 GMT
2032 Fix typo in example

Fix #2032

Issue #2032 created #created-2032

29 May at 05:43:47 GMT
Simple typo in XPath 4.0 example - inherited from XPath 3.0 spec

Location of typo: Array Types section.

Current text:

[ 1, 2 ] instance array(*) returns true()

Expected text:

[ 1, 2 ] instance of array(*) returns true()

Pull request #2031 created #created-2031

28 May at 17:55:57 GMT
2025 JNodes

Fix #2025

This is a first draft for review.

It includes changes to the data model, functions and operators, and XQuery/XPath. It does not yet include changes to XSLT.

It's a big proposal, but I think it removes more complexity from the spec than it adds. It's basically a unification of two concepts, both of which were addressing aspects of the same problem, namely that lookup expressions lose too much information. It gets rid of the pin/label mechanism, and modifiers on lookup expressions, and introduces JNodes and JAxes in their place. (Any suggestions for improved terminology are more than welcome.)

I think we get a lot more "bangs for the buck" with this solution, and it makes navigation of JSON trees work in a much closer way to familiar navigation of XML trees. It needs a lot more work on examples and explanation, of course.

Issue #1859 closed #closed-1859

28 May at 10:04:37 GMT

Question on `fn:chain` and `err:FOAP0001`

Issue #1894 closed #closed-1894

28 May at 09:46:57 GMT

Additional examples to fn:chain - in a new branch

Issue #1883 closed #closed-1883

28 May at 09:45:14 GMT

882 Replace fn:chain by fn:compose

Pull request #2030 created #created-2030

28 May at 09:14:04 GMT
2029 xsd validator notes and examples

Adds more explanation to xsd:validator

Extracts material from the XQuery and XSLT specs describing the validation process, moving this to a new section in F&O, to reduce duplication.

Fix #2029

Issue #1959 closed #closed-1959

27 May at 16:14:55 GMT

1953 (part) XSLT Worked example using methods to implement atomic sets

Issue #882 closed #closed-882

27 May at 16:11:35 GMT

fn:chain or fn:compose

Issue #1984 closed #closed-1984

27 May at 16:11:34 GMT

882 Drop fn:chain

Issue #2023 closed #closed-2023

27 May at 16:06:21 GMT

Semantics of X?$a

Issue #2024 closed #closed-2024

27 May at 16:06:20 GMT

Add rules for $V?$X

Issue #2029 created #created-2029

27 May at 09:26:25 GMT
fn:xsd-validator - more explanation needed

See action QT4CG-119-02

In the review when the function was accepted, I was asked to supply more notes and examples indicating how the various options for assembling a schema interacted with each other.

Pull request #2028 created #created-2028

27 May at 09:19:02 GMT
2027 '#' syntax for computed PIs and namespaces

Fix #2027

Issue #2027 created #created-2027

27 May at 08:05:58 GMT
QNameLiteral syntax for namespace and Processing Instruction constructors

Action QT4CG-021-01 (should be QT4CG-121-01)

We now allow QNameLiterals to be used in element and attribute constructors, for example attribute #xml:space {"default"}.

For symmetry we need a similar syntax for namespace and processing-instruction constructors. These have the same ambiguity problem if the node name clashes with a reserved word such as "div", but they are constrained to be NCNames (or no-namespace QNames, depending on your perspective).

One possible solution is to use a new construct such as NCNameLiteral (but the name is wrong, unless we also allow it to be used in contexts where a Literal is allowed). Another possibility is for the grammar to allow a QNameLiteral, but for the semantics to restrict it to be in no namespace.

Pull request #2026 created #created-2026

26 May at 23:32:23 GMT
2022 Drop module feature

Fix #2022

The effect is that support for library modules is no longer optional.

I decided not to pursue merging the "schema import" and "typed data" features into one.

QT4 CG meeting 123 draft agenda #agenda-05-27

25 May at 09:16:15 GMT

Draft agenda published.

Issue #2025 created #created-2025

25 May at 07:36:49 GMT
Combine the concepts of pins/labels and modified lookups

We have two rather separate mechanisms, both designed to solve aspects of what is essentially the same problem: lookup expressions lose too much information.

Pinning tries to solve the problem by saying that if the origin of the lookup is pinned, then the results of the lookup carry a label containing information about the key and the parent.

Modifiers like pair::* try to solve the problem by returning a map containing the key and the value as separate fields.

But pinning only solves part of the problem, in particular it doesn't prevent X?* flattening the result, and the pairs modifier only solves part of the problem, in particular it doesn't retain parentage.

I would like to try combining them and trying to create a mechanism that is better than either. I don't yet know exactly how this might work, but I'm thinking along the lines:

  • Replace the concept of labelled items with labelled values (that is, a label can be attached to any value, not just an item)
  • Scrap pin() as an explicit function
  • A lookup expression like $X ? child::Y returns a sequence of labelled values (no flattening)
  • The properties of a labelled value include: ** target - the actual value ** key - the associated key (or array index) ** parent - the containing map or array
  • These properties might be made available through syntax such as $LV ? target::* , $LV ? key::*, $LV ? parent::* (or otherwise)
  • ancestor and ancestor-or-self can be made available as derived properties
  • Many operations when given a labelled value should automatically operate on its target, and ignore the label (rather like atomisation). Exactly which operations do this is an interesting question to which I don't yet know the answer. It's tricky because child::* returns a sequence of labelled values, and we want to be able to manipulate this in unflattened form. Perhaps child::* should instead return an array of labelled values? But then you end up with another lookup operation to extract the members of this array.

Pull request #2024 created #created-2024

23 May at 22:50:59 GMT
Add rules for $V?$X

Fix #2023

Issue #2023 created #created-2023

23 May at 08:56:21 GMT
Semantics of X?$a

In §4.14.3.1 we describe the semantics of lookup expressions. Rules 3a to 3e and 4a to 4e all start "if the KeySpecifier KS is..." and should enumerate all the possibilities for a KeySpecifier. But the case where the KeySpecifier is a VarRef is not mentioned.

Of course, X?$a is supposed to be a shorthand for X?($a), we just fail to state the fact.

Issue #2022 created #created-2022

22 May at 23:21:50 GMT
Simplify optional XQuery conformance features

In XQuery I propose:

  • Dropping the "module feature" - every conformant XQ40 implementation must support library modules
  • Merging the "schema-aware" and "typed-data" features into a single optional feature, aligned with schema-awareness in XSLT

Issue #2021 created #created-2021

22 May at 15:04:15 GMT
XSLT: Move "Patterns" section into "Template Rules"

I propose to move XSLT section §5.4 Patterns so it becomes §6.1, under Template Rules, where it will hopefully be easier to find.

Because this will produce a large number of diffs, I propose to make it a separate PR rather than combining it with the work I am currently doing on patterns and template rules for maps and arrays.

Issue #2020 closed #closed-2020

22 May at 12:05:45 GMT

Reconsider the rationale for the xsl:select instruction

Issue #2020 created #created-2020

22 May at 06:29:53 GMT
Reconsider the rationale for the xsl:select instruction

The section for xsl:select in the XSLT specification includes the following rationale:

An XPath expression written within an XML attribute is subjected by the XML parser to attribute value normalization, which changes the arrangement of whitespace within the value. While this will rarely affect the actual meaning of the expression, it can mean that formatting is lost. Multi-line attribute values are therefore best avoided. The loss of formatting also makes it difficult for an XSLT processor to provide precise error locations.

There are good reasons why xsl:select would be a useful instruction, but I don't think providing precise error locations is one of them. This is just circumventing a problem that is solvable today for select attributes. If an implementer wanted to supply a more precise error location in attribute values (and this would certainly help developers) they could adopt a solution similar to the Ecma SourceMap used by EcmaScript transpilers and minifiers.

In XSLT 3.0, we frequently work with multi-line select attribute values on XSLT instructions without major issues. Examples include: when calling using fold-left() functions with one or more inner functions or using multi-case if/else expressions. Using xsl:select just to get good error messages does not seem like a good trade-off for the added verbosity.

For these cases, one can either use a simple XPath linter in the XSLT editor to highlight the specific error tokens caused by basic typos and unresolved references, and then fall back on using the compiler error messages with approximate line-numbers for (the many) cases that the linter cannot pick up.

Precise XSLT Error Locations and AI Agents

Modern XSLT editors are today fully integrated with AI Agents (e.g. GitHub Copilot or AI Positron). These agents use reported error-locations to explain and suggest a fix for the XSLT problem for the user. Precise error locations are critical to the quality of the explanation and the fix. This help should be available equally for XSLT select attributes or xsl:select instructions.

Pull request #2019 created #created-2019

21 May at 22:59:26 GMT
1776: XSLT template rules for maps and array

Currently work In progress, committed so that the draft can be reviewed.

Changes in three main areas:

  • Pattern syntax: patterns such as ?item and ?parent?item are defined to match items in a map by their key
  • Built-in template rules for on-no-match="shallow-copy-all". Revisits the built in template rules for this scenario.
  • General revision of the processing model for xsl:apply-templates applied to a tree of maps and arrays.

Issue #2018 created #created-2018

21 May at 19:44:07 GMT
Type-checking the result of xsl:apply-templates

Code that calls xsl:apply-templates inevitably has expectations about the type of the result. For example someone doing

<xsl:apply-templates select="@*"/>

may have an expectation that the result will be a sequence of attribute nodes, and the code might fail untidily if it is anything else. There is currently no way of stating this expectation, or of triggering coercion on the result. We have added an attribute xsl:mode/@as but different calls on apply-templates in the same mode may have different expectations. (I'm seeing this particularly with modes that process maps and arrays)

We could add an as attribute to xsl:apply-templates to make the expectation explicit.

Issue #2017 created #created-2017

21 May at 13:31:56 GMT
`fn:sort-by`: Observations

We should make the second parameter obligatory (fn:sort-by(1 to 3) seems confusing).

sort(keys := fn { ?key }) occurs twice in the remaining text; should be sort(key := fn { ?key }).

Issue #1795 closed #closed-1795

21 May at 09:19:21 GMT

XSLT templates: Matching values in a map by key

Issue #1981 closed #closed-1981

21 May at 08:44:06 GMT

Syntax for QName literals clashes with XQuery pragmas

Issue #2016 created #created-2016

21 May at 06:42:30 GMT
File Module: Incorporate changes

The EXPath File Module must be revised in several steps. First of all, several functions need to be incorporated that were added to the initial version (details: https://docs.basex.org/12/File_Functions).

Pull request #2015 created #created-2015

21 May at 00:09:08 GMT
2009 Avoid constructing document node when it makes no sense

Fix #2009

The rules for xsl:variable are changed so there is no attempt to construct an implicit temporary tree when the sequence constructor contains an xsl:map. xsl:array, or xsl:select instruction (perhaps mixed with other instructions).

Compatibility: note that xsl:array and xsl:select are new in 4.0, while xsl:map inside xsl:variable always throws an error in XSLT 3.0.

Justification:

  • a child xsl:select element behaves like a select attribute
  • if the content of xsl:variable is xsl:map or xsl:array it makes no sense to require the user to add as=map(*) or as=array(*) because the type is obvious anyway.

Pull request #2014 created #created-2014

20 May at 21:50:06 GMT
QT4CG-122-01 Add notes, examples, and rationale for xsl:select

Completes action QT4CG-122-01

Issue #2008 closed #closed-2008

20 May at 18:07:44 GMT

2004 Add xsl:select instruction

Issue #2004 closed #closed-2004

20 May at 18:07:44 GMT

xsl:xpath instruction

QT4 CG meeting 122 draft minutes #minutes—05-20

20 May at 16:20:00 GMT

Draft minutes published.

Issue #2006 closed #closed-2006

20 May at 16:10:56 GMT

2005 Add fn:apply-templates function

Issue #2005 closed #closed-2005

20 May at 16:10:56 GMT

apply-templates() as a function

Issue #1991 closed #closed-1991

20 May at 16:08:52 GMT

835 Add built-in named record types to static context

Issue #1085 closed #closed-1085

20 May at 16:06:35 GMT

Parameters to fn:sort

Issue #2001 closed #closed-2001

20 May at 16:06:34 GMT

1085 Revert fn:sort to the 3.1 spec; introduce fn:sort-by

Issue #1992 closed #closed-1992

20 May at 16:04:32 GMT

Type of fn:schema-type-record ? constructor

Issue #1999 closed #closed-1999

20 May at 16:04:31 GMT

1992 Correct type of constructor function in schema-type-record

Issue #1997 closed #closed-1997

20 May at 16:02:44 GMT

Coercion Rules: §3.4.1 rule 3(c)

Issue #1998 closed #closed-1998

20 May at 16:02:43 GMT

1997 Correct nesting of item coercion rules

Pull request #2013 created #created-2013

20 May at 09:15:35 GMT
748 Parse functions: consistency

Closes #748

Issue #2012 created #created-2012

19 May at 08:14:37 GMT
Add array:sort-with

Issue #655 PR #795 introduced fn:sort-with.

We should define array:sort-with for consistency.

Pull request #2011 created #created-2011

18 May at 10:33:15 GMT
675(part): Add XSLT static typing rules for new kinds of XPath expression

Updates the static typing rules in XSLT for new kinds of expression introduced in XPath 4.0. These rules are used in streamability analysis, but more work needs to be done to complete the streamability analysis.

Production rules are now referenced by name, as production numbers are no longer available.

Issue #2010 created #created-2010

17 May at 15:33:07 GMT
XSLT patterns: generalize union, intersect, and except

This is related to issue #402.

We have generalised the meaning of union, intersect, and except, when used in XSLT patterns, so that they now mean:

  • A union B - matches either A or B
  • A intersect B - matches both A and B
  • A except B - matches A and does not match B

With these semantics, there is no longer any sensible reason to restrict these pattern operators to apply only to node patterns. The semantics work equally well for patterns that match (for example) maps or arrays. For example

<xsl:template match="record(a, b) except record(a, b, c)">

Issue #2009 created #created-2009

17 May at 08:16:02 GMT
xsl:variable implicit document nodes

In XSLT 3.0, an xsl:variable instruction with no select or as attribute implicitly wraps the value created by the sequence constructor in a document node. This inevitably fails if the content is a map, making it necessary to write

<xsl:variable name="m" as="map(*)">
   <xsl:map>...</xsl:map>
</xsl:variable>

I propose that this wrapping should not happen if the first item in the result of the sequence constructor is a function item (including a map or array). The practical effect on users is that they can leave out the as="map(*)" attribute in this situation.

For function items other than arrays, this is currently an error condition so there is no incompatibility.

For arrays it does represent an incompatible change -- the current rules ("Constructing complex content") say that an array is flattened. But XSLT 3.0 has no instruction to construct an array, it would have to be done using xsl:sequence; and no-one would deliberately construct an array merely in order to flatten it, so the situation is unlikely to arise in practice.

The proposal is that the decision whether or not to construct a wrapping document node should be based on the first item in the sequence. This is to allow lazy evaluation. A function item appearing later in the sequence would be handled the same way as now -- most likely an error. An empty sequence continues to result in a childless document node.

Pull request #2008 created #created-2008

16 May at 21:38:00 GMT
2004 Add xsl:select instruction

Fix #2004

Issue #322 closed #closed-322

16 May at 19:56:12 GMT

Map construction in XSLT: xsl:record instruction

Issue #2007 created #created-2007

16 May at 17:06:57 GMT
Creating arrays in XSLT

This kind of code comes up a lot, and is hard to simplify except by dropping into XPath:

                <xsl:array>
                  <xsl:for-each select="?members?*[?_nodeType='MethodDeclaration']">
                     <xsl:array-member>
                        <xsl:apply-templates select="."/>
                     </xsl:array-member>
                  </xsl:for-each>
               </xsl:array>

Issue #2005 (PR #2006) make it feasible to do it all in XPath (by calling the apply-templates() function) but I don't feel that's the whole answer.

Perhaps we could define something like

<xsl:array-build for-each="?members?*[?_nodeType='MethodDeclaration']">
   <xsl:apply-templates select="."/>
</xsl:array-build>

Pull request #2006 created #created-2006

15 May at 23:26:18 GMT
2005 Add fn:apply-templates function

Fix #2005

Issue #2005 created #created-2005

15 May at 16:53:01 GMT
apply-templates() as a function

I propose introducing apply-templates() as an xslt-only function, with semantics broadly equivalent to the xsl:apply-templates instruction.

The main use case identified so far is when constructing maps and arrays, it enables the XPath syntax to be used rather than the much more verbose XSLT syntax.

Parameters:

  • select: the items to be processed using matching template rules
  • with-params: the parameters to be passed. Like with-params on xsl:evaluate, this means variable names exist at run-time, which is a bit of an innovation, but I think it's manageable
  • mode: again, this means mode names exist at run-time, which may have consequences. There are open questions about what the default should be, or how the various options (default mode, unnamed mode, current mode) should be expressed.

Issue #2004 created #created-2004

15 May at 16:41:25 GMT
xsl:xpath instruction

In the case study https://github.com/qt4cg/qtspecs/issues/1786#issuecomment-2884424739 I encountered a use case where an instruction xsl:xpath would be useful.

<xsl:xpath>
   { "class":
       { "name": f:degenerify(name/@identifier),
          "abstract": ? abstract,
           "extends": array{? extendedTypes =!> map:merge(apply-templates()) },
           "implements": array{? implementedTypes =!> map:merge(apply-templates()) }
             ...
       }
  }
 </xsl:xpath>

The instruction is very simple: <xsl:xpath>EXPR</xsl:xpath> is equivalent to <xsl:sequence select="EXPR"/>. It's particularly useful because XPath constructors for maps and arrays are so much more concise than the XSLT equivalents. Compared with using xsl:sequence, it means that:

  • XML attribute value normalization doesn't kick in, so your formatting is better protected (meaning also that the system has some chance of computing line numbers correctly for diagnostics)
  • You haven't tied up either single or double-quotes as an attribute delimiter; both can be freely used within the expression.
  • You aren't creating the false impression that you're returning a (multi-item) sequence

Note that the content is NOT a sequence constructor; no child elements are allowed; and the content is not interpreted as a text value template. Unlike xsl:evaluate, the XPath expression is statically fixed.

(This example also introduces apply-templates as a function, but that will be a separate proposal).

Issue #2003 created #created-2003

15 May at 16:08:37 GMT
Conditional entries in map constructors

If you're constructing a map using a map constructor, adding an entry conditionally can be a real pain, and typically involves a wholesale rewrite of the way the map is constructed. It would be nice to be able to mark an entry as optional so users don't have to resort to such wholesale rewrites.

The difficulty of course is finding a nice syntax: one that is both intuitively readable and grammatically unambiguous.

One possibility might be:

MapConstructorEntry ::= MapKeyExpr ":"  MapValueExpr "optional"?

with the semantics that if the "optional" keyword is present, and the result of evaluating MapValueExpr is an empty sequence, then the entry is omitted from the constructed map.

A more ambitious construct might be

MapConstructorEntry ::= MapKeyExpr ":"  MapValueExpr ("when" MapEntryCondition)?

which adds the entry to the map only if the condition is true.

For example:

let $map := {"height": string(@height), 
            "width": string(@width), 
            "weight": string(@weight) when exists(@weight)}

Issue #2002 created #created-2002

15 May at 12:25:15 GMT
Adaptive serialization: QNames

We could use the new QName literal syntax when serializing QNames with the adaptive method:

serialize(xs:QName('x'), { 'method': 'adaptive' })

(: current output: Q{}x :)
(: proposed output: #x or #Q{}x :)

Pull request #2001 created #created-2001

14 May at 17:50:07 GMT
1085 Revert fn:sort to the 3.1 spec; introduce fn:sort-by

Fix #1085

The new functionality introduced into the 4.0 version of fn:sort is repackaged into a new function fn:sort-by with a much cleaner interface; the fn:sort function reverts to its 3.1 specification.

If this PR attracts support then the corresponding change will be applied to the array:sort function.

Issue #2000 created #created-2000

14 May at 14:11:20 GMT
element-to-map() - type signature of plan

The specifications of element-to-map() and element-to-map-plan() use different record types for the data structure representing the plan. In both cases the definition is less precise than it might be (though not wrong). The two functions should use a common named record type, which should be as precise as possible.

Issue #1982 closed #closed-1982

13 May at 17:24:02 GMT

1981 Ambiguity with qname literals and pragmas

Issue #1889 closed #closed-1889

13 May at 17:19:29 GMT

HTML serialization: `html-version` and `version` parameters; allowed values

Issue #1977 closed #closed-1977

13 May at 17:19:28 GMT

1889 Tidy up handling of HTML serialization version, default to HTML5

Issue #1985 closed #closed-1985

13 May at 17:13:44 GMT

Default namespace terminology

Issue #1987 closed #closed-1987

13 May at 17:13:43 GMT

1985 Tidy up namespace terminology

Issue #1986 closed #closed-1986

13 May at 16:49:56 GMT

Obsolete note on reporting errors

Issue #1988 closed #closed-1988

13 May at 16:49:55 GMT

1986 Drop obsolete notes on error reporting

Issue #1989 closed #closed-1989

13 May at 16:49:06 GMT

1983 QName literals in node constructors

Issue #1983 closed #closed-1983

13 May at 16:49:06 GMT

Computed node constructors - use QName literals rather than string literals

Issue #1990 closed #closed-1990

13 May at 16:48:10 GMT

Update schema-for-xslt40.xsd

Pull request #1999 created #created-1999

12 May at 15:24:53 GMT
1992 Correct type of constructor function in schema-type-record

Fix #1992

Pull request #1998 created #created-1998

12 May at 14:28:37 GMT
1997 Correct nesting of item coercion rules

Fix #1997

(A correction to an editorial error that made a substantive difference to the spec.)

Issue #1997 created #created-1997

12 May at 13:57:29 GMT
Coercion Rules: §3.4.1 rule 3(c)

This section reads:

If R is an [atomic type] and J is an [atomic item], then:

  • If J is an instance of R then it is used unchanged.
  • If J is an instance of type xs:untypedAtomic then: ** If R is an [enumeration type] then A is cast to xs:string. ** If R is [namespace-sensitive] then a [type error] [[err:XPTY0117]] is raised.
  • Otherwise, J is cast to type R.

The last line (rule 3(c)) looks all wrong. If we just did a cast at this point then rules 4 and 5 would be unnecessary.

It's not easy to trace the history, but I think it went wrong when the rules for choice/union types were refactored (around 2024-04-12). The line in question appears to have originally been under a conditional "if A is an instance of xs:untypedAtomic...".

Issue #1996 created #created-1996

12 May at 11:05:48 GMT
Lookups, KeySpecifier: add NumericLiteral and ContextValueRef?

Various types of expressions are allowed as a KeySpecifier:

Lookup  ::=  ("?" | "??") (Modifier "::")? KeySpecifier
KeySpecifier  ::=  NCName | IntegerLiteral | StringLiteral | VarRef | ParenthesizedExpr | LookupWildcard | TypeSpecifier

Maybe we could add ContextValueRef and NumericLiteral to the list, to make the following expressions legal:

{ '1.5': 'one and a half' }?1.5
array { 1 to 256 }?0x80
(3, 4, 5) ! $array?.

Issue #1995 created #created-1995

12 May at 10:56:19 GMT
Consistency: array lookups

The different variants to look up array members should be unified. For example (if I interpret the rules correctly), the following expressions can be evaluated…

[ 'a' ]?(<x>1</x>)
[ 'a' ]??(number(<x>1</x>))
[ 'a' ]??(1e0)
let $a := 1e0 return [ 'a' ]??$a

…whereas the following expressions raise errors:

[ 'a' ]?(number(<x>1</x>))
[ 'a' ]?(1e0)
let $a := 1e0 return [ 'a' ]?$a

We should probably try to make all of them legal, or try to justify what happens.

QT4 CG meeting 121 draft agenda #agenda-05-13

12 May at 09:00:00 GMT

Draft agenda published.

Issue #1797 closed #closed-1797

12 May at 09:30:43 GMT

elements-to-maps: separate function to construct a plan

Issue #1993 closed #closed-1993

12 May at 08:25:15 GMT

Incorrect test generated for map:pairs

Issue #1994 closed #closed-1994

12 May at 08:25:14 GMT

1993 Stylesheet fix to copy the occurrence indicator

Pull request #1994 created #created-1994

11 May at 21:04:02 GMT
1993 Stylesheet fix to copy the occurrence indicator

Fix #1993

Issue #1993 created #created-1993

11 May at 20:52:20 GMT
Incorrect test generated for map:pairs

The signature for map:pairs is

map:pairs(
$map	as map(*)	
) as key-value-pair*

but the test case generated in misc-BuiltInKeywords is (incorrectly)

map:pairs(map := ?) instance of function(map(*)) as fn:key-value-pair

Note the missing * at the end.

Issue #1992 created #created-1992

11 May at 16:24:26 GMT
Type of fn:schema-type-record ? constructor

In the type fn:schema-type-record (returned by functions such as fn:schema-type), the field constructor is said to be of type fn(xs:anyAtomicType) as xs:anyAtomicType. It is also said to be "the same function as returned by [fn:function-lookup] applied to the type name (with arity one)". But that function has type fn(xs:anyAtomicType?) as T?

The correct type for the constructor field should be fn(xs:anyAtomicType?) as xs:anyAtomicType?

Pull request #1991 created #created-1991

11 May at 10:45:43 GMT
835 Add built-in named record types to static context

This PR adds six built-in named record types to the static context of every application:

Record [key-value-pair] Record [load-xquery-module-record] Record [parsed-csv-structure-record] Record [random-number-generator-record] Record [schema-type-record] Record [uri-structure-record]

These are now listed in Appendix C of F&O

Issue 835 requests a review of the names of these records; perhaps putting them in one place will make that review easier. Personally, I am happy with the names as currently defined.

Pull request #1990 created #created-1990

08 May at 18:22:50 GMT
Update schema-for-xslt40.xsd

Fixed invalid syntax xs:simpleType/@ref (moved to @memberTypes) in simpleType named method

Based the type fixed-namespaces-type-default on xs:token instead of xs:string to allow for whitespace normalization

Changed collation attribute on xsl:merge-key to be an avt (according to spec)

Changed attributes that were previously of type "xsl:char-optionally-expanded" to just xs:string since the spec says they can be any string. I couldn't think of a reason they should be limited to one character optionally followed by a colon and more characters, so I assumed this was some sort of artifact from the past.

Changed xsl:next-iteration and xsl:evaluate to not allow mixed content

Corrected _split_when to _split-when

Added missing shadow attributes (in two places) for allow-duplicate-names, build-tree, json-lines and json-node-output-method

Added missing shadow attribute for select on perform-sort

Added assertions for required attributes:

  • xsl:use-package - name
  • xsl:expose - names, component and visibility

Made the errors attribute a list of tokens instead of just xs:token. Doesn't affect validation but I think it is more clear.

Changed default value of per-mille from a tilde to ‰

Changed the default value of the on-no-match attribute from shallow-skip to text-only-copy, per the spec

Gave xsl:exclude-result-prefixes the same type as no-namespace exclude-result-prefixes, to allow for #all and #default

Gave xsl:extension-element-prefixes the same type as no-namespace extension-element-prefixes, to allow for #default

Removed the xsl:prefixes and xsl:char-optionally-expanded types since they were no longer used after the above changes

Changed the type of the visibility attribute on xsl:attribute-set, xsl:function, xsl:template and xsl:variable to xsl:visibility-not-hidden-type to exclude "hidden" per the spec

Changed the keyword value of the fixed-namespaces attribute from #default to #standard (and adjusted type names)

Pull request #1989 created #created-1989

08 May at 11:53:05 GMT
1983 QName literals in node constructors

Fix #1983

Pull request #1988 created #created-1988

07 May at 13:15:31 GMT
1986 Drop obsolete notes on error reporting

Fix #1986

Pull request #1987 created #created-1987

07 May at 13:08:35 GMT
1985 Tidy up namespace terminology

Fix #1985

Editorial.

The main effect is to centralise the descriptions of how to expand unprefixed QNames into a few named rules which can be referenced and reused throughout the spec.

Issue #1986 created #created-1986

07 May at 10:00:35 GMT
Obsolete note on reporting errors

I propose dropping the folliowing text in XQuery §2.4.2

None of this text says anything prescriptive, and the suggested notation of URI#local appears outdated.

The method by which an XQuery 4.0 processor reports error information to the external environment is implementation-defined.

An error can be represented by a URI reference that is derived from the error QName as follows: an error with namespace URI NS and local part LP can be represented as the URI reference NS # LP . For example, an error whose QName is err:XPST0017 could be represented as http://www.w3.org/2005/xqt-errors#XPST0017.

Note:

Along with a code identifying an error, implementations may wish to return additional information, such as the location of the error or the processing phase in which it was detected. If an implementation chooses to do so, then the mechanism that it uses to return this information is implementation-defined.

Issue #1985 created #created-1985

07 May at 07:43:12 GMT
Default namespace terminology

There are places in the spec that use sloppy terminology regarding namespaces. For example 2.1.3 says

the namespace URI is inferred from the prefix by examining the in-scope namespaces in the static context

But the static context does not define "in-scope namespaces", it defines "statically known namespaces"

I propose to put together an editorial PR to tidy this up.

Pull request #1984 created #created-1984

06 May at 20:53:13 GMT
882 Drop fn:chain

Fix #882

Supersedes PR #1883

There has been a great deal of discussion about the relative merits of the status-quo fn:chain function and the proposed replacement fn:compose. The CG was polled on whether it preferred to have fn:chain only, fn:compose only, or both, or neither. There was no clear consensus. The only option which no-one seemed to favour was to have fn:chain only -- which is the status quo. Since no-one is happy with the status quo I am therefore proposing that we drop this function. We can then start with a clean slate.

For the record the main criticisms of the fn:chain function as currently specified were:

(a) it is more useful to have a function that combines several functions into a single function, without actually applying that function to a set of supplied arguments

(b) The function has special-case behaviour for arrays (if the input is not an array and the function has arity > 1 then the input sequence is converted to an array).

(c) The need for the function is not clearly motivated; the examples given can all be achieved in some simpler more intuitive way.

Issue #1983 created #created-1983

06 May at 20:32:33 GMT
Computed node constructors - use QName literals rather than string literals

We have introduced (in 4.0) the option to specify element and attribute names in computed node constructors in the form of string literals. We should replace this with QName literals.

Pull request #1982 created #created-1982

06 May at 20:30:39 GMT
1981 Ambiguity with qname literals and pragmas

Resolves the syntax problem identified in #1981 by requiring a space between ( and #.

Adds more examples and notes scattered around the specs.

Issue #1981 created #created-1981

06 May at 16:41:32 GMT
Syntax for QName literals clashes with XQuery pragmas

Unfortunately (as revealed by implementation and testing) the syntax for QName literals clashes with the syntax for pragmas in XQuery.

In the expression error(#err:XPTY0004), the longest token after error is (# which looks like the start of a pragma.

It's actually a wee bit complicated. Looking at the tokenization rules, we shouldn't be recognizing a pragma here because there is no closing #). The tokenization notes say "The lexical production rules for [variable terminals] have been designed so that there is minimal need for backtracking."; the introduction of the new syntax would mean that this is no longer the case. But regardless of the details, I think we have to change the QName literal syntax.

I propose we go for doubling the hash: error(##err:XPTY0004). We need to qualify the rules for tokenizing a pragma to say that a pragma is recognized when we see ((#, optional whitespace, EQName) - that's not unlike the rules we have for other "variable tokens".

Issue #1972 closed #closed-1972

06 May at 16:22:39 GMT

Dynamic function call applied to empty sequence

Issue #1240 closed #closed-1240

06 May at 16:22:39 GMT

$sequence-of-maps ? info()

Issue #1975 closed #closed-1975

06 May at 16:22:38 GMT

1240 Allow operand of dynamic function call to be a sequence

Issue #1661 closed #closed-1661

06 May at 16:19:25 GMT

QName arguments: also allow strings

Issue #1976 closed #closed-1976

06 May at 16:19:24 GMT

1661 Introduce QName literals

Issue #1973 closed #closed-1973

06 May at 16:16:21 GMT

Substantitively disjoint types

Issue #1974 closed #closed-1974

06 May at 16:16:20 GMT

1973 Cross-reference from type analysis to definition of disjointedness

Issue #1951 closed #closed-1951

06 May at 16:14:11 GMT

Some nits regarding the method attribute

Issue #1971 closed #closed-1971

06 May at 16:14:10 GMT

1951 Clarifications on serialization parameters

Issue #1952 closed #closed-1952

06 May at 16:08:43 GMT

Change option name from xsi-schema-location to use-xsi-schema-location

Issue #1969 closed #closed-1969

06 May at 16:08:42 GMT

1952 Change option name xsi-schema-location

Issue #1967 closed #closed-1967

06 May at 16:05:41 GMT

Example for fn:unparsed-binary uses obsolete function name

Issue #1968 closed #closed-1968

06 May at 16:05:40 GMT

1967 r/binary-resource/unparsed-binary/

Issue #1957 closed #closed-1957

06 May at 16:02:47 GMT

Schema for XSLT incorrectly allows mixed content for xsl:output

Issue #1964 closed #closed-1964

06 May at 16:02:46 GMT

1957 xsl output allows mixed content

Issue #1958 closed #closed-1958

06 May at 15:59:33 GMT

Typo in map:build

Issue #1963 closed #closed-1963

06 May at 15:59:32 GMT

1958 Fix simple typo in map:build

Issue #1980 created #created-1980

06 May at 14:12:44 GMT
HTML serialization: the rules for adding a meta element need to be aligned with HTML5

See Saxon bugs:

https://saxonica.plan.io/issues/5852 https://saxonica.plan.io/issues/6772

regarding the recognition and generation of META elements in the HTML and XHTML header sections.

Saxon is producing HTML5 output as mandated by the 3.1 serialization spec but this is apparently either invalid or deprecated by the HTML5 specification. The 4.0 serialization spec makes some adjustments in this area but I don't think it is fully in line yet with HTML5.

Issue #1979 created #created-1979

04 May at 09:36:58 GMT
Records: Type Safety

One cognitive challenge with records is to internalize that records are not independent types, but only map constraints. As a consequence, no type safety guarantees exist when records are accessed and updated:

  • A lookup of a non-existing key raises no error.
  • A record update may result in a map that does not match the original record definition.

This makes it hard and often impossible/illegal for processors to output helpful error messages.

There are reasons why we don’t want to make records too strict: an extensible record may include keys that are not defined in the record type:

(: must not raise an error :)
declare record local:r(a, *);
let $r as local:r := { 'a': 1, 'b': 2 }
return $r?b

However, for non-extensible records, I think we should allow processors to perform stricter checks when unknown keys are looked up, or when the result of an update would conflict with the original record type:

declare record local:r(a as xs:integer);
(: unknown key :)
local:r(1)?b,
(: invalid value type :)
map:put(local:r(1), 'a', 'string')

As records are no independent types, it will be difficult to enforce errors in all cases: It would require implementations to always know that a currently processed map has once been validated against a specific record type. But in many cases, implementations may be able to preserve record types for maps that have been coerced to a record, or created with a record declaration, and propagate them to updated maps. For example, we already do so when we can statically infer that the resulting map of a map:put call will match the original record type.

Issue #1978 created #created-1978

04 May at 00:55:54 GMT
Function `map:build` does not allow expressing the dependency of a value on its key. Some simple types of maps cannot be built.

The Problem

Function map:build, does not allow to explicitly define the functional dependency of a value on its key.

As result, it is unusable for creating even such simple maps as the following:

The input is:

("apple", "apricot", "banana", "blueberry", "cherry")

The $keys function is: $keys := fn($x){characters($x)} That is, every character, in every input string, is a key.

We need the values to be: if an input string contains the key two or more times, then each such string, else the empty sequence.

The expected map to be produced is:

{
  "a":  "banana",
  "b": "blueberry",
  "c": (),
  "e": "blueberry",
  "h": (),
  "i": (),
  "l": (), (: Lowercase L :)
  "n", "banana",
  "o", (),
  "p": "apple",
  "r": ("cherry", "blueberry"),
  "t": (),
  "y": ()
}

Solution

We provide a new definition of map:build - this can be a complete replacement of the current function, or could be added as a new overload. I am in the process of writing a PR, and your feedback would be appreciated.

The definition is simple:

let $mapBuild := fn(
$input	as item()*,	
$keys	as (fn($item as item(), $position as xs:integer) as xs:anyAtomicType*),
$value	as (fn($key as xs:anyAtomicType, $input as item()*) as item()*)
) as map(*)
{
  let $allKeys := distinct-values(for-each($input, $keys))
   return
     $allKeys ! map:pair(., $value(., $input)) => map:of-pairs()
}

As can be seen from executing the code below, the redefined function can be successfully used to build the "problematic" map above, and also all currently provided examples in the FO Spec for the function map:build.

let $mapBuild := fn(
$input	as item()*,	
$keys	as (fn($item as item(), $position as xs:integer) as xs:anyAtomicType*),
$value	as (fn($key as xs:anyAtomicType, $input as item()*) as item()*)
) as map(*)
{
  let $allKeys := distinct-values(for-each($input, $keys))
   return
     $allKeys ! map:pair(., $value(., $input)) => map:of-pairs()
}
 return
   let $input := ("apple", "apricot", "banana", "blueberry", "cherry"),
       $employees :=
         <employees>
           <employee name="Jim Nelson" location="New York" ssn="1234567890" salary="123456"/>
           <employee name="Ann West" location="New York" ssn="0987654321" salary="99999"/>
           <employee name="Peter Smith" location="Seattle" ssn="123454321" salary="155223"/>
           <employee name="Karen Johnson" location="Seattle" ssn="5432198760" salary="175000"/>
           <employee name="Jonh Lagarde" location="Boston" ssn="9999999999" salary="145000"/>
           <employee name="Samantha Weird" location="Boston" ssn="1111111111" salary="153000"/>
         </employees>
    return
(    
     $mapBuild(
       $input,
       fn($string, $pos) {distinct-values(characters($string))},
       fn($key, $input)
       {
         filter($input, fn($string, $pos){$key = duplicate-values(characters($string))}) 
       }
     ),
     $mapBuild((), string#1, string#1),
     $mapBuild(1 to 10, fn {. mod 3}, fn($key, $input){filter($input, fn{$key = . mod 3})}),
     $mapBuild(1 to 5, identity#1, format-integer(?, "w")),
     $mapBuild(("January", "February", "March", "April", "May", "June",
                "July", "August", "September", "October", "November", "December"),
               substring(?, 1, 1), fn($key, $input){filter($input, fn{$key = substring(., 1, 1)})}
              ),
     $mapBuild(
        ("apple", "apricot", "banana", "blueberry", "cherry"),
        substring(?, 1, 1), fn($key, $input){sum($input[$key eq substring(., 1, 1)] ! string-length(.))}
     ),
     $mapBuild(
       ('Wang', 'Liu', 'Zhao'),
       fn($name, $pos) { $name },
       fn($key, $input){index-of($input, $key)}
     ),  
     let $titles := 
       <titles>
        <title>A Beginner’s Guide to <ix>Java</ix></title>
        <title>Learning <ix>XML</ix></title>
        <title>Using <ix>XML</ix> with <ix>Java</ix></title>
      </titles>
     return
       $mapBuild($titles/title, 
                 fn($title){$title/ix}, 
                 fn($key, $input){filter($input, fn($elem){$key = $elem/ix})}
               ),
      $mapBuild(
        $employees//employee, fn{@ssn}, fn($key, $input){filter($input, fn($elem){$key = $elem/@ssn})}
      ),
      $mapBuild(
        $employees//employee, fn{@location}, fn($key, $input) {count(filter($input, fn($elem){$key = $elem/@location}))}
      ),
      $mapBuild(
        $employees//employee, fn{@location}, fn($key, $input) {max((filter($input, fn($elem){$key = $elem/@location}))/xs:decimal(@salary))}
      )
)

All results (executed with BaseX) are the expected, correct ones:

{"a":"banana","p":"apple","l":(),"e":"blueberry","r":("blueberry","cherry"),"i":(),"c":(),"o":(),"t":(),"b":"blueberry","n":"banana","u":(),"y":(),"h":()}
{}
{1:(1,4,7,10),2:(2,5,8),0:(3,6,9)}
{1:"one",2:"two",3:"three",4:"four",5:"five"}
{"J":("January","June","July"),"F":"February","M":("March","May"),"A":("April","August"),"S":"September","O":"October","N":"November","D":"December"}
{"a":12,"b":15,"c":6}
{"Wang":1,"Liu":2,"Zhao":3}
{"Java":(<title>A Beginner’s Guide to <ix>Java</ix></title>,<title>Using <ix>XML</ix> with <ix>Java</ix></title>),"XML":(<title>Learning <ix>XML</ix></title>,<title>Using <ix>XML</ix> with <ix>Java</ix></title>)}
{"1234567890":<employee name="Jim Nelson" location="New York" ssn="1234567890" salary="123456"/>,"0987654321":<employee name="Ann West" location="New York" ssn="0987654321" salary="99999"/>,"123454321":<employee name="Peter Smith" location="Seattle" ssn="123454321" salary="155223"/>,"5432198760":<employee name="Karen Johnson" location="Seattle" ssn="5432198760" salary="175000"/>,"9999999999":<employee name="Jonh Lagarde" location="Boston" ssn="9999999999" salary="145000"/>,"1111111111":<employee name="Samantha Weird" location="Boston" ssn="1111111111" salary="153000"/>}
{"New York":2,"Seattle":2,"Boston":2}
{"New York":123456,"Seattle":175000,"Boston":153000}

Image

Pull request #1977 created #created-1977

02 May at 15:59:18 GMT
1889 Tidy up handling of HTML serialization version, default to HTML5

Does some general tidying up of the serialization text, but the main substantive changes are (a) to make HTML5 the default version, and (b) to make support for earlier versions effectively optional.

Please review carefully. Marking as editorial because I'm not sure any test cases need to change, but I might be wrong.

Fix #1889

Pull request #1976 created #created-1976

02 May at 11:56:39 GMT
1661 Introduce QName literals

Fix #1661

See also #747

As discussed in the issue, I wasn't happy with the idea of changing the coercion rules to allow strings to be provided where a QName is expected, because of the need to keep the namespace context around at run-time, and because of potential confusion about exactly what namespace context is used.

Instead I have gone back to the idea of introducing QName literals, using the simple syntax #EQName.

Examples:

error(#err:XPTY0004)
node-name($node) = #xml:space
format-number($num, #de)
load-xquery-module($module)?variables?(#myvar)
transform({'initial-template':#xsl:initial-template})
{'last': 'Kay', 'first': 'Michael', 'suffix':#fn:null}

Pull request #1975 created #created-1975

02 May at 10:28:19 GMT
1240 Allow operand of dynamic function call to be a sequence

Fix #1240 Fix #1972

This PR enables use of expressions such as $rectangle?area() - sum($rectangle?contents()?area()) which would previously have failed with a type error.

Pull request #1974 created #created-1974

02 May at 09:13:16 GMT
1973 Cross-reference from type analysis to definition of disjointedness

Fix #1973

Issue #1973 created #created-1973

02 May at 08:41:52 GMT
Substantitively disjoint types

Section §2.3.3.1 Static Analysis Phase mentions

A processor may raise a type error during static analysis if the inferred static type of an expression has no overlap (intersection) with the required type, and cannot be converted to the required type using the [coercion rules].

This should cross-refer to the more precisely defined concept of types being "substantively disjoint" - see §3.4.3.

Issue #1972 created #created-1972

02 May at 08:24:18 GMT
Dynamic function call applied to empty sequence

A note in F+O under map:get states

map:get(map:get(map:get($map, 'employee'), 'name'), 'first') can be written as $map('employee')('name')('first').

That's technically correct: both these expressions will fail in the same way if $map does not contain an entry for the key employee. Unlike the lookup expression $map?employee?name?first which returns an empty sequence in this situation.

The rules for dynamic function calls (xpath, §4.5.3.1) state that $F($X) raises a type error if $F is an empty sequence.

I think it would be more useful if both map:get() and dynamic function calls were changed to have "empty if empty" semantics.

This is related to #1240 which goes further by allowing $F to be a sequence of function items.

Pull request #1971 created #created-1971

01 May at 22:11:45 GMT
1951 Clarifications on serialization parameters

Fix #1951

Issue #1970 created #created-1970

30 Apr at 09:08:15 GMT
Editorial notes

XQFO

  • fn:fold-right has an obsolete change section saying that “The $action callback function accepts an optional position argument.”
  • “then [the] operation will fail”
  • remove whitespace before/after QName literals (#1982)
  • fn:unparsed-binary: return type: xs:base64Binary?

(everyone: feel free to add notes, I’ll create a PR sometime later)

Pull request #1969 created #created-1969

30 Apr at 08:53:36 GMT
1952 Change option name xsi-schema-location

Change to use-xsi-schema-location (because the value is a boolean, not a location)

Fix #1952

Pull request #1968 created #created-1968

30 Apr at 08:47:25 GMT
1967 r/binary-resource/unparsed-binary/

Fix #1967

Issue #1967 created #created-1967

30 Apr at 08:27:48 GMT
Example for fn:unparsed-binary uses obsolete function name

One of the examples for the new function fn:unparsed-binary uses the obsolete function name fn:binary-resource

Issue #1568 closed #closed-1568

30 Apr at 07:34:02 GMT

Define a Unicode case-insensitive collation

Issue #1966 closed #closed-1966

30 Apr at 07:34:01 GMT

1568b Add unicode case-blind collation

Issue #1945 closed #closed-1945

30 Apr at 06:11:07 GMT

1568 unicode case blind collation

Pull request #1966 created #created-1966

30 Apr at 06:08:51 GMT
1568b Add unicode case-blind collation

Replaces #1945 which was approved by the CG, but had pull conflicts because of incidental editorial changes

Fix #1568

Issue #1965 created #created-1965

30 Apr at 01:13:19 GMT
The Generator record

This is a continuation of the original issue https://github.com/qt4cg/qtspecs/issues/716, created almost 2 years ago, and having accumulated a lot of very useful discussion.

Now, when we have methods that are fields of records, it became practical to produce the record type entirely in code, and this is the base for the planned PR.

1. What it contains

  • The standard record fields as originally published:
     initialized as xs:boolean,
     endReached as xs:boolean,
     getCurrent as %method fn() as item()*,
     moveNext as %method fn(*)
  • The following 34 methods - this will form the signatures and formal definitions of the methods inside the documentation:
toArray := %method fn()
take := %method fn($n as xs:integer)
takeWhile := %method fn($pred as function(item()*) as xs:boolean)
skip := %method fn($n as xs:nonNegativeInteger)
skipWhile := %method fn($pred as function(item()*) as xs:boolean)
some := %method fn()
someWhere := %method fn($pred)
subrange := %method fn($m as xs:positiveInteger, $n as xs:integer)
chunk := %method fn($size as xs:positiveInteger)
head := %method fn()
tail := %method fn()
at := %method fn($ind as xs:nonNegativeInteger)
for-each := %method fn($fun as function(*))
for-each-pair := %method fn($gen2 as f:generator, $fun as function(*))
zip := %method fn($gen2 as f:generator)
concat := %method fn($gen2 as f:generator)
append := %method fn($value as item()*)
prepend := %method fn($value as item()*)
insertAt := %method fn($pos as xs:positiveInteger, $value as item()*)
removeAt := %method fn($pos as xs:nonNegativeInteger)
replace := %method fn($funIsMatching as function(item()*) as xs:boolean, $replacement as item()*)
reverse := %method fn()
filter := %method fn($pred as function(item()*) as xs:boolean)
fold-left := %method fn($init as item()*, $action as fn(*))
fold-right := %method fn($init as item()*, $action as fn(*))
fold-lazy := %method fn($init as item()*, $action as fn(*), $shortCircuitProvider as function(*))
scan-left := %method fn($init as item()*, $action as fn(*))
scan-right := %method fn($init as item()*, $action as fn(*))
makeGenerator := %method fn($provider as function(*))
makeGeneratorFromArray := %method fn($input as array(*))
makeGeneratorFromSequence := %method fn($input as item()*)
toSequence := %method fn()
emptyGenerator := %method fn()
  • 90 tests/examples - with calls to all the methods - in normal and edge cases

2. Where to get the executable (with BaseX) code?

For everyone's convenience, you will find the complete executable code at the end of this issue/initial-comment. Alternatively, the code is available here: https://github.com/dnovatchev/Articles/blob/main/Generators/Code/generator.xpath

The latter will always contain the latest, up-to-date code. And, of course, please execute the code with BaseX, as I have done many times:

Image

3. What this gives us:

  1. Working with huge collections, that would otherwise be restricted by the available memory.
  2. Deferred execution.
  3. Handling collections containing unknown or infinite number of members
  • A (next) member is produced only on request. No time is spent on producing all members of the collection.
  • A (next) member is produced only on request. No memory is consumed to store all members of the collection.
  1. Lazy evaluation - due to the above and also using the fold-lazy method (also described in this article)
  2. Implementation of the original idea about Kollection - https://github.com/qt4cg/qtspecs/issues/910 .

4. What assistance is needed

I will greatly appreciate any recommendations on how to proceed with the actual PR:

  • Can this be a single PR ?
  • If this is too-big for a single PR, then how to proceed, like splitting it to pieces?
  • Any observations and comments on the code itself.

5. References:

  1. The original issue: Generators in XPath: https://github.com/qt4cg/qtspecs/issues/716
  2. This article: Generators in XPath
  3. The article defining fold-lazy : "Laziness in XPath. The trouble with fn:fold-right"

6. Complete, executable definition of the generator record

declare namespace f = "http://www.w3.org/2005/xpath-functions-2025";
declare record f:generator 
   ( initialized as xs:boolean,
     endReached as xs:boolean,
     getCurrent as %method fn() as item()*,
     moveNext as %method fn(*) (: as f:generator, :),
     toArray := %method fn()
     {
       while-do( [., []],
                function( $inArr) 
                { $inArr(1)?initialized and not($inArr(1)?endReached) },                 
                function($inArr) 
                { array{$inArr(1)?moveNext(), 
                        array:append($inArr(2), $inArr(1)?getCurrent())
                       } 
                 }         
       ) (2)
     },
     
     take := %method fn($n as xs:integer) 
     {
      let $gen := if(not(?initialized)) then ?moveNext()
                    else .
       return
         if($gen?endReached or $n le 0) then $gen?emptyGenerator()
          else
            let $current := $gen?getCurrent(),
                $newResultGen := map:put(., "getCurrent", %method fn(){$current}),
                $nextGen := $gen?moveNext()
             return
               if($nextGen?endReached) then $newResultGen
                 else
                   let
                       $newResultGen2 :=  map:put($newResultGen, "moveNext", %method fn() {$nextGen?take($n -1)}) 
                     return
                       $newResultGen2
      },
      
      takeWhile := %method fn($pred as function(item()*) as xs:boolean)
      {
        let $gen := if(not(?initialized)) then ?moveNext()
                      else .
         return
           if($gen?endReached) then $gen?emptyGenerator()
            else      
              let $current := $gen?getCurrent()
                return
                  if(not($pred($current))) then $gen?emptyGenerator()
                  else
                    let $newResultGen := map:put(., "getCurrent", %method fn(){$current}),
                        $nextGen := ?moveNext()
                     return
                        if($nextGen?endReached) then $newResultGen
                        else
                          let $newResultGen2 :=  map:put($newResultGen, "moveNext", %method fn() {$nextGen?takeWhile($pred)}) 
                           return $newResultGen2  
      },
     
     skipStrict := %method fn($n as xs:nonNegativeInteger, $issueErrorOnEmpty as xs:boolean) 
     {
            if($n eq 0) then .
              else if(?endReached) 
                     then if($issueErrorOnEmpty)
                           then error((), "Input Generator too-short") 
                           else ?emptyGenerator()
              else 
                let $gen := if(not(?initialized)) then ?moveNext()
                             else .
                  return
                    if(not($gen?endReached)) then $gen?moveNext()?skipStrict($n -1, $issueErrorOnEmpty)
                      else $gen?emptyGenerator()                 

     },
     skip := %method fn($n as xs:nonNegativeInteger) 
     {
       ?skipStrict($n, false())
     },
     
     skipWhile := %method fn($pred as function(item()*) as xs:boolean)
     {
        let $gen := if(not(?initialized)) then ?moveNext()
                      else .
         return
           if($gen?endReached) then $gen?emptyGenerator()
            else
              let $current := $gen?getCurrent()
               return
                 if(not($pred($current))) then $gen
                  else $gen?moveNext()?skipWhile($pred)                    
     },
     
     some := %method fn()
     {
       ?initialized and not(?endReached)
     },
     
     someWhere := %method fn($pred)
     {
       ?filter($pred)?some()
     },
     
     subrange := %method fn($m as xs:positiveInteger, $n as xs:integer)
     {
       ?skip($m - 1)?take($n - $m + 1)
     },
     
     chunk := %method fn($size as xs:positiveInteger)
     {
        let $gen := if(not(?initialized)) then ?moveNext()
                      else .
         return
           if($gen?endReached) then $gen?emptyGenerator()
           else
             let $thisChunk := $gen?take($size)?toArray(),
                 $cutGen := $gen?skip($size),
                 $resultGen := $gen => map:put("getCurrent", %method fn(){$thisChunk})
                                    => map:put("moveNext", %method fn(){$cutGen?chunk($size)})
              return $resultGen
     },
     
     head := %method fn() {?take(1)?getCurrent()},
     tail := %method fn() {?skip(1)},
     
     at := %method fn($ind as xs:nonNegativeInteger) {?subrange($ind, $ind)?getCurrent()},
           
     for-each := %method fn($fun as function(*))
     {
      let $gen := if(not(?initialized)) then ?moveNext()
                    else .        
       return
         if(?endReached) then ?emptyGenerator()
          else
           let $current := $fun(?getCurrent()),
                $newResultGen := map:put(., "getCurrent", %method fn(){$current}),
                $nextGen := ?moveNext()
            return
              if($nextGen?endReached) then $newResultGen
                else
                  let $newResultGen2 :=  map:put($newResultGen, "moveNext", %method fn() {$nextGen?for-each($fun)}) 
                     return
                       $newResultGen2                    
      },
      
      for-each-pair := %method fn($gen2 as f:generator, $fun as function(*))
      {
        let $gen := if(not(?initialized)) then ?moveNext()
                    else .,
            $gen2 := if(not($gen2?initialized)) then $gen2?moveNext()
                    else $gen2
         return
            if(?endReached or $gen2?endReached) then ?emptyGenerator() 
             else  
               let $current := $fun(?getCurrent(), $gen2?getCurrent()),
                   $newResultGen := map:put(., "getCurrent", %method fn(){$current}),
                   $nextGen1 := ?moveNext(),
                   $nextGen2 := $gen2?moveNext()
                return
                   if($nextGen1?endReached or $nextGen2?endReached) then $newResultGen
                     else
                       let $newResultGen2 := map:put($newResultGen, "moveNext", %method fn(){$nextGen1?for-each-pair($nextGen2, $fun)})
                         return
                           $newResultGen2                        
      },
      
      zip := %method fn($gen2 as f:generator)
      {
        ?for-each-pair($gen2, fn($x1, $x2){[$x1, $x2]})
      },

      concat := %method fn($gen2 as f:generator)
      {
        let $gen := if(not(?initialized)) then ?moveNext()
                    else .,
            $gen2 := if(not($gen2?initialized)) then $gen2?moveNext()
                    else $gen2,
            $resultGen := if($gen?endReached) then $gen2
                            else if($gen2?endReached) then $gen
                            else
                              $gen  => map:put(  "moveNext", 
                                                %method fn()
                                                 {
                                                 let $nextGen := $gen?moveNext()
                                                   return 
                                                     $nextGen?concat($gen2)
                                                 }
                                              )                                   
        return 
           $resultGen            
      },

      append := %method fn($value as item()*)
      {
        let $gen := if(not(?initialized)) then ?moveNext()
                    else .,
            $genSingle := $gen => map:put("getCurrent", %method fn(){$value})
                               => map:put("moveNext", %method fn(){?emptyGenerator()})
                               => map:put("endReached", false())
         return
           $gen?concat($genSingle)                    
      },
      
      prepend := %method fn($value as item()*)
      {
                let $gen := if(not(?initialized)) then ?moveNext()
                    else .,
                    $genSingle := $gen => map:put("getCurrent", %method fn(){$value})
                                       => map:put("moveNext", %method fn(){?emptyGenerator()})
         return
           $genSingle?concat($gen)  
      },
      
      insertAt := %method fn($pos as xs:positiveInteger, $value as item()*)
      {
        let $genTail := ?skipStrict($pos - 1, true())
         return
            if($pos gt 1)
              then ?take($pos - 1)?append($value)?concat($genTail)
              else $genTail?prepend($value)               
      },
      
      removeAt := %method fn($pos as xs:nonNegativeInteger)
      {
        let $genTail := ?skipStrict($pos, true())
          return
            if($pos gt 1)
              then ?take($pos - 1)?concat($genTail)
              else $genTail
      },
    
      replace := %method fn($funIsMatching as function(item()*) as xs:boolean, $replacement as item()*)
      {
        if(?endReached) then .
          else
            let $current := ?getCurrent()
              return
                if($funIsMatching($current))
                  then let $nextGen := ?moveNext()
                     return
                       . => map:put("getCurrent", %method fn() {$replacement})
                         => map:put("moveNext", %method fn() { $nextGen } 
                                  )
                  else (: $current is not the match for replacement :)
                    let $nextGen := ?moveNext()
                      return . => map:put("moveNext", 
                                           %method fn()
                                           {
                                             let $intendedReplace := function($z) {$z?replace($funIsMatching, $replacement)}
                                              return
                                                if($nextGen?endReached) then $nextGen
                                                else $intendedReplace($nextGen)
                                           }
                                        )
      },
      
      reverse := %method fn()
      {
        if(?endReached) then ?emptyGenerator()
          else
           let $current := ?getCurrent()
             return
               ?tail()?reverse()?append($current)
      },

      filter := %method fn($pred as function(item()*) as xs:boolean)
      {
             if(?initialized and ?endReached) then ?emptyGenerator()
              else
                let $getNextGoodGen := function($gen as map(*), 
                                             $pred as function(item()*) as xs:boolean)
                   {
                      if($gen?endReached) then $gen?emptyGenerator()
                      else
                        let $mapResult := 
                              while-do(
                                       $gen,
                                       function($x) { not($x?endReached) and not($pred($x?getCurrent()))},
                                       function($x) { $x?moveNext() }
                                       )   
                        return 
                          if($mapResult?endReached) then $gen?emptyGenerator()
                           else $mapResult                  
                   },
                   
                   $gen := if(?initialized) then . 
                             else ?moveNext(),
                   $nextGoodGen := $getNextGoodGen($gen, $pred)
                return
                  if($nextGoodGen?endReached) then $gen?emptyGenerator()
                  else
                    $nextGoodGen => map:put("moveNext", 
                                            %method fn() 
                                              {
                                                let $nextGoodGen := $getNextGoodGen(?inputGen?moveNext(), $pred)
                                                  return
                                                    if($nextGoodGen?endReached) then $nextGoodGen?emptyGenerator()
                                                    else
                                                      map:put(map:put($nextGoodGen, "moveNext", %method fn() {$nextGoodGen?moveNext()?filter($pred)}),
                                                                      "inputGen", $nextGoodGen
                                                              )
                                               }
                                           )
                                   =>
                                     map:put("inputGen", $nextGoodGen)
        },     
        fold-left := %method fn($init as item()*, $action as fn(*))
        {
          if(?endReached) then $init
            else ?tail()?fold-left($action($init, ?getCurrent()), $action)
        },
        
        fold-right := %method fn($init as item()*, $action as fn(*))
        {
          if(?endReached) then $init
            else $action(?head(), ?tail()?fold-right($init, $action))
        },
        
        fold-lazy := %method fn($init as item()*, $action as fn(*), $shortCircuitProvider as function(*))
        {
          if(?endReached) then $init
          else
           let $current := ?getCurrent()
             return
               if(function-arity($shortCircuitProvider($current, $init)) eq 0)
                 then $shortCircuitProvider($current, $init)()
                 else $action($current, ?moveNext()?fold-lazy($init, $action, $shortCircuitProvider))
        },
        
        scan-left := %method fn($init as item()*, $action as fn(*))
        {
          let $resultGen := ?emptyGenerator() 
                                => map:put("endReached", false())
                                => map:put("getCurrent", %method fn(){$init})
           return
             if(?endReached) 
               then $resultGen => map:put("moveNext", %method fn(){?emptyGenerator()})
               else
                 let $resultGen := $resultGen => map:put("getCurrent", %method fn(){$init}),
                     $partialFoldResult := $action($init, ?getCurrent())
                   return
                     let $nextGen := ?moveNext()
                      return
                        $resultGen => map:put("moveNext", %method fn()
                                              { 
                                                  $nextGen?scan-left($partialFoldResult, $action)
                                               }
                                              )            
        },
      
        scan-right := %method fn($init as item()*, $action as fn(*))
        {
          ?reverse()?scan-left($init, $action)?reverse()                         
        },
        
        makeGenerator := %method fn($provider as function(*))
        {
         let $gen := if(not(?initialized)) then ?moveNext()
                    else .,
              $nextDataItemGetter := $provider(0),
              $nextGen := if(not($nextDataItemGetter instance of function(*))) then $gen?emptyGenerator()  
                           else $gen?emptyGenerator()
                            => map:put("numDataItems", 1)
                            => map:put("current", $nextDataItemGetter())
                            => map:put("endReached", false())
                            => map:put("getCurrent", %method fn() {?current})
                            => map:put("moveNext",  
                                       %method fn() 
                                        {
                                          let $nextDataItemGetter := $provider(?numDataItems)
                                            return
                                              if(not($nextDataItemGetter instance of function(*))) then ?emptyGenerator()
                                              else
                                                . => map:put("current", $nextDataItemGetter())
                                                  => map:put("numDataItems", ?numDataItems + 1)
                                        }
                                       )
           return $nextGen                                                  
        },
        
        makeGeneratorFromArray := %method fn($input as array(*))
        {
          let $size := array:size($input),
              $arrayProvider := fn($ind as xs:integer)
                                {
                                  if($ind +1 gt $size) then -1
                                   else fn(){$input($ind + 1)}
                                }
           return ?makeGenerator($arrayProvider)
        },
        
        makeGeneratorFromSequence := %method fn($input as item()*)
        {
          let $size := count($input),
              $seqProvider := fn($ind as xs:integer)
                                {
                                  if($ind +1 gt $size) then -1
                                   else fn(){$input[$ind + 1]}
                                }
           return ?makeGenerator($seqProvider)
        },
        
        toSequence := %method fn() {?toArray() => array:items()},     
        
        emptyGenerator := %method fn() 
        {
          . => map:put("initialized", true()) => map:put("endReached", true())
            => map:put("getCurrent", %method fn() {error((),"getCurrent() called on an emptyGenerator")})
            => map:put("moveNext", %method fn() {error((),"moveNext() called on an emptyGenerator")})
        },      
     *
   );

let $gen2ToInf := f:generator(initialized := true(), endReached := false(), 
                              getCurrent := %method fn(){?last +1},
                              moveNext := %method fn()
                              {
                                if(not(?initialized))
                                  then map:put(., "inittialized", true())
                                  else map:put(., "last", ?last + 1)
                              },
                              options := {"last" : 1}
                             ),
    $double := fn($n) {2*$n},
    $sum2 := fn($m, $n) {$m + $n},
    $product := fn($m, $n) {$m * $n}
  return    
  (
    "$gen2ToInf?take(3)?toArray()",
    $gen2ToInf?take(3)?toArray(),
    "================",    
    "$gen2ToInf?take(3)?skip(2)?getCurrent()",
    $gen2ToInf?take(3)?skip(2)?getCurrent(),
    (: $gen2ToInf?take(3)?moveNext()?moveNext()?moveNext()?getCurrent(), :)
    "================",
    "$gen2ToInf?getCurrent()",
    $gen2ToInf?getCurrent(),
    "$gen2ToInf?moveNext()?getCurrent()",
    $gen2ToInf?moveNext()?getCurrent(),
    "================",
    "$gen2ToInf?take(5) instance of f:generator",
    $gen2ToInf?take(5) instance of f:generator,
    "==>  $gen2ToInf?skip(7) instance of f:generator",
    $gen2ToInf?skip(7) instance of f:generator,  
    "================",
    "$gen2ToInf?subrange(4, 6)?getCurrent()",
    $gen2ToInf?subrange(4, 6)?getCurrent(), 
    "$gen2ToInf?subrange(4, 6)?moveNext()?getCurrent()",
    $gen2ToInf?subrange(4, 6)?moveNext()?getCurrent(),
    "$gen2ToInf?subrange(4, 6)?moveNext()?moveNext()?getCurrent()",
    $gen2ToInf?subrange(4, 6)?moveNext()?moveNext()?getCurrent(),
    (: $gen2ToInf?subrange(4, 6)?moveNext()?moveNext()?moveNext()?getCurrent() :) (: Must raise error:)    
    "================",    
    "$gen2ToInf?subrange(4, 6)?head()",
    $gen2ToInf?subrange(4, 6)?head(),  
    "$gen2ToInf?subrange(4, 6)?tail()?head()",
    $gen2ToInf?subrange(4, 6)?tail()?head(),
    "$gen2ToInf?subrange(4, 6)?toArray()",
    $gen2ToInf?subrange(4, 6)?toArray(),
    "$gen2ToInf?head()",
    $gen2ToInf?head(),
    "==>  $gen2ToInf?tail()?head()",
    $gen2ToInf?tail()?head(),
    "================", 
    "$gen2ToInf?subrange(4, 6)?tail()?toArray()",
    $gen2ToInf?subrange(4, 6)?tail()?toArray(),
    "================",
    "$gen2ToInf?at(5)",
    $gen2ToInf?at(5), 
    "================",
    "$gen2ToInf?subrange(1, 5)?toArray()",
    $gen2ToInf?subrange(1, 5)?toArray(),
    "$gen2ToInf?subrange(1, 5)?for-each($double)?toArray()",
    $gen2ToInf?subrange(1, 5)?for-each($double)?toArray(),
    "$gen2ToInf?take(5)?for-each($double)?toArray()",
    $gen2ToInf?take(5)?for-each($double)?toArray(),
    "==>  $gen2ToInf?for-each($double)?take(5)?toArray()",
    $gen2ToInf?for-each($double)?take(5)?toArray(),
    "================",
    "$gen2ToInf?subrange(1, 5)?toArray()",
    $gen2ToInf?subrange(1, 5)?toArray(),
    "$gen2ToInf?subrange(6, 10)?toArray()",
    $gen2ToInf?subrange(6, 10)?toArray(),
    "$gen2ToInf?subrange(1, 5)?for-each-pair($gen2ToInf?subrange(6, 10), $sum2)?toArray()",
    $gen2ToInf?subrange(1, 5)?for-each-pair($gen2ToInf?subrange(6, 10), $sum2)?toArray(), 
    "==>  $gen2ToInf?for-each-pair($gen2ToInf, $sum2)?take(5)?toArray()",
    $gen2ToInf?for-each-pair($gen2ToInf, $sum2)?take(5)?toArray(),
    "================",
    "==>  $gen2ToInf?filter(fn($n){$n mod 2 eq 1})?getCurrent()",
    $gen2ToInf?filter(fn($n){$n mod 2 eq 1})?getCurrent(),
    "$gen2ToInf?filter(fn($n){$n mod 2 eq 1})?moveNext()?getCurrent()",
    $gen2ToInf?filter(fn($n){$n mod 2 eq 1})?moveNext()?getCurrent(),
    "================", 
    "$gen2ToInf?filter(fn($n){$n mod 2 eq 1})?take(10)?toArray()",
    $gen2ToInf?filter(fn($n){$n mod 2 eq 1})?take(10)?toArray(),  
    "================", 
    "$gen2ToInf?filter(fn($n){$n mod 2 eq 1})?take(10)?toSequence()",
    $gen2ToInf?filter(fn($n){$n mod 2 eq 1})?take(10)?toSequence(),
    "================", 
    "$gen2ToInf?takeWhile(fn($n){$n < 11})?toArray()",
    $gen2ToInf?takeWhile(fn($n){$n < 11})?toArray(), 
    "$gen2ToInf?takeWhile(fn($n){$n < 2})?toArray()",
    $gen2ToInf?takeWhile(fn($n){$n < 2})?toArray(), 
    "================", 
    "$gen2ToInf?skipWhile(fn($n){$n < 11})?take(5)?toArray()",
    $gen2ToInf?skipWhile(fn($n){$n < 11})?take(5)?toArray(),
    "==> $gen2ToInf?skipWhile(fn($n){$n < 2})",
    $gen2ToInf?skipWhile(fn($n){$n < 2}),
    "
     ==> $gen2ToInf?skipWhile(fn($n){$n < 2})?skip(1)",
    $gen2ToInf?skipWhile(fn($n){$n < 2})?skip(1),
(:    $gen2ToInf?skipWhile(fn($x) {$x ge 2}) :) (: ?skip(1) :)
    "================", 
    "$gen2ToInf?some()",
     $gen2ToInf?some(),
     "let $empty := $gen2ToInf?emptyGenerator()
      return $empty?some()",
     let $empty := $gen2ToInf?emptyGenerator()
      return $empty?some(),
    "================",
    "$gen2ToInf?take(5)?filter(fn($n){$n ge 7})?some()",
     $gen2ToInf?take(5)?filter(fn($n){$n ge 7})?some(),  
     "$gen2ToInf?take(5)?someWhere(fn($n){$n ge 7})",
     $gen2ToInf?take(5)?someWhere(fn($n){$n ge 7}), 
     "$gen2ToInf?take(5)?someWhere(fn($n){$n ge 6})",
     $gen2ToInf?take(5)?someWhere(fn($n){$n ge 6}),
     "$gen2ToInf?someWhere(fn($n){$n ge 100})",
     $gen2ToInf?someWhere(fn($n){$n ge 100}),
     "================",
     "$gen2ToInf?take(10)?take(11)?toArray()",
     $gen2ToInf?take(10)?take(11)?toArray(),
     "$gen2ToInf?take(10)?skip(10)?toArray()",
     $gen2ToInf?take(10)?skip(10)?toArray(),
     "$gen2ToInf?take(10)?skip(9)?toArray()",     
     $gen2ToInf?take(10)?skip(9)?toArray(),
     "$gen2ToInf?take(10)?subrange(3, 12)?toArray()",
     $gen2ToInf?take(10)?subrange(3, 12)?toArray(),
     "$gen2ToInf?take(10)?subrange(5, 3)?toArray()",
     $gen2ToInf?take(10)?subrange(5, 3)?toArray(),
     "================",
     "$gen2ToInf?take(100)?chunk(20)?getCurrent()",
      $gen2ToInf?take(100)?chunk(20)?getCurrent(),
      "==>  $gen2ToInf?chunk(20)?take(5)?toArray()",
      $gen2ToInf?chunk(20)?take(5)?toArray(),
     "================",
     "$gen2ToInf?take(100)?chunk(20)?moveNext()?getCurrent()",
      $gen2ToInf?take(100)?chunk(20)?moveNext()?getCurrent(),
     "$gen2ToInf?take(100)?chunk(20)?moveNext()?moveNext()?getCurrent()", 
      $gen2ToInf?take(100)?chunk(20)?moveNext()?moveNext()?getCurrent(),
     "$gen2ToInf?take(100)?chunk(20)?skip(1)?getCurrent()",      
      $gen2ToInf?take(100)?chunk(20)?skip(1)?getCurrent(),
     "================",      
     "$gen2ToInf?take(100)?chunk(20)?for-each(fn($genX){$genX})?toArray()",      
      $gen2ToInf?take(100)?chunk(20)?for-each(fn($genX){$genX})?toArray(),
     "================",  
     "$gen2ToInf?take(10)?chunk(4)?toArray()",
      $gen2ToInf?take(10)?chunk(4)?toArray(),
      "$gen2ToInf?take(10)?chunk(4)?for-each(fn($arr){array:size($arr)})?toArray()",
      $gen2ToInf?take(10)?chunk(4)?for-each(fn($arr){array:size($arr)})?toArray(),
     "================", 
     "$gen2ToInf?subrange(10, 15)?concat($gen2ToInf?subrange(1, 9))?toArray()",
     $gen2ToInf?subrange(10, 15)?concat($gen2ToInf?subrange(1, 9))?toArray(),
     "================", 
     "$gen2ToInf?subrange(1, 5)?append(101)?toArray()",
     $gen2ToInf?subrange(1, 5)?append(101)?toArray(),
     "$gen2ToInf?subrange(1, 5)?prepend(101)?toArray()",
     $gen2ToInf?subrange(1, 5)?prepend(101)?toArray(),
     "==>  $gen2ToInf?append(101)",
     $gen2ToInf?append(101),
     "$gen2ToInf?prepend(101)?take(5)?toArray()",
     $gen2ToInf?prepend(101)?take(5)?toArray(),
     "================", 
     "$gen2ToInf?subrange(1, 5)?zip($gen2ToInf?subrange(6, 10))?toArray()",
     $gen2ToInf?subrange(1, 5)?zip($gen2ToInf?subrange(6, 10))?toArray(),
     "$gen2ToInf?subrange(1, 5)?zip($gen2ToInf?subrange(10, 20))?toArray()",
     $gen2ToInf?subrange(1, 5)?zip($gen2ToInf?subrange(10, 20))?toArray(),
     "==>  $gen2ToInf?zip($gen2ToInf?skip(5))?take(10)?toArray()",
     $gen2ToInf?zip($gen2ToInf?skip(5))?take(10)?toArray(),
     "================", 
     "$gen2ToInf?makeGenerator(fn($numGenerated as xs:integer)
                                 {if($numGenerated le 9) then fn() {$numGenerated + 1} else -1} 
                             )?toArray()",
     $gen2ToInf?makeGenerator(fn($numGenerated as xs:integer)
                                 {if($numGenerated le 9) then fn() {$numGenerated + 1} else -1} 
                             )?toArray(),
     "================", 
     "$gen2ToInf?makeGeneratorFromArray([1, 4, 9, 16, 25])?toArray()",
      $gen2ToInf?makeGeneratorFromArray([1, 4, 9, 16, 25])?toArray(),
      "$gen2ToInf?makeGeneratorFromSequence((1, 8, 27, 64, 125))?toArray()",
      $gen2ToInf?makeGeneratorFromSequence((1, 8, 27, 64, 125))?toArray(), 
     "================", 
     "$gen2ToInf?take(10)?insertAt(3, ""XYZ"")?toArray()",
      $gen2ToInf?take(10)?insertAt(3, "XYZ")?toArray(),
      "$gen2ToInf?take(10)?insertAt(1, ""ABC"")?toArray()",
      $gen2ToInf?take(10)?insertAt(1, "ABC")?toArray(),
      "$gen2ToInf?take(10)?insertAt(11, ""PQR"")?toArray()",
      $gen2ToInf?take(10)?insertAt(11, "PQR")?toArray(),
      "==>  $gen2ToInf?insertAt(3, ""XYZ"")?take(10)?toArray()", 
      $gen2ToInf?insertAt(3, "XYZ")?take(10)?toArray(),
     (: , $gen2ToInf?take(10)?insertAt(12, "GHI")?toArray() :)  (:  Must raise error "Input Generator too-short." :) 
     "================", 
     "$gen2ToInf?take(10)?removeAt(3)?toArray()",
      $gen2ToInf?take(10)?removeAt(3)?toArray(),
      "$gen2ToInf?take(10)?removeAt(1)?toArray()",
      $gen2ToInf?take(10)?removeAt(1)?toArray(),
      "$gen2ToInf?take(10)?removeAt(10)?toArray()",
      $gen2ToInf?take(10)?removeAt(10)?toArray(),
      "==>  $gen2ToInf?removeAt(3)?take(10)?toArray()",
      $gen2ToInf?removeAt(3)?take(10)?toArray(),
      (: , $gen2ToInf?take(10)?removeAt(11)?toArray() :)        (:  Must raise error "Input Generator too-short." :) 
     "================",
     "$gen2ToInf?take(10)?replace(fn($x){$x gt 4}, ""Replacement"")?toArray()",
      $gen2ToInf?take(10)?replace(fn($x){$x gt 4}, "Replacement")?toArray(),
      "$gen2ToInf?take(10)?replace(fn($x){$x lt 3}, ""Replacement"")?toArray()",
      $gen2ToInf?take(10)?replace(fn($x){$x lt 3}, "Replacement")?toArray(),
      "$gen2ToInf?take(10)?replace(fn($x){$x gt 10}, ""Replacement"")?toArray()",
      $gen2ToInf?take(10)?replace(fn($x){$x gt 10}, "Replacement")?toArray(),
      "$gen2ToInf?take(10)?replace(fn($x){$x gt 11}, ""Replacement"")?toArray()",
      $gen2ToInf?take(10)?replace(fn($x){$x gt 11}, "Replacement")?toArray(),
      "$gen2ToInf?take(10)?replace(fn($x){$x lt 2}, ""Replacement"")?toArray()",
      $gen2ToInf?take(10)?replace(fn($x){$x lt 2}, "Replacement")?toArray(),
      "==> $gen2ToInf?replace(fn($x){$x gt 4}, ""Replacement"")?take(10)?toArray()",
      $gen2ToInf?replace(fn($x){$x gt 4}, "Replacement")?take(10)?toArray(),
      "$gen2ToInf?replace(fn($x){$x lt 3}, ""Replacement"")?take(10)?toArray()",
      $gen2ToInf?replace(fn($x){$x lt 3}, "Replacement")?take(10)?toArray(),
    (:  
      Will result in endless loop:
      
      , "==>  ==>  ==>  $gen2ToInf?replace(fn($x){$x lt 2}, ""Replacement"")?take(10)?toArray() <==  <==  <==",
      $gen2ToInf?replace2(fn($x){$x lt 2}, "Replacement")?take(10)?toArray() 
    :)
    "================",
    "$gen2ToInf?emptyGenerator()?reverse()?toArray()",
    $gen2ToInf?emptyGenerator()?reverse()?toArray(),
    "$gen2ToInf?emptyGenerator()?append(2)?reverse()?toArray()",
    $gen2ToInf?emptyGenerator()?append(2)?reverse()?toArray(),
    "$gen2ToInf?take(10)?reverse()?toArray()",
    $gen2ToInf?take(10)?reverse()?toArray(),
    "================",
    "$gen2ToInf?take(5)?fold-left(0, fn($x, $y){$x + $y})",
    $gen2ToInf?take(5)?fold-left(0, fn($x, $y){$x + $y}),
    "================",
    "$gen2ToInf?take(5)?fold-right(0, fn($x, $y){$x + $y})",
    $gen2ToInf?take(5)?fold-right(0, fn($x, $y){$x + $y}),
    "================",
    "$gen2ToInf?emptyGenerator()?scan-left(0, fn($x, $y){$x + $y})?toArray()",
    $gen2ToInf?emptyGenerator()?scan-left(0, fn($x, $y){$x + $y})?toArray(),
    "$gen2ToInf?take(5)?scan-left(0, fn($x, $y){$x + $y})?toArray()",
    $gen2ToInf?take(5)?scan-left(0, fn($x, $y){$x + $y})?toArray(),
    "================",
    "$gen2ToInf?makeGeneratorFromSequence((1 to 10))?scan-right(0, fn($x, $y){$x + $y})?toArray()",
    $gen2ToInf?makeGeneratorFromSequence((1 to 10))?scan-right(0, fn($x, $y){$x + $y})?toArray(),
    "================",
    let $multShortCircuitProvider := fn($x, $y)
        {
          if($x eq 0) then fn(){0}
            else fn($z) {$x * $z}
        },
        $gen-5ToInf := $gen2ToInf?for-each(fn($n){$n -7})
     return
     (
       "let $multShortCircuitProvider := fn($x, $y)
        {
          if($x eq 0) then fn(){0}
            else fn($z) {$x * $z}
        },
            $gen-5ToInf := $gen2ToInf?for-each(fn($n){$n -7})
          return
            $gen2ToInf?take(5)?fold-lazy(1, $product, $multShortCircuitProvider),
            $gen-5ToInf?fold-lazy(1, $product, $multShortCircuitProvider)",
       $gen2ToInf?take(5)?fold-lazy(1, $product, $multShortCircuitProvider),
       $gen-5ToInf?fold-lazy(1, $product, $multShortCircuitProvider)
     )
   )

Issue #1954 closed #closed-1954

29 Apr at 20:43:22 GMT

Private fields in records

Pull request #1964 created #created-1964

29 Apr at 20:39:37 GMT
1957 xsl output allows mixed content

Change to schema-for-xslt40

Fix #1957 (xsl:output disallow mixed content)

Add support for xsl:import-schema/@role

QT4 CG meeting 119 draft minutes #minutes—04-29

29 Apr at 17:30:00 GMT

Draft minutes published.

Issue #1961 closed #closed-1961

29 Apr at 16:20:50 GMT

Attempt to show that xsl:record allows extra attributes

Issue #1956 closed #closed-1956

29 Apr at 16:17:36 GMT

1954 (part) Private variables and functions don't need to be in the module namespace

Issue #1271 closed #closed-1271

29 Apr at 16:15:31 GMT

Schema validation in XPath

Issue #1933 closed #closed-1933

29 Apr at 16:15:30 GMT

1271 fn:xsd-validator() function

Issue #557 closed #closed-557

29 Apr at 16:12:32 GMT

fn:unparsed-binary: accessing and manipulating binary types

Issue #1587 closed #closed-1587

29 Apr at 16:12:31 GMT

557 Add fn:unparsed-binary function

Issue #1319 closed #closed-1319

29 Apr at 16:10:16 GMT

Specification Documents: Editors and Contributors

Issue #1416 closed #closed-1416

29 Apr at 16:10:12 GMT

Key-value pairs: built-in record type `pair`

Issue #1844 closed #closed-1844

29 Apr at 16:10:09 GMT

Drop mapping arrow operator

Issue #1704 closed #closed-1704

29 Apr at 16:09:30 GMT

Ignore the byte order mark more completely/globally

Issue #1950 closed #closed-1950

29 Apr at 16:09:29 GMT

1704 Add rules/notes for BOM and related topics

Issue #1906 closed #closed-1906

29 Apr at 16:05:42 GMT

1797 elements-to-maps-conversion-plan function

Pull request #1963 created #created-1963

29 Apr at 14:10:54 GMT
1958 Fix simple typo in map:build

Fix #1958

Issue #1962 created #created-1962

26 Apr at 11:05:28 GMT
fn:map-to-element

As this feature request has been reported back to us more than once, I want to raise the question here if we want to introduce a function that inverts the result of fn:map-to-element back to an XML representation – provided that a conversion plan exists.

Many results would certainly be lossy, but as the plan is now available separately, it would be possible to roundtrip a lot of data with regular structures, without having to write custom conversion code.

Pull request #1961 created #created-1961

25 Apr at 11:26:40 GMT
Attempt to show that xsl:record allows extra attributes

By the simple expedient of adding

<e:attribute name="*" required="yes">
   <e:data-type name="expression"/>
</e:attribute>

to the syntax summary we get

Screenshot 2025-04-25 at 12-25-57 XSL Transformations (XSLT) Version 4 0

With some additional prose, would that be sufficient?

Issue #1960 closed #closed-1960

25 Apr at 08:59:38 GMT

Attempt to improve rendering of the dynamic ToC

Pull request #1960 created #created-1960

25 Apr at 08:58:57 GMT
Attempt to improve rendering of the dynamic ToC

This is an attempt to complete action QT4CG-116-03.

Part of the confusion in the rendering is that it’s partly done by CSS and partly done by JavaScript and those had gotten out of sync. Given that the JavaScript code has to do some of the work, I changed things so it does all of the work.

I also discovered that the weird Firefox bug where the font size changed was, wait for it, caused by the particular codepoint being used for “changed”. So I, uh, changed it.

We now get ✚ for new sections, ✭ for changed sections, and both when there are both new and changed sections. I spent a bit of time trying to find a third symbol, but gave up.

The markup in the XSLT specification isn’t quite the same as in other specifications, so the results are a tiny bit odd in places. There are some sections that get marked “both” in the ToC but when you expand the ToC, there’s only one mark. I haven’t tried to work out what’s going on there yet.

Pull request #1959 created #created-1959

24 Apr at 15:49:46 GMT
1953 (part) XSLT Worked example using methods to implement atomic sets

Provides an XSLT package that uses named record types and methods to implement an atomic set data type, as an example of how abstract data types can now be implemented.

Issue #1958 created #created-1958

24 Apr at 14:41:28 GMT
Typo in map:build

If the key is already present, the processor combines the new value for the key with the existing value as determined by the and duplicates option.

Issue #1957 created #created-1957

24 Apr at 11:58:19 GMT
Schema for XSLT incorrectly allows mixed content for xsl:output

The declaration (line 1412) says mixed="true".

Pull request #1956 created #created-1956

24 Apr at 11:34:34 GMT
1954 (part) Private variables and functions don't need to be in the module namespace

See issue #1954

The PR removes the requirement for private variables and functions declared in library modules to be in the module namespace. There has never been any sensible reason for this restriction.

The restriction is retained for public variables and functions; one could argue that it is unnecessary in that case also, but it does no harm and enforces good coding discipline.

Issue #1955 created #created-1955

24 Apr at 09:58:46 GMT
fn:doc, fn:parse-xml: entity expansion

The current rule for the entity-expansion-limit option is:

The processor should impose a limit on the number of entity references that are expanded, or on the size of the expanded entities, depending on the options available in the underlying XML parser; the limit should be commensurate with the value requested, but the precise effect may be . implementation-dependent. If the XML parser does not offer the ability to impose a limit, or if the value is zero, then entity expansion should if possible be disabled entirely, leading to a dynamic error if the input contains any entity references. A negative value should be interpreted as placing no limits on entity expansion.

By default, Java uses 64000 as limit. An explicit value less than or equal to 0 indicates no limit (https://docs.oracle.com/javase/tutorial/jaxp/limits/limits.html, https://docs.oracle.com/en/java/javase/17/docs/api/java.xml/module-summary.html). I don’t know about other languages.

I would like to…

  1. change the option to an expand-entities Boolean, or
  2. change the rules and make 0 disable the limit.

Any favorites?

Issue #1954 created #created-1954

24 Apr at 09:39:56 GMT
Private fields in records

It would be nice to have some way of indicating that some of the fields in a record are (in some sense) private, intended for internal use.

I'm not proposing full encapsulation - the instances of a record type are maps, and can be manipulated by functions such as map:keys(), map:get(), and map:put() which expose all the keys.

Rather I'm proposing a convention that makes it difficult to access the fields "accidentally" using lookup expressions: a bit like naming the fields using a leading underscore, but something a bit stronger. Analogous to reflection in Java, which allows you to break encapsulation with a bit of effort.

I'd suggest making the keys for these "private" fields QNames rather than strings:

  • In the record declaration, we allow a field name to be a QName rather than a string: record(private:data as item()*, long, lat).
  • QNames can't be used directly in a lookup; to access the field, you need to know what namespace "private" is bound to, which doesn't need to be published information (though it is of course discoverable)
  • Internally the implementor of this interface can bind a QName to a private variable and use this:

declare %private variable $private:data as xs:QName('http://my.private.namespace/', 'data')

and then access it using $record?$private:data

Issue #1953 created #created-1953

24 Apr at 09:19:38 GMT
Make generation of constructor function for named record types optional

I propose that when a named record type is declared in XQuery or XSLT, the generation of a constructor function should be optional.

Perhaps in XQuery it should only happen if there is an annotation %constructor, and in XSLT if there is an attribute constructor="yes". I think it's better for the default to be "no constructor" because it's better to make the existence of a constructor explicitly visible.

There are cases where you don't want a system-generated constructor primarily because you want to provide your own constructor which perhaps accepts the data in a slightly different form, or perhaps imposes constraints like cross-validation of supplied arguments.

Issue #1952 created #created-1952

23 Apr at 21:47:20 GMT
Change option name from xsi-schema-location to use-xsi-schema-location

Functions (such as fn:doc and fn:parse-xml) that have a boolean option xsi-schema-location should change this to use-xsi-schema-location to make it clearer that the expected value is a boolean and not a schema location.

(Comment made in passing at the last meeting).

Issue #1951 created #created-1951

23 Apr at 15:54:00 GMT
Some nits regarding the method attribute

A few minor comments on the method attribute (that also apply to XSLT 3.0):

  1. A note in section 25.1 says "In the case of the attributes method, cdata-section-elements, suppress-indentation, and use-character-maps, the effective value of the attribute contains a space-separated list of EQNames." The effective value of the method attribute should not be a list of values, just one value, right?

  2. The XSD schema requires that the method attribute contain a colon if it is not one of the 6 "built-in" values. (The type xsl:method restricts xsl:EQName with the pattern "\c*:\c*"). Now that it can be an expanded QName, it should also allow for Q{...}.

  3. A note in section 3.2 says: "Extension attributes may also be used to influence the behavior of the serialization methods xml, xhtml, html, or text, to the extent that... If a serialization method other than one of these four is requested (using a prefixed QName in the method parameter) then...".
    a. This lists only 4 methods, Should "json" and "adaptive" be added to the list and "four" changed to "six"? b. Should "using a prefixed QName" be changed to something like "using an EQName with a non-absent namespace"?

  4. In the note about error XTSE1570 in Section 26.2. "The value must (if present) be a valid EQName. If it is a lexical QName with no a prefix, then it identifies a method specified in [XSLT and XQuery Serialization] and must be one of xml, html, xhtml, or text. a. It only lists the 4 methods, leaving out "json" and "adaptive" b. Typo - "no a prefix" should be "no prefix"

Thanks!

Pull request #1950 created #created-1950

23 Apr at 15:52:08 GMT
1704 Add rules/notes for BOM and related topics

Fix #1704

The main substantive change is that unparsed-text() now explicitly discards any leading BOM.

Other functions that involve decoding of octets to strings are updated to reflect the changes that we made to reference the concept of "permitted characters".

Issue #1644 closed #closed-1644

23 Apr at 15:29:02 GMT

fn:elements-to-maps: Mixed Content

Issue #1658 closed #closed-1658

23 Apr at 15:28:43 GMT

fn:elements-to-maps: `empty`, normalize space ?

Issue #1647 closed #closed-1647

23 Apr at 15:28:36 GMT

fn:elements-to-maps: Explicit Layouts

Issue #1949 created #created-1949

23 Apr at 15:26:51 GMT
fn:element-to-map: Updated Feedback

My feedback is based on the latest version PR (#1906):

1. Boolean types

I think we should be careful about changing data to a representation that differs from the input data. If the input contains 0 and 1, it seems too invasive to me to return a boolean. Many users will not be aware that those numbers are valid candidates for Boolean conversions in XPath. That’s why I would still pledge for adapting the type rule detections, and placing numeric before boolean (related: 5).

Things are even more awkward (if I got the rules right) when working without a conversion plan:

(: Query :)
element-to-map(<x><a>1</a><a>2</a></x>)

(: Result :)
{ "x": [ true(), 2 ] }

2. Explicit types

I still feel uneasy that we ignore the specified type if it does not match – even more because XML is known for its rigor that documents must be well-formed to be accepted. I agree we should allow users to be lax about their generated output – by deliberately omitting types – but if type hints are supplied, I think we should take them serious.

An example:

element-to-map(
  <a>2</a>,
  { 'plan': { 'a': { 'layout': 'simple', 'type': 'boolean' } } }
)

3. Numeric casts

If the prescribed type is numeric and the value is castable as xs:numeric, then it is output as an instance of xs:integer, xs:decimal, or xs:double depending on the lexical form of the value, following the same rules as for XPath numeric literals.

Unless we use the same rule somewhere else in the spec, I would definitely vote for making things easier and choose consistency. xs:numeric(<a>1</a>) returns a double value, so I think we should do exactly the same here. If the result will be serialized as JSON, everything will be a number anyway.

4. Normalized space

  1. If empty($EE/(* | text()) …the layout is empty

I still believe empty($EE/(* | text()[normalize-space()]) would be a better choice. The error sections for both empty and list state that “whitespace-only text nodes are discarded.”, so it is not clear to me why the rules for whitespace text nodes differ for these layouts.

5. 18.5.2 Creating a conversion plan

The current rules do not mention yet that child keys need to be added for list and list-plus.

In general, I would appreciate if redundancy could be removed. I’m still struggling finding all relevant information without resorting to the tests. For example, I think that due to the new XQuery code, a lot of informal and possibly lossy rules can be dropped.

6. Function signatures: document-node(), element()

Both functions should accept only elements, or accept both document nodes and elements. Maybe it’s better to only accept elements; it would resemble the name of the function.

7. deep-skip option

I wouldn’t be able to tell how a shallow-skip option could work, so maybe skip is sufficient?

8. Streamability

  • The conversion is not streamable.

It is not clear (to me) what this means. Is this XSLT-related? Maybe a reference would be helpful, or we should drop the phrase if it’s not relevant anymore?

My observation was that fn:element-to-map can be implemented without keeping the full document in main-memory, so maybe we should let the processors decide what to do?

9. JSON

Example output: …shown as serialized JSON. The result is always shown as a singleton map…

This sounds contradictory, as there are no maps in the JSON terminology (but objects). Maybe there is no need to mention the JSON serialization, as the presented results are maps & arrays that can be run as XPath expressions out of the box.

10. Layout rules: errors

The error rules for empty and empty-plus say: “If any other child nodes are present, this layout fails.”. For the simple and simple-plus layouts, it is “If any child elements are present, this layout fails.”.

Am I right to assume that in both cases it’s only child element that result in a failure?

Issue #1948 created #created-1948

23 Apr at 14:43:55 GMT
fn:element-to-map: Tests

My feedback is based on the latest version PR (#1906) and the latest test cases (https://github.com/qt4cg/qt4tests/pull/223). I decided to list my observations in this repository, as I am not sure whether it’s the tests or the spec that may possibly need to be revised:

  1. element-to-map-017:

As discussed in the last meeting (see also https://github.com/qt4cg/qtspecs/pull/1906#issuecomment-2821502378), the xsi:type attribute should already be ignored in the choice of the conversion plan, in order to choose a plan that does not include attributes:

element-to-map(parse-xml('<a xmlns="http://a.com/" xsi:type="xs:integer"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xmlns:xs="http://www.w3.org/2001/XMLSchema">2</a>')/*)
Result: { "Q{http://a.com/}a": 2 }
Expect: { "Q{http://a.com/}a": { "#content": "2" } }
  1. element-to-map-401:

I would expect the record layout to be also applied for the b child node:

element-to-map(parse-xml('<a id="3"><b/></a>')/a, {'plan': {'a': {'layout': 'empty'},
  '*': {'layout': 'record'}, '@id': {'type': 'numeric'} }})
Result: { "a": { "@id": 3, "b": { } } }
Expect: { "a": { "@id": 3, "b": "" } }
  1. element-to-map-420 and others:

As discussed in the meeting, list layouts require a child key:

element-to-map(parse-xml('<a id="zz"><b/><b/><c/></a>')/a, {'plan':{'a':{'layout':'list'}}})
Error : XPTY0004: Missing key 'child' (node: a).
Expect: FOJS0008
  1. element-to-map-511:

I would expect the mixed layout to be also applied for the a child node:

element-to-map(parse-xml('<a>The <i>short</i> introduction</a>')/a, {'plan': {'a': {'layout':'simple'}, '*': {'layout':'mixed'} }})
Result: { "a": [ "The ", { "i": [ "short" ] }, " introduction" ] }
Expect: { "a": array { ("The ", { "i": "short" }, " introduction") } }

Issue #1936 closed #closed-1936

22 Apr at 20:19:39 GMT

XSD for XSLT 4.0 is missing form="qualified" on several attributes within the attribute group "literal-result-element-attributes"

Issue #1943 closed #closed-1943

22 Apr at 20:16:24 GMT

Mark attribute declarations as form=qualified

Issue #1947 closed #closed-1947

22 Apr at 20:16:23 GMT

1936 Mark attributes with form=qualified

Pull request #1947 created #created-1947

22 Apr at 20:16:12 GMT
1936 Mark attributes with form=qualified

Fix #1936

Issue #1946 created #created-1946

22 Apr at 16:27:12 GMT
We need examples of a record with an entry that is a %method and invoking this method with the result it must produce

We have two new great features in XPath 4.0 - the record and %method entries of a map.

It is natural to combine the two and have a record, one of whose entries is a method.

Unfortunately, at present there is no such example in the relevant Specs, and this leaves the reader trying to guess even the syntax of the specific record definition.

Therefore, we need at least one such example, to help readers and implementors in this.

BaseX did a great job in implementing both records and %method map-entries, and I constructed the following example, which is syntactically correct, but results in error (due to function coercion - explained in a separate issue here: https://github.com/qt4cg/qtspecs/issues/1938).

This code:

declare namespace t = "my:t";
declare record t:location 
   ( longitude as xs:integer,
     latitude as xs:integer,
     myFun2 as %method fn() as xs:integer,
     *
   );
   
   let $r := t:location(longitude := 25, latitude := 10, myFun2 := %method fn() {?longitude + ?latitude}
                        )
     return ($r, $r?myFun2())

When executed with BaseX, raises an error: [XPDY0002] .: Context value is undefined.

We need the specification to provide a similar example, and what the result should be.

In particular:

  1. Must the return type of the method myFun2 (above) be specified or can it be omitted? BaseX raises a syntax error if this is specified as: myFun2 as %method fn() : "Expecting 'as', found ',' ?
  2. If the method entry is specified as having a particular type (such as: as xs:integer) then can the corresponding function, provided in the construction of this record omit the function type (such as myFun2 := %method fn() {?longitude + ?latitude}) , even though it is clear that the type of the result is xs:integer ?

Issue #1941 closed #closed-1941

22 Apr at 16:13:50 GMT

Add PR numbers and dates to change metadata

Issue #1921 closed #closed-1921

22 Apr at 16:10:38 GMT

XSLT: semantics of PatternVersionRange

Issue #1922 closed #closed-1922

22 Apr at 16:10:37 GMT

1921 Expand definition of version ranges in XSLT

Issue #1907 closed #closed-1907

22 Apr at 16:07:36 GMT

Method lookup: wildcards

Issue #1926 closed #closed-1926

22 Apr at 16:07:35 GMT

1907 method lookup (disallow wildcard selection)

Issue #1928 closed #closed-1928

22 Apr at 16:04:17 GMT

1844b Arrow Expressions

Issue #1724 closed #closed-1724

22 Apr at 16:03:05 GMT

Allow @copy-namespaces on <xsl:mode>?

Issue #1929 closed #closed-1929

22 Apr at 16:03:04 GMT

1725 xsl:mode/@copy-namespaces

Issue #1939 closed #closed-1939

22 Apr at 16:01:27 GMT

XQDY0153 (from try/finally) should be a type error

Issue #1940 closed #closed-1940

22 Apr at 16:01:25 GMT

1939 XQDY0153 (from try/finally) should be a type error

Issue #1931 closed #closed-1931

22 Apr at 15:58:13 GMT

QT4-CG-116-02 improve description of validation

Issue #910 closed #closed-910

22 Apr at 15:57:41 GMT

Introduce a Kollection object with functions that operate on all types of items that can be containers of unlimited number of "members"

Pull request #1945 created #created-1945

22 Apr at 14:46:22 GMT
1568 unicode case blind collation

Fix #1568

Issue #1944 created #created-1944

22 Apr at 09:14:55 GMT
Try/Catch/Finally - order of evaluation

I'm struggling a bit with try/catch/finally. Is there an implied constraint on the order of evaluation? The spec appears to suggest so:

its expression will be evaluated after the expressions of the try clause and a possibly evaluated catch clause.

What does evaluated after actually mean in a functional language? I can't see how to reconcile this with the general principles of the language regarding lazy evaluation etc. Is there some kind of exception to the general rule that you only have to evaluate as much of an expression as is needed to work out what the result is going to be?

It seems to me that the hidden unstated purpose of "finally" is to execute expressions that have side-effects, and without a proper semantic framework for handling side-effects, this is going to get us into trouble.

Pull request #1943 created #created-1943

21 Apr at 21:18:43 GMT
Mark attribute declarations as form=qualified

Resubmitted because of some clerical error...

Issue #1937 closed #closed-1937

21 Apr at 21:15:43 GMT

1936 Mark attribute declarations as form=qualified

Pull request #1942 created #created-1942

21 Apr at 19:54:42 GMT
37 Support sequence, array, and map destructuring declarations

Closes #37.

This currently only supports XPath. I'm working on the wording for XQuery.

Pull request #1941 created #created-1941

21 Apr at 16:23:44 GMT
Add PR numbers and dates to change metadata

Purely editorial. No issue raised.

Pull request #1940 created #created-1940

21 Apr at 10:33:56 GMT
1939 XQDY0153 (from try/finally) should be a type error

Closes #1939

Issue #1939 created #created-1939

20 Apr at 11:58:58 GMT
XQDY0153 (from try/finally) should be a type error

The finally clause is required to return an empty sequence. If not, it raises XQDY0153. This should be a type error rather than a dynamic error, so that it can be raised statically when appropriate.

Also noted in passing: "the the" in item 4 of the "changes" section of §4.20).

Issue #1938 created #created-1938

18 Apr at 20:36:04 GMT
Invoking coerced methods

@ChristianGruen, in BaseXdb/basex#2420, brought up a test case similar to this:

declare record local:r(
    f as fn() as item()
);

local:r(%method fn() {.})
? f()
=> map:keys()

With a method passed to the constructor, and retrieved by the lookup operator, one would expect the function call to return the map, and the result to be f, the set of keys of the map. With BaseX's current implementation however, it fails with [XPDY0002] .: Context value is undefined.

The reason for this failure is function coercion. The record constructor asks for a more specific type than that of the supplied function, so it is subject to function coercion. This creates a new function item, which preserves the method annotation, and which effects a call to the original one. The lookup operator then is applied to the newly created function item, which as a method is equipped with the map as its context item. But when the original function gets called, that context item is not propagated to it. So there is a (rightful?) complaint about an undefined context.

I may be missing something here, but I do not see anything in the spec that makes the context item available to the coerced function, so I think that the described behavior is in fact conformant to the spec.

But as it contradicts the original expectation, I would be grateful for a clarification.

Pull request #1937 created #created-1937

17 Apr at 16:32:41 GMT
1936 Mark attribute declarations as form=qualified

Fix #1936

Issue #1936 created #created-1936

16 Apr at 17:37:25 GMT
XSD for XSLT 4.0 is missing form="qualified" on several attributes within the attribute group "literal-result-element-attributes"

The attributes in attribute group "literal-result-element-attributes" should be in the XSL namespace, and most of them are, except for xsl:default-mode, xsl:default-validation and xsl:expand-text. Those 3 are missing their form="qualified" attribute, so they would default to being in no namespace.

I know it's non-normative but it would be good to make the correction. (This is also an issue for the XSD for XSLT 3.0).

Issue #1799 closed #closed-1799

15 Apr at 23:06:04 GMT

"well-formed HTML document"?

Issue #1891 closed #closed-1891

15 Apr at 23:06:03 GMT

`fn:parse-html`: `html-version`

Issue #1918 closed #closed-1918

15 Apr at 23:06:02 GMT

1891 clarifications on HTML versions and errors

Issue #1363 closed #closed-1363

15 Apr at 22:50:57 GMT

map:get and array:get

Issue #1901 closed #closed-1901

15 Apr at 22:50:56 GMT

1363 fallback becomes a value not a function

Issue #1896 closed #closed-1896

15 Apr at 16:10:29 GMT

Drop "parameter names" as a property of a function item

Issue #1916 closed #closed-1916

15 Apr at 16:10:28 GMT

1896 Drop parameter names as a property of function items

Issue #1932 closed #closed-1932

15 Apr at 16:07:14 GMT

QT4-CG-115-01 xsl:next-match examples

Issue #1930 closed #closed-1930

15 Apr at 16:04:14 GMT

QT4-CG-116-04 correction to fn:function-identity

Issue #1923 closed #closed-1923

15 Apr at 16:02:04 GMT

Arithmetic Expressions needlessly mentions UnionExpr

Issue #1924 closed #closed-1924

15 Apr at 16:02:03 GMT

1923 Editorial adjustments for arithmetic expressions

Issue #269 closed #closed-269

15 Apr at 16:01:48 GMT

Function for URI relativization

Issue #826 closed #closed-826

15 Apr at 16:01:32 GMT

Arrays: Representation of single members of an array

Issue #1566 closed #closed-1566

15 Apr at 16:01:24 GMT

EXPath Modules: Future

Issue #1754 closed #closed-1754

15 Apr at 16:01:17 GMT

Inverse functions to bin:hex, bin:bin, and bin:octal

Issue #1780 closed #closed-1780

15 Apr at 16:01:09 GMT

xsl:for-each optional variable introduction

Issue #1905 closed #closed-1905

15 Apr at 15:58:43 GMT

Editorial edits

Issue #1919 closed #closed-1919

15 Apr at 15:58:42 GMT

1905 Editorial edits

Issue #1935 created #created-1935

15 Apr at 14:11:06 GMT
doc-available() with invalid options

The doc-available function needs to make it clear what happens when invalid options are supplied.

Clearly invalid options such as xinclude="yes-or-no" should be an error (rather than resulting in a return value of false, as the current spec might suggest).

It's less clear what should happen if say you request schema validation with a non-schema-aware processor. I think this probably calls for returning false rather than an error.

Issue #1934 created #created-1934

14 Apr at 14:26:24 GMT
Supporting RELAX NG validation

At meeting 116, the question was raised: why don't we support RELAX NG validation?

I think that's a good question. Further, I think we should, if we can work out the technical details and arrive at consensus.

The good news is that it's a lot simpler than XSD validation. For those not familiar with RELAX NG, the 50,000 foot summary is that it's (more-or-less) regular expressions over trees. The grammar defines a number of patterns, including at least one designated as a start pattern. If the document matches (any one of the) start pattern(s), then it's valid. A trivial example looks like this:

start = doc

doc =
    element doc {
        attribute date { xsd:date },
        p+
    }

p =
    element p {
        text
    }

(RELAX NG also has an XML syntax, but the "compact syntax" is isomorphic and many people find it easier to read.)

A couple of things to note: the "p" in "p+" in the doc pattern is a reference to the "p" pattern, not to the element named "p". And although the date attribute has to conform to an xsd:date, that does not do any type assignment. RELAX NG allows user-defined data types; I suggest we make that an implementation-defined feature. Since no type assignment is performed, it doesn't really matter.

How might we add support for RELAX NG validation? A sketch...

  1. Add to the static context a set of RELAX NG patterns. Initally empty, these are patterns that can match the document element during RELAX NG validation. (The union of all of the "start" patterns from all of the imported RELAX NG grammars.) There's no user-access to these patterns, so we don't technically need to add them to the data model, though I suppose we could.

  2. In XQuery, allow schema import to import RELAX NG grammars. This has no effect except that the start patterns defined in that grammar are added to the start patterns in the static context. It is an error to specify “fixed” or “default element namespace” if the imported schema is a RELAX NG grammar.

  3. Add “relax-ng” as a ValidationMode in ValidateExpr. It is an error to also specify a “type”.

    If RELAX NG validation is requested, the patterns in the static context are used to attempt to validate the document. If one succeeds, the validated document is returned. If none succeed, that’s an error.

  4. In XSLT, allow schema import to import RELAX NG grammars.

  • Details about role, TBD
  • Details about literal schema elements, TBD
  1. In XSLT, on elements that have a validation attribute, allow the value “relax-ng” with semantics analagous to the validate expression in XQuery.

  2. In the F&O functions that have “dtd-validation” and “xsd-validation” options, add a boolean “relax-ng” validation option. If true, validation is done with the patterns in the static context.

There's one small wrinkle, the RELAX NG DTD Compatibility specification defines some annotations that allow a RELAX NG grammar to return default attributes. That means RELAX NG validation can return a different document than was validated, but I'm not sure how important that support is in 2025 if there were strong objections.

Pull request #1933 created #created-1933

14 Apr at 11:26:56 GMT
1271 fn:xsd-validator() function

This proposal makes schema validation (as performed by the XQuery validate expression) available as a function. This allows additional options to be defined without extending the grammar, it makes it easier to incorporate validation within a pipeline of function calls, and it makes validation available from XPath.

If the proposal is accepted I would propose doing some editorial reorganisation so that the current XQuery and XSLT text describing the semantics of validation are directed to the definition of this function, reducing duplication in the specs.

Fix #1271

Pull request #1932 created #created-1932

14 Apr at 09:25:14 GMT
QT4-CG-115-01 xsl:next-match examples

Adds an example demonstrating passing of parameter through a chain of xsl:next-match instructions

Pull request #1931 created #created-1931

14 Apr at 08:57:23 GMT
QT4-CG-116-02 improve description of validation

Improves the description of the semantics of the xsd-validation option on parse-xml() and doc(). Also brings the two functions into line by adding the xsi-schema-location option from doc() to parse-xml().

Pull request #1930 created #created-1930

14 Apr at 08:42:52 GMT
QT4-CG-116-04 correction to fn:function-identity

Fix a simple typo.

Pull request #1929 created #created-1929

13 Apr at 08:55:30 GMT
1725 xsl:mode/@copy-namespaces

Fix #1724

Issue #1742 closed #closed-1742

12 Apr at 23:20:54 GMT

Maps constructed using streamed xsl:fork instruction should not be ordered

Issue #1925 closed #closed-1925

12 Apr at 17:08:19 GMT

1844 Arrow Expressions

Pull request #1928 created #created-1928

12 Apr at 17:05:20 GMT
1844b Arrow Expressions

This PR doesn't do what issue https://github.com/qt4cg/qtspecs/issues/1844 suggests, namely dropping the mapping arrow. Instead it picks up a couple of points made in passing in that issue:

(a) drops remaining references to the obsolete =?> operator

(b) simplifies the grammar for arrow expressions

(c) improves the way arrow expressions are described, including their relationship to pipeline expressions.

Issue #1927 closed #closed-1927

12 Apr at 15:33:34 GMT

1907b method lookup

Pull request #1927 created #created-1927

12 Apr at 15:33:09 GMT
1907b method lookup

Fix #1907

Pull request #1926 created #created-1926

12 Apr at 15:29:51 GMT
1907 method lookup (disallow wildcard selection)

Fix #1907

Issue #1341 closed #closed-1341

12 Apr at 14:55:21 GMT

Remove the `$position` argument from the `$action` function passed to folds

Pull request #1925 created #created-1925

12 Apr at 13:49:51 GMT
1844 Arrow Expressions

This PR doesn't do what issue #1844 suggests, namely dropping the mapping arrow. Instead it picks up a couple of points made in passing in that issue:

(a) drops remaining references to the obsolete =?> operator

(b) simplifies the grammar for arrow expressions

(c) improves the way arrow expressions are described, including their relationship to pipeline expressions.

Pull request #1924 created #created-1924

12 Apr at 09:28:30 GMT
1923 Editorial adjustments for arithmetic expressions

Fix #1923

Issue #1923 created #created-1923

11 Apr at 16:52:05 GMT
Arithmetic Expressions needlessly mentions UnionExpr

Minor editorial adjustment needed at XPath specs, 4.8 Arithmetic Expressions. The definition of UnionExpr is mentioned in the EBNF snippets at the top, but that definition is not discussed, nor should it be. This snippet should be dropped.

Pull request #1922 created #created-1922

11 Apr at 10:46:41 GMT
1921 Expand definition of version ranges in XSLT

Fix #1921

Simple editorial bug fix.

Issue #1921 created #created-1921

10 Apr at 21:26:33 GMT
XSLT: semantics of PatternVersionRange

In XSLT §3.5.1 The semantics of VersionTo and VersionFromTo are described as if the keyword to is always followed by a VersionPrefix, whereas the syntax allows a choice of a VersionPrefix or a PackageVersion.

This problem is present in XSLT 3.0.

See also Saxon bug https://saxonica.plan.io/issues/6746

Note also the absence of tests for the form to VersionPrefix.

Issue #1737 closed #closed-1737

10 Apr at 12:44:54 GMT

Grammar problems introduced by #1732

Issue #1798 closed #closed-1798

10 Apr at 12:42:40 GMT

Getting the value of the new identity-(DM)property of a function. `fn:function-identity`

Issue #1920 created #created-1920

10 Apr at 12:39:27 GMT
Parse functions: determinism

The function fn:parse-xml is nondeterministic: Every function call may return a different node instance. Most other parse functions (fn:parse-json, fn:parse-csv, fn:csv-to-xml, etc) are deterministic, and I believe we should change that and make them nondeterministic as well.

We could also make fn:json-doc nondeterministic. If we don’t, we should probably add a stable option.

Pull request #1919 created #created-1919

10 Apr at 09:44:42 GMT
1905 Editorial edits

Closes #1905

Issue #1917 closed #closed-1917

09 Apr at 16:31:36 GMT

1891 HTML versions and errors

Pull request #1918 created #created-1918

09 Apr at 16:30:48 GMT
1891 clarifications on HTML versions and errors

Fix #1891 Fix #1799

partial fix for #1889

Pull request #1917 created #created-1917

09 Apr at 16:24:38 GMT
1891 HTML versions and errors

Fix #1891 Fix #1799

partial fix for #1889

Pull request #1916 created #created-1916

08 Apr at 18:31:29 GMT
1896 Drop parameter names as a property of function items

Fix #1896

Issue #1911 closed #closed-1911

08 Apr at 18:15:53 GMT

Remarks on recent changes to regular expression handling

Issue #1902 closed #closed-1902

08 Apr at 18:02:19 GMT

`binary:unpack-integer`, overflow/underflow

Issue #451 closed #closed-451

08 Apr at 16:32:17 GMT

Multiple Schemas

Issue #1819 closed #closed-1819

08 Apr at 16:32:16 GMT

451 Multiple schemas in XSLT

Issue #1881 closed #closed-1881

08 Apr at 16:29:06 GMT

fn:function-identity for maps and arrays

Issue #1895 closed #closed-1895

08 Apr at 16:29:05 GMT

1881 Function identity for maps and arrays

Issue #1876 closed #closed-1876

08 Apr at 16:26:02 GMT

`fn:replace`: Combine $replacement and $action parameters

Issue #1897 closed #closed-1897

08 Apr at 16:26:01 GMT

1876 In fn:replace(), merge the $replacement and $action parameters

Issue #1520 closed #closed-1520

08 Apr at 16:22:56 GMT

Type declarations of cyclically dependent modules

Issue #1908 closed #closed-1908

08 Apr at 16:22:55 GMT

1520 Allow forwards references to named item types

Issue #1910 closed #closed-1910

08 Apr at 16:20:10 GMT

1021 (part 1) Add $options arg to doc() and doc-available()

Issue #501 closed #closed-501

08 Apr at 16:16:33 GMT

Error handling: try/finally

Issue #1914 closed #closed-1914

08 Apr at 16:16:32 GMT

501 Error handling: try/finally

Issue #1915 closed #closed-1915

08 Apr at 16:13:49 GMT

1902b bin:unpack out of range error

Issue #1624 closed #closed-1624

08 Apr at 16:11:06 GMT

document-node(a|b) is the same type as document-node(a)|document-node(b)

Issue #1898 closed #closed-1898

08 Apr at 16:11:05 GMT

1624b Expand rules for document node subtyping

Issue #1832 closed #closed-1832

08 Apr at 16:07:15 GMT

Associativity of Operators, especially "||" (Appendix A.5)

Issue #1904 closed #closed-1904

08 Apr at 16:07:14 GMT

1832 Operator Associativity

Issue #564 closed #closed-564

08 Apr at 16:06:27 GMT

Sorted maps

Issue #982 closed #closed-982

08 Apr at 16:06:21 GMT

scan-left, scan-right: position argument, array functions

Issue #1846 closed #closed-1846

08 Apr at 16:06:17 GMT

%method functions, dynamic function calls

Issue #1900 closed #closed-1900

08 Apr at 16:06:14 GMT

Records: instance checks

Issue #1913 closed #closed-1913

08 Apr at 16:03:43 GMT

1911 Clarifications for regular expressions

Issue #1645 closed #closed-1645

08 Apr at 14:29:06 GMT

fn:elements-to-maps: Debugging

Issue #1646 closed #closed-1646

08 Apr at 14:28:54 GMT

fn:elements-to-maps: Robustness

Issue #1648 closed #closed-1648

08 Apr at 14:27:03 GMT

fn:elements-to-maps: Types

Issue #1909 closed #closed-1909

06 Apr at 23:22:47 GMT

1902 bin unpack out of range

Pull request #1915 created #created-1915

06 Apr at 23:21:47 GMT
1902b bin:unpack out of range error

Replaces PR #1909

Adds error conditions for unpacking an integer that is too large for the implementation

Pull request #1914 created #created-1914

04 Apr at 16:08:40 GMT
501 Error handling: try/finally

Closes #501

Pull request #1913 created #created-1913

04 Apr at 15:08:02 GMT
1911 Clarifications for regular expressions
  1. Reinstates the non-capturing group syntax (?: xxx )
  2. Clarifies that a zero-length matching segment does not overlap an immediately preceding adjacent (but non-zero-length) segment.

Issue #1912 created #created-1912

04 Apr at 14:05:19 GMT
Error handling: `fn:throw`

Adopted from #501:

In https://github.com/qt4cg/qtspecs/pull/493, a function/expression was suggested to re-throw errors:

try {
  (: wild stuff :)
} catch * {
  module:log($err:description),
  fn:throw($err:map)
}

Existing errors map can be modified before rethrowing them:

try {
  1 div 0
} catch * {
  module:log($err:description),
  fn:throw($err:map => map:put('description', 'Arithmetic error'))
}

Issue #1911 created #created-1911

02 Apr at 13:45:40 GMT
Remarks on recent changes to regular expression handling

I would like to share these observations that I made while working on recent changes of regular expression handling per #1856.

The section that mentions potential rewrites of \b and \B misses to consider the start and end of the string, as well as the empty string. It should rather read:

\b can be rewritten to an equivalent form in terms of lookbehind and lookahead assertions:

(?:(*positive_lookbehind:\w)(?:$|(*positive_lookahead:\W))|(?:^|(*positive_lookbehind:\W))(*positive_lookahead:\w))

A similar rewrite is possible for \B, but it must additionally take care of the empty string.

For fn:analyze-string, it might be useful to add a clarifying remark about empty matches at the end of the result (see qt4cg/qt4tests#224).

The specification of fn:analyze-string contains a duplicated word, the the. The same also occurs in several other places in the documents.

Pull request #1910 created #created-1910

31 Mar at 16:50:14 GMT
1021 (part 1) Add $options arg to doc() and doc-available()

To follow: options for collection() and uri-collection().

Pull request #1909 created #created-1909

31 Mar at 15:40:31 GMT
1902 bin unpack out of range

Add error condition.

Fix #1902

I also did some work on removing errors and warnings from the EXPath binary build. There are a couple of outstanding issues I'm not sure how to fix:

(a) The function bin:bin had the incorrect id value func-bin-binary instead of func-bin-bin. I've corrected it, but the database of section ids needs updating.

(b) In database.xml, the EXPath binary spec is identified as document-summary/@uri = "https://qt4cg.org/specifications/EXPath/binary-40/". But the actual location of the specification is "https://qt4cg.org/specifications/expath-binary-40/"

(c) There are tags such as <code>bin:index-of-range</code> which the stylesheet is trying to interpret as function names rather than error codes. They actually refer to obsolete error codes so we can't use <errorref>

Pull request #1908 created #created-1908

31 Mar at 11:32:55 GMT
1520 Allow forwards references to named item types

Fix #1520

Issue #1907 created #created-1907

29 Mar at 17:42:57 GMT
Method lookup: wildcards

We should ignore the %method annotation for wildcard lookups:

let $data := { 'fn': %method fn() { . } }
return $data?*

I cannot see when this makes sense as it only seems to work if the wildcard lookup returns a single item (→ “selects a key/value pair whose value part is a singleton method”). In addition, it makes streaming of wildcard results troublesome.

If we believe we should support context value bindings for wildcards, I think it would be better to apply it to each item of the returned value, instead of the value as a whole.

Pull request #1906 created #created-1906

28 Mar at 15:16:41 GMT
1797 elements-to-maps-conversion-plan function

The PR drops the "uniform" option of elements-to-maps into a separate function elements-to-maps-conversion-plan, which can be used to analyze a corpus of data and generate a conversion plan for use by elements-to-maps. This is useful when the conversion is to be applied to documents that are not part of the corpus, for example when new documents arrive for conversion every day and need to be converted in a consistent way. It also provides a more general mechanism for users to override the system decisions on what layouts to use for what elements.

The PR is not entirely complete at this stage: the technical detail is all there, but examples need to be reviewed. Comments are welcome at this stage.

There are a few other minor changes. The most notable are:

  • More consistent fallback when an inappropriate layout is chosen. If the layout does not allow attributes, then attributes are discarded; if there is any other mismatch, the converter falls back to serialized XML layout.
  • Better handling of boolean and numeric element and attribute content.

Issue #1905 created #created-1905

27 Mar at 15:40:09 GMT
Editorial edits

XQFO:

  • Buggy examples/results: map:put, map:of-pairs, fn:scan-left
  • duplicates: the the, …
  • Boolean defaults: true()/false() vs. true/false

…to be continued

Pull request #1904 created #created-1904

27 Mar at 11:42:29 GMT
1832 Operator Associativity

Update the table and explanatory notes.

Fix #1832

Issue #1903 created #created-1903

27 Mar at 11:27:31 GMT
`fn:scan-left`, `fn:scan-right`: missing steps

I have labeled #982 (which included position arguments) to be closed to focus on the remaining todos:

  1. The types of the $action parameters of fn:fold-right and fn:scan-right should be aligned. In particular, item()* and item() of the scan function should be swapped: → #1919
fn:fold-right(
  $input   as item()*,	
  $init    as item()*,	
  $action  as fn(item(), item()*) as item()*	
) as item()*

fn:scan-right(
  $input   as item()*,	
  $init    as item()*,	
  $action  as fn(item()*, item()) as item()*	
) as array(*)*
  1. The result of the last example of fn:scan-left is syntactically wrong. → #1919
  2. The equivalent array functions are still missing (if we still believe we want to include them).
  3. We need tests.

Issue #1902 created #created-1902

27 Mar at 10:20:49 GMT
`binary:unpack-integer`, overflow/underflow

If binary:unpack-integer or binary:unpack-unsigned-integer generates a value that exceeds the range supported by the implementation, err:FOAR0002 should be raised.

Related: https://github.com/expath/expath-cg/issues/116

Pull request #1901 created #created-1901

27 Mar at 00:44:43 GMT
1363 fallback becomes a value not a function

Issue #1363 generated a large amount of discussion on how to handle absent keys in map:get() and out-of-range indexes in array:get().

I felt that one of the simplest proposals was to change the $fallback argument to be a simple default value, rather than a function. This eliminates some of the more "clever" use cases, but these can always be achieved in other ways, as the discussion thread demonstrates. Meanwhile reducing $fallback to a simple default value makes life easier for the 90% of cases where this is all that is needed (especially for arrays, when the desire is to return a default value rather than throwing an error).

This PR therefore implements that simple proposal.

Fix #1363

Issue #1766 closed #closed-1766

26 Mar at 15:26:21 GMT

1715 Drop array bound checking

Issue #1900 created #created-1900

26 Mar at 14:13:12 GMT
Records: instance checks

Continues #1862:

In the last meeting, we discussed whether the order of record entries should be considered in instance checks. After further reflection and attempts to implement it, I believe this will make things much easier in the long term:

In 3.4.1 Item Coercion Rules, the coercion of records was added as a second exceptional case: The coercion may change the item in question even if the upstream instance check is succesful. This leads to additional action that I believe could simply be avoided if the successful instance means that no further action is required. I think it will also reduce possible cost that was indicated in https://github.com/qt4cg/qtspecs/issues/1862#issuecomment-2709104860.

Note: This issue clearly focuses on implications for the implementation. From a user perspective, I assume it will hardly ever make a difference whether we consider order or not. My assumption is that nearly all records will have the expected order anyway, or they will match the order once the first coercion has taken place.

All this takes time to specify. I will be glad to make an attempt and write the PR.

Issue #1899 closed #closed-1899

26 Mar at 08:42:42 GMT

Superflous whitespace change to nudge CI; apologies for the noise.

Pull request #1899 created #created-1899

26 Mar at 08:26:18 GMT

Superflous whitespace change to nudge CI; apologies for the noise.

Issue #1660 closed #closed-1660

25 Mar at 20:23:15 GMT

Further suggestions for fn:path

Issue #1747 closed #closed-1747

25 Mar at 20:05:57 GMT

Function finder is broken

Issue #1858 closed #closed-1858

25 Mar at 19:07:10 GMT

Initial xsl:record

Issue #1870 closed #closed-1870

25 Mar at 19:00:43 GMT

Rename $zero keyword of fold-left and fold-right

Issue #1887 closed #closed-1887

25 Mar at 18:59:31 GMT

1870 rename $zero keyword of fold functions

Issue #1886 closed #closed-1886

25 Mar at 18:48:40 GMT

1660 Additional options for fn:path

Issue #1862 closed #closed-1862

25 Mar at 18:47:11 GMT

Records: consider order

Issue #1874 closed #closed-1874

25 Mar at 18:47:10 GMT

1862 Coercing to a record type changes map order

Issue #1861 closed #closed-1861

25 Mar at 18:46:08 GMT

xsl:next-match with-all-params

Issue #1875 closed #closed-1875

25 Mar at 18:46:07 GMT

1861 Params passed automatically through next-match

Pull request #1898 created #created-1898

25 Mar at 15:33:05 GMT
1624b Expand rules for document node subtyping

Fix #1624

Pull request #1897 created #created-1897

25 Mar at 11:57:55 GMT
1876 In fn:replace(), merge the $replacement and $action parameters

Fix #1876

Issue #1884 closed #closed-1884

25 Mar at 11:10:48 GMT

Deep-equality keys

Issue #1896 created #created-1896

25 Mar at 10:59:40 GMT
Drop "parameter names" as a property of a function item

One of the properties of function items is the parameter names.

This property is unused; there is nothing that depends on the value of this property, and no way of discovering the value, and it isn't defined for all function items, e.g. maps and arrays, or functions returned by functions such as fn:op. It causes complications, such as whether two functions can have the same identity if they have different parameter names. I propose to drop it.

Of course, there are open issues that suggest allowing parameter names to be used in dynamic function calls. But I see little chance of coming up with a design that achieves this, because in general when you're given a function item to call, you have no idea what the parameter names are, and the person supplying the function item has very little control over what the parameter names will be.

Pull request #1895 created #created-1895

25 Mar at 10:15:49 GMT
1881 Function identity for maps and arrays

Supplies rules for how fn:function-identity() should handle maps and arrays.

Also makes the point that labels are ignored. There's a general statement to the effect in XDM that labels are ignored except where otherwise specified, but it's useful to avoid any doubt here.

Fix #1881

Issue #1892 closed #closed-1892

24 Mar at 18:07:33 GMT

Dnovatchev dn examples (ignore this)

Pull request #1894 created #created-1894

24 Mar at 17:32:02 GMT
Additional examples to fn:chain - in a new branch

Re-submitted the same as PR 1890. Added some new examples to fn:chain.

Issue #1890 closed #closed-1890

24 Mar at 17:11:11 GMT

More examples added to fn:chain

Issue #1893 closed #closed-1893

24 Mar at 17:01:50 GMT

Fix broken markup

Pull request #1893 created #created-1893

24 Mar at 16:48:52 GMT
Fix broken markup

I cannot imagine how we got a merged PR that included broken markup, but it's probably made a mess of the diffs recently.

Pull request #1892 created #created-1892

24 Mar at 15:56:42 GMT
Dnovatchev dn examples (ignore this)

This PR #1890 rebased off master to test if it makes for cleaner diffs.

Issue #1891 created #created-1891

24 Mar at 12:26:59 GMT
`fn:parse-html`: `html-version`

Maybe we can align the HTML versions that fn:parse-html needs to support with the remaining specification. It currently says:

Valid values an implementation must support for the html method are: 3, 3.2 for HTML 3.2 W3C Recommendation, 14 January 1997 4, 4.01 for HTML 4.01 W3C Recommendation, 24 December 1999 5.0 for HTML5 W3C Recommendation, 28 October 2014 5.1 for HTML 5.1 W3C Recommendation, 1 November 2016 5.2 for HTML 5.2 W3C Recommendation, 14 December 2017 LS for HTML Living Standard, WHATWG 5 may be equivalent to any of 5.0, 5.1, 5.2, or LS

In the XQFO and Serialization specs, only HTML 4.0/4.01 and HTML 5 are mentioned.

@rhdunn Do you have an opinion on this?

Related: #1889

Pull request #1890 created #created-1890

23 Mar at 22:34:52 GMT
More examples added to fn:chain

Added 6 more examples and tests All are correctly executed.

Issue #1889 created #created-1889

20 Mar at 15:32:00 GMT
HTML serialization: `html-version` and `version` parameters; allowed values

The serialization spec says (HTML Output Method: the version and html-version Parameters):

If the html-version serialization parameter is not absent, the requested HTML version is the value of the html-version serialization parameter; otherwise, it is the value of the version serialization parameter.

fn:serialize defines the following defaults:

  • html-version: 5
  • version: 1.0

I wonder whether these rules cover all possible cases:

  1. Is it correct that HTML will be serialized as HTML5 if no options are supplied?
(: html-version=5 :) 
serialize(<html/>, { 'method': 'html' })
  1. If only version is supplied, is it correct that it is ignored because of html-version defaulting to 5?
(: html-version=5 ? :) 
serialize(<html/>, { 'method': 'html', 'version': '4.01' })
  1. If no, i.e., if { 'version': '4.01' } is expected to overwrite the default for html-version, how can we know at which stage the default values are to be considered?

In addition, the serialization specification mentions versions HTML 4.01 and HTML5 various times, but it seems to be up to the implementation to decide which HTML versions to support. However, we seem to have test cases for 4.0 and 5. Would it make sense to define a miminum set of versions that need to be supported?

Finally, for some reason, the html-version parameter was defined to be a decimal, whereas version is defined as a string (since XQFO 3.1). Maybe this leads to the surprising result that Saxon seems to accept the option { 'version': '4.0' }, but rejects { 'html-version': 4 }.

Pull request #1888 created #created-1888

20 Mar at 04:57:46 GMT
366 xsl:package-location

First draft, for initial feedback.

Notes:

  • Because the CG has little energy/resources to develop the EXPath Zip module, I have situated the question of archive (compressed or not) in the URI scheme itself. There are dozens of archives, dozens of URI schemes. The only case where I have found overlap is in the jar: scheme/archive. Yes, I've seen zip: used as an alias for jar:, but it's not an official IANA URI scheme. This may need discussion.
  • I have opted to bind @priority to a non-zero integer. This is the first time the constraint for the union of positive and negative integers has been placed on an XSLT attribute, so I may not have correctly set up element-catalog.xml.
  • I have opted to not make attribute values format, name, and version as criteria for the priority package location (new term), so that developers can be warned when the package is at odds with the declaration. To make them criteria would mean that inconsistencies between the declaration and the referenced packages would remain undetected.
  • I adopted the terms "URL" and "entry" based upon the IANA nomenclature for the jar: scheme.
  • I may have overthought the distinction between archive and non-archive URIs. Feedback is appreciated.
  • Error code 3000 has been broken up into different possible errors.
  • Suggestions on the type and number of tests that need to be written for the test suite are welcome.

Pull request #1887 created #created-1887

18 Mar at 21:29:35 GMT
1870 rename $zero keyword of fold functions

Fix #1870

Issue #998 closed #closed-998

18 Mar at 20:54:21 GMT

regular expression addition - lookbehind assertions and lookahead assertions

Issue #1848 closed #closed-1848

18 Mar at 20:53:46 GMT

Define regular expressions using XSD 1.1 as baseline

Issue #1856 closed #closed-1856

18 Mar at 20:50:50 GMT

998 Add boundary and lookahead/behind assertions

Pull request #1886 created #created-1886

18 Mar at 18:33:52 GMT

1660 Additional options for fn:path

Issue #1860 closed #closed-1860

18 Mar at 17:36:43 GMT

fn:parse-xml: DTDs, external resources

Issue #1857 closed #closed-1857

18 Mar at 17:36:43 GMT

fn:parse-xml: `xinclude`

Issue #1879 closed #closed-1879

18 Mar at 17:36:42 GMT

1857, 1860: Add more options to parse-xml

Issue #1882 closed #closed-1882

18 Mar at 17:34:35 GMT

982 Editorial rewrite of scan-left and scan-right

Issue #1866 closed #closed-1866

18 Mar at 17:32:31 GMT

Ambiguities introduced by #1864

Issue #1877 closed #closed-1877

18 Mar at 17:32:30 GMT

1866 Disambiguate TypeSpecifier syntax

Issue #1867 closed #closed-1867

18 Mar at 17:30:57 GMT

1341 Drop position from fold callbacks

Issue #1869 closed #closed-1869

18 Mar at 17:28:04 GMT

`fn:duplicate-values`: Order of results

Issue #1873 closed #closed-1873

18 Mar at 17:28:03 GMT

1869 duplicate values

Issue #1851 closed #closed-1851

18 Mar at 17:26:24 GMT

Questions on `fn:atomic-type-annotation`

Issue #1878 closed #closed-1878

18 Mar at 17:26:23 GMT

1851 Make ?variety optional; explain namespace-sensitive

Issue #1863 closed #closed-1863

18 Mar at 17:24:40 GMT

add \U \u L \u \E to replace() (case conversion)

Issue #1880 closed #closed-1880

18 Mar at 17:23:07 GMT

Editorial revision of fn:function-identity

Issue #1885 created #created-1885

18 Mar at 16:51:25 GMT
Use the spcification grammar markup to define the regular expression grammar in F&O

The grammar for regular expressions in the regular expression section of F&O is currently defined as a code block. Making it use the grammar markup used to define the pattern, XPath, and XQuery grammars would:

  1. give the grammar a unified appearance with the other grammars;
  2. allow grammar elements to be cross referenced and linked back to the grammar.

Issue #1884 created #created-1884

17 Mar at 18:18:00 GMT
Deep-equality keys

Issue #119 proposes extending maps to allow arbitrary values as keys. This is very difficult to achieve, (a) because the fact that keys are atomic items is deeply embedded in the design of a number of functions and operations on maps, and (b) because it's very hard to define an equality function that suits everyone.

The way we tacked variable equality semantics for strings was via the collation-key() function, which takes a string and a collation as input and produces an opaque key value, which can be used as a key in maps, and which reflects the desired equality semantics.

We could extend the same idea to values other than strings. In particular, we could define a deep-equality-key() that can be calculated for any sequence, and that takes all the matching options of the deep-equal() function as a parameter. (We could then redefine deep-equal(a, b, options) to mean deep-equality-key(a, options) eq deep-equality-key(b, options)).

The main drawback is that the deep-equality-keys for large node trees or maps would be rather long strings. People might use the functionality without realising the expense.

Another problem is that one of our options in deep-equals() is a callback function for item equality, and we couldn't replicate this when computing a key. But this callback is the only way we have, for example, to compare nodes by identity rather than by content.

Note that an internal deep-equality-key concept (or at least a deep-equality hashcode) is needed anyway for efficient implementation of deep-equals where order is deemed irrelevant. Without it, the function becomes O(n^2). Quite independently of this proposal, we should perhaps have an explicit option on deep-equals() to compare nodes by identity.

Issue #1296 closed #closed-1296

17 Mar at 09:47:55 GMT

982 Rewrite of scan-left and scan-right

Pull request #1883 created #created-1883

16 Mar at 18:19:14 GMT
882 Replace fn:chain by fn:compose

Drops the existing fn:chain function and replaces it with a new fn:compose function.

This combines two separate changes:

(a) whereas fn:chain applies a sequence of functions to an input, fn:compose returns a composite function that can be used repeatedly with different inputs.

(b) the fn:compose function is restricted to arity-1 functions, which leads to a much simpler specification that still handles the vast majority of practical use cases.

In particular, note that if the sequence of functions to be applied is statically known, then it can always be written out explicitly; the real use case for this function is when the sequence of functions is constructed dynamically. And in this situation, fn:chain in its current form can easily fail because of problems with the arity of the functions included in the chain.

Issue #1865 closed #closed-1865

16 Mar at 17:13:35 GMT

Callback functions, position argument: consistency

Pull request #1882 created #created-1882

16 Mar at 00:09:09 GMT
982 Editorial rewrite of scan-left and scan-right

This is intended to be purely an editorial rewrite, it does not change the functionality.

Replaces #1296.

Addresses #982, but we still need to add corresponding functions for arrays.

Issue #1881 created #created-1881

15 Mar at 08:28:39 GMT
fn:function-identity for maps and arrays

The data model spec says that function identity is not defined for maps and arrays.

The specification of fn:function-identity() fails to mention this fact.

Pull request #1880 created #created-1880

15 Mar at 01:16:03 GMT
Editorial revision of fn:function-identity

Tidies up the text and adds examples

Pull request #1879 created #created-1879

14 Mar at 15:50:21 GMT
1857, 1860: Add more options to parse-xml

Add options to control entity expansion and XInclude processing.

Fix #1857 Fix #1860

Pull request #1878 created #created-1878

14 Mar at 14:58:34 GMT
1851 Make ?variety optional; explain namespace-sensitive

Fix #1851

Allow ?variety to be absent e.g. for xs:anySimpleType

Define namespace-sensitive by an xtermref to the definition in the XP/XQ spec.

Pull request #1877 created #created-1877

14 Mar at 14:38:56 GMT
1866 Disambiguate TypeSpecifier syntax

Fix #1866

Issue #1876 created #created-1876

14 Mar at 13:10:48 GMT
`fn:replace`: Combine $replacement and $action parameters

We could combine the competing $replacement and $action parameters:

replace(
  'this is a test',
  '(\w)(\w+)?',
  fn($s, $g) { upper-case($g[1]) || lower-case($g[2]) }
)

Original comment: https://github.com/qt4cg/qtspecs/issues/1863#issuecomment-2711149296

Pull request #1875 created #created-1875

14 Mar at 11:27:02 GMT
1861 Params passed automatically through next-match

Fix #1861

Pull request #1874 created #created-1874

14 Mar at 10:49:05 GMT
1862 Coercing to a record type changes map order

Fix #1862

Pull request #1873 created #created-1873

14 Mar at 09:55:31 GMT
1869 duplicate values

Fix #1869

Issue #1872 created #created-1872

13 Mar at 11:31:22 GMT
Arrays: members → values / entries?

I am pretty sure the first reaction will be DONT!, but for the sake of consistency it seems important enough for me to bring this up:

Could we rename array “members” to “values”?

Some advantages that I would see:

  • We could treat arrays and maps more similarly.
  • We already have a values lookup key specifier for arrays.
  • No 3.1 array function contains the string “member”, so we will not introduce any backward inconsistencies.
  • All 4.0 features that use this string could be safely renamed.

Of course the term “values” is a very common one, but we have to decided to stick with “map values” – and arrays and maps are very similar.

Finally, I noticed that also the term “member” has different meanings in the spec and is not exclusively used for arrays (e.g. in the rules for fn:innermost, fn:format-integer or for members of union types).

Issue #1338 closed #closed-1338

13 Mar at 10:40:15 GMT

Arrays and maps: Members, entries, values, contents, pairs, …

Issue #1871 created #created-1871

13 Mar at 10:39:55 GMT
Arrays and maps: consistency

Suggestions (based on #1338, related: #1868)

  1. In symmetry with the pairs lookup specifier, we should add array:pairs and an inverse array:of-pairs function.
  2. In symmetry with the values lookup specifier, we should add array:values and map:values functions, to retrieve the values of maps and the members of arrays as a sequence of arrays.
  3. In return, array:members and array:of-members seem redundant, and we should drop them.
  4. In analogy with the keys specifier and map:keys, we should add array:keys (which returns a dense integer range).

Background

With version 4.0, we are adding a lot of promising and powerful new map and array features. This is a big step forward, compared to the obvious limitations of 3.1.

Some aspects of the 3.1 design have made it difficult (or impossible) to fully adjust array and maps, but (in my opinion) the old overall concept was impressively consistent – and it is definitely a big challenge to achieve a 4.0 design that is not too fragmented.

To me, this becomes particularly evident in the case of arrays. The following example sums up the items of all members of an array. For the cumbersome 3.1 solution…

for $pos in 1 to array:size($array)
return sum($array($pos))

…we now have several (roughly?) equivalent options to do this:

  1. for member $m in $array return sum($m)
  2. array:members($array) ! sum(?value)
  3. $array?pairs::* ! sum(?value)
  4. $array?values::* ! sum(.)

The examples above imply that:

  • for 1., an array member is a sequence;
  • for 2., an array member is a map;
  • for 3., an array has pairs (but there is no array:pairs);
  • for 4., an array has values (but there is no array:values).

Issue #1870 created #created-1870

12 Mar at 22:19:13 GMT
Rename $zero keyword of fold-left and fold-right

I find the name $zero for this parameter unhelpful and confusing.

I suggest $accum, short for "accumulator" or "accumulated result".

Issue #1869 created #created-1869

12 Mar at 18:44:11 GMT
`fn:duplicate-values`: Order of results

With https://github.com/qt4cg/qtspecs/pull/987, a rule was added to fn:duplicate-values:

For any set of values that compare equal, the one that is returned is the one that appears first in $values.

I think we should adapt the behavior to return the duplicates in the order they appear, not the original values:

  • A common use case for this function is to find the first duplicate in a list.
  • If we return the original values in the correct order, we need to parse the full sequence before we can know which will be the first result. A worst-case example:
(0x7FFFFFFFFFFFFFFF, 1, 1 to 0x7FFFFFFFFFFFFFFF)
=> duplicate-values()
=> head()

Issue #1868 created #created-1868

12 Mar at 18:19:26 GMT
array:members() to include index position

Currently array:members(["a", "b"]) returns

{'value': "a"},
{'value': "b"}

I suggest that it should instead return

{'key': 1, 'value': "a"},
{'key': 2, 'value': "b"}

The extra information is useful for any operation that wants to take account of positions as well as values. For example, rearranging an array into multiple columns. Using the names "key" and "value" also means that the data is suitable for converting an array to a map by means of map:of-pairs.

The function array:of-members() should change to accept record('value', *) (making the record type extensible) so that the key part is ignored if present.

Pull request #1867 created #created-1867

12 Mar at 18:02:44 GMT
1341 Drop position from fold callbacks

Following up on issue 1341, we decided to drop the position argument from the 4 fold functions.

Most of the changes in this PR are dealing with the collateral damage - changes to "formal equivalents" of other functions that previously relied on fold-left having the position available to the callback function.

Issue #1866 created #created-1866

12 Mar at 08:15:55 GMT
Ambiguities introduced by #1864

The grammar check done by RExification of XQuery and XPath 4.0 Grammars has detected a bunch of LALR(2) conflicts caused by the recent addition of TypeSpecifier to the KeySpecifier production.

In fact these are ambiguities between following being used as a QName (via EQName, TypeName), or as a keyword:

  • array
  • attribute
  • comment
  • document-node
  • element
  • empty-sequence
  • enum
  • fn
  • function
  • item
  • map
  • namespace-node
  • node
  • processing-instruction
  • record
  • schema-attribute
  • schema-element
  • text

E.g. element in

$A?~element()

can be parsed as an element test, element(), or as a type name element followed by a PositionalArgumentList. This is similar to what the "reserved-function-names" constraint covers, but that does not apply here because there is no function name involved.

The SequenceType in TypeSpecifier, enclosed in extra parenthese, does not present a problem, so my proposal is to drop ItemType from TypeSpecifier and rewrite the production to

TypeSpecifier
         ::= '~' '(' SequenceType ')' 

Issue #1865 created #created-1865

11 Mar at 20:37:36 GMT
Callback functions, position argument: consistency
  • In https://github.com/qt4cg/qtspecs/pull/1735#issuecomment-2715090198, it was decided to remove the position argument from fn:fold-left and fn:fold-right. → #1867
  • As maps are ordered now, we should add the position argument to iterative map functions (e.g., map:for-each; basically all functions for which equivalent sequence and array functions exist).

Issue #1227 closed #closed-1227

11 Mar at 17:41:32 GMT

150 PR resubmission for fn ranks

Issue #1456 closed #closed-1456

11 Mar at 17:04:17 GMT

Filtering by type in lookup expressions

Issue #1864 closed #closed-1864

11 Mar at 17:04:15 GMT

1456 Lookup expressions filtered by type

Issue #1740 closed #closed-1740

11 Mar at 17:02:56 GMT

1725b Further elaboration of duplicates handling in maps

Issue #1735 closed #closed-1735

11 Mar at 17:02:28 GMT

1341 Drop $position callback from many functions

Issue #1794 closed #closed-1794

11 Mar at 15:16:00 GMT

Lookup: select all except

Issue #1778 closed #closed-1778

10 Mar at 16:22:41 GMT

1456 Lookup expressions filtered by type

Pull request #1864 created #created-1864

10 Mar at 16:21:58 GMT
1456 Lookup expressions filtered by type

Fix #1456

Technically identical to PR #1778, but reworked because it had become impossible to resolve the merge conflicts.

Issue #1863 created #created-1863

10 Mar at 05:49:18 GMT
add \U \u L \u \E to replace() (case conversion)

Many systems using regular expressions support case conversion in the replacement strings.

For example,

sed -e 's/[aA]*/\L\u&/'

given AAA as input, produces Aaa.

It’s not 100% clear to me its worth adding, since an action function can do the same thing with more or less work, but for reference,

\U turns the replaced text into upper case until \E, \L, or the end of the replacement string \L turns the replaces text to lower case in the same way \u and \l affect the single next character and operate independently of \U, \L, \E.

I wrote up some more precise spec text and can make a pull request; the case in the sed example above is common in text conversion projects but slightly tricky to get right with a function,

          fn { upper-case(substring(., 1, 1)) || lower-case(substring(., 2) }

This is simple, but consider \2 \L\u\1\3 as a function, where \1 may be empty.

Overall i don’t have strong feelings either way, except that supporting them may help people migrate from other systems or languages. \E feels uncomfortably procedural. In Perl and libpcre i think, \E also turns of \Q (which disables all metadata characters up until \E).

Like < and > in patterns, \L and friends can be emulated with some care, but that’s true of a lot of regular expression syntax, and one point of the shorthands (as i see it) is to move the feature towards being accessible by people with less of a programming background.

Issue #1862 created #created-1862

09 Mar at 18:55:18 GMT
Records: consider order

I think we should make the order of record entries part of instance checks and coercion rules:

  1. It will be less confusing for users if records have a well-defined order (similar to objects in OOL), in particular if records are serialized.
  2. It will be much easier for implementations to access record entries by their internal index if the order is statically known. There will still be opportunities for optimizing lookups in arbitrary maps (index-based access has generally become easier with maps being ordered).

Issue #1861 created #created-1861

07 Mar at 08:27:30 GMT
xsl:next-match with-all-params

Problem

The <xsl:next-match> instruction is useful when writing local templates to customize the behavior of an imported XSLT. Unfortunately, there is a limitation due to the fact that <xsl:next-match> does not pass along parameters unless the parameters are defined as tunneling or the parameters are explicitly coded using <xsl:with-param>.

The fact that <xsl:next-match> does not automatically pass along parameters can be surprising or lead to cumbersome workarounds, and limits how <xsl:next-match> can be used when writing local templates to customize the behavior of imported XSLT.

  • In situations where parameters are defined in an imported XSLT it might not be feasible to change parameters to tunneling.

  • In situations where a variety of parameters might be in scope when a template that uses <xsl:next-match> is invoked, currently each parameter needs to be explicitly coded using <xsl:param> and <xsl:with-param> in <xsl:next-match>, even though the parameters might not be relevant to the purpose or logic of the template. This may lead to fragile and less maintainable code and increases the cognitive load for developers, especially when working with complex, multi-layered stylesheets.

This proposal aims to simplify the use of the <xsl:next-match> instruction while being backwards compatible.

Proposal

  • Add an option to <xsl:next-match> to enable passing along all parameters. This option might take the form of a new optional attribute on <xsl:next-match> named with-all-params (this name is similar to the existing element name <xsl:with-param>) that takes a yes/no (or Boolean) value and defaults to no (false).

  • An instruction <xsl:next-match with-all-params="no"/> would operate the same as <xsl:next-match/> currently does.

  • An instruction <xsl:next-match with-all-params="yes"/> would operate the same as <xsl:next-match/> currently does with the difference that all parameters that were in scope when the current template was invoked will remain in scope for the next matching template.

  • An instruction <xsl:next-match with-all-params="yes"> that contains <xsl:with-param> should operate the same as described in the preceding paragraph with the difference that parameters defined by <xsl:with-param> will also be in scope for the next matching template.

  • If a parameter defined by <xsl:with-param> within <xsl:next-match with-all-params="yes"> has the same name as a parameter that was in scope when the current template was invoked, then the effective value of that parameter should be the value defined by <xsl:with-param>. This will allow a template to override parameters when necessary.

To summarize, <xsl:next-match with-all-params="yes"> should invoke the next matching template and automatically pass along all parameters that were in scope when the current template was invoked, and optionally allow using <xsl:with-param> to set additional parameters or modify parameter values.

Example

Given this input document:

<!-- input.xml -->
<section>
    <p>hello</p>
</section>

This stylesheet import.xsl provides a set of base templates. The template matching element "p" uses <xsl:next-match/> in it's current (default) operation.

<!-- import.xsl -->
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="3.0" expand-text="yes">
    
    <xsl:output indent="yes"/>
    <xsl:mode on-no-match="shallow-copy"/>
    
    <xsl:template match="section">
        <section>
            <xsl:apply-templates>
                <xsl:with-param name="a" select="'a'"/>
                <xsl:with-param name="b" select="'b'"/>
            </xsl:apply-templates>
        </section>
    </xsl:template>
    
    <xsl:template match="p">
        <xsl:param name="a"/>
        <xsl:param name="b"/>
        <xsl:param name="c"/>
        <p>a {$a}</p>
        <p>b {$b}</p>
        <p>c {$c}</p>
        <xsl:next-match/>
    </xsl:template>
    
</xsl:stylesheet>

This is the output of the above stylesheet import.xsl and the input document:

<section>
   <p>a a</p>
   <p>b b</p>
   <p>c </p>
   <p>hello</p>
</section>

This stylesheet before.xsl imports the stylesheet import.xsl and defines a template to customize how <p> elements are processed. The parameter $a needs to be intercepted and forwarded even though this template is not doing anything with $a. The parameter $b is overridden, and the parameter $c is added within <xsl:next-match>.

<!-- before.xsl -->
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="3.0">
    
    <xsl:import href="import.xsl"/>
    
    <xsl:template match="p">
        <xsl:param name="a"/>
        <p>customization</p>
        <xsl:next-match>
            <xsl:with-param name="a" select="$a"/>
            <xsl:with-param name="b" select="'buzz'"/>
            <xsl:with-param name="c" select="'c'"/>
        </xsl:next-match>
    </xsl:template>
    
</xsl:stylesheet>

This stylesheet after.xsl does the same thing as the previous stylesheet but uses with-all-params="yes". The template does not need to intercept and forward the parameter $a because this is handled automatically by with-all-params="yes". The parameter $b is overridden, and the parameter $c is added within <xsl:next-match> in the same way as the previous stylesheet. Although this is a small example in which the parameter $a is the only savings, the benefit of with-all-params="yes" can be significant in scenarios where there are more parameters.

<!-- after.xsl -->
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="4.0">
    
    <xsl:import href="import.xsl"/>
    
    <xsl:template match="p">
        <p>customization</p>
        <xsl:next-match with-all-params="yes">
            <xsl:with-param name="b" select="'buzz'"/>
            <xsl:with-param name="c" select="'c'"/>
        </xsl:next-match>
    </xsl:template>
    
</xsl:stylesheet>

The two stylesheets above (before.xsl and after.xsl) should produce the same output.

<!-- output.xml -->
<section>
   <p>customization</p>
   <p>a a</p>
   <p>b buzz</p>
   <p>c c</p>
   <p>hello</p>
</section>

Issue #1801 closed #closed-1801

05 Mar at 19:07:46 GMT

1798 Function fn:function-identity

Issue #1860 created #created-1860

05 Mar at 17:51:08 GMT
fn:parse-xml: DTDs, external resources

The text doesn’t say much about what DTD validation means. Is my assumption correct that it boils down to a SAXParserFactory.setValidating call in Java?

What about DTDs in general? Given the following snippets (using the default false for DTD validation)…

<!-- xml.dtd -->
<!ENTITY arrow "→">

parse-xml(`
  <!DOCTYPE xml SYSTEM 'xml.dtd'>
  <xml>&arrow;</xml>`
)

…should the result be <xml/>, <xml>→</xml>, or an error? In other words, should the (potentially external) xml.dtd resource be resolved and interpreted?

Maybe we should introduce an additional DTD option (or options?) to control the loading of external DTDs and the handling of entities, for example:

http://apache.org/xml/features/nonvalidating/load-external-dtd
http://xml.org/sax/features/external-general-entities
http://xml.org/sax/features/external-parameter-entities

Thoughts are welcome.

Issue #1859 created #created-1859

05 Mar at 15:36:21 GMT
Question on `fn:chain` and `err:FOAP0001`

Per #1280, fn:apply has been changed to allow the number of arguments to be greater than the arity of the function.

fn:chain is defined in terms of fn:apply, and it also refers to the error code err:FOAP0001 which belongs to fn:apply.

However the condition for the error differs between the two:

  • fn:chain

    An error [err:FOAP0001] is raised if the arity of any function $f in $functions is different from the number of members in the array that is passed to fn:apply.

  • fn:apply

    A dynamic error is raised if the arity of the function $function is greater than the size of the array $arguments ([err:FOAP0001]).

Also in B Error Codes, err:FOAP001 still asks for the arity to exaclty match the number of arguments:

err:FOAP0001, Wrong number of arguments.

Raised when fn:apply is called and the arity of the supplied function is not the same as the number of members in the supplied array.

Should not the description of fn:chain and the error summary be adapted to the changed behaviour of fn:apply?

Issue #1853 closed #closed-1853

04 Mar at 18:57:49 GMT

1845 Revised design of methods to use . rather than $this

Issue #1845 closed #closed-1845

04 Mar at 18:14:51 GMT

Should we add additional syntactic sugar for use with %method functions?

Issue #1820 closed #closed-1820

04 Mar at 17:21:56 GMT

Delta markers in collapsed TOC

Issue #1838 closed #closed-1838

04 Mar at 17:21:55 GMT

1820 Attempt to add change markup in collapsed ToC

Issue #1796 closed #closed-1796

04 Mar at 17:18:46 GMT

Allow fn:invisible-xml to return a function that returns an item()

Issue #1839 closed #closed-1839

04 Mar at 17:18:45 GMT

Relax the return type of the Invisible XML parsing function

Issue #1849 closed #closed-1849

04 Mar at 17:15:45 GMT

Reduce the indentation in the ToC

Issue #1850 closed #closed-1850

04 Mar at 17:12:15 GMT

Actions from meeting 111

Issue #1771 closed #closed-1771

04 Mar at 17:10:12 GMT

fn:deep-equal: map order

Issue #1855 closed #closed-1855

04 Mar at 17:10:11 GMT

1771 Add option for deep-equal to consider map order

Issue #1847 closed #closed-1847

04 Mar at 16:23:32 GMT

%method functions: explicit self reference?

Pull request #1858 created #created-1858

04 Mar at 12:04:24 GMT
Initial xsl:record

An initial draft of the xsl:record instruction, for discussion

Issue #1656 closed #closed-1656

03 Mar at 18:39:03 GMT

Ordered Maps: Updates

Issue #1829 closed #closed-1829

03 Mar at 18:30:00 GMT

Problems with new arrow expression syntax

Issue #1854 closed #closed-1854

03 Mar at 18:24:56 GMT

Can someone direct me to the motivating use case of objects?

Issue #1857 created #created-1857

03 Mar at 16:59:36 GMT
fn:parse-xml: `xinclude`

We should allow XInclude processing to be enabled/disabled, as it can potentially lead to memory leaks.

Issue #1835 closed #closed-1835

03 Mar at 12:38:52 GMT

add zero-width assertions to regular expressions

Pull request #1856 created #created-1856

03 Mar at 12:34:31 GMT
998 Add boundary and lookahead/behind assertions

Incorporates and supersedes #1835

Issue #1836 closed #closed-1836

02 Mar at 19:12:15 GMT

unparsed-text-lines() - line endings

Pull request #1855 created #created-1855

02 Mar at 19:06:38 GMT
1771 Add option for deep-equal to consider map order

Adds an option for deep-equal to treat order of entries in a map as significant.

Fix #1771

Issue #1854 created #created-1854

28 Feb at 11:37:21 GMT
Can someone direct me to the motivating use case of objects?

There's a LOT of conversations about "this" and methods and really quite complex syntax, but (despite writing OO software for about 30+ years), I cant think of a motivating use case in the context of XSLT/XQuery.

  • in imperative languages with mutable state...yes
  • in very large code bases requiring some abstraction/encapsulation - well maybe...there are probably easier ways to do this without 'objects'.

this is the sort of canonical example people use (to illustrate simple technical points)

$rect := {'x': 10, 'y': 7, 'area': fn(){?x * ?y}}

but actually I would write

$rect := {'x': 10, 'y': 7, 'area': 10 * 7}

(and similarly write a data constructor for the record in that manner)

i.e.

fn($x,$y){'x': $x, 'y': $y, 'area': $x * $y}

I'm not sure at the moment its worth the effort.

Pull request #1853 created #created-1853

27 Feb at 10:38:54 GMT
1845 Revised design of methods to use . rather than $this

Proposal is that in methods, the containing map should be bound to the context item rather than to the special variable $this, so fields of that map are referenced as ?x rather than $this?x.

Issue #1852 created #created-1852

27 Feb at 08:56:40 GMT
fn:values-except: Return atomic values that occur in A but not in B

fn:distinct-values can be used to perform a union on atomic values:

(: returns 1 to 5 :)
let $one := 1 to 4, $two := 2 to 5
return distinct-values(($one, $two))

fn:duplicate-values can be used for intersect:

(: returns 2 to 4 :)
let $one := 1 to 4, $two := 2 to 5
return duplicate-values(($one, $two))

A (roughly) equivalent alternative is $one[. = $two].

I think we should add an equivalent for except (it requires 2 arguments instead of 1):

fn:values-except(
  $values     as xs:anyAtomicType*,
  $exclude    as xs:anyAtomicType*,
  $collation  as xs:string?         := fn:default-collation()
) as xs:anyAtomicType*

An example:

(: returns 1 :)
let $one := 1 to 4, $two := 2 to 5
return values-except($one, $two)

In principle, this function can also be written as $one[not(. = $two)], but a dedicated function will be easier to understand for users and easier to optimize for processors.

Issue #1247 closed #closed-1247

26 Feb at 23:55:45 GMT

`??type(T)` in lookup expressions - shortcuts

Issue #1851 created #created-1851

26 Feb at 15:42:11 GMT
Questions on `fn:atomic-type-annotation`

These questions came up while working on fn:atomic-type-annotation:

  • what is the variety of xs:anySimpleType?
  • should not constructor be absent in an fn:schema-type-record describing xs:QName?

Here are the detailed observations:

fn:schema-type-record?variety

Consider the following query:

  (
    <x>42</x>
    => fn:atomic-type-annotation()
  )
  ?base-type()
  ?base-type()
  ?variety

My interpretation is as follows:

  • the x element node is atomized to a value of type xs:untypedAtomic,
  • so fn:atomic-type-annotation returns the information for xs:untypedAtomic,
  • the base type of that is xs:anyAtomicType,
  • the base type of that is xs:anySimpleType,
  • per XML schema 1.1, 3.16.7.1 xs:anySimpleType, the {variety} of xs:anySimpleType is absent,
  • so variety should be absent in an fn:schema-type-record describing xs:anySimpleType,
  • the result thus should be an empty sequence.

According to the current spec, variety must always be present with a value of type enum("atomic", "list", "union", "empty", "simple", "element-only", "mixed"), but also correspond to the {variety} of the simple type in the XSD component model.

Should not variety be optional, and omitted for xs:anySimpleType?

fn:schema-type-record?constructor

The spec says this about constructor:

The field is absent for complex types and for the abstract types xs:anyAtomicType, xs:anySimpleType, and xs:NOTATION. It is also absent for all namespace-sensitive types, that is, types derived from xs:QName or xs:NOTATION.

The formulation does not include xs:QName, but should not its constructor be absent for the same reasons as for the types derived from it?

Pull request #1850 created #created-1850

26 Feb at 12:36:25 GMT
Actions from meeting 111

[ ] QT4CG-111-01: MK to review the editorial comments on PR #1837 and then merge the PR.

Done (along with a couple of other minor corrections noted in passing)

[ ] QT4CG-111-02: MK to fix the typo $in as xs:double+ and 1.3. 1.4 that middle “.” should be a “,”

Already done before the PR was merged

[ ] QT4CG-111-03: MK to add a %method example that uses the arrow syntax.

Done (though the example isn't especially convincing).

Also added another couple of examples and notes in passing.

Pull request #1849 created #created-1849

26 Feb at 09:57:32 GMT

Reduce the indentation in the ToC

Issue #1848 created #created-1848

26 Feb at 08:31:33 GMT

Define regular expressions using XSD 1.1 as baseline

Issue #1800 closed #closed-1800

25 Feb at 21:17:38 GMT

The `=?>` lookup arrow expression operator is weird, difficult to use, difficult to understand, difficult to read and unnatural

Issue #1817 closed #closed-1817

25 Feb at 21:17:37 GMT

1800 Methods

Issue #1843 closed #closed-1843

25 Feb at 18:20:58 GMT

XQFO: TOC texts

Issue #1847 created #created-1847

25 Feb at 18:14:33 GMT
%method functions: explicit self reference?

This is a discussion issue; I am torn and would be interested in feedback:

With the just added %method annotation, basically two things happen:

  1. An implicit $this parameter is preprended to the remaining parameters of a function.
  2. The current map will be bound to the first parameter by the lookup operator.

The inner workings of the example in the spec were not entierly obvious in today’s meeting…

let $area := %method fn() { $this?x * $this?y }
return $area({ 'x': 3, 'y': 4 })

…and I am wondering if we are not more flexible by making the self-referencing parameter explicit. This way, it would be up to the user to decide how the parameter is called…

let $number := { 'value': 3, 'inc': %method fn($self) { $self?value + 1 } }
return $number?inc()

…the focus function syntax could be used alternatively…

let $number := { 'value': 3, 'inc': %method fn { ?value + 1 } }
return $number?inc()

…and it would allow for a stricter typing ($this as map(*) is not very specific), and thus for better error reporting:

declare record coord(
  x as xs:double,
  y as xs:double,
  product := %method fn($coord as coord) { $coord?x * $coord?y }
);
coord(3, 4)?product()

Obviously, it would cause new issues:

  • %method fn() {} would need to be made illegal
  • The type of the first argument would need to be map(*) or a subtype of it.
  • Users may be led to write…
let $map := { 'fn': %method fn($a, $b) { $a * $b } }
return $map?fn(2, 3)

On the other hand, the existence of the %method annotation should indicate that this function type differs from others.

If we stick with the invisible $this parameter, I wonder what function-arity(%method fn() {}) is supposed to return?

Issue #1846 created #created-1846

25 Feb at 17:48:00 GMT
%method functions, dynamic function calls

With the #1817, the %method annotation was introduced for functions. It is interpreted by the lookup operator:

let $number := { 'value': -3, 'abs': %method fn() { abs($this?value) } }
return $number?abs()

I think we should extend this mechanism to dynamic function calls, as many people use the constructs interchangeably:

return $number('abs')()

I agree that the binding mechanism should not apply for map:get or any other map functions and iterations.

Issue #1830 closed #closed-1830

25 Feb at 17:32:21 GMT

1829 Reintroduce restrictions on RHS of `=>`

Issue #1815 closed #closed-1815

25 Feb at 17:29:57 GMT

Function annotations on function items

Issue #1828 closed #closed-1828

25 Feb at 17:29:56 GMT

1815 Add more detail on annotations of function items

Issue #1834 closed #closed-1834

25 Feb at 17:27:37 GMT

json-lines - refinement

Issue #1837 closed #closed-1837

25 Feb at 17:27:36 GMT

1834 Additional clarification on JSON lines

Issue #583 closed #closed-583

25 Feb at 17:24:27 GMT

(array|map):replace → *:substitute or *:change

Issue #1833 closed #closed-1833

25 Feb at 17:24:26 GMT

583 Drop map:replace and array:replace

Issue #1816 closed #closed-1816

25 Feb at 17:21:08 GMT

Programmatic partial application

Issue #1825 closed #closed-1825

25 Feb at 17:21:07 GMT

1816 New function fn:partial-apply

Issue #1818 closed #closed-1818

25 Feb at 17:18:20 GMT

Grammar problem introduced by #1802

Issue #1826 closed #closed-1826

25 Feb at 17:18:19 GMT

Fix grammar bug #1818

Issue #1823 closed #closed-1823

25 Feb at 17:15:47 GMT

Clearer top-level section headings in F+O

Issue #1824 closed #closed-1824

25 Feb at 17:15:46 GMT

1823 Revise top-level headings in F+O spec

Issue #1845 created #created-1845

25 Feb at 17:14:05 GMT
Should we add additional syntactic sugar for use with %method functions?

During meeting 111, DN was arguing for additional syntactic sugar when his connection to the call ended abruptly. This issue is to make sure we come back to those discussions.

Specifically, should we allow ^x as an abbreviation for $this?x ?

Issue #1813 closed #closed-1813

25 Feb at 17:12:48 GMT

Reorganise top-level sections in XDM

Issue #1814 closed #closed-1814

25 Feb at 17:12:47 GMT

1813 Reorganise the XDM spec at top level

Issue #1811 closed #closed-1811

25 Feb at 17:09:23 GMT

Add note concerning non-XML characters in character maps

Issue #1812 closed #closed-1812

25 Feb at 17:09:21 GMT

1811 Add note regarding non-XML chars in xsl:output-character

Issue #1844 created #created-1844

25 Feb at 16:18:34 GMT
Drop mapping arrow operator

To reduce the number of new operators, I suggest removing the mapping arrow operator =!>, in favor of the recently added -> operator (which now allows us to arbitrarily create chains for single items and sequences).

Related: https://github.com/qt4cg/qtspecs/issues/1685

Issue #1843 created #created-1843

25 Feb at 15:35:10 GMT
XQFO: TOC texts

The XQFO TOC is overly verbose, and inconsistent nevertheless. With the addition of arrows and symbols, many headers stretch across several lines.

If no one objects, I will remove all the redundant "Functions ..." strings:

Current:

  1. Introduction
  2. Functions on nodes and node sequences
  3. Errors and diagnostics
  4. Functions and operators on numerics
  5. Functions on strings
  6. Functions that manipulate URIs
  7. Functions and operators on Boolean values
  8. Functions and operators on durations
  9. Functions and operators on dates and times
  10. Functions related to QNames ...

Proposed:

  1. Introduction
  2. Nodes
  3. Errors and diagnostics
  4. Numerics
  5. Strings
  6. URIs
  7. Boolean values
  8. Durations
  9. Dates and times
  10. QNames ...

Issue #1842 closed #closed-1842

25 Feb at 09:14:49 GMT

This is a test of the emergency broadcast system. This is only a test.

Issue #1842 created #created-1842

25 Feb at 09:13:44 GMT
This is a test of the emergency broadcast system. This is only a test.

Had this been a real emergency, we would have fled in terror and you would not have been informed.

Issue #1840 closed #closed-1840

25 Feb at 09:13:06 GMT

GH action remove-label-on-reopen.yml

Issue #1841 closed #closed-1841

25 Feb at 09:13:05 GMT

Action to remove label on reopen

Pull request #1841 created #created-1841

25 Feb at 09:12:58 GMT
Action to remove label on reopen

Close #1840

Pull request #1840 created #created-1840

24 Feb at 21:03:12 GMT
GH action remove-label-on-reopen.yml

In response to the mailing list post by @ndw: https://lists.w3.org/Archives/Public/public-xslt-40/2025Feb/0024.html

This is untested, but it might at least serve as an inspiration how to avoid the unwanted tag in an automated manner.

Pull request #1839 created #created-1839

24 Feb at 17:09:40 GMT
Relax the return type of the Invisible XML parsing function

Fix #1796

This change does not appear to change any test results. (In other words, none of our tests checked that the return type was explicitly a document node.)

Pull request #1838 created #created-1838

24 Feb at 15:59:24 GMT
1820 Attempt to add change markup in collapsed ToC

Fix #1820

This PR updates the styling so that a small "Δ" is added to the expand arrow when there are changes or additions in the concealed subsections. It's smaller and not blue. I could argue that this is on purpose so that the marking is different and perhaps more subtle. But the truth is, it was just easier to add the Δ without any markup that would make it larger or blue.

I've opted to conceal the Δ when the ToC is "open" on the grounds that you can see what is or isn't marked new on the revealed subsctions.

Issue #1827 closed #closed-1827

24 Feb at 11:24:50 GMT

XPath TOC: For and Let Expressions: whitespace

Issue #1831 closed #closed-1831

24 Feb at 11:24:49 GMT

1827 Fix excess whitespace in TOC

Pull request #1837 created #created-1837

24 Feb at 10:07:00 GMT
1834 Additional clarification on JSON lines

Fix #1834

Issue #1836 created #created-1836

24 Feb at 09:42:23 GMT
unparsed-text-lines() - line endings

The description of the unparsed-text-lines function contradicts itself regarding line endings.

First it says that the function is equivalent to calling unparsed-text() and applying tokenize(., '\n') to the result.

Then it says that it accepts x0A, x0D, or x0D0A as line endings.

Pull request #1835 created #created-1835

24 Feb at 08:15:41 GMT
add zero-width assertions to regular expressions

Proposal for issues !998 and !1006 to add zero-width assertions - lookahead, lookbehind, and word boundary.

Word boundaries use the already-defined \w and \W from XML Schema.

The syntax for lookahead and lookbehind assertions supports the two most common variants, one using < and > and the other using (*positive_lookahead:expr), which is at least amenable to Web searches, and doesn’t need escaping in XSLT or XQuery.

Note that word boundary < \b \B > assertions can be rewritten in terms of lookahead and lookbehind assertions.

Perl has a more powerful form of \b and \B that can match grapheme clusters, the Unicode linebreaking algorithm, and more, but supporting that would require language and script based mechanisms; if the graphemes() function is added, it would be worth considering. For now, i made it an error to write \b{...} so that the support could be added later if wanted, and also so that copying regular expressions into XPath would raise an error for the unsupported feature.

I will reopen !998 - if this is accepted i can produce test cases. Of course, i’m also happy to edit/rewrite etc. The syntax is widely supported, although \K is i think not in libpcre (but, libpcre has looser restrictions on negative backward assertions).

Issue #1834 created #created-1834

23 Feb at 17:42:56 GMT
json-lines - refinement

Some suggestions regarding support for json-lines:

(a) The json-lines spec has no official standing. It might therefore be a good idea if we summarize its essentials, just in case it disappears off the web. (b) The spec makes the final newline optional. Our test cases assume no final newline. We should probably mandate this for interoperability. (c) We should tell people how to read files in json-lines format - specifically unparsed-text-lines() ! parse-json()

Pull request #1833 created #created-1833

21 Feb at 09:32:04 GMT
583 Drop map:replace and array:replace

Fix #583

Issue #1832 created #created-1832

21 Feb at 00:22:03 GMT
Associativity of Operators, especially "||" (Appendix A.5)

The associativity of the || operator is given as "left-to-right" - it should surely be "either" (like comma, "or", and "union").

Other aspects of this table are questionable.

  • The operator ?[] for filtering a map or array should probably be included.
  • Arguably => and ? should be omitted because the RHS is not actually an expression, though it's true that if A => B => C is allowed, then it means (A => B) => C.
  • + and * are associative, it's only in conjunction with other operators that they aren't.

Pull request #1831 created #created-1831

20 Feb at 21:10:38 GMT
1827 Fix excess whitespace in TOC

Fix #1827

Pull request #1830 created #created-1830

20 Feb at 12:41:08 GMT
1829 Reintroduce restrictions on RHS of `=>`

Partial reversion of PR #1763

Issue #1829 created #created-1829

20 Feb at 11:58:48 GMT
Problems with new arrow expression syntax

I'm hitting problems with implementing the changes in PR #1763

The problem is that the => can now be followed by either a static function call or a dynamic function call, and I think we need unbounded lookahead to distinguish them.

Consider

3 => function-lookup(xs:QName('fn:abs'), 1)()

at first sight the arrow appears to be followed by a static function call, function-lookup(xs:QName('fn:abs'), 1). But treating it as such causes a parsing error when we get to the () - what we actually have here is a dynamic function call that starts with a static function call.

I propose that we revert to allowing a dynamic function call only in the form

a => x ( argument-list )

where x is a variable reference, a parenthesized expression, an inline function expression, or a map or array constructor.

Pull request #1828 created #created-1828

20 Feb at 11:25:43 GMT
1815 Add more detail on annotations of function items

Fix #1815

Issue #1827 created #created-1827

20 Feb at 10:11:37 GMT
XPath TOC: For and Let Expressions: whitespace

The table of contents for XPath, section 4.12, "For and Let Expressions", contains spurious whitespace. The whitespace appears to be present in the HTML, but it is not there in the source XML. In the actual section heading, there are two <a> elements before and after the heading text, each having as content a single space character.

The problem is also there in the equivalent section heading "FLWOR Expressions" in the XQuery spec.

In the "xpath-assembled" document, the heading appears as

            <head>
               <phrase role="xpath">For and Let Expressions</phrase>
            </head>

It seems to be the phrase element that's causing the trouble: or more likely, the whitespace text nodes that surround it.

Pull request #1826 created #created-1826

20 Feb at 09:55:46 GMT
Fix grammar bug #1818

Fix #1818

Pull request #1825 created #created-1825

20 Feb at 09:01:45 GMT
1816 New function fn:partial-apply

Fix #1816

Pull request #1824 created #created-1824

19 Feb at 21:05:47 GMT
1823 Revise top-level headings in F+O spec

Revises the headings for consistency and brevity, to make the ToC easier to navigate at a glance

Fix #1823

Issue #1823 created #created-1823

19 Feb at 20:52:09 GMT
Clearer top-level section headings in F+O

The improved rendition of the table of contents makes it apparent that the top-level sections headings in F+O are inconsistent and unnecessarily verbose.

Issue #1821 closed #closed-1821

19 Feb at 17:53:45 GMT

Generated appendices in XDM

Issue #1822 closed #closed-1822

19 Feb at 17:53:44 GMT

1821 Fix the generated appendixes in the Data Model

Pull request #1822 created #created-1822

19 Feb at 17:25:14 GMT
1821 Fix the generated appendixes in the Data Model

Fix #1821

Issue #1821 created #created-1821

19 Feb at 09:21:59 GMT
Generated appendices in XDM

The last four appendices in XDM are stylesheet-generated, and their TOC entries are added "by hand", and as a result they are incorrectly rendered.

I suggest using the same process for these appendices as other specs use: they should have a skeletal presence in the XML master, with a processing instruction to direct the stylesheet to expand the content; there is then no need for special machinery in the stylesheet to generate the TOC.

Issue #1808 closed #closed-1808

18 Feb at 22:07:34 GMT

Add pipeline operator to list of tokens using '<' and '>' characters

Issue #1820 created #created-1820

18 Feb at 22:05:20 GMT
Delta markers in collapsed TOC

When the TOC is shown in collapsed mode, it would be nice to promote the Δ change markers to the level where they become visible.

Pull request #1819 created #created-1819

18 Feb at 21:23:05 GMT
451 Multiple schemas in XSLT

Fix #451

Issue #1818 created #created-1818

18 Feb at 20:10:11 GMT
Grammar problem introduced by #1802

The recent merge of #1802 has incorrectly changed production ArrowExpr from

ArrowExpr
          ::= UnaryExpr ( SequenceArrowTarget | MappingArrowTarget | LookupArrowTarget )

to

ArrowExpr
          ::= UnaryExpr ( SequenceArrowTarget MappingArrowTarget LookupArrowTarget )*

It has changed a g:choice operator to a g:zeroOrMore operator, while it should have added a g:zeroOrMore around the g:choice.

Issue #1716 closed #closed-1716

18 Feb at 18:37:42 GMT

Variable lookahead needed for `ArrowTarget`

Issue #1763 closed #closed-1763

18 Feb at 18:37:41 GMT

1716 Generalize syntax of arrow expressions

Issue #1789 closed #closed-1789

18 Feb at 17:32:55 GMT

Terminology: "singleton map"

Issue #1791 closed #closed-1791

18 Feb at 17:32:54 GMT

1789 Fix singleton terminology

QT4 CG meeting 110 draft minutes #minutes—02-18

18 Feb at 17:30:00 GMT

Draft minutes published.

Issue #1769 closed #closed-1769

18 Feb at 17:17:48 GMT

Add links from processing model diagrams

Issue #1788 closed #closed-1788

18 Feb at 17:14:42 GMT

Drop reference to maps being unordered

Issue #1790 closed #closed-1790

18 Feb at 17:14:41 GMT

1788 Replace statement that maps are unordered

Issue #1785 closed #closed-1785

18 Feb at 17:11:39 GMT

XQuery 4.0 grammar: `ArrowExpr` target, `ReverseAxis`

Issue #1802 closed #closed-1802

18 Feb at 17:11:38 GMT

1785 Fix two simple grammar bugs

Issue #1803 closed #closed-1803

18 Feb at 17:08:36 GMT

Drop "(Non-Normative)" from table of contents

Issue #1804 closed #closed-1804

18 Feb at 17:08:35 GMT

Drop "(Non-Normative)" from ToC

Issue #1805 closed #closed-1805

18 Feb at 17:05:35 GMT

Drop middle dots from term references in F&O

Issue #1806 closed #closed-1806

18 Feb at 17:05:33 GMT

1805 Drop middle dots from termref rendition in F+O

Issue #1807 closed #closed-1807

18 Feb at 17:02:40 GMT

Two exceptions or three?

Issue #1809 closed #closed-1809

18 Feb at 17:02:39 GMT

1807 Two exceptions to the rule, not three

Issue #1631 closed #closed-1631

18 Feb at 17:00:52 GMT

xsl:apply-templates (without select) should allow inline content

Issue #1810 closed #closed-1810

18 Feb at 16:59:15 GMT

1808 Add -> to list of tokens using lt and gt characters

Pull request #1817 created #created-1817

18 Feb at 15:42:57 GMT
1800 Methods

Fix #1800

Issue #1816 created #created-1816

18 Feb at 15:33:40 GMT
Programmatic partial application

We don't have a programmatic way of doing partial function application, in particular it's very difficult to to a partial application supplying the first argument of a function item without knowing statically how many other arguments there are.

I suggest extending fn:apply so that the second argument can be a map with integer keys; it can supply some or all of the arguments to a function, and arguments that aren't supplied are retained in the returned partially-applied function item. For the example cited where only the first argument is to be supplied, supplying an array of length one would be equivalent.

Issue #1815 created #created-1815

18 Feb at 15:29:45 GMT
Function annotations on function items

We say very little about function annotations on function items.

For example,

  • we don't say that the function item constructed by a named function reference inherits the function annotations of the function declaration

  • we don't say whether a function item constructed by partial application (whether static or dynamic) has any function annotations, and if so what they are.

  • we don't mention them in function-lookup().

Pull request #1814 created #created-1814

17 Feb at 12:51:57 GMT
1813 Reorganise the XDM spec at top level

Fix #1813

The diff version is probably not too useful because a lot of material has moved around. Very little text has actually changed, and none of it substantively.

QT4 CG meeting 110 draft agenda #agenda-02-18

17 Feb at 11:30:00 GMT

Draft agenda published.

Issue #1813 created #created-1813

17 Feb at 11:15:48 GMT
Reorganise top-level sections in XDM

The improved presentation of the TOC for our specs makes it rather obvious that the structure of the XDM spec has become unbalanced. Most of the spec is about node trees; information about atomic values, maps, functions etc is hard to find, and sometimes appears in strange places such as "Terminology".

In addition, the way the spec is assembled from multiple entities serves little purpose. It makes it harder for editors to find the text that needs to be edited and to locate markup errors introduced in the course of editing.

Pull request #1812 created #created-1812

17 Feb at 10:51:31 GMT
1811 Add note regarding non-XML chars in xsl:output-character

Fix #1811

Issue #1811 created #created-1811

17 Feb at 10:12:07 GMT
Add note concerning non-XML characters in character maps

We have relaxed the rules for using non-XML characters in strings. It would be useful to explain how to take advantage of this in character maps.

Pull request #1810 created #created-1810

17 Feb at 09:33:53 GMT

1808 Add -> to list of tokens using lt and gt characters

Pull request #1809 created #created-1809

17 Feb at 09:23:09 GMT
1807 Two exceptions to the rule, not three

Fix #1807

Issue #1808 created #created-1808

16 Feb at 23:10:38 GMT
Add pipeline operator to list of tokens using '<' and '>' characters

Add the operator -> to the list of tokens in XPath A3.3.

Issue #1807 created #created-1807

16 Feb at 22:50:42 GMT
Two exceptions or three?

XPath 4.5.2.7 on function identity says "There are two exceptions to this rule:" and then lists three.

Pull request #1806 created #created-1806

16 Feb at 22:44:16 GMT
1805 Drop middle dots from termref rendition in F+O

Brings F+O into line with the other specs.

Fix #1805

Issue #1805 created #created-1805

16 Feb at 22:36:40 GMT
Drop middle dots from term references in F&O

The F&O spec renders termref links between middle dots. None of the other specs use this convention.

Pull request #1804 created #created-1804

16 Feb at 22:05:00 GMT
Drop "(Non-Normative)" from ToC

Fix #1803

Issue #1803 created #created-1803

16 Feb at 21:54:06 GMT
Drop "(Non-Normative)" from table of contents

Proposal - drop the phrase "(Non-Normative)" from section titles in the table of contents (but not in the body of the document).

In several of the specs this phrase disrupts the indentation and formatting of the ToC, and it adds very little value.

Pull request #1802 created #created-1802

16 Feb at 19:16:56 GMT
1785 Fix two simple grammar bugs

Fix #1785

Pull request #1801 created #created-1801

15 Feb at 23:08:30 GMT
1798 Function fn:function-identity

The function fn:identity as already described and discussed in #1798

Issue #1800 created #created-1800

14 Feb at 20:17:03 GMT
The `=?>` lookup arrow expression operator is weird, difficult to use, difficult to understand, difficult to read and unnatural

The XPath 4.0 language now includes a way for a function defined as a member of a map to easily access other members (siblings) that belong to the same map instance. Special syntax, the =?> operator, was introduced to call such a function. As a whole this is a huge step forward providing the user with a new, powerful mechanism to conveniently express relationships and calculations over several member-values of a map instance.

I am raising this issue with the goal of further improving and simplifying for the user the way to define and call a member function of a map/record, giving it a convenient way to access the values of other members of the instance of the map, on which the call has been issued.

In my work, I have been trying to define a number of functions that must belong to a map/record and that should be able to access other members of the same map/record to which these functions belong.

The experience was far from satisfying and here I describe the main problems I encountered when trying to use the =?> operator, and some obvious suggestions how we can further simplify the syntax for calling any member function of a map or record.

1. Problems trying to use the =?> operator

Here are the main problems I ran into.

Problem1. The =?> operator was:

  • weird-looking;
  • difficult to use;
  • difficult to understand;
  • difficult to read;
  • feeling unnatural. It would be much better if we didn't have to use any special operator at all in order to call a member function "myFunction" of a map $m by simply: $m?myFunction(<tuple of any arguments defined in the signature of the function>)

Problem2. There is no example, in the sections that describe the record type (3.2.8.3), showing a record member-function that accesses the values of other members of the same instance of the record. Thus, the new feature is effectively hidden for people who want to work with records. We need such an example for a record, so that we don't forget that any record is also a map and possesses all functionality a map has to offer. And a statement to this effect must be added to the description of records.

Problem 3. This syntax is overcomplicated and difficult to use and remember, resulting in unnecessarily long and complex expressions:

let $rectangle := {
  "width": 20,
  "height": 12,
  "area": fn($this) { $this?width * $this?height }
} 
return $rectangle =?> area()

It would be significantly better to use a much simplified syntax such as:

let $rectangle := {
  "width": 20,
  "height": 12,
  "area": fn() { ?width * ?height }
} 
return $rectangle ? area()

Recognizing that ?name is already used since XPath 3.1 as Unary Lookup Operator, and to avoid the unlikely case of collision, when a member function accesses other members of the map-owner-instance that happen to have identically the same names as expected constituents of the current context item (upon which the function is applied), we can introduce a special character to denote the current map-owner-instance, thus the above example could look like this:

let $rectangle := {
  "width": 20,
  "height": 12,
  "area": fn() { ^width * ^height }
} 
return $rectangle ? area()

Solutions

Solution for Problem 1 above (weirdness of the =?> operator:
Do not introduce any special operator. Just use ? to invoke the member-function.

Solution for Problem 2 above (lack of example of a record having a member-function that accesses other members of the same map-owner-instance). Obviously, provide such an example. Also reiterate there that all features and functionality of a map continue to be available for records.

Solution for Problem 3 above (overcomplicated syntax:

  • Get rid of the =?> operator. Use ? for all references to member-functions.
  • Don't use any special variable like $this. For example, the current example in the documentation: "area": fn($this) { $this?width * $this?height } should instead be: "area": fn() { ^width * ^height }
  • use the ^ character to denote owner-map-instance membership. Thus ^width means: "The member named "width" of the map instance upon which the current function was invoked"

Conclusion

I will issue a PR with the solutions, provided there are not any substantial comments hilighting problems with this proposal.

Issue #1799 created #created-1799

13 Feb at 20:43:29 GMT
"well-formed HTML document"?

There is an apparent ambiguity in the XPath Functions specification as to whether fn:parse-html raises dynamic error err:FODC0011 when html-version is set to one of the HTML5 versions and the content of $html is not well-formed, given the general expectation that HTML5 parsers can always parse an input string regardless of syntactic validity.

The HTML5 standard, for its part, has actually always allowed the parser to be aborted upon encountering a parse error (though no browser does this), so the function definition would seem to require that parse-html("<p>Hello</p>", { "method": "html", "html-version": 5 }) invariably raises an error, given that the input is invalid (missing opening <html> tag, etc.).

I don't think this is the intended behavior; my suggestion is to either have it be explicitly implementation-defined as to which parse errors cause err:FODC0011 to be raised, or require that it is never raised for HTML5.

Issue #1238 closed #closed-1238

12 Feb at 23:39:45 GMT

XSLT on-no-match="shallow-copy-all" - revised rules

Issue #1798 created #created-1798

12 Feb at 20:37:31 GMT
Getting the value of the new identity-(DM)property of a function. `fn:function-identity`

The current set of Functions on Functions: at (https://qt4cg.org/specifications/xpath-functions-40/Overview.html#functions-on-functions) was recently updated with a new function to produce all annotations for a given function: fn:function-annotations. However we are still missing the ability to reference another important, newly-added property of a function: the function identity: in DM and in XPath.


fn:function-identity

Summary Returns the identity of the function item.

Signature

fn:function-identity(   $function as fn(*) ) as xs:string

Properties This function is ·deterministic·, ·context-independent·, and ·focus-independent·.

Notes

  1. This function can be useful in any scenario, where the evaluation of a function call requires repeated evaluation of this same, or other functions. Often in such cases, the algorithm needs access to a general structure, containing the cached results of executing possibly many different functions applied on specific arguments-tuples. What unique key is needed under which to group all invocations of a specific function and then the mapping between their function-call arguments and the result of the call? Remarkably, the identity of a function fits exactly the requirements (uniqueness / one per function) for such a key. As we already have a function-identity property in the Data Model for each function-item, it is straightforward to provide this identity, and the fn:function-identity does exactly that.
  2. The function identity, by definition, is generated upon the creation of a function and has meaning throughout of the life of that function. It is not meaningful to store this value across different executions, because the identity given to a function in execution1 will generally be different from the identity, given to it in execution2. However, the definitions of system functions (functions defined in the specifications under the system namespaces - with standard prefixes: xs, fn, map, array, math, err, output) can be assigned permanent identities in the official Specs documents, requiring every implementation to use exactly this published identity value for the official, system functions, thus achieving efficiency and convenience during debugging.
  3. The function identity, being a string, can be used as a key in a map, thus making it possible to map a particular function to a sequence of items. It becomes possible to allow function items as map-keys by extending the definition of same-keys with: "If both keys are function items: $f1 and $f2, then they are the same if and only if: function-identity($f1) eq function-identity($f2)

Issue #1797 created #created-1797

12 Feb at 16:20:09 GMT
elements-to-maps: separate function to construct a plan

I propose separating out the uniform=true option of elements-to-maps() into a separate function. This function analyses the data and produces a conversion plan, which can be supplied to the "layouts" option (perhaps renamed) of the main function.

The benefits are:

  • The plan can be tweaked after it is created by manual adjustment, for example if the user wants to use "empty-plus" layout wherever the system's choice would be "empty", or to take account of anticipated future changes in the structure.
  • The plan can be used to process documents that did not exist at the time it was created, thus ensuring that future documents are all converted in the same way, and avoiding the overhead of rebuilding the plan each time.
  • The plan can be created from a small sample of the documents to be converted.
  • The plan can be created from a large collection of documents including documents that don't need to be converted but which may contain structural elements that are not revealed by the documents that need converting now.
  • The user can examine the plan to see what it is doing, which is useful for diagnostics.

We should define the format of the plan (a map from element names to layouts) so that it can conveniently be serialized as a JSON document.

Issue #1796 created #created-1796

12 Feb at 14:46:44 GMT
Allow fn:invisible-xml to return a function that returns an item()

Our current fn:invisible-xml function returns a document node. That makes perfect sense when held up against the Invisible XML specification. But I wonder if we should leave the door open to some extensibility. I can imagine, for example, an implementation of an Invisible XML processor that has the ability to return a map or even a CSV structure instead of XML. (XML is the required, standard result in 1.0 but implementors have been known to offer user options to produce other serializations and one area of potential change in the future is other serialization formats.)

Pro: more extensible. Con: less type information about the result.

Issue #1795 created #created-1795

12 Feb at 12:51:15 GMT
XSLT templates: Matching values in a map by key

The simplest coding pattern for template rule processing for JSON structures would be to take a structure like this:

[
   {"name": "John", "address": { .... }, job-history: [ { .... }, {....} ]},
   {"name": "Jane", "address": { .... }, job-history: [ { .... }, {....} ]}
]

and to process it using template rules of the form:

<xsl:template match="record(name, address, job-history)">
    <xsl:apply-templates select="?*"/>
</xsl:template>

<xsl:template match="(pattern matching key 'name')">...</xsl:template>

<xsl:template match="(pattern matching key 'address')">...</xsl:template>

<xsl:template match="(pattern matching key 'job-history')">...</xsl:template>

We have nearly all the ingredients in place for this. In particular, we can ensure that the select="?*" selects values that are labelled with the relevant key, making it technically possible to match values according to that key: select="?*" might select an xs:string value "John", but the string is labelled with the property key="name", so it can in principle match a template rule designed to process the "name" value.

The only piece that's missing is how to write the match patterns. We can write match=".[label()?key = 'name']", but that's hopelessly long-winded.

I propose that we use the syntax match="?name" to match a value that is labelled with the key "name". This feels intuitive and natural, and 99% of users won't trouble with the complex underlying semantics.

We can extend this by borrowing other parts of the Lookup expression syntax, for example match="?('X', 'Y', 'Z')" to match several keys.

I would also suggest promoting the operators "union", "intersect" and "except" so they can be used to combine any patterns (not just node patterns) so this could be written match="?X | ?X | ?Z", or we could write match="?* except ?X". But note that this would create an expectation that users can also write select="?* except ?X" in an XPath expression; and that's quite hard to achieve: see also #1794.

Issue #1794 created #created-1794

12 Feb at 12:03:14 GMT
Lookup: select all except

In lookup expressions we have ?* to select all entries, and ?X to select a specific entry. There is frequently a requirement to select all entries with specific exceptions.

One way of doing this is $map => map:remove('X')?*

Another is to do $map?pairs::*[?key != 'X']?value

A third option is $map?[?key != 'X']?*

Or $map => map:filter(($k, $v){$k != 'X'})?*

Or for key $k value $v in $map where $k != 'X' return $v

None of these feels particularly user-friendly.

A possible syntax might be $map?-X or more generally "?" "-" KeySpecifier to select all entries that are not selected by the KeySpecifier. For example this would allow $map?-('X', 'Y') to exclude X and Y,

Issue #1782 closed #closed-1782

12 Feb at 11:42:20 GMT

1776 Add lookup patterns using ? and ??

Issue #1781 closed #closed-1781

11 Feb at 21:52:25 GMT

XSLT: drop section 23 (Processing JSON Data) and Appendix B

Issue #1792 closed #closed-1792

11 Feb at 17:34:20 GMT

Schema validation errors on function catalog for EXPath binary spec

Issue #1793 closed #closed-1793

11 Feb at 17:34:19 GMT

1792 Make function-catalog file schema-valid

Pull request #1793 created #created-1793

11 Feb at 17:34:06 GMT
1792 Make function-catalog file schema-valid

Fix #1792

Issue #1792 created #created-1792

11 Feb at 17:29:35 GMT
Schema validation errors on function catalog for EXPath binary spec

I'm seeing schema validation errors after rebasing, it looks like PR #1765 introduced lines like

<fos:changes issue="1751">

when the @issue attribute should be on the child fos:change element.

Perhaps the validation done by the build has improved.

I'll fix this in a separate PR to be emergency-applied.

QT4 CG meeting 109 draft minutes #minutes—02-11

11 Feb at 17:10:00 GMT

Draft minutes published.

Issue #1779 closed #closed-1779

11 Feb at 17:04:38 GMT

XPath 4.0 EBNF grammar

Issue #1783 closed #closed-1783

11 Feb at 17:04:37 GMT

1779 Make CharRef XQuery-only

Issue #1752 closed #closed-1752

11 Feb at 17:02:34 GMT

Return type of fn:partition()

Issue #1761 closed #closed-1761

11 Feb at 17:02:33 GMT

1752 Correct return type of fn:partition()

Issue #1751 closed #closed-1751

11 Feb at 17:00:32 GMT

bin:encode-string - should the result have a BOM?

Issue #1765 closed #closed-1765

11 Feb at 17:00:31 GMT

1751 Clarify BOM handling

Issue #1770 closed #closed-1770

11 Feb at 16:58:28 GMT

Union patterns in XSLT

Issue #1772 closed #closed-1772

11 Feb at 16:58:27 GMT

1770 Default priority of rules with a union pattern

Issue #402 closed #closed-402

11 Feb at 16:56:09 GMT

XSLT patterns: intersect and except

Issue #1773 closed #closed-1773

11 Feb at 16:56:08 GMT

402 Change the semantics of intersect and except in patterns

Issue #1784 closed #closed-1784

11 Feb at 16:53:58 GMT

1781 Drop obsolete material from XSLT spec

Issue #755 closed #closed-755

11 Feb at 16:51:30 GMT

with expression; chaining and concatenation

Issue #877 closed #closed-877

11 Feb at 16:51:22 GMT

Inconsistency in XQFO comparator functions/operators with recursive rules

Issue #1729 closed #closed-1729

11 Feb at 16:50:42 GMT

Grammar problems introduced by #1721

Issue #1767 closed #closed-1767

11 Feb at 16:50:41 GMT

1729/1737 Fix grammar for "declare record"

Pull request #1791 created #created-1791

11 Feb at 11:12:21 GMT
1789 Fix singleton terminology

Replaces "singleton map" with "single-entry map" and "singleton array" with "single-member array"; the term "singleton" now always means count()=1, not size()=1.

Fix #1789

Pull request #1790 created #created-1790

11 Feb at 10:04:33 GMT
1788 Replace statement that maps are unordered

Fix #1788

Issue #1789 created #created-1789

11 Feb at 09:00:35 GMT
Terminology: "singleton map"

We often use the term "singleton map" to mean a map containing a single entry (key-value pair).

But in XQ 4.14.3.1 we use the same term to mean "a sequence containing a single map".

Issue #1788 created #created-1788

10 Feb at 14:50:28 GMT
Drop reference to maps being unordered

In F&O 17.5.1.5 elements-to-maps record layout, mapping rules, delete

Because the child elements are converted to a map, their order is not retained.

Substitute a rule that the entries in the map will correspond with "order of first appearance".

QT4 CG meeting 109 draft agenda #agenda-02-11

10 Feb at 14:00:00 GMT

Draft agenda published.

Issue #1787 created #created-1787

10 Feb at 10:47:24 GMT
Sorted maps revisited

Now that we have ordered maps established, I'd like to make another attempt to introduce sorted maps - that is, maps whose ordering is by key value. The entries in such a map would be sorted by key, but there's no attempt to maintain sort order in subsequent put() operations.

We introduce map:sort($m) essentially as a convenient shorthand for map:of-pairs(sort(map:pairs($m), fn{?key})).

And then we introduce something like map:get-range($from, $to) which returns the keys (or pairs, or entries) whose keys are in a given range -- which the implementation can optimize if it knows the map has been sorted.

Issue #1786 created #created-1786

09 Feb at 09:38:14 GMT
A case study for XSLT transformation of JSON: the transpiler

One of the design aims of XSLT 4.0 is that it should be easier to transform JSON. Back in 2016 I published a paper at XML Prague (https://www.saxonica.com/papers/xmlprague-2016mhk.pdf) with the rather disappointing result that for a couple of non-trivial JSON transformation tasks, the easiest solution was to convert the JSON to XML, transform the XML, and then convert it back. In many ways it was that discovery that motivated the whole XSLT 4.0 project. So I want to review to what extent we have solved that problem, and what remains to be done. In particular, I have recently raised a number of open issues related to how we transform JSON-derived trees of maps and arrays using template rules, and I'm not sure we can resolve those issues without testing the proposals against real use cases.

I'm proposing to take as a case study the Java-to-C# transpiler which we described in a 2021 paper at https://www.saxonica.com/papers/markupuk-2021mhk.pdf. This is a real XSLT application in daily use. It invokes the (open source) JavaParser to emit an XML representation of Java source code, it performs various transformations of that XML, and then finally spits out equivalent C# source code. My basic question is: suppose the JavaParser had chosen to emit JSON instead of XML (as it might perfectly reasonably have chosen to do). Would we be able to write the transpiler in XSLT 4.0 to work entirely within the JSON space, avoiding all use of XML?

I chose this case study for several reasons:

  • It's entirely plausible that the input might have been JSON rather than XML
  • The application relies very heavily (and successfully) on rule-based processing: if we didn't have template rules, then it would be dominated by large xsl:choose statements with hundreds of branches.
  • At around 5000 lines of XSLT, it's large enough to be non-trivial, yet small enough to be tractable as a case study.

I looked at a couple of other candidates, and found they were things that could be readily done in XSLT 3.0 without any enhancements. For example we have production XSLT 3.0 code that takes a JSON data feed from our online shop at saxonica.com and uses it to update our sales database and to generate license keys. The JSON is voluminous but the structure is simple, and the constructs in XSLT 3.0 for handling maps and arrays are entirely up to the job. The transpiler differs in that the JSON has a much more interesting recursive structure, making rule-based transformation a natural fit to the task.

I'm not proposing to actually produce a complete replacement of the current transpiler, only to explore the task of doing so in enough detail to get some useful insights. I propose to use this issue tracker to capture my working notes as the study proceeds, but if there are recommendations affecting the 4.0 specs (as seems likely), then I will extract those into separate issues. Perhaps at the end of the process I will write up the case study as a conference paper.

My rough plan is as follows:

  1. Explore conversion of the current XML output by JavaParser to JSON using the new elements-to-maps() function. We have a number of open issues on the usability of this function and it will be interesting to see whether we encounter similar difficulties to those that have already been raised, and whether the suggested solutions are appropriate.
  2. Convert the xml-to-java stylesheet to work on this JSON input. This stylesheet is not actually a working part of the transpiler, rather it's something we built as a stepping stone; before attempting to convert the XML syntax tree to C#, we felt it would be instructive to write code that converted it back to Java. This is an 820-line stylesheet and it should be feasible to convert it completely.
  3. The transpiler currently produces, as an intermediate output, a "digest" file containing summary information about all the classes and methods found in the Java code, and their subtyping/override relationships. We then have a process that augments this digest with attributes that are needed by the C# generation, for example which methods to label with "virtual" or "override" modifiers. I propose to experiment with producing (and transforming) this digest in JSON rather than XML format.
  4. Examine the XSLT code that generates C# output to look for features that appear to be tricky to convert, for example anything that uses the parent or ancestor axis, and study to what extent we now have the capability in XSLT 4.0 to handle those situations.

Using this format (a GitHub issue) to record progress carries a risk that there will be comments that take things off at a tangent. Please help by resisting that temptation: if there are interesting issues raised in your mind, please take those up as separate issues.

Issue #1785 created #created-1785

08 Feb at 08:57:57 GMT
XQuery 4.0 grammar: `ArrowExpr` target, `ReverseAxis`

While testing the parser generated from the specification grammar, I encountered two issues.

1. ArrowExpr target must be optional

The current definition in the specification is as follows:

ArrowExpr ::= UnaryExpr (SequenceArrowTarget | MappingArrowTarget | LookupArrowTarget)

However, the target part must at least be optional, or better zero-or-more:

ArrowExpr ::= UnaryExpr (SequenceArrowTarget | MappingArrowTarget | LookupArrowTarget)*

Otherwise arrow targets are expected almost everywhere. Making it zero-or-more allows parsing of

a => b() => c()

which would not be possible without extra parentheses if it was optional.

2. Missing preceding-sibling in ReverseAxis

The ReverseAxis production currently appears as:

ReverseAxis ::= ( "ancestor"
                | "ancestor-or-self"
                | "parent"
                | "preceding"
                | "preceding-or-self"
                | "preceding-sibling-or-self" ) "::"

It is missing the prededing-sibling axis.

Pull request #1784 created #created-1784

07 Feb at 21:45:41 GMT
1781 Drop obsolete material from XSLT spec

Drops material mainly deriving from when XSLT 3.0 had to work with both XPath 3.0 and 3.1. Includes non-normative exposition and some obsolete conformance statements.

Pull request #1783 created #created-1783

07 Feb at 21:03:24 GMT
1779 Make CharRef XQuery-only

Fix #1779

Makes the CharRef token XQuery-only.

Pull request #1782 created #created-1782

07 Feb at 12:46:20 GMT
1776 Add lookup patterns using ? and ??

Fix #1776

Issue #1026 closed #closed-1026

07 Feb at 12:44:03 GMT

XSLT match patterns on pinned maps and arrays

Issue #1781 created #created-1781

07 Feb at 12:15:33 GMT
XSLT: drop section 23 (Processing JSON Data) and Appendix B

Section 23 Processing JSON data at one time contained the specification of maps, before this moved into XPath 3.1. This has now gone, and what's left is pretty much content-free.

Appendix B contains a stylesheet for converting XML to JSON. It has some educational value, but not much, and I think it can go.

Issue #1780 created #created-1780

07 Feb at 11:49:43 GMT
xsl:for-each optional variable introduction

I spend quite a lot of time writing

<xsl:for-each select="foo">
   <xsl:variable name="foo" select="." as="element(foo)"/>
   <xsl:for-each select="$foo/bar">
      <xsl:variable name="bar" select="." as="element(bar)"/>
      .... do some stuff with $foo and $bar
   </xsl:for-each>
</xsl:for-each>

I'd prefer to go (much like xquery)

<xsl:for-each name="foo" as="element(foo)" select="foo">
   <xsl:for-each name="bar" as="element(bar)" select="$foo/bar">
      .... do some stuff with $foo and $bar
   </xsl:for-each>
</xsl:for-each>

Issue #1779 created #created-1779

07 Feb at 11:44:47 GMT
XPath 4.0 EBNF grammar

The grammar extraction and transformation in RExify XQuery 4.0 grammar has been extended to cover the XPath 4.0 specification document, resulting in an LALR(1) grammar for XPath 4.0 that is suitable for REx.

This update revealed a minor issue: section A.3.1 Terminal Symbols lists CharRef, which is unreferenced:

CharRef       ::= [http://www.w3.org/TR/REC-xml#NT-CharRef]
                                                         /* xgc: xml-version */

Currently, the transformation process includes a rule that removes this production. The rules will be adjusted as the grammar evolves.

Pull request #1778 created #created-1778

07 Feb at 10:31:36 GMT
1456 Lookup expressions filtered by type

Fix #1456

Allows selection of records by type within a JSON tree, for example $json ?? ~record(first, last) ? last.

I'm aware that the use of the tilde here is controversial but I think this kind of query is going to be very common; it needs something simple and I think people will get used to it. No-one has suggested anything that is obviously better, and I propose to also use ~ in other similar contexts, for example type patterns in XSLT, which will increase familiarity.

I suggest reading ~ as "of type".

Issue #1777 created #created-1777

06 Feb at 18:35:20 GMT
Shallow copy in XSLT with maps and arrays

Currently the xsl:copy instruction, if applied to a map or array, does a deep copy, and ignores the content of the contained sequence constructor.

I propose that if the contained sequence constructor is non-empty then instead of ignoring it, we should use it to create the content of the new map or array. Specifically, for maps xsl:copy will behave essentially like xsl:map, and for arrays it will behave essentially like xsl:array.

This is an incompatibility with 3.1, but since a contained sequence constructor is currently totally useless in this situation, it doesn't seem likely to cause any trouble.

I also propose that rather than using the new built-in on-no-match="shallow-copy-all, we should extend the semantics of shallow-copy to cover maps and arrays (as currently defined for shallow-copy-all). Again, there is an incompatibility, but the current rules are so unhelpful that it's unlikely people are relying on them.

I also propose that when apply-templates is applied to a map or array, it should be automatically pinned if it is not pinned already. The means that match patterns can be used with a lot more context to match the deep contents of the map or array and override the processing of the built-in templates.

And I propose that when apply-templates is applied to a map or array and there is no select attribute, it should "do the right thing" by applying templates to the map or array contents, rather than using the useless default of child::node().

Issue #1776 created #created-1776

06 Feb at 18:24:49 GMT
Using `?` and `??` in XSLT patterns

I propose that the pattern P1 ? P2, where P1 and P2 are patterns, should match any labelled item $L provided that $L matches P2, and $L?.. (that is, ($L => label())?parent ) matches P1.

Similarly, the pattern P1 ?? P2, where P1 and P2 are patterns, should match any labelled item $L provided that $L matches P2, and $L?... (that is, ($L => label())?ancestors() ) matches P1.

Note that neither the syntax nor the semantics are directly related to the lookup operator in XPath. In particular, P2 is a pattern, not a KeySpecifier. But there is a strong analogy, both with the use of ? and ?? in XPath expressions, and with the use of / and // in patterns.

Issue #1775 created #created-1775

06 Feb at 18:11:55 GMT
Navigation in JSON trees

I propose that the parse-json function should create a pinned tree, so that upwards navigation to parent and ancestor j-nodes becomes possible.

I propose introducing the key specifier .., with $M?.. being a shorthand for ($M => label())?parent, giving a convenient and familiar way to navigate from a j-node to its parent in a pinned tree. For example, $M?..?name gives the value of the name property in the immediately containing map.

Similarly, I propose introducing the key specifier ... to navigate to ancestors, so $M?... becomes a shorthand for ($M => label())?ancestors(), and $M?...?name returns the name property of all containing maps.

For symmetry I suggest we also provide ... as an abbreviated axis step, short for ancestor::node().

I'd like to find a better name for "pinned". Perhaps "tracked" better captures that what it does is to track downward navigation steps and make them reversible.

I'd also like to introduce the terms j-tree and j-node. A j-tree is a map or array, recursively expanded to include its entries or members. A j-node is a value in a j-tree. Perhaps confine the usage to maps and arrays that have been pinned/tracked.

Issue #1774 created #created-1774

06 Feb at 17:55:40 GMT
Nomenclature: relabelling

The term relabelling - used when we are down-casting, for example when an xs:integer is supplied and the required type is xs:unsignedByte - is easily confused with the concept of a label, being a set of properties that can be associated with any item in XDM 4.0, and which is accessible through the fn:label function.

I suggest we rename relabelling as rebadging.

Apart from anything else, this has the virtue that my spell-checker won't auto-correct it...

Issue #1713 closed #closed-1713

06 Feb at 12:18:29 GMT

Patchy exposition of XSLT type pattern syntax

Pull request #1773 created #created-1773

05 Feb at 17:49:55 GMT
402 Change the semantics of intersect and except in patterns

Fixes a bug in the 3.0 spec whereby the intersect and except operators in a pattern have counter-intuitive semantics.

Fix #402

Pull request #1772 created #created-1772

05 Feb at 16:50:06 GMT
1770 Default priority of rules with a union pattern

Scraps the increasingly-complicated rules for handling priority of rules with a union pattern.

Fix #1770

Issue #1771 created #created-1771

05 Feb at 16:09:29 GMT
fn:deep-equal: map order

It may not come as a big surprise: A first feature request we received for ordered maps was to be able to take the order into account when comparing maps.

I would propose to add an ordered-map option to fn:deep-equal, which defaults to false:

(: returns false :)
deep-equal(
  { 1: 'one', 2: 'two' },
  { 2: 'two', 1: 'one' },
  { 'ordered-map': true() }
)

It should be simple to use and easy to implement.

Issue #1770 created #created-1770

05 Feb at 16:07:01 GMT
Union patterns in XSLT

The original XSLT 1.0 rule for union patterns such as match="A|B" said that the default priority was calculated as if there were two separate template rules with match="A" and match="B". This became more complicated with the introduction of xsl:next-match in XSLT 2.0 - what should happen if the item matches both branches? It became more complicated again in XSLT 3.0 with the introduction of on-multiple-match - is it a multiple match if an item matches both branches? And in 4.0 it's complicated further by the introduction of constructs like match="element(A|B)" which is deemed equivalent to match=A|B.

I would like to break this cycle with a backwards-incompatible change. The default priority of a union pattern should be the numeric maximum of the default priorities of its branches; the treatment as being somewhat-equivalent to two separate template rules should go. We should encourage implementations to issue a compatibility warning if a union pattern appears with no explicit priority, and with multiple branches having different default priority.

Pull request #1769 created #created-1769

05 Feb at 15:56:36 GMT
Add links from processing model diagrams

Completes action QT4CG-108-02

I’ve added link targets where necessary. I didn’t try to link closer than that paragraph level, partly because I think that’s the context the reader needs, but also partly because we don’t copy ID values from all elements.

There’s no definition of DM4.

(Review of the link targets and comments on what (if anything) the remaining boxes and labels should link to most appreciated.)

Issue #1768 closed #closed-1768

05 Feb at 15:21:42 GMT

Inline SVG images

Pull request #1768 created #created-1768

05 Feb at 15:21:31 GMT
Inline SVG images

In order for links to work in the browser, the SVG has to be inline, not loaded from a separate file. For self-document links, I guess this makes sense.

This is a tools-only change.

Pull request #1767 created #created-1767

05 Feb at 15:20:26 GMT
1729/1737 Fix grammar for "declare record"

Fix #1729

  • The syntax should be "declare record", not "declare type record".
  • All the declarations using annotations should allow multiple annotations.
  • Added a note about refactoring the grammar to avoid unbounded lookahead.

Pull request #1766 created #created-1766

05 Feb at 12:51:19 GMT
1715 Drop array bound checking

Fix #1715

Drops array bound checking from array:get, arrays-as-functions, and array lookup. Returns () instead of an error FOAY0001 when the index is out of bounds. This brings arrays and maps into closer alignment.

Drops the $fallback argument of array:get()

Adds a new function array:get-if-present() which replicates the old behaviour of array:get().

Functions such as array:put, array:replace, array:insert-before, array:head, array:tail continue to perform bound checking.

Issue #1738 closed #closed-1738

05 Feb at 11:41:04 GMT

Formatting of lists within notes

Pull request #1765 created #created-1765

05 Feb at 11:36:04 GMT
1751 Clarify BOM handling

Fix #1751

Clarifies BOM handling (and byte order generally) in bin:encode-string and bin:decode-string.

Also adds a note to bin:octal for the prevention of possible misunderstanding.

Issue #1758 closed #closed-1758

05 Feb at 09:43:29 GMT

EXPath specification validation problems

Issue #1759 closed #closed-1759

05 Feb at 09:43:28 GMT

Fix validation issues in the EXPath module function catalogs

Issue #1739 closed #closed-1739

04 Feb at 23:50:42 GMT

Obsolete references to ordering mode

Issue #1741 closed #closed-1741

04 Feb at 23:50:41 GMT

1739 drop references to ordering mode in the static context

QT4 CG meeting 108 draft minutes #minutes—02-04

04 Feb at 17:30:00 GMT

Draft minutes published.

Issue #1757 closed #closed-1757

04 Feb at 17:10:10 GMT

Build cleanup: remove the "by hand" diffs

Issue #1760 closed #closed-1760

04 Feb at 17:10:09 GMT

Remove hand-generated diffs from the builds

Issue #1743 closed #closed-1743

04 Feb at 17:07:56 GMT

1738 Formatting of Notes in F&O

Issue #1733 closed #closed-1733

04 Feb at 17:05:43 GMT

ACTION QT4CG-088-04, reworking the processing model diagram

Issue #1746 closed #closed-1746

04 Feb at 17:05:42 GMT

Replace processing model diagrams

Issue #1750 closed #closed-1750

04 Feb at 17:03:41 GMT

EXPath Binary : copy-edits and minor enhancements

Issue #1753 closed #closed-1753

04 Feb at 17:03:40 GMT

1750 Overhaul of EXPath binary spec

Issue #1571 closed #closed-1571

04 Feb at 17:03:03 GMT

Discussion: On the implementability of the specs and helping implementors

Issue #1699 closed #closed-1699

04 Feb at 17:02:52 GMT

XPath function to calculate edit distance between two strings

Issue #1682 closed #closed-1682

04 Feb at 17:01:23 GMT

Type Promotion

Issue #1734 closed #closed-1734

04 Feb at 17:01:22 GMT

1682 Type promotion and operator mapping

Issue #1764 closed #closed-1764

04 Feb at 09:18:02 GMT

Remove the BOM from unparsed text input?

Issue #1764 created #created-1764

04 Feb at 08:49:18 GMT
Remove the BOM from unparsed text input?

XML parsing handles the BOM for us, and we say something explicit about the BOM when parsing JSON, but we're silent about the BOM when loading unparsed text. I think the right answer is to discard the BOM and return the text that follows it...

Issue #1762 closed #closed-1762

03 Feb at 15:36:40 GMT

Combining different kinds of arrow

Pull request #1763 created #created-1763

03 Feb at 15:35:56 GMT
1716 Generalize syntax of arrow expressions

Fix #1716

QT4 CG meeting 108 draft agenda #agenda-02-04

03 Feb at 11:30:00 GMT

Draft agenda published.

Issue #1762 created #created-1762

03 Feb at 10:44:20 GMT
Combining different kinds of arrow

In the spec, under arrow expressions, we have this example:

(1 to 5) =!> xs:double() =!> math:sqrt() =!> fn($a) { $a + 1 }() => sum()

That use of an inline function is pretty clumsy, and it would be nice to think we could eliminate it using the new -> operator. But it ain't easy.

We can't do

(1 to 5) =!> xs:double() =!> math:sqrt() -> .+1 => sum()

because the precedence is wrong.

We can't do

(1 to 5) =!> (xs:double() => math:sqrt() -> .+1 ) => sum()

because we can't have a parenthesised construct on the RHS of the mapping arrow.

We can use the bang operator but the parentheses are awkward:

((1 to 5) ! (xs:double(.) => math:sqrt() -> (.+1) )) => sum()

If we changed the precedences we could allow

(1 to 5) ! xs:double(.) ! math:sqrt(.) ! (.+1) -> sum(.)

Which would require moving -> so it has lower precedence than !. But this would disrupt its relationship with =>.

Pull request #1761 created #created-1761

03 Feb at 10:01:18 GMT
1752 Correct return type of fn:partition()

Fix #1752

Pull request #1760 created #created-1760

03 Feb at 09:44:29 GMT
Remove hand-generated diffs from the builds

Fix #1757

The PR build isn't going to be very informative, but I'll leave this one open in case anyone wants to review the source code diffs.

I did not attempt to remove the XML markup from the specs. Perhaps we should, but I think we'd want to manage that carefully to avoid an absolute mountain of merge conflicts.

Issue #1744 closed #closed-1744

03 Feb at 09:15:06 GMT

Remove dead wood re: SVG diagrams from the XSLT build

Pull request #1759 created #created-1759

03 Feb at 09:01:16 GMT
Fix validation issues in the EXPath module function catalogs

Fix #1758

Issue #1758 created #created-1758

03 Feb at 08:50:44 GMT
EXPath specification validation problems

As @michaelhkay noted in email, the function catalogs for the EXPath specifications are not being validated.

That validation only occurs during test generation ¯_(ツ)_/¯

  1. Add an example to the EXPath file specification so that it's possible to run test generation
  2. Add test generation for EXPath file and binary to the build
  3. Fix the validation errors in the function catalog

Issue #1756 closed #closed-1756

03 Feb at 08:43:10 GMT

Make DeltaXML diffs on the main build too

Issue #1757 created #created-1757

03 Feb at 08:20:45 GMT
Build cleanup: remove the "by hand" diffs

Unless I'm mistaken, the 'by hand' diffs, the ones that are created from explicit diff markup added by the editors, have not been consistently maintained for some time.

We still have places that point to them, and I think this could be confusing.

I propose that we pull all of that machinery out and remove references to them.

Pull request #1756 created #created-1756

03 Feb at 07:18:17 GMT
Make DeltaXML diffs on the main build too

This PR should build DeltaXML diffs of the EXPath specs...and when merged, should build them on the main build as well.

Issue #1755 closed #closed-1755

03 Feb at 07:02:23 GMT

Attempt to make DeltaXML diffs for EXPath specs

Pull request #1755 created #created-1755

03 Feb at 07:02:15 GMT

Attempt to make DeltaXML diffs for EXPath specs

Issue #1754 created #created-1754

02 Feb at 23:57:35 GMT
Inverse functions to bin:hex, bin:bin, and bin:octal

In writing formal equivalents for the functions in the binary EXPath module, I found that while we have bin:bin() which turns a string of 0s and 1s into a binary value, we don't have any convenient way of doing the inverse. The same is true for octal. For hex we can cast to hexBinary and then cast to string, but that's a bit of a circumlocution.

I propose functions bin:to-bin, bin:to-octal and bin:to-hex that convert a binary value to a string of binary, octal, or hexadecimal digits respectively. Perhaps with an options parameter that allows a grouping separator and grouping size to be specified.

Pull request #1753 created #created-1753

02 Feb at 21:13:19 GMT
1750 Overhaul of EXPath binary spec

Apart from general copy-editing, the main changes are:

  • A lot more examples, presented in executable markup format (though they are not yet tested)
  • Many functions now have formal equivalents (again, currently untested)
  • Allow underscores and spaces in input to bin:hex, bin:octal, and bin:bin
  • Use type xs:unsignedByte for octet arguments
  • Use an enum() type for the octet-order argument

Fix #1750

Issue #1752 created #created-1752

02 Feb at 19:18:45 GMT
Return type of fn:partition()

The return type of fn:partition should be array(item()*)* not array(item())*.

Issue #1751 created #created-1751

02 Feb at 00:28:22 GMT
bin:encode-string - should the result have a BOM?

Test cases in the EXPath test suite using bin:encode-string with encoding=utf-16 include a BOM at the start of the output, but the spec says nothing about this. It's probably useful for some use case but a nuisance for others.

Issue #1750 created #created-1750

01 Feb at 00:44:37 GMT
EXPath Binary : copy-edits and minor enhancements

Suggested minor enhancements:

  • Allow underscores and whitespace in strings of binary, octal, or hex digits supplied as strings.
  • Use type xs:unsignedByte rather than xs:integer for octet values
  • Use an enum type for params like "little-endian".

The following are some suggested copy-edits:

Abstract para 4 - link to XQuery 4.1. The last sentence of the para ("The signatures and summaries of functions defined in this document...") makes no sense.

1.1 para 1, twice, ".)" should be ").".

1.2 Mention that the coercion rules in 4.0 mean that wherever a function accepts xs:base64Binary, it also accepts xs:hexBinary (but we've changed the signature to allow either, anyway).

para 2. " if the result return"?

The Note is ineleganty worded.

1.3 I guess we should integrate the test suite into QT4.

1.5 para 2 "In accordance with current practice" eh?

2.1 Example would benefit from reformatting.

2.2 Example, similarly. Could use underscores in the long integers. "and the examples from above reverse"??

  1. "fn:fn:binary-resource" does not yet exist and is triple-barrelled.

  2. Avoid "apologetic quotes" in 'constants'. And elsewhere. If it doesn't work as plain English without quotes, then it needs to be a defined term.

4.1 and throughout, in Examples, use the F&O rendition rather than the right arrow. Also, add these functions to the example checking mechanism.

4.1 Notes, be more precise than "similarly". Define formal equivalent. Non-editorial enhancement: allow underscores in the string.

4.2 There must be a more elegant way of saying "(8-wise) (ASCII) binary digits ([01])". Allow underscores.

4.2 "a xs:base64Binary with no embedded data" - use the term "zero-length".

4.3 similarly. Allow underscores.

Function properties: I think all these functions are pure functions so it's a waste of space to say this explicitly for each function.

4.4, 4.5 Use xs:unsignedByte to represent octets now that we have implicit downcasting. (Changes error code [[bin:octet-out-of-range]to XPTY0004).

5.6 "blank octets"?

7.1.2 "or assumed to be represented"

7.1.3 "Care should be taken" - what does this mean?

"Positive and negative infinities are supported" - who or what is doing the supporting?

Use underscore rather than space as separators between digits.

'quiet' NaN - avoid apologetic quotes.

7.4 - I find the note regarding signed/unsigned integers very confusing.

8.1: "bitwise or" - avoid apologetic quotes. For these three functions we should say what they do rather than assuming the reader will guess from the names. bin:shift could do with more precision. Perhaps the functions could be explained more formally by a mapping from a binary value to a sequence of booleans, then for example bin:and becomes something like for-each-pair(op:from-bits($a), op:from-bits($b), op('and') ) => op:to-bits().

8.5: avoid the notation |$by| for absolute value. Not all of us remember our schooldays. (And when I was at school, by meant b × y, and $ meant dollars.)

Issue #1749 closed #closed-1749

31 Jan at 12:56:33 GMT

Don't set the function finder position to 'fixed' on small devices

Pull request #1749 created #created-1749

31 Jan at 12:56:11 GMT
Don't set the function finder position to 'fixed' on small devices

This is also related to 1747. It "fixes" the problem that @ChristianGruen reported where the function finder obscured content on mobile (narrow) devices. I've changed things so that it isn't at a fixed location on narrow devices. It still appears above the ToC, but it scrolls as normal.

Issue #1748 closed #closed-1748

31 Jan at 09:41:01 GMT

Fix 'window.onload' bug in ToC JS

Pull request #1748 created #created-1748

31 Jan at 09:40:51 GMT
Fix 'window.onload' bug in ToC JS

This fixes 1747 so I'm going to push it immediately.

I'm leaving the bug open because I'll also look at @ChristianGruen 's report that it is problematic on mobile.

Issue #1747 created #created-1747

31 Jan at 07:16:29 GMT
Function finder is broken

The function finder in F&O (and elsewhere) is broken. I believe that it uses the ToC to find the link target and now that the ToC structure has changed, it's failing.

Issue #1745 closed #closed-1745

30 Jan at 15:50:52 GMT

Implement expanding/collapsing ToC

Pull request #1746 created #created-1746

30 Jan at 15:49:10 GMT
Replace processing model diagrams

Fix #1733

Pull request #1745 created #created-1745

30 Jan at 13:48:18 GMT
Implement expanding/collapsing ToC

I'm just going to merge this one because

  1. The CG agreed they wanted this
  2. All of the changes are presentational, there are no technical changes
  3. The PR build won't work anyway

I did make a couple of executive decisions.

The use of "..." as the target to click on didn't seem like a practical affordance. It's not a common use of ellipsis and it looked too much like it simply meant that part of the title was elided. I went with right and down triangles instead. And I added the few lines of JS required to make them "turn".

I added a top-level expand/collapse that does all of the sections. I wasn't happy that with the new UI, there was no way to get an overview of the document by seeing all of the section titles.

I tinkered with the CSS. I'm not uniformly happy with it, especially with the treatment of long titles, but I think the aesthetic failings are infrequent.

We need to review accessibility before we try to publish as a CG Report.

Pull request #1744 created #created-1744

30 Jan at 10:47:05 GMT
Remove dead wood re: SVG diagrams from the XSLT build

Completes action QT4CG-106-01

Pull request #1743 created #created-1743

30 Jan at 10:11:43 GMT
1738 Formatting of Notes in F&O
  1. Improves the stylesheets and CSS so that Notes sections in the F&O spec are rendered with a single continuous green stripe, rather than a separate (and sometimes indented) stripe per paragraph or list item.
  2. Makes some other markup changes identified in passing, especially using <char> to mark up individual characters.

Issue #1742 created #created-1742

29 Jan at 20:20:29 GMT
Maps constructed using streamed xsl:fork instruction should not be ordered

One of the techniques used in XSLT streaming is to build multiple outputs during a single streamed pass of the input, and the multiple outputs can be captured in different entries in a map (in different prongs of an xsl:fork instruction). The ordering of such a map should be implementation-dependent, in order to allow construction in parallel threads.

Furthermore, I think that an xsl:map instruction used in this way should probably not allow duplicate keys. In principle we could collect key/value pairs during the streamed processing and then resolve duplicates at the end, but it's extra complexity.

Pull request #1741 created #created-1741

29 Jan at 11:08:57 GMT
1739 drop references to ordering mode in the static context

Fix #1739

Pull request #1740 created #created-1740

29 Jan at 10:22:51 GMT
1725b Further elaboration of duplicates handling in maps

Actions QT4CG-107-02 and QT4CG-107-03.

The three functions map:build, map:of-pairs, and map:merge now all have the same options parameters, and avoid duplication in the specification. The xsl:map instruction is defined by reference to map:merge.

Although the action suggested specifying these functions to use the first key from a set of duplicates, I found this was not possible because of the way map:put is defined. They therefore use the last key from the set of duplicates.

Fix #1725

Issue #1739 created #created-1739

29 Jan at 09:45:44 GMT
Obsolete references to ordering mode

The functions fn:distinct-values and fn:duplicate-values refer to the ordering mode in the static context, a concept that we have abolished.

Issue #1738 created #created-1738

29 Jan at 00:20:56 GMT
Formatting of lists within notes

The formatting of lists within notes in F&O is weird: see for example the math:atan2 function.

Issue #1737 created #created-1737

28 Jan at 21:25:59 GMT
Grammar problems introduced by #1732

Today's merge of #1732 has introduced two problems to the grammar as now shown in the spec:

  • ValueExpr has changed from

    ValueExpr ::= ValidateExpr | ExtensionExpr | SimpleMapExpr
    

    to

    ValueExpr ::= SimpleMapExpr
    

    This disconnects ValidateExpr and ExtensionExpr from the rest of the grammar.

  • AnnotatedDecl has been added without being referenced. Also it describes something that would look like

    declare declare variable $x external
    

Issue #1722 closed #closed-1722

28 Jan at 19:40:06 GMT

1717 define focus functions using pipeline operator

Issue #1717 closed #closed-1717

28 Jan at 19:40:06 GMT

Define focus functions in terms of the pipeline operator

Issue #1736 created #created-1736

28 Jan at 18:14:28 GMT
Add option retain-order=false when constructing maps

I would like to provide an option on functions that potentially create large maps, including

xsl:map
map:build
map:merge
map:of-pairs
parse-json
json-doc

If retain-option=false is specified, the user declares to the processor that they don't require the resulting map to be in any particular order. An implementation is of course free to ignore this and deliver an ordered map anyway, but if the implementation can save time or space by not retaining order then it is free to do so.

I propose to provide some data quantifying the potential benefits of this option. I realise that some optimisation hints provided in the past, for example the unordered{} expression, have been ineffective, but I think there is a difference here because changing maps to be ordered may result in a performance regression for people moving from 3.1 to 4.0.

QT4 CG meeting 107 draft minutes #minutes—01-28

28 Jan at 17:20:00 GMT

Draft minutes published.

Issue #1719 closed #closed-1719

28 Jan at 17:13:50 GMT

Purging dead build code

Issue #1731 closed #closed-1731

28 Jan at 17:13:49 GMT

1719 drop shared spec from build

Issue #1725 closed #closed-1725

28 Jan at 17:10:39 GMT

Position of duplicates in ordered maps

Issue #1727 closed #closed-1727

28 Jan at 17:10:38 GMT

1725 Define more detailed rules for duplicates in maps

Issue #1485 closed #closed-1485

28 Jan at 17:07:53 GMT

Record declarations in XSLT

Issue #1708 closed #closed-1708

28 Jan at 17:07:52 GMT

1485 Add xsl:record-type declaration

Issue #76 closed #closed-76

28 Jan at 17:06:21 GMT

non-deterministic time

Issue #747 closed #closed-747

28 Jan at 17:06:16 GMT

QName literals

Issue #885 closed #closed-885

28 Jan at 17:06:09 GMT

fn:uuid

Issue #981 closed #closed-981

28 Jan at 17:05:50 GMT

Identify optional arguments in callback functions

Issue #1720 closed #closed-1720

28 Jan at 17:04:37 GMT

Grammar overhaul

Issue #1732 closed #closed-1732

28 Jan at 17:04:36 GMT

1720 grammar simplification

Issue #1069 closed #closed-1069

28 Jan at 17:03:31 GMT

fn:ucd

Issue #1124 closed #closed-1124

28 Jan at 17:03:17 GMT

Formatting XPath/XQuery: Preferences, Conventions

Issue #1252 closed #closed-1252

28 Jan at 17:03:09 GMT

Add a new function `fn:html-doc`

Issue #1728 closed #closed-1728

28 Jan at 17:01:15 GMT

Fix CSS for production tables

Pull request #1735 created #created-1735

27 Jan at 17:09:40 GMT
1341 Drop $position callback from many functions

Responding to the discussion in #1341, this (somewhat experimental) PR explores the possibility of dropping the optional $position argument to the callback of many higher-order functions such as some(), every(), filter(), for-each(), fold-left(), fold-right(). Instead, it provides the option to wrap the input sequence in a call of numbered-items() which replaces each item in the input with an (item, position) pair.

I've done this only (so far) for higher-order sequence functions, but the intent is that the same could be done for arrays and (potentially) maps.

I left the position argument in place for a few functions where losing it seemed to cause genuine inconvenience:

  • partition(), where the function wraps the supplied items into arrays, and you don't want to have to remove the positions afterwards
  • subsequence-where(), where many use cases are likely to use positional information
  • for-each-pair(), where there are two input sequences and it seems clumsy to associate position information with one or the other

The main benefit is that we provide one basic mechanism which is automatically available everywhere, which means we don't have to have debates about whether or not there is a use case for adding position information to (say) fold-left or scan-right.

A further benefit is that the functions defined for sequences automatically become available for arrays and maps. I haven't yet explored the impact on maps and arrays; I will wait first to see what the reaction is to this proposal.

Issue #1730 closed #closed-1730

27 Jan at 15:12:45 GMT

Consistency in default handling of map duplicates

Pull request #1734 created #created-1734

27 Jan at 13:05:48 GMT
1682 Type promotion and operator mapping

Fix #1682

Moves the relevant parts of the operator mapping table into the sections for Arithmetic Expressions and Value Comparisons. Adds links to the op: functions in F&O.

Drops the Type Promotion appendix, moving the rules inline; and drops the term "type promotion"

Adjusts the specs for sum() and avg() so they are now defined directly in terms of pairwise addition of values.

Issue #1733 created #created-1733

27 Jan at 10:17:12 GMT
ACTION QT4CG-088-04, reworking the processing model diagram

I have no idea what tool was used to create the current processing model diagram. We know it needs to be updated, but I've no particular skill with drawing programs, so I spent half an hour constructing a Graphviz diagram:

digraph Processing_Model {
    subgraph clusterQT4 {
        Exec [label="Execution\nEngine" ];
        XDM [label="XPath Data\nModel"; shape="note" ];
        AST [label="Abstract\nSyntax Tree" ];
        Static [label="Static\nContext"; shape="box3d" ];
        Dynamic [label="Dynamic\nContext"; shape="box3d" ];
        Schema [label="Schema\nDefinitions"; shape="note" ];

        XPath -> AST [label=" SQ1" ];
        AST -> AST [label=" SQ5" ];
        AST -> Exec [label=" DQ1" ];
        Schema -> Static;
        Static -> AST [label=" SQ4" ];
        Static -> Dynamic [label=" DQ2" ];
        Dynamic -> Exec [ dir="both"; label=" DQ5" ];
        Exec -> XDM [ dir="both"; label=" DQ4" ];
    }
    XML [ shape="note" ];
    PSVI [ shape="note" ];
    XML -> PSVI [ label=" DM1" ];
    PSVI -> XDM  [ label=" DM2" ];
    XML -> XDM [ label=" DM1" ];

    Direct [ label=" Direct\nGeneration" ];
    Direct -> XDM [ label=" DM3" ];

    Host [label="Host\nEnvironment" ];
    Host -> Schema [label=" SI1" ];
    Host -> Static [label=" SQ2" ];
    Host -> Dynamic [label=" DQ3" ];

    Serialize [ shape="note" ];
    XDM -> Serialize [ label=" DM4" ];
}

It looks something like this:

Image

Is this worth pursuing, or is that just half an hour of my life I'll never get back?

QT4 CG meeting 107 draft agenda #agenda-01-28

27 Jan at 09:30:00 GMT

Draft agenda published.

Pull request #1732 created #created-1732

26 Jan at 22:12:14 GMT
1720 grammar simplification

This PR primarily affects the grammar file, simplifying it to remove most of the material that is only there to support the generation of a JavaCC parser (which has probably not been achievable since XPath/XQuery 2.0).

The section of the grammar that defines the binary operators starting with OrExpr is now expressed using conventional production rules, rather than the precedence-based grammar previously used. This allows deletion of some convoluted code in grammar2spec.xsl.

The DTD for the grammar file is revised to exclude many constructs that are no longer used.

Many simple token definitions (especially those that consist of a simple constant string) have been inlined.

Fix #1720

Pull request #1731 created #created-1731

25 Jan at 23:20:57 GMT
1719 drop shared spec from build

Removes tasks from the gradle build, and associated stylesheets, that are there only to construct the "shared XPath/XQuery specification" which is no longer used by the editors or made visible to readers.

Also fixes a couple of link errors/warnings in the build.

Fix #1719

Issue #1730 created #created-1730

25 Jan at 18:03:46 GMT
Consistency in default handling of map duplicates

In 3.1:

  • map:merge() defaults to duplicates = use-first
  • xsl:map defaults to duplicates = reject

In 4.0

  • map:build defaults to duplicates = combine
  • map:of-pairs defaults to duplicates = combine

Should we try to align the defaults?

Issue #1729 created #created-1729

24 Jan at 22:04:53 GMT
Grammar problems introduced by #1721

Some productions of the XQuery 4.0 grammar were made obsolete by recent changes, but still occur in the document:

  • StringConstructorStart
  • StringInterpolationStart
  • StringInterpolationEnd
  • StringConstructorEnd
  • TagQName
  • EndTagQName
  • ProcessingInstructionStart
  • ProcessingInstructionEnd
  • DirCommentContentChar
  • DirCommentContentDashChar

Also, the replacement of declare record by declare type record has intoduced a new ambiguity. For example, with the input

declare type A as xs:integer;
declare type record as (A);

it remains unclear whether the second line declares a type named "record", in which case

42 instance of record

or a type named "as", where

{'A': 42} instance of as

My proposal would be return to declare record. There are also 13 examples in the document using declare record.

Issue #1723 closed #closed-1723

24 Jan at 21:00:42 GMT

`ThenAction` left over after removal of `BracedActions`

Pull request #1728 created #created-1728

24 Jan at 13:15:53 GMT
Fix CSS for production tables

This PR removes some extraneous space between the rows and columns in the production tables (cellspacing) and turns off the odd grey background on comments. (I don't think the grey backbround was helping any, but if you disagree...)

Issue #1721 closed #closed-1721

24 Jan at 13:01:23 GMT

1713 Revise code for generating production rules

Pull request #1727 created #created-1727

23 Jan at 17:58:36 GMT
1725 Define more detailed rules for duplicates in maps

Clarifies the rules for how duplicates are handled by map:merge, map:build, map:of-pairs, and xsl:map.

Introduces a callback option for map:merge that is compatible with map:build and map:of-pairs, to increase commonality between all four functions/instructions.

Fix #1725

Issue #1726 closed #closed-1726

23 Jan at 16:50:50 GMT

1726 Control order when map input has duplicate keys

Pull request #1726 created #created-1726

23 Jan at 12:54:05 GMT

1726 Control order when map input has duplicate keys

Issue #1725 created #created-1725

23 Jan at 11:20:38 GMT
Position of duplicates in ordered maps

It became clear to me when writing test cases that the specs aren't entirely clear about what happens when building a map from an input sequence that contains duplicate keys. It says clearly what entry should be created for the duplicated key, but it doesn't say clearly where this entry should appear in the result.

There are four functions/instructions that this applies to: map:merge, map:build, map:of-pairs, and xsl:map.

I propose that in each case, the position of the entry for the duplicated key in the resulting map should correspond to the position of the first occurrence of that key in the input sequence. That is, "order of first appearance": the effect should be the same as if new entries are always created using a map:put() operation.

This might be slightly unexpected in the case of map:merge() with the option duplicates=use-last. It means the value will be that of the last duplicate, but its position will be that of the first duplicate. However, the other three functions/instructions achieve the effect of use-last with the callback on-duplicates=fn{$a, $b){$a} which only controls the value of the entry, and cannot be used to control its position, and I think it makes sense for map:merge with duplicates=use-last to behave in the same way.

Of course we could introduce a separate option to control the position of the combined entry, but I think that would be overkill. xsl:for-each-group and distinct-values both use the "order of first appearance" rule and this has never caused any problems. (group-by in XQuery delivers groups in implementation-dependent order, however).

Issue #1724 created #created-1724

22 Jan at 23:54:22 GMT
Allow @copy-namespaces on <xsl:mode>?

As part of an XSLT transformation I need to remove an unused (anywhere) namespace declaration and <xsl:mode on-no-match="shallow-copy"/> doesn’t appear to accept a @copy-namespaces attribute where I can tell it not to copy unused namespaces. The unused namespace was used on a single attribute on the input document, but I’m removing the attribute entirely as part of the transformation, so nothing will remain in the output document that uses the namespace in question. With <xsl:mode on-no-match="shallow-copy"/> the namespace declaration is copied into the result, even though it is not used. If I use the old identity template and set the value of @copy-namespaces on it to something falsy, I get the result I want, that is, no unneeded namespace declaration.

Insofar as <xsl:mode on-no-match="shallow-copy"/> has come to fill the role formerly occupied by the identity template, would it be reasonable to allow it also to declare that unused namespaces should not be copied? If that request is reasonable, is it reasonable to think of it as a bug fix, rather than a new-feature request?

Issue #1723 created #created-1723

22 Jan at 22:07:49 GMT
`ThenAction` left over after removal of `BracedActions`

Thanks for fixing the IfExpr ambiguity.

In #1712, BracedAction was introduced to replace the previous rules for braced actions. Of these, ThenAction still appears in the EBNF summary, but it is no longer referenced.

Issue #1651 closed #closed-1651

22 Jan at 17:27:22 GMT

Ordered Maps: maps that retain insertion order

Issue #1703 closed #closed-1703

22 Jan at 17:27:21 GMT

1651 ordered maps

Issue #1709 closed #closed-1709

22 Jan at 11:55:07 GMT

Extend diagram of item types to include record types etc

Pull request #1722 created #created-1722

22 Jan at 11:46:31 GMT
1717 define focus functions using pipeline operator

Fix #1717

Provides a formal definition of focus functions making use of the new pipeline operator.

Pull request #1721 created #created-1721

22 Jan at 11:27:43 GMT
1713 Revise code for generating production rules

The main change here is to change the way "scraps" are expanded: these are the local collections of production rules that appear inline within the spec. These are now driven by a single prodrecap element naming the rule to be expanded, and the logic is now automated for deciding (a) which subsidiary production rules to include in the scrap, and (b) which occurrence of a production rule to use as the target for a hyperlinked reference to that rule, depending on where the reference appears.

Along with this there has been a fair bit of deletion of legacy code and general modernisation (e.g using XSLT 2.0 and 3.0 constructs where appropriate).

Issue #1720 created #created-1720

22 Jan at 10:50:32 GMT
Grammar overhaul

There is a lot of dead wood in the xpath-grammar.xml file. This issue is raised to capture some observations and suggestions about how it can be simplified.

  • The DTD lists 19 attributes that can appear on g:token, and documents the meaning of 5 of them (very briefly). I suspect that many of the attributes are never used. Many of them were probably intended primarily for use by the JavaCC parser generator.
  • As if that weren't enough, the grammar2spec stylesheet has logic that looks for additional attributes (an example is @alias-for) which are not even allowed by the DTD let alone being in active use.
  • The "if" logic to assign productions to different languages (xpath, xquery, XSLT patterns) is hard to maintain and could be automated: just search for productions that are reachable from the top-level production for each language. This could be done by a preprocessing stylesheet that generates a grammar file for each language.
  • The switch into a precedence-based grammar for binary operators (g:exprProduction name="OperatorExpr") doesn't really help anyone. For generating production rules in the spec, it just complicates the generation logic. The same is true for anyone else writing applications that use the grammar as input. It doesn't really make life easier for maintainers of the grammar, because it means there is more to learn.

All the JavaCC machinery is still in the repo and I think it could probably go. Leaving stuff like that lying around makes things more difficult when you need to search filestore for references to things.

Issue #1719 created #created-1719

21 Jan at 22:57:23 GMT
Purging dead build code

In the course of working on #1713 I've been exploring some dark corners of the build system. There's a lot of dead code. Some of it might come in useful in the future (e.g. code supporting XQuery Update) but most will be very hard to revive. For example there's a lot of grammar machinery which is there only to allow generation of a JavaCC parser.

The main purpose of this issue is to capture notes that might lead to some reduction of technical debt.

The gradle build is currently giving me

Warning: link-text-with-check was unable to make a link for $ref-id="doc-shared40-Prolog"

That message comes from xmlspec-override.xsl. This stylesheet looks like dead code because it has lots of references to XPath30 and XQuery30. But it can't be completely dead if we're getting errors from it. It's imported from two places: xpath-functions-30.xsl in the F&O tree and shared.xsl in the xquery40 tree. The message comes from gradle task xquery_shared_html. As far as I can see the build system is constructing an XPath specification, an XQuery specification, and a "shared" specification which is a union of the two. (It starts off "XQuery 4.0 and XPath 4.0 is an expression language that allows..."). Presumably this was intended to allow editors and WG members to review a single document rather than reviewing XPath and XQuery separately. But I don't think it's used today, I think we could kill it off.

shared.xsl is referenced only from build.gradle when building the shared specification.

xpath-functions-30.xsl doesn't appear to be referenced from anywhere, and it carries a comment saying

 Created 17 Dec 2008 by MHK.
 No longer used 16 Feb 2009?

In the short term I've deleted the code in xmlspec-override.xsl starting with the comment "Our inability to create a link for $ref-id may be a sign of something wrong, so...". This gets rid of the warning messages. In the longer term, subject to confirmation, I think we can delete the build targets associated with the "shared" language spec, and delete the stylesheets xmlspec-override.xsl, xpath-functions-30.xsl, and shared.xsl.

Issue #1718 created #created-1718

21 Jan at 17:45:35 GMT
Ordered Maps: positions in callback functions

Now that maps have a defined order, we should add the position to HOF parameters in map functions (in alignment with sequence and array functions). Examples:

map:for-each(
  $map     as map(*),	
  $action  as fn($key as xs:anyAtomicType, $value as item()*, $pos as xs:integer) as item()*	
) as item()*

map:filter(
  $map        as map(*),	
  $predicate  as fn($key as xs:anyAtomicType, $value as item()*, $pos as xs:integer) as xs:boolean?	
) as map(*)

QT4 CG meeting 106 draft minutes #minutes—01-21

21 Jan at 17:40:00 GMT

Draft minutes published.

Issue #1706 closed #closed-1706

21 Jan at 17:22:44 GMT

Ambiguous `if` syntax

Issue #1712 closed #closed-1712

21 Jan at 17:22:43 GMT

1706 Drop "else if" and "else" clauses from braced conditionals

Issue #1685 closed #closed-1685

21 Jan at 17:20:39 GMT

Pipeline Operator

Issue #1686 closed #closed-1686

21 Jan at 17:20:38 GMT

1685 Pipeline Operator

Issue #1701 closed #closed-1701

21 Jan at 17:18:47 GMT

Add dedication to MSM (action QT4CG-088-01)

Issue #1705 closed #closed-1705

21 Jan at 17:16:22 GMT

fn:divide-decimals, fn:round: large precision values

Issue #1711 closed #closed-1711

21 Jan at 17:16:21 GMT

1705 Say that max precision is implementation-defined

Issue #1710 closed #closed-1710

21 Jan at 17:14:14 GMT

1709 Updated type diagrams

Issue #1606 closed #closed-1606

21 Jan at 17:13:37 GMT

Drop named item types other than named record types

Issue #1494 closed #closed-1494

21 Jan at 17:13:32 GMT

Records: Introduction?

Issue #1176 closed #closed-1176

21 Jan at 17:13:27 GMT

Use fn:parse-uri to check whether a filepath is relative or absolute

Issue #1700 closed #closed-1700

21 Jan at 17:11:44 GMT

Remove some dead .DS_Store files

Issue #1717 created #created-1717

21 Jan at 16:59:58 GMT
Define focus functions in terms of the pipeline operator

Now that we have accepted the pipeline operator into the language, we can define the semantics of focus functions to take advantage of them, specifically, fn() { EXPR } can be defined to be equivalent to fn($v) { $v -> EXPR } where $v is an otherwise-unused variable name.

QT4 CG meeting 106 draft agenda #agenda-01-21

21 Jan at 10:30:00 GMT

Draft agenda published.

Issue #1716 created #created-1716

20 Jan at 21:13:15 GMT
Variable lookahead needed for `ArrowTarget`

The current grammar definition allows any QName (via EQName) as an ArrowStaticFunction:

ArrowTarget
         ::= ArrowStaticFunction ArgumentList
           | ArrowDynamicFunction PositionalArgumentList
ArrowStaticFunction
         ::= EQName
ArrowDynamicFunction
         ::= VarRef
           | InlineFunctionExpr
           | ParenthesizedExpr 

This complicates the distinction of the static and dynamic variants of ArrowTarget, as it cannot be done with a fixed number of lookahead tokens. E.g. in an expression starting like this

A => fn ( $A, $B, $C, (: ... :) $Z ) { } ( 

the distinction cannot be made before the left brace is seen. While constructing an LR parser, there is a shift-reduce conflict between shifting fn as a keyword of an InlineFunctionExpr, or reducing fn to the QName of EQName.

This can easily be fixed by adding xgc: reserved-function-names to ArrowStaticFunction, which would also be consistent with other function calls in disallowing reserved function names:

ArrowStaticFunction
         ::= EQName
                          /* xgc: reserved-function-names */		 

But could not ArrowTarget also be written like the following?

ArrowTarget
         ::= FunctionCall
           | DynamicFunctionCall

In this case, the xgc: reserved-function-names constraint would be inherited from FunctionCall. It eliminates ArrowStaticFunction and ArrowDynamicFunction and at the same time lifts some restrictions imposed by the current ArrowTarget. It does not cause any LALR(2) conflicts.

Issue #1715 created #created-1715

20 Jan at 13:59:18 GMT
Array Lookups: partial removal of out-of-bounds checks

Various QT4 tests imply that the out-of-bounds check for arrays have been removed. An example:

<test-case name="UnaryLookup-005a">
  <description>Integer subscript into an array: array index too low</description>
  <created by="Michael Kay" on="2014-11-27"/>
  <modified by="Michael Kay" on="2024-07-22" change="returns () in 4.0"/>
  <dependency type="spec" value="XP40+ XQ40+"/>
  <test>(['a', 'b'], ['c', 'd'])[ ?0 eq 'c']</test>
  <result>
    <assert-empty/>
  </result>
</test-case>

I believe this is not reflected in the spec yet, or at least it includes examples that need to be updated:

[ "a", "b" ]?3 raises a dynamic error err:FOAY0001.

I guess that #832 would have been the PR with the relevant changes (we have already observed in another issue that some changes of this PR need to survive; see https://github.com/qt4cg/qtspecs/pull/1283#issuecomment-2568330191).


Edit (2025-05-26): Outdated:

That leads me to the original reason for creating this issue:

  • I think it’s a good idea to drop the range check for array lookups, and it would seem consistent to me to also drop it for dynamic function calls.
  • As map/array lookups and dynamic function calls are often used interchangeably, $array?0 and $array(0) should behave identically.
  • The FOAY0001 error would (and should) still be raised by the array functions, including array:get, array:put, array:remove, or array:insert-before.

Issue #1714 created #created-1714

19 Jan at 23:34:09 GMT
sibling:: axis. Action Item QT4CG-097-03

This issue is a reflection of the following Action Item:

QT4CG-097-03: DN to proposal an axis for accessing the siblings of a node.

I have prepared a pdf file that contains the updated relevant updated sections from the "Xpath 4.0" document:

  • There are no deletions or conflicting changes.
  • The additions to the text are highlighted in turquoise.
  • The file that contains all relevant updated sections of the document is at: https://github.com/dnovatchev/qtspecs/blob/dn-siblings/sibling-axis.pdf

If the above doesn't work, please try: https://github.com/dnovatchev/MathPuzzles/blob/master/sibling-axis.pdf

Issue #1713 created #created-1713

19 Jan at 11:49:13 GMT
Patchy exposition of XSLT type pattern syntax

In XSLT §5.4.2.2 Type Patterns, the exposition of the grammar is "patchy" - it includes some production rules such as FieldDeclaration that are in the subtree of the main production rule (TypePattern) without giving all the intermediate rules that connect this rule to the root.

It's easy enough to correct this by hand, but it would be nice to prevent this happening by automating the generation of these families of grammar rules, perhaps by including all rules in the subtree up to a depth of 3, say. It would also be nice to simply list the productions to be included without having to decide manually which of them should be the principal target of termref references (by being marked with an ID).

Pull request #1712 created #created-1712

18 Jan at 18:51:28 GMT
1706 Drop "else if" and "else" clauses from braced conditionals

Fix #1706

Pull request #1711 created #created-1711

18 Jan at 18:29:32 GMT
1705 Say that max precision is implementation-defined

Applies to fn:round, fn:round-half-to-even, fn:divide-decimals

Fix #1705

Pull request #1710 created #created-1710

17 Jan at 23:17:00 GMT
1709 Updated type diagrams

Added a few details to the type diagrams: user-defined array, map, and record types; enumeration types; untypedAtomic

Issue #1709 created #created-1709

17 Jan at 22:44:49 GMT
Extend diagram of item types to include record types etc

I propose to extend the diagram of item types (common to DM and FO) to include more detail of the hierarchy below function types.

Issue #1617 closed #closed-1617

17 Jan at 12:55:10 GMT

1606 Drop named item types, refine named record types, esp in XSLT

Pull request #1708 created #created-1708

17 Jan at 12:52:56 GMT
1485 Add xsl:record-type declaration

Adds named record types to XSLT, with much the same spec as for XQuery, but some extra tweaks for handling visibility and overriding.

Fix #1485

Issue #1707 closed #closed-1707

17 Jan at 08:58:55 GMT

Fix bug in build dependencies

Pull request #1707 created #created-1707

17 Jan at 08:58:46 GMT
Fix bug in build dependencies

Changing xslt.xml didn't actually cause the HTML for the XSLT specification to be rebuilt. 👎

Issue #1706 created #created-1706

16 Jan at 21:08:24 GMT
Ambiguous `if` syntax

The optional else in a braced if expression introduces an ambiguity in the XQuery 4.0 grammar.

Here is an example of an ambiguous expression:

if (A) then if (B) {C} else if (D) {E} else if (F) {G} else {H}

It can be parsed like this

if (A) then if (B) {C}                               else {}
       else if (D) {E} else if (F) {G} else {H}

but also like the following

if (A) then if (B) {C} else if (D) {E}               else {}
       else if (F) {G} else {H}

The corresponding part of the grammar is

IfExpr   ::= 'if' '(' Expr ')' ( UnbracedActions | BracedActions )
UnbracedActions
         ::= 'then' ExprSingle 'else' ExprSingle
BracedActions
         ::= ThenAction ElseIfAction* ElseAction?
ThenAction
         ::= EnclosedExpr
ElseIfAction
         ::= 'else' 'if' '(' Expr ')' EnclosedExpr
ElseAction
         ::= 'else' EnclosedExpr

The ambiguity could be resolved by making the ElseAction in BracedActions mandatory, i.e.:

BracedActions
         ::= ThenAction ElseIfAction* ElseAction

Issue #1705 created #created-1705

16 Jan at 15:53:51 GMT
fn:divide-decimals, fn:round: large precision values

We may need to specify what is going to happen if very large (positive and negative) precisions are specified:

divide-decimals(1, 1, 0x7FFFFFFF)

A simple implementation in Java to compute the quotient for this function returns an Overflow exception:

BigDecimal.ONE.divide(BigDecimal.ONE, 0x7FFFFFFF, RoundingMode.DOWN)

This also affects fn:round: The query round(1, -0x80000000) seems to behave unexpectedly in existing implementations.

In general, the computation gets very slow for large precision values, and it may not be simple to interrupt such low-level operations, so maybe (if it makes sense, I haven’t really thought about it) we could define precision limits.

Issue #1704 created #created-1704

16 Jan at 15:09:03 GMT
Ignore the byte order mark more completely/globally

Following on a discussion with @line-o on the XML.com Slack, I took a peek at the way we deal with the byte order mark in Functions and Operators. We seem to be explicit about it in a couple of JSON functions but not elsewhere. I think we should assert that the byte order mark is explicitly ignored in all of the input functions (json-, parse-, unparsed-* etc.)

Issue #1136 closed #closed-1136

15 Jan at 18:53:02 GMT

Defining names for parameters on typed function tests

Issue #1696 closed #closed-1696

15 Jan at 18:53:01 GMT

1136 Optional names in typed function types

Issue #1688 closed #closed-1688

15 Jan at 00:04:04 GMT

In rendered HTML, link to definition is missing its link text

Pull request #1703 created #created-1703

14 Jan at 22:48:18 GMT
1651 ordered maps

Reopened pull request introducing ordered maps.

Fix #1651.

Issue #1609 closed #closed-1609

14 Jan at 17:23:50 GMT

1651 Ordered Maps

QT4 CG meeting 105 draft minutes #minutes—01-14

14 Jan at 17:15:00 GMT

Draft minutes published.

Issue #1632 closed #closed-1632

14 Jan at 17:06:08 GMT

Add xsl:map/@select

Issue #1694 closed #closed-1694

14 Jan at 17:06:07 GMT

1632 Add xsl:map/@select

Issue #1684 closed #closed-1684

14 Jan at 17:05:05 GMT

[XSLT] Composite merge keys

Issue #1689 closed #closed-1689

14 Jan at 17:05:04 GMT

1684 Composite merge keys; current-merge-key-array function

Issue #1680 closed #closed-1680

14 Jan at 17:03:00 GMT

Ambiguous `switch` syntax

Issue #1692 closed #closed-1692

14 Jan at 17:02:59 GMT

1680 Fix switch syntax ambiguity

Issue #1672 closed #closed-1672

14 Jan at 17:00:55 GMT

array:values, map:values: Alternatives

Issue #1687 closed #closed-1687

14 Jan at 17:00:54 GMT

1672 array:values, map:values: Alternatives

Issue #1006 closed #closed-1006

14 Jan at 16:59:43 GMT

regular expression addition - word boundaries

Issue #490 closed #closed-490

14 Jan at 16:59:36 GMT

Control over schema validation in parse-xml(), doc(), etc.

Issue #108 closed #closed-108

14 Jan at 16:59:30 GMT

Template match using values of [tunnel] parameters

Issue #1284 closed #closed-1284

14 Jan at 16:58:49 GMT

Build issue: Unsupported specref to [streamability-fn-distinct-ordered-nodes]

Issue #1695 closed #closed-1695

14 Jan at 16:58:48 GMT

1284 Define streamability of distinct-ordered-nodes

Issue #1693 closed #closed-1693

14 Jan at 16:56:43 GMT

1683 Extend xpath-functions schema with CSV components

Issue #1690 closed #closed-1690

14 Jan at 16:54:20 GMT

1688 In "implementation-defined" appendix, fix absent generated link

Issue #1702 created #created-1702

14 Jan at 15:19:10 GMT
Node Updates: Functions

In #1225, I have summarized some thoughts on generalizing updates for both nodes and structured items (maps/arrays).

XQuery Update is complex, as updates are in general, so we may still decide that it is too ambitious to introduce update features in the core language. If we want to give it a try, we could offer functions that are based on XQUF, but that only perform one update operation at a a time on a given input. This way, we could ignore the sophisticated Pending Update List semantics, which is only important when multiple updating expressions are specified and need to be checked and brought into order.

A function set that provides an equivalent functionality for all XQUF update operations could look as follows (the presented functions are valid XQuery Update code):

declare namespace update = 'http://www.w3.org/TR/xquery-update';

declare function update:delete(
  $node  as node(),
  $path  as fn(node()) as node()*
) as node() {
  copy $c := $node
  modify delete node $path($c)
  return $c
};

declare function update:rename(
  $node  as node(),
  $path  as fn(node()) as node()*,
  $name  as (xs:QName | xs:NCName | fn(node(), xs:integer) as (xs:QName | xs:NCName))
) as node() {
  copy $c := $node
  modify (
    for $target at $pos in $path($c)
    let $result := if($name instance of fn(*)) {
      $name($target, $pos)
    } else {
      $name
    }
    return rename node $target as $result
  )
  return $c
};

declare function update:replace(
  $node      as node(),
  $path      as fn(node()) as node()*,
  $contents  as (node() | xs:anyAtomicType | fn(node(), xs:integer) as node()*)*,
  $options   as record(value? as xs:boolean)? := {}
) as node() {
  copy $c := $node
  modify (
    for $target at $pos in $path($c)
    let $result := (
      for $content in $contents
      return if($content instance of fn(*)) {
        $content($target, $pos)
      } else {
        $content
      }
    )
    return if($options?value) {
      replace value of node $target with $result
    } else {
      replace node $target with $result
    }
  )
  return $c
};

declare function update:insert(
  $node      as node(),
  $path      as fn(node()) as node()*,
  $contents  as (node() | xs:anyAtomicType | fn(node(), xs:integer) as (node() | xs:anyAtomicType))*,
  $options   as record(position? as enum('last', 'first', 'before', 'after'))? := {}
) as node() {
  copy $c := $node
  modify (
    for $target at $pos in $path($c)
    let $result := (
      for $content in $contents
      return if($content instance of fn(*)) {
        $content($target, $pos)
      } else {
        $content
      }
    )
    return switch($options?position) {
      case 'before' return insert node $result before $target
      case 'after'  return insert node $result after $target
      case 'first'  return insert node $result as first into $target
      default       return insert node $result as last into $target
    }
  )
  return $c
};

Here are some exemplary function calls:

let $node := <xml><e/><e/></xml>
return (
  (: deletes all <e/> child nodes :)
  update:delete($node, fn { e }),
  (: renames the <e/> child nodes to <f/> :)
  update:rename($node, fn { e }, 'f'),
  (: replaces the <e/> child nodes with <replaced/> :)
  update:replace($node, fn { e }, <replaced/>),
  (: replaces the string value of the <e/> child nodes with 'text' :)
  update:replace($node, fn { e }, 'text', { 'value': true() }),
  (: inserts a 'text' text node into the <e/> child nodes :)
  update:insert($node, fn { e }, 'text'),
  (: inserts 'text1' and 'text2' text nodes into the <e/> child nodes :)
  update:insert($node, fn { e }, fn($node, $pos) { 'text' || $pos }),
  (: inserts an <x/> element after each <e/> child node :)
  update:insert($node, fn { e }, <x/>, { 'position': 'after' })
)

Multiple update operations can easily be chained:

(: rename <e/> child nodes to <f/>, insert 'x' text nodes :)
<xml><e/><e/></xml>
=> update:rename(fn { e }, 'f')
=> update:insert(fn { f }, 'x')

Ideally, we could offer a similar function set (or maybe even the same) for maps and arrays in a next step (see #77). The map/array syntax would be similar for deletions…

let $data := { 'a': [ 1, 2, 3 ] }
return update:delete($data, fn { ?a?2 })

…but it certainly gets trickier for other operations.

If some of you believe that the presented approach is something that we should pursue, I will be happy to add details. As an alternative, we could pursue the XQUF light approach that I have sketched in #1225, based on the existing XQUF update keywords.

Yet another solution could be to stick with what we have, but add map/array update features to XQUF.

Pull request #1701 created #created-1701

14 Jan at 14:07:18 GMT
Add dedication to MSM (action QT4CG-088-01)

I've had this action on my plate for a while. Having written a dedication, there's a follow-up question of where to put it. Having it in only one specification isn't wrong, but it seems slightly odd given that MSM contributed to them all. In the end, I decided to put a full dedication in the XPath specification and link to it from the others.

My rationale for the XPath spec is that it's probaly one that everyone reads. Another possibility was the Data Model as it's "foundational" but I think it's less read than XPath.

The published PR won't be write because there are tooling changes required. I've attached a couple of screen shots, one of the full dedication in XPath:


Screenshot 2025-01-14 at 14-00-08 XML Path Language (XPath) 4 0 WG Review Draft

And another of the link from the other specs (from XSLT, I think, but they're all the same).


Screenshot 2025-01-14 at 13-59-08 XSL Transformations (XSLT) Version 4 0

Pull request #1700 created #created-1700

14 Jan at 14:01:14 GMT
Remove some dead .DS_Store files

I'm not sure how these got checked in...

QT4 CG meeting 105 draft agenda #agenda-01-14

14 Jan at 10:11:15 GMT

Draft agenda published.

Issue #1699 created #created-1699

14 Jan at 03:48:16 GMT
XPath function to calculate edit distance between two strings

I propose a new XPath function to calculate the edit distance between two strings. It could use a specific algorithm, for example fn:levenshtein-distance(s1,s2).

The function could also be designed more generic like fn:edit-distance(s1, s2, algorithm) where algorithm could be levenshtein, hamming, lcs ... (see edit distance).

Use Case: Schematron Quick Fix when checking glossentry elements against terms defined in glossary. "Your term is not defined in Glossary, did you mean ...".

Thanks, Frank

Issue #1407 closed #closed-1407

13 Jan at 17:53:37 GMT

Improve the spec prose and table of content layout for types

Issue #1698 created #created-1698

13 Jan at 13:53:53 GMT
Allow select attribute for xsl:call-template instruction

The lack of the following feature is something that bothers me from time to time. I hope this is the right place here for my proposal. And even though I did some search -- I am not sure if something similar was discussed before ...

I propose to allow a select attribute for xsl:call-template instructions. When the select attribute is set, then the named template is called for each selected item as context item.

When the empty sequence is selected, the template is not invoked.

When the select attribute is omitted, then the instruction works as before (Invoked once and "[...] does not change focus [...]").

For extension instructions from named templates: May work the same with a prefixed attribute (e.g. xsl:select).

Benefits I see:

  • Change the context without xsl:apply-templates
  • Avoid template parameter with current() as default value (annoying when you have nested named template calls)
  • Avoid xsl:for-each workaround where context just must be adjusted for a single item (no such parameter available, see before)
  • Save an xsl:for-each instruction with this shorter form
  • Harmonize xsl:call-template with xsl:apply-templates concept a little bit

Simple example:

<xsl:template match="elem">
  <xsl:call-template name="t:make-something" select="child-elem"/>
  <!-- ... or as extension instruction: -->
  <t:make-something xsl:select="child-elem"/>
</xsl:template>

<xsl:template name="t:make-something">
  <xsl:context-item use="required" as="element(child-elem)">
  <!-- ... -->
</xsl:template>

The call of t:make-something before is equivalent with:

<xsl:template match="elem">
  <xsl:for-each select="child-elem">
    <xsl:call-template name="t:make-something"/>
  </xsl:for-each>
</xsl:template>

Issue #1675 closed #closed-1675

13 Jan at 12:03:50 GMT

CSV parsing

Issue #1677 closed #closed-1677

13 Jan at 12:03:49 GMT

1675 Fixes for CSV parsing

Issue #1673 closed #closed-1673

13 Jan at 11:55:27 GMT

1407 TOC structure for types

Issue #1681 closed #closed-1681

13 Jan at 09:21:11 GMT

Δ in the table of contents

Issue #1691 closed #closed-1691

13 Jan at 09:21:10 GMT

1681 - Delta marker in TOC

Issue #1697 created #created-1697

13 Jan at 00:17:32 GMT
Add documentary names to callback function signatures

If PR #1696 is accepted we can add documentary names to the parameters of callback function signatures, for example fn:filter can become

fn:filter(
  $input as item()*, |  
  $predicate as fn($item as item(), $position as xs:integer) as xs:boolean? |  
) as item()*

and we can (if we need to) use the parameter names in the prose

Pull request #1696 created #created-1696

13 Jan at 00:07:47 GMT
1136 Optional names in typed function types

Fix #1136

Pull request #1695 created #created-1695

12 Jan at 23:36:22 GMT
1284 Define streamability of distinct-ordered-nodes

Fix #1284

Issue #1610 closed #closed-1610

11 Jan at 10:48:57 GMT

Some cross references are incorrect

Issue #1683 closed #closed-1683

10 Jan at 09:35:55 GMT

There are validity errors in the function catalog related to csv elements

Pull request #1694 created #created-1694

09 Jan at 23:10:22 GMT
1632 Add xsl:map/@select

Fix #1632

Pull request #1693 created #created-1693

09 Jan at 22:25:52 GMT
1683 Extend xpath-functions schema with CSV components

This was an unsuccessful attempt to fix issue #1683, but the change is still worth making. It extends the aggregated schema for the XPath functions namespace to include definitions for the result of the csv-to-xml function.

Pull request #1692 created #created-1692

09 Jan at 21:55:59 GMT
1680 Fix switch syntax ambiguity

Fix #1680 (as suggested in the issue)

Pull request #1691 created #created-1691

09 Jan at 21:42:57 GMT
1681 - Delta marker in TOC

Fix #1681

Pull request #1690 created #created-1690

09 Jan at 20:28:19 GMT
1688 In "implementation-defined" appendix, fix absent generated link

For F&O the automatically-generated appendix of implementation-defined item should link each such item to the nearest containing section that has a head child as well as an id attribute.

Pull request #1689 created #created-1689

09 Jan at 18:10:34 GMT
1684 Composite merge keys; current-merge-key-array function

Acknowledges that as a result of changes to xsl:sort, xsl:merge now accepts composite merge keys; introduces the current-merge-key-array() function to handle them.

Fix #1684

Issue #1688 created #created-1688

09 Jan at 17:39:39 GMT
In rendered HTML, link to definition is missing its link text

https://qt4cg.org/specifications/xpath-functions-40/Overview.html#impl-def Item 6 contains a sentence that renders as "See ." In the raw HTML, there is a link <a href="#dt-nondeterministic-wrt-ordering"></a> with no link text.

I thought I might make this issue a little more substantive by reporting a second typo or broken link, but I can't find a second one at the moment. :)

Pull request #1687 created #created-1687

09 Jan at 12:16:22 GMT
1672 array:values, map:values: Alternatives

Issue: #1672

Pull request #1686 created #created-1686

09 Jan at 11:46:49 GMT
1685 Pipeline Operator

Issue: #1685

The PR introduces the pipeline operator ->. If we decide to add it, we could drop =!> in a second step and update various examples in the text.

Issue #1685 created #created-1685

09 Jan at 09:46:19 GMT
Pipeline Operator

This is an attempt to find a solution for the discussion in #755, which was originally about defining an expression to bind the context value. It serves as a summary for an upcoming PR.

We have two operators in the language that can be used for pipelining:   1. With the simple map operator !, single items of an input can be bound to the context value.   2. With the arrow operator =>, an input can be bound as first argument in a function call.

The current restrictions are:   A) There is no way to bind a sequence with 0 or more than 1 items to the context value.   B) We can only bind the input to the first function argument.

In addition, we have introduced the mapping arrow expression =!> to bind single items of an input to the first function argument.

We could generalize and simplify the situation by introducing a dedicated and very basic pipeline operator: A -> B evaluates A to a value, which is bound to the context value before evaluating B.

With the operator, restriction A) would be resolved. Restriction B) would be tackled indirectly, as -> and ! can often be combined. For example, the following examples from the specification could be simplified…

(: current vs. simplified syntax :)
$s => tokenize() =!> fn { `"{.}"` }()
$s -> tokenize(.) ! `"{.}"`

(: current vs. simplified syntax :)
(1 to 5) =!> xs:double() =!> math:sqrt() =!> fn($a) { $a + 1 }() => sum()
(1 to 5) ! xs:double(.) ! math:sqrt(.) ! (. + 1) -> sum(.)

…and we could drop =!> in favor of the new operator.

An equivalent representation for the focus function fn { E } would be fn($c) { $c -> E }.

Issue #1684 created #created-1684

08 Jan at 21:41:35 GMT
[XSLT] Composite merge keys

The changes in PR #1674 to allow composite sort keys automatically propagate to xsl:merge, because the semantics of xsl:merge-key are defined entirely by reference to xsl:sort.

No immediate problem, except (1) we should acknowledge the fact and point out that composite merge keys are now allowed, and (2) the effect on the current-merge-key() function. This is the sequence-concatenation of the merge keys for multiple merge sources. The spec says:

the [current merge key] will be a single atomic item if there is a single merge key, or a sequence of atomic items if there are multiple merge keys.

Actually I think that's already wrong, because it forgets that an individual merge key may be an empty sequence. If that happens then the current-merge-key() function is somewhat useless. I suggest we simply document the fact: if there are multiple merge sources generating multiple merge keys and they are not all singletons, then the sequence concatenation of the merge keys may not be especially useful.

We could provide a variant current-merge-key-array() that returns an array of sort key values, one for each xsl:merge-key element, each one being a sequence of atomic items.

QT4 CG meeting 104 draft minutes #minutes—01-07

07 Jan at 17:30:00 GMT

Draft minutes published.

Issue #1261 closed #closed-1261

07 Jan at 17:24:17 GMT

Add decimal-divide function

Issue #1671 closed #closed-1671

07 Jan at 17:24:16 GMT

1261 New fn:divide-decimals() function

Issue #1662 closed #closed-1662

07 Jan at 17:22:47 GMT

xsl:sort - add composite sort keys

Issue #1674 closed #closed-1674

07 Jan at 17:22:46 GMT

1662 Allow composite sort keys in xsl:sort

Issue #1621 closed #closed-1621

07 Jan at 17:20:01 GMT

compare() with collations that do not support ordering

Issue #1676 closed #closed-1676

07 Jan at 17:20:00 GMT

1621 Capabilities of Collations

Issue #1678 closed #closed-1678

07 Jan at 17:17:01 GMT

Semantics of element(N, T) where T is a union type

Issue #1679 closed #closed-1679

07 Jan at 17:17:00 GMT

1678 Define element(E,T) and attribute(A,T) in terms of "derives-from"

Issue #1670 closed #closed-1670

07 Jan at 17:11:45 GMT

Action QT4CS-097-02: Enable xtermref links to XSD SCM property names

Issue #1667 closed #closed-1667

07 Jan at 17:08:39 GMT

Invalid XML characters in JSON input

Issue #1669 closed #closed-1669

07 Jan at 17:08:38 GMT

1667 Revise handling of non-XML characters in parse-json

Issue #1668 closed #closed-1668

07 Jan at 17:05:34 GMT

Minor copy edits (no issue raised)

Issue #1649 closed #closed-1649

07 Jan at 17:02:13 GMT

Result type of fn:function-annotations()

Issue #1666 closed #closed-1666

07 Jan at 17:02:12 GMT

1649 result of function annotations

Issue #1650 closed #closed-1650

07 Jan at 16:59:55 GMT

fn:node-kind, fn:type-of: Editorial

Issue #1665 closed #closed-1665

07 Jan at 16:59:54 GMT

1650 Tidy up fn:type-of

Issue #1663 closed #closed-1663

07 Jan at 16:57:47 GMT

Remove DTD/stylesheet distractions at the top of the schema

QT4 CG meeting 104 draft agenda #agenda-01-07

07 Jan at 10:11:30 GMT

Draft agenda published.

Issue #1683 created #created-1683

06 Jan at 11:10:56 GMT
There are validity errors in the function catalog related to csv elements

The build reports:

Processing file:/Volumes/Saxonica/src/qt4cg/qtspecs/specifications/xpath-functions-40/src/function-catalog.xml
Using parser com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser
Building tree for file:/Volumes/Saxonica/src/qt4cg/qtspecs/specifications/xpath-functions-40/src/function-catalog.xml using class net.sf.saxon.tree.tiny.TinyBuilder
Tree built in 215.684667ms
Tree size: 38655 nodes, 773635 characters, 7637 attributes
Error on line 1 column 53 of generate-qt3-test-set.xsl:
  XTTE1512  Cannot validate <Q{.../xpath-functions}csv>: no element declaration available
Error on line 3 column 7 of generate-qt3-test-set.xsl:
  XTTE1512  One validation error was reported: Cannot validate <Q{.../xpath-functions}csv>:
  no element declaration available
** Failure in parse-xml on fos:result of csv-to-xml-002
Using parser com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser
Error on line 1 column 53 of generate-qt3-test-set.xsl:
  XTTE1512  Cannot validate <Q{.../xpath-functions}csv>: no element declaration available
Error on line 5 column 7 of generate-qt3-test-set.xsl:
  XTTE1512  One validation error was reported: Cannot validate <Q{.../xpath-functions}csv>:
  no element declaration available
** Failure in parse-xml on fos:result of csv-to-xml-003
Using parser com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser
Error on line 1 column 53 of generate-qt3-test-set.xsl:
  XTTE1512  Cannot validate <Q{.../xpath-functions}csv>: no element declaration available
Error on line 3 column 7 of generate-qt3-test-set.xsl:
  XTTE1512  One validation error was reported: Cannot validate <Q{.../xpath-functions}csv>:
  no element declaration available
** Failure in parse-xml on fos:result of csv-to-xml-004
Using parser com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser
Error on line 1 column 53 of generate-qt3-test-set.xsl:
  XTTE1512  Cannot validate <Q{.../xpath-functions}csv>: no element declaration available
Error on line 8 column 7 of generate-qt3-test-set.xsl:
  XTTE1512  One validation error was reported: Cannot validate <Q{.../xpath-functions}csv>:
  no element declaration available
** Failure in parse-xml on fos:result of csv-to-xml-005
Using parser com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser
Error on line 1 column 53 of generate-qt3-test-set.xsl:
  XTTE1512  Cannot validate <Q{.../xpath-functions}csv>: no element declaration available
Error on line 16 column 7 of generate-qt3-test-set.xsl:
  XTTE1512  One validation error was reported: Cannot validate <Q{.../xpath-functions}csv>:
  no element declaration available
** Failure in parse-xml on fos:result of csv-to-xml-006
Using parser com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser
Error on line 1 column 53 of generate-qt3-test-set.xsl:
  XTTE1512  Cannot validate <Q{.../xpath-functions}csv>: no element declaration available
Error on line 24 column 7 of generate-qt3-test-set.xsl:
  XTTE1512  One validation error was reported: Cannot validate <Q{.../xpath-functions}csv>:
  no element declaration available
** Failure in parse-xml on fos:result of csv-to-xml-007
Using parser com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser
Error on line 1 column 53 of generate-qt3-test-set.xsl:
  XTTE1512  Cannot validate <Q{.../xpath-functions}csv>: no element declaration available
Error on line 37 column 7 of generate-qt3-test-set.xsl:
  XTTE1512  One validation error was reported: Cannot validate <Q{.../xpath-functions}csv>:
  no element declaration available
** Failure in parse-xml on fos:result of csv-to-xml-008
Using parser com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser
Error on line 1 column 53 of generate-qt3-test-set.xsl:
  XTTE1512  Cannot validate <Q{.../xpath-functions}csv>: no element declaration available
Error on line 40 column 7 of generate-qt3-test-set.xsl:
  XTTE1512  One validation error was reported: Cannot validate <Q{.../xpath-functions}csv>:
  no element declaration available
** Failure in parse-xml on fos:result of csv-to-xml-009
Using parser com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser
Error on line 1 column 53 of generate-qt3-test-set.xsl:
  XTTE1512  Cannot validate <Q{.../xpath-functions}csv>: no element declaration available
Error on line 36 column 7 of generate-qt3-test-set.xsl:
  XTTE1512  One validation error was reported: Cannot validate <Q{.../xpath-functions}csv>:
  no element declaration available
** Failure in parse-xml on fos:result of csv-to-xml-010
Execution time: 532.103792ms
Memory used: 98Mb

Issue #1682 created #created-1682

06 Jan at 11:05:57 GMT
Type Promotion

The description of type promotion in Appendix B.1 has become outdated.

Firstly, the coercion rules no longer invoke type promotion; instead, they use a custom table of "implicit casts". So B.1 is wrong to say that type promotion is invoked by the coercion rules.

Secondly, for selecting an entry in the operator mapping table, I don't think type promotion comes into play.

  • The rules for value comparisons do all the necessary type conversions of operands BEFORE invoking a search of the operator mapping table.
  • The rules for arithmetic operators don't require any type promotion: for numerics, they invoke a function such as op:numeric-add, and it is the definition of this function (not the selection of the function in the mapping table) that invokes type promotion.

The statement in B.1 that "If the result type of an operator is listed as numeric, it means "the first type in the ordered list (xs:integer, xs:decimal, xs:float, xs:double) into which all operands can be converted by [subtype substitution] and [type promotion]" seems wrong in general: for example it doesn't cover integer div integer. The result type is actually defined by the rules of the selected function, e.g. op:numeric-divide, and not by the "result type" column of the operator mapping table. Perhaps it should say that if the result type is a subtype of numeric as defined by the particular function.

References to Type Promotion in F&O section 1.6 are also outdated.

The sum() and avg() functions invoke "numeric promotion" to convert all values in the input to a common type, but the exact rules for doing this aren't exactly clear. For example, the equivalent expression given for sum() doesn't do what the prose says. For example, given sum() applied to a sequence (X as decimal, Y as decimal, Z as float), the prose says the result is float(X) + float(Y) + Z, whereas the equivalent expression gives float(decimal(X + Y)) + Z) which is not necessarily the same thing.

Issue #1681 created #created-1681

05 Jan at 23:38:07 GMT
Δ in the table of contents

All the spec say in their first changes section:

Sections with significant changes are marked Δ in the table of contents.

However, these markers are present only in the F&O specification.

Issue #1680 created #created-1680

05 Jan at 19:27:47 GMT
Ambiguous `switch` syntax

Unless I am overlooking some constraint preventing this, an ambiguity has been introduced to the XQuery 4.0 grammar by allowing the SwitchComparand to be omitted per #671/#678.

Here is an example of an ambiguous expression:

switch case A return switch case B return switch case C return D default return E default return F

It can be parsed along the lines of

switch
case A return SWITCH 
case B return switch
              case C return D 
              default return E 
default return F

but also like the following

switch
case A return switch
              case B return SWITCH
              case C return D
              default return E
default return F

Pull request #1679 created #created-1679

05 Jan at 14:28:41 GMT
1678 Define element(E,T) and attribute(A,T) in terms of "derives-from"

Fix #1678

Issue #1678 created #created-1678

05 Jan at 00:24:02 GMT
Semantics of element(N, T) where T is a union type

The semantics of element(N, T) say that to get a match, the type annotation A of the element must be derived from T by restriction. This means you will never get a match if T is a union type.

Furthermore, if T is a complex type, there is no match if the type annotation is a complex type derived by extension from T.

I think this is a simple error in the spec. It should say that derived-from(A, T) must be true. The derived-from() relationship handles union types and derivation by extension correctly.

We do use derived-from when specifying subtyping. This means that an element E can be an instance of element(E, xs:integer), and not be an instance of element(E, xs:numeric), even though element(E, xs:integer), is a subtype of element(E, xs:numeric).

The error seems to have crept in when the rules were redrafted for 4.0. Up to and including 3.1, the semantics of ElementTest and AttributeTest reference the derived-from() function.

Pull request #1677 created #created-1677

04 Jan at 18:32:51 GMT
1675 Fixes for CSV parsing

Fix #1675

Pull request #1676 created #created-1676

03 Jan at 20:08:21 GMT
1621 Capabilities of Collations

Fix #1621

This PR is largely editorial, except that it makes a substantive change to the fn:collation-available function.

Issue #1675 created #created-1675

03 Jan at 16:02:05 GMT

CSV parsing

Pull request #1674 created #created-1674

03 Jan at 13:32:17 GMT
1662 Allow composite sort keys in xsl:sort

Fix #1662

Pull request #1673 created #created-1673

03 Jan at 01:07:40 GMT
1407 TOC structure for types

Addresses part of #1407:

  • Improves the section headings and levels for the Types and Subtyping sections
  • Level-4 headings (and level-5 if there were any) are no longer omitted from the F&O TOC.

There are other suggestions in #1407 regarding the spec prose that are not (yet) implemented.

Changing the CSS to adjust presentation of level-4 and level-5 headings in the TOC is way above my level of CSS competence, there's some very elaborate logic in this area, and anyone who wants to tackle it is welcome.

Issue #1672 created #created-1672

02 Jan at 20:50:26 GMT
array:values, map:values: Alternatives

We still have array:values and map:values in the spec, even though the names were considered suboptimal: When retrieving values of struct(ure(d item))s, one would expect to get not a flat, but a structured result.

A while ago, the items key specifier was introduced to mimic the classical wildcard lookup syntax (making $A?* and $A?items::* equivalent), and I suggest renaming our functions to array:items and map:items:

  $map?*
≍ map:items($map)

  $array?*
≍ array:items($array)

Plan B could be to extend the second argument of map:get (and array:get) to also accept predicate functions…

map:get(
  $map  as map(*),	
  $key  as (xs:anyAtomicType|fn(xs:anyAtomicType) as xs:boolean?)
) as item()*

…which would allow us to write:

  $map?a
≍ $map => map:get('a')
≍ $map => map:get(fn { . = 'a' })

  $map?(1 to 5)
≍ $map => map:get(fn { . = 1 to 5 })

  $map?*
≍ $map => map:get(true#0)

(: and things like :)
$map => map:get(fn { . mod 2 = 1 })

Pull request #1671 created #created-1671

01 Jan at 16:50:28 GMT
1261 New fn:divide-decimals() function

Fix #1261

Pull request #1670 created #created-1670

01 Jan at 12:51:37 GMT

Action QT4CS-097-02: Enable xtermref links to XSD SCM property names