@qt4cg statuses in 2025
This page displays status updates about the QT4 CG project from 2025.
See also recent statuses.
QT4 CG meeting 133 draft minutes #minutes—08-19
Draft minutes published.
Issue #2139 closed #closed-2139
Binary comparisons
Issue #2168 closed #closed-2168
2139 Make hexBinary and base64Binary fully comparable
Issue #2162 closed #closed-2162
QT4CG-132-04 Expand the rectangle?area example
Issue #2143 closed #closed-2143
JNodes and Methods
Issue #1714 closed #closed-1714
sibling:: axis. Action Item QT4CG-097-03
Issue #350 closed #closed-350
CompPath (Composite-objects path) Expressions
Issue #119 closed #closed-119
Allow a map's key value to be any sequence
Issue #106 closed #closed-106
Decorators' support
Issue #34 closed #closed-34
Proposal to introduce the set datatype in XPath 4
Issue #2164 closed #closed-2164
Fix return type in `fn:parse-csv` signature
Issue #2072 closed #closed-2072
JNodes: accessing properties
Issue #2170 closed #closed-2170
The current "?>" method call operator is ugly, difficult to read, difficult to find and understand. We have much better alternatives.
Issue #2170 created #created-2170
The current "?>" method call operator is ugly, difficult to read, difficult to find and understand. We have much better alternatives.
The current "?>" method call operator is ugly, difficult to read, difficult to find and understand. We have much better alternatives.
One obvious big improvement is to have ==>
.
This is:
- Readable.
- Distinctly distinguishable.
- Understandable and intuitive for anyone who has used an OOP language (C++, C#, Java)
- Expresses the similarity with the
=>
operator. The same way=>
provide the LHS item as the first argument of the RHS function, the similar, and extended, in appearance==>
provides the LHS map/object as an implicit argument to the RHS method. - Because of 1, 2, 3, and 4 above, very little additional learning and understanding effort is required from the XPath user.
Proposed action: Replace the ugly, difficult to read, difficult to find and understand operator "?>" with ==>
.
QT4 CG meeting 133 draft agenda #agenda-08-19
Draft agenda published.
Issue #1938 closed #closed-1938
Invoking coerced methods
Issue #2169 created #created-2169
Longest-token rule incorrectly produces `StringInterpolation` delimiter
StringInterpolation
currently defines a two-character token, curly right brace + backtick, to follow Expr
as a terminator:
StringInterpolation ::= "`{" Expr? "}`"
On other occasions, Expr
is followed by a single right curly brace:
EnclosedExpr ::= "{" Expr? "}"
The following applies to tokenization (the "longest-token" rule):
If the current position is not the end of the input, then return the longest literal terminal or variable terminal that can be matched starting at the current position, regardless whether this terminal is valid at this point in the grammar.”
My concern is that input like
<a>{42}`</a>
is going to be mis-tokenized under the longest-token rule: after 42
, the next (longest) token is the two-character StringInterpolation
terminator, which however is not a valid terminator of the EnclosedExpr
serving as CommonContent
of the direct element constructor.
Proposed fix
My proposal is to replace the two-character tokens that introduce and terminate StringInterpolation
with single backticks around an EnclosedExpr
with no intervening whitespace:
StringInterpolation ::= "`" EnclosedExpr "`" /* ws: explicit */
This replaces both of the two-character delimiters of StringInterpolation
, while still describing the intended language, but without causing the longest-token rule to produce a token that cannot be handled afterwards.
Issue #1736 closed #closed-1736
Add option retain-order=false when constructing maps
Pull request #2168 created #created-2168
2139 Make hexBinary and base64Binary fully comparable
Fix #2139
hexBinary and base64Binary become mutually comparable under all comparison operators: which may affect backward compatibility.
Pull request #2167 created #created-2167
2166 Reinstate lost text for lookup expressions
Fix #2166
Issue #2166 created #created-2166
Lookup expressions: we have deleted too much text
In reverting many of the features previously added to lookup expressions (for example deep lookup and lookup modifiers) we seem to have accidentally lost text that actually defines what the different key specifiers mean; we're left with lots of examples but no actual specification.
I was reading to see what the current spec says about array bound checking: it appears to say nothing.
Issue #2165 created #created-2165
Treat expression: inconsistencies, questionable uses
Related: Michael’s observation in https://github.com/qt4cg/qtspecs/issues/2163#issuecomment-3185618064.
The current spec says for treat:
XPath 4.0 provides an expression called treat that can be used to modify the static type of its operand.
It further mentions the static analysis phase, which has been removed from the specs; maybe we should remove these references.
There are hardly any uses of the expression in the current spec. One is for Absolute Path Expressions:
An expression of the form
/PP
(that is, a path expression with a leading/
) is treated as an abbreviation for the expressionself::gnode()/(fn:root(.) treat as (document-node()|jnode())/PP
.
The use seems confusing, as in many cases self::gnode()
can only be evaluated at runtime. Maybe we could rewrite it to a variant that coerces the node? It would generally be easier for optimizers to rewrite paths when there is no need to differentiate between treat and coercion (and I have never seen code that catches XPDY0050
).
Indeed I think it would be helpful to have a coerce to
expression, even if people will rarely use it explicitly. It would allow us to remove all remaining uses of treat as
(except, of course, for the expression itself), and we could simplify various examples that use variable declarations only for coercing values.
Pull request #2164 created #created-2164
Fix return type in `fn:parse-csv` signature
In f2e1f48, fn:parse-csv
was changed to return an empty sequence, when its first argument is an empty sequence. This is however not reflected in the function's return type, which is here changed to parsed-csv-structure-record?
.
Issue #2163 created #created-2163
Method calls: `?>` or` =?>`
I propose using =?>
for method calls rather than ?>
(a) I lilke the association with =>
to call a function item with an implicit first argument; =?>
combines selection of an item from a map (?
) with function invocation (=>
)
(b) ?>
, while technically unambiguous, smells strongly of XML processing instructions
Issue #2100 closed #closed-2100
JNodes: functions
Issue #2149 closed #closed-2149
2100 Make innermost, outermost, has-children, path apply to JNodes
Pull request #2162 created #created-2162
QT4CG-132-04 Expand the rectangle?area example
Expands the explanation of the example of method chaining
QT4 CG meeting 132 draft minutes #minutes—08-12
Draft minutes published.
Issue #2132 closed #closed-2132
Error handling in and/or expressions
Issue #2133 closed #closed-2133
2132 error handling in logical expressions
Issue #1996 closed #closed-1996
Lookups, KeySpecifier: add NumericLiteral and ContextValueRef?
Issue #2134 closed #closed-2134
1996 Lookups, KeySpecifier: Literal, ContextValueRef
Issue #2147 closed #closed-2147
2143 Redesign of method calls
Issue #2152 closed #closed-2152
"x" is not an instance of enum("x")
Issue #2154 closed #closed-2154
2152 Revise rules for enumeration types
Issue #2156 closed #closed-2156
2092 Drop map:pair, map:of-pairs, map-pairs
Issue #2135 closed #closed-2135
QT4CG-131-01/02 Expand on example as actioned
Issue #2136 closed #closed-2136
Drop full-width angle brackets
Issue #2137 closed #closed-2137
2136 Drop full-width < and > symbols
Issue #2141 closed #closed-2141
Remove nested paragraphs
Issue #2145 closed #closed-2145
Allow implicit whitespace in StringInterpolation
Issue #2146 closed #closed-2146
Require at least one character in StringTemplateFixedPart
Issue #1062 closed #closed-1062
150bis revised proposal for fn:ranks
Issue #150 closed #closed-150
fn:ranks: Produce all ranks in applying a function on the items of a sequence
Issue #714 closed #closed-714
Function annotations in XSLT
Issue #1698 closed #closed-1698
Allow select attribute for xsl:call-template instruction
Issue #1852 closed #closed-1852
fn:values-except: Return atomic values that occur in A but not in B
Issue #2157 closed #closed-2157
Unicode collation algorithm references
Issue #2158 closed #closed-2158
2157 Editorial updates to F+O §5.5 (Unicode collations)
Issue #2161 created #created-2161
Drop other non-ASCII operators (×, ÷)
Adopted from https://github.com/qt4cg/qtspecs/issues/2136#issuecomment-3135426200:
The feedback for U+00D7
(MULTIPLICATION SIGN, ×) and U+00F7
(DIVISION SIGN, ÷) that we got so far was not very positive either, so I would suggest dropping also those operators; they offer no real added value.
Pull request #2160 created #created-2160
2073 data model changes for JNodes and Sequences
This is a first draft of a PR, giving the data model changes only, for a change to the JNode model affecting maps and arrays with sequence-valued entries. A sequence of length 2 or more now has children representing the items in the sequence. Although there is still an asymmetry between sequences of length 1 and longer sequences, it is more manageable than i the previous model.
QT4 CG meeting 132 draft agenda #agenda-08-12
Draft agenda published.
Issue #2159 created #created-2159
JNodes: Learning from JSONiq?
For those who have not stumbled upon JSONiq yet, I am adding some introductory links:
- https://www.jsoniq.org/docs/JSONiq-usecases/html-single/
- https://www.jsoniq.org/docs/Introduction_to_JSONiq/html/
- https://www.jsoniq.org/docs/JSONiqExtensionToXQuery/html-single/index.html
JSONiq has been designed as a query and update language for JSON data. Its first versions were based on XQuery. Due to its similarities, it may give us some good inspirations for traversing and modifying JNodes.
RumbleDB is a current implementation maintained by Ghislain Fourny (@ghislainfourny).
Pull request #2158 created #created-2158
2157 Editorial updates to F+O §5.5 (Unicode collations)
Fix #2157
Issue #2157 created #created-2157
Unicode collation algorithm references
In the F&O reference to UTS#10 we say incorrectly that: The current version is 9.0.0, dated 2016-05-18.
Similarly for UTS#35 we say incorrectly: The current version is 29, dated 2016-03-15.
In §5.5, functions based on substring matching, we say
"In the definitions below, we refer to the terms match and minimal match as defined in definitions DS2 and DS4 of [[UTS #10]]."
It's not made clear what "the definitions below" is referring to: the terms "match" and "minimal match" are actually used in the rules of the individual functions.
The parenthetical sentence (“collation unit” is equivalent to "collation element" as defined in [[UTS #10]])
is not very elegantly expressed.
Pull request #2156 created #created-2156
2092 Drop map:pair, map:of-pairs, map-pairs
Addresses part of issue #2092.
While the function family map:pair
, map:of-pairs
, and map:pairs
can be handy, they are not necessary, especially now that we have JNodes. They are also very easily user-written:
map:pair => {'key': $key, 'value': $value}
map:of-pairs => map:build($pairs, fn{?key}, fn{?value})
map:pairs => map:for-each($map, fn($k, $v){ {'key': $k, 'value': $v })
On the grounds that we should avoid providing multiple ways of solving the same problem, I propose dropping these three functions.
Note: in some ways I would have preferred to drop the alternative trio map:entry, map:entries, and map:merge; but two of these are present in the 3.1 specification.
Issue #2021 closed #closed-2021
XSLT: Move "Patterns" section into "Template Rules"
Issue #2078 closed #closed-2078
2031/2025 JNodes: inconsistency in data model taxonomy, definitions
Pull request #2155 created #created-2155
2150 Define patterns for JNodes
Fix #2150 Fix #2010
Issue #2151 closed #closed-2151
2021 Move the section on Patterns to a more logical place in the spec
Issue #2153 closed #closed-2153
Remove limitations from `enum` type
Pull request #2154 created #created-2154
2152 Revise rules for enumeration types
Fix #2152
Revises the rules for enumeration types: they are now structural subtypes of xs:string
rather than nominative subtypes. The main effect is that "x" instance of enum("x")
is now true. The change is motivated by use cases involving XSLT pattern matching, where strict "instance of" matching is required, with no coercion.
Issue #2153 created #created-2153
Remove limitations from `enum` type
"Tolkein" isn't an actual instance of enum("Tolkein"), it's only coercible to that type, and when types are used in paths it has to be an actual instance. I think we need to fix that.
Originally posted by @michaelhkay in #2150
It seems strange that there is no way to create a value that is an instance of a singleton enumeration type. Only casting (and annotation, which is a kind of casting too) is available.
On the other hand:
let $x as enum("foo") := "foo"
return ( ()
, $x instance of enum("foo")
, $x instance of xs:string
, atomic-equal($x, "foo")
)
(: true(), true(), true() :)
This means that "foo"
should be an instance of enum("foo")
, and then enum("foo")
is a subtype of xs:string
.
And the following is unclear (from 3.2.6 Enumeration Types):
-
It follows from these rules that an atomic item will only satisfy an
instance of
test if it has the correct type annotation, and this typically requires an explicit cast. So the expression"red" instance of enum("red", "green", "blue")
returnsfalse
, while"red" cast as enum("red") instance of enum("red", "green", "blue")
returnstrue
.
Probably, a more narrow reason is that a singleton enumeration type is an "anonymous atomic type derived from xs:string
by restriction using an enumeration facet" that permits only one value. Yes, this makes type checking for an enum more complex, but seems not more complex than casting.
Anyway, is it possible to make any instance of xs:string
also an instance of the corresponding singleton enumeration type? (that is, essentially make it so that this casting happens "hidden", if required).
Issue #2152 created #created-2152
"x" is not an instance of enum("x")
The usefulness of enum() types is limited by the fact that the string "x" is not actually an instance of enum("x")
, it is only coercible to that type. This means that in contexts where strict type matching is required (for example, in XSLT patterns), either (a) you can't use enum() the way you would like, or (b) you use it and fail to understand why it fails.
Pull request #2151 created #created-2151
2021 Move the section on Patterns to a more logical place in the spec
This PR simply moves the section on Patterns to a more logical place in the XSLT specification. Unless anyone objects, I will merge the PR without waiting for group approval, so that I can use the result as a baseline for further work on patterns and templates, hopefully giving a better diff baseline.
Issue #1776 closed #closed-1776
Using `?` and `??` in XSLT patterns
Issue #2150 created #created-2150
XSLT Patterns to match JNodes
Supersedes #1776.
Part of the motivation for introducing JNodes was to make rule-based recursive-descent transformation of JSON structures much easier. This issue addresses part of that capability, namely defining patterns that match JNodes (and perhaps improving the patterns that match maps and arrays).
In general I think the patterns that match JNodes should be distinct from the patterns that match XNodes; although we have unified path expressions so that a/b can select either an XNode or a JNode, I think there would be too much scope for confusion if match="a/b" were able to match a JNode as well as an XNode.
My first idea would be to allow the syntax match="jnode(a)"
for a template rule that matches JNodes having a selector property of "a", similarly jnode(a/b)
, jnode(a//b)
, jnode(a/*/b)
, jnode(a[x="c"])
with semantics defined in much the same way.
But there's a question how this relates to type patterns. With type patterns, we can already do match="type(jnode(record(Author, Title, *)))"
which matches a JNode whose content is of type record(Author, Title, *)
. Where syntactically possible we allow type patterns to be abbreviated, so this would become match="jnode(record(Author, Title, *))"
which conflicts with the above.
An analogy with element(N, T)
might suggest match="jnode(K, V)"
where K constrains the selector property of the JNode, and V constrains its content property. So we might have match="jnode(books, array(record(Author, Title, *)))"
to match a JNode whose selector is "books" and whose content is of type array(record(Author, Title, *))
.
At the same time, while matching maps by a type such as match="record(Author, Title, *))"
works well, I find that this is often accompatied by a predicate so it becomes match="record(Author, Title, *))[Author='Tolkein']"
. It would be nice to express this more concisely and readably perhaps as match="record(Author[.='Tolkein'], Title, *))"
Issue #115 closed #closed-115
Lookup operator on arrays of maps
Pull request #2149 created #created-2149
2100 Make innermost, outermost, has-children, path apply to JNodes
Fix #2100
Issue #2148 created #created-2148
fn:base-uri: Raise errors?
The (rather old) test case K2-BaseURIFunc-29
indicates that invalid URIs may result in an error:
<test-case name="K2-BaseURIFunc-29">
<description> Use an URI in an xml:base element that is a valid URI, but an invalid HTTP URL.
Since implementations aren't required to validate specific schemes but allowed to,
this may either raise an error or return the URI.
</description>
<created by="Frans Englich" on="2007-11-26"/>
<dependency type="spec" value="XQ10+"/>
<test><![CDATA[let $i := fn:base-uri(<anElement xml:base="http:\\example.com\\examples">Element content</anElement>)
return $i eq "http:\\example.com\\examples" or empty($i)]]></test>
<result>
<assert-true/>
</result>
</test-case>
I raise this issue in the qtspecs repository as I wondered whether we should clarify how invalid URIs are to be handled by fn:base-uri
.
If it’s the test that is misleading, I will be glad to correct the comment, or add an error code.
Pull request #2147 created #created-2147
2143 Redesign of method calls
Although issue #2143 envisaged redefining method calls in terms of JNodes, this PR takes a different approach.
The "magic" performed by the lookup operator when the entry in a map is annotated %method is dropped. Instead we have a new operator ?>
which is essentially defined as a macro: in simple cases $map ?> method (X)
is defined to be essentially an abbreviation for ($map ? method)($map, X)
.
I have used the operator ?>
suggested by Christian, but in some ways I prefer the operator we had originally, =?>
, because (a) there is a stronger analogy with =>
, and (b) ?>
brings up images of XML syntax for processing instructions.
Pull request #2146 created #created-2146
Require at least one character in StringTemplateFixedPart
The grammar rules for StringTemplate
are as follows:
StringTemplate ::= "`" (StringTemplateFixedPart | StringTemplateVariablePart)* "`"
/* ws: explicit */
StringTemplateFixedPart ::= ((Char - ('{' | '}' | '`')) | "{{" | "}}" | "``")*
/* ws: explicit */
StringTemplateVariablePart ::= EnclosedExpr
/* ws: explicit */
But StringTemplateFixedPart
should not be allowed as a zero-length token, because this is causing an ambiguity: the input ``
currently can be parsed as any of
<StringTemplate>`<StringTemplateFixedPart/>`</StringTemplate>
<StringTemplate>`<StringTemplateFixedPart/><StringTemplateFixedPart/>`</StringTemplate>
<StringTemplate>`<StringTemplateFixedPart/><StringTemplateFixedPart/><StringTemplateFixedPart/>`</StringTemplate>
and so on.
In order to ensure an unambiguous result, StringTemplateFixedPart
should be required to consist of at least one character. Also the /* ws: explicit */
on StringTemplateVariablePart
is superfluous. The grammar rules thus should be changed to:
StringTemplate ::= "`" (StringTemplateFixedPart | StringTemplateVariablePart)* "`"
/* ws: explicit */
StringTemplateFixedPart ::= ((Char - ('{' | '}' | '`')) | "{{" | "}}" | "``")+
/* ws: explicit */
StringTemplateVariablePart ::= EnclosedExpr
Pull request #2145 created #created-2145
Allow implicit whitespace in StringInterpolation
Production StringInterpolation
currently does not allow implicit whitespace:
StringInterpolation ::= "`{" Expr? "}`"
/* ws: explicit */
But this is likely not intended - all examples in the spec do have whitespace adjacent to the braces.
This change thus removes /* ws: explicit */
in order to allow implicit whitespace.
Issue #2143 created #created-2143
JNodes and Methods
I propose changing the mechanism for invoking methods to take advantage of JNodes.
Instead of the current magic rule for the "?" operator, we move the magic to the rules for dynamic function calls: in a dynamic function call F(X, Y), if the value of F is a JNode J whose content property is a function item annotated with %method, then the function body is executed with the parent of J (that is, the containing map or array) as the context value. It seems much cleaner semantics to make this a rule for dynamic function calls rather than for map lookup.
The downside is that the call syntax now would become ($rectangle/area)()
rather than $rectangle?area()
. Unfortunately $rectangle/area()
parses as $rectangle/(area())
. So we might want to invent some better syntax.
Issue #2142 closed #closed-2142
Markup fixes in the HTML output
Pull request #2142 created #created-2142
Markup fixes in the HTML output
- Moved all the processor comments to the end; this avoids having a comment before
<!DOCTYPE HTML>
which is frowned upon because ... reasons - The XSLT stylesheet was adding links for sections, but so was the main stylesheet, so they were coming out nested.
- Don't attempt to link to functions or elements inside titles. (This also results in nested links)
- Attempt to "unwrap"
<p>
elements around things that can't be inside a<p>
, like various sorts of lists. It's a bit ugly, but it makes for much cleaner HTML.
I'm just going to merge this because there's no practical way to see the consequences in the PR.
Apologies in advance that this will introduce some spurious diffs. I think those will go away after the build finishes and after you've rebased your PRs on the new stylesheets.
Pull request #2141 created #created-2141
Remove nested paragraphs
I have no idea why the DTD allows <p>
inside <p>
but I assume this is a markup error and not intentional.
Issue #2140 closed #closed-2140
Restore diffs
Pull request #2140 created #created-2140
Restore diffs
With great appreciation to the fine folks at DeltaXignia!
Issue #2138 closed #closed-2138
NodeTest `type(X|Y)`: double parentheses needed
Issue #2139 created #created-2139
Binary comparisons
It seems confusing to me that deep-equal
& atomic-equal
return different results than eq
for binary types:
let $hex := xs:hexBinary(''), $base64 := xs:base64Binary('')
return (
(: false :) deep-equal($hex, $base64),
(: false :) atomic-equal($hex, $base64),
(: true :) $hex eq $base64
)
The rules say:
fn:deep-equal
If both
$i1
and$i2
are instances ofxs:hexBinary
orxs:base64Binary
,$i1
eq$i2
returnstrue
.
This can be interpreted in two ways, but it seems to mean that $i1
and $i2
need to have the same type?
fn:atomic-equal
One of the following conditions is true:
$value1
and$value2
are both instances ofxs:hexBinary
.$value1
and$value2
are both instances ofxs:base64Binary
.
op:binary-equal
op:binary-equal( $value1 as (xs:hexBinary | xs:base64Binary), $value2 as (xs:hexBinary | xs:base64Binary) ) as xs:boolean
The function returns
true
if$value1
and$value2
are of the same length, measured in binary octets, and contain the same octets in the same order. Otherwise, it returnsfalse
.
As atomic-equal(xs:double(3), xs:float(3))
returns true
, I would also expect true
for binary items with the same contents.
Related (for numbers): #986
Issue #2138 created #created-2138
NodeTest `type(X|Y)`: double parentheses needed
It is not currently possible to write a NodeTest as (for example) child::type(xs:string | xs:integer)
. As a consequence of the way the grammar is defined, two pairs of parentheses are needed: child::type((xs:string | xs:integer))
It would be easy enough to fix this usability glitch.
Pull request #2137 created #created-2137
2136 Drop full-width < and > symbols
Fix #2136
Issue #2136 created #created-2136
Drop full-width angle brackets
The option of using full-width angle brackets doesn't seem to have attracted great enthusiasm, and now that we have the precedes
and follows
operators, I suggest we drop them. Nearly all cases of plain <
and <=
can be replaced with lt
and le
.
One of the problems with using non-ASCII characters is not just that it's hard to type them, it's also quite hard to recognise them by their appearance. There are so many characters that look a bit like less-than and greater-than symbols.
Pull request #2135 created #created-2135
QT4CG-131-01/02 Expand on example as actioned
Expands on the let binding example
QT4 CG meeting 131 draft minutes #minutes—07-29
Draft minutes published.
Issue #2130 closed #closed-2130
Proposed new operator keywords: precedes, follows
Issue #2080 closed #closed-2080
Destructuring let clauses: Bind remaining values
Issue #2119 closed #closed-2119
2080 allow let $($head, $tail)
Issue #2087 closed #closed-2087
Adaptive serialization: JNodes
Issue #2114 closed #closed-2114
2087 Change adaptive serialization of JNodes
Issue #2084 closed #closed-2084
Steps when the context value contains multiple nodes
Issue #2115 closed #closed-2115
2084 - document order of axis steps when context value is a sequence
Issue #2082 closed #closed-2082
parse-html options parameter conventions
Issue #2117 closed #closed-2117
2082 parse-html options
Issue #2099 closed #closed-2099
Choosing names for the jnode function and the jnode type
Issue #2129 closed #closed-2129
2099 Rename fn:jnode and jnode-type
Issue #2086 closed #closed-2086
Can the ¶value property of a JNode be (or contain) a JNode?
Issue #1978 closed #closed-1978
Function `map:build` does not allow expressing the dependency of a value on its key. Some simple types of maps cannot be built.
Issue #1946 closed #closed-1946
We need examples of a record with an entry that is a %method and invoking this method with the result it must produce
Issue #1514 closed #closed-1514
Editorial: optional position argument in function signature for for-each and other HOF
Issue #1175 closed #closed-1175
XPath: Optional parameters in the definition of an inline function
Issue #2102 closed #closed-2102
Type diagrams: drop/add parentheses
Issue #2113 closed #closed-2113
2102 Make type labels in diagram consistent
Pull request #2134 created #created-2134
1996 Lookups, KeySpecifier: Literal, ContextValueRef
Revised; closes #1996
Issue #2063 closed #closed-2063
1996 Lookups, KeySpecifier: Literal, ContextValueRef
Pull request #2133 created #created-2133
2132 error handling in logical expressions
Note: depends on #2115 because of terminology changes.
Fix #2132
Issue #2132 created #created-2132
Error handling in and/or expressions
In §2.4.5 we introduced the concept of guarded expressions, and included the rules
- In an and expression, the second operand is guarded by the value of the first operand being true.
- In an or expression, the second operand is guarded by the value of the first operand being false.
This change is not mentioned in 4.11 Logical Expressions. For example, this still says
The order in which the operands of a logical expression are evaluated is [implementation-dependent]
and the truth tables in 4.11 are unchanged from 3.1.
(We have also introduced defined terms "and expression" and "or expression", and should use them here).
My understanding of the new rule for guarded expressions is that with (A and B), if A is false then the result is false even if B raises an error; this is not what the truth table says.
Issue #2131 closed #closed-2131
XSLT `xsl:for-each-group` `split-when` variables
QT4 CG meeting 131 draft agenda #agenda-07-29
Draft agenda published.
Issue #2131 created #created-2131
XSLT `xsl:for-each-group` `split-when` variables
Currently the spec says:
The expression is supplied with two variables:
$group
is set to the contents of the current group being constructed, and$next
is the next item in the population.
1. Do these supplied variables shadow the user-defined variable of the same name? (probably they should, but it is not mentioned) For example:
<xsl:variable name="group" select="(1,2,3)"/>
<xsl:for-each-group select="$input" split-when=" $next = $group "> ... </xsl:for-each-group>
Maybe these variables should be in the fn:
namespace?
2. To access the current grouping key and current group the functions fn:current-grouping-key
and fn:current-group
are used. What is the rationale of introducing variables instead of functions to access the group and item in split-when
?
Issue #2066 closed #closed-2066
Cells in the F&O signature blocks should be vertically aligned to the top
Issue #2122 closed #closed-2122
2066 CSS changes for function prototypes
Pull request #2130 created #created-2130
Proposed new operator keywords: precedes, follows
The operators <<
and >>
, in my opinion, are poorly known, and challenging for developers working in XSLT. Many punctuation-based operators have aliases in ordinary-language quasi-equivalents, but <<
and >>
lack any ordinary verbal equivalents, and break this principle.
The attached proposal offers to make precedes
a keyword equivalent to <<
and follows
a keyword equivalent to >>
. This means that //title[. << following-sibling::isbn[1]]
can now be expressed as //title[. precedes following-sibling::isbn[1]]
Pull request #2129 created #created-2129
2099 Rename fn:jnode and jnode-type
Renames the function fn:jnode
as fn:jtree
, and the item type jnode-type()
as jnode()
Fix #2099
Issue #2128 created #created-2128
JNodes and XSLT Streaming
In principle, there's no reason why JTrees shouldn't be streamable. However, it's an immense amount of work both for the specification and for an implementation, so I intend to rule it out.
That then leaves the question about deciding streamability in the case of constructs that could be processing JNodes.
I think we might be able to define a rule something like: "if a template rule (etc.) is declared with streamable="yes", this amounts to a declaration that it will only be used to process XNodes, even if without this declaration it would also be capable of processing JNodes."
The details depend on other aspects of how we define template rule processing and pattern matching for JNodes. At present I'm inclined to say that the pattern syntax for matching JNodes should be distinct from that for matching XNodes, so that a template can only match one or the other, never both.
Issue #2127 created #created-2127
JNodes: Include atomic items
With the introduction of JNodes, it feels like a natural step to enhance the processing of documents, collections and databases to JSON data. Currently, the roots of JNodes are restricted to maps and arrays. We should generalize them and include support for atomic items. The Background:
JSON data types are not restricted to maps (objects) and arrays, they can also be strings, numbers and booleans. As a consequence, a json-doc('input.json')
call may also return atomic types. When iterating over JSON input, we should ensure that there is no need for additional type checks to ensure that code does not fail:
for $i in 1 to 10
for $doc in json-doc($i || '.json')
(: where $doc instance of (map(*)|array(*)) :)
return $doc/a/b/c
I assume it would be no substantial change to open $json/step
for types other than nodes, maps and arrays. The only tricky special case is null
, but it is converted to an empty sequence, why json-doc('document-with-single-null-value.json')/a/b/c
already succeeds.
To address any concerns that JNodes do not make sense for standalone atomic items: There is a certain analogy to XML text nodes, which can also be created without serving any immediate purpose, but are helpful and necessary for mapping the entire XML data model.
Issue #2126 created #created-2126
Absolute path expressions with JTrees
I'm thinking there may be a case for disallowing or restricting the use of absolute path expressions over JTrees.
Firstly, there's a strong likelihood that users are only dimly aware of where the root of the tree actually is. They are likely to imagine, if they parse a json text, that the root of the tree will be the root of that text. But in fact the root is wherever they started path navigation from, which may be different. It's likely to be particularly confusing if you move out of JNode space into map/array territory and then back again. Requiring an explicit call on root(), or on ancestor::*[last()], might mean that users think more carefully about it. This makes it clearer that the root is not some kind of absolute fixed point, it is simply "the place where you started your current journey".
Certainly, I don't think we should allow /x
to implicitly construct a JNode wrapping the context item: the '/' here is redundant and means the user almost certainly doesn't understand what they are doing.
Another consideration is that if we restrict leading /
to work only with XTrees, then we reinstate a lot of static type checking capability that we have currently lost.
Issue #2125 created #created-2125
csv-to-xml() - untestable results
The machinery for generating test cases from spec examples is failing in the case of the csv-to-xml()
tests.
The test generator produces an expected result assertion of
<assert-xml ignore-prefixes="false"><![CDATA[<substitute-for-unparseable-result-xml xmlns="http://www.w3.org/2010/09/qt-fots-catalog"/>]]></assert-xml>
The same problem occurs with some fn:analyze-string tests.
The failure basically means that parse-xml() on the expected test results has failed. (Stylesheet generate-qt3-test-set.xsl line 122).
I think that the problem is that the build is running Saxon in schema-aware mode, and this applies to all XML parsing including the parse-xml() function, so we're getting a schema validation failure where we really don't want to be validating in the first place.
Unfortunately we can't selectively switch parse-xml() validation off unless we move to a later (and not yet stable) Saxon version, and rewriting the whole stylesheet to not be schema-aware would be painful.
Pull request #2124 created #created-2124
573 Functions to Construct Trees
A first cut at providing a functional approach to XNode and XTree construction.
At this stage I'm interested in comments on the general approach, not the fine detail (some of which, e.g. namespace inheritance, still needs work.)
Pull request #2123 created #created-2123
2051: XSLT group by cluster
Companion PR to #2051 .
I have opted for only two examples, hoping they catalyze the imagination of what is possible. Comments welcome.
Pull request #2122 created #created-2122
2066 CSS changes for function prototypes
Fix #2066
Issue #2121 closed #closed-2121
2066 fo signature table format
Pull request #2121 created #created-2121
2066 fo signature table format
CSS changes to improve alignment of complex signatures, e.g. fn:round
Fix #2066
Issue #1774 closed #closed-1774
Nomenclature: relabelling
Issue #1775 closed #closed-1775
Navigation in JSON trees
Pull request #2120 created #created-2120
2007 Revised design for xsl:array
Revised design for xsl:array
based on usage experience.
Fix #2007
Issue #2118 closed #closed-2118
2080 Tweak the rules for destructuring variable bindings
Pull request #2119 created #created-2119
2080 allow let $($head, $tail)
Fix #2080
With let $($x, $y, $z)
, $z bids to the rest of the sequence.
With let $[$a, $b, $c]
, FOAR0001 is raised if the array is too short.
Note, XPath and XQuery should be reviewed separately as the source text for let expressions is different.
Pull request #2118 created #created-2118
2080 Tweak the rules for destructuring variable bindings
- When binding to a sequence, the last variable binds to the rest of the sequence.
- When binding to an array, an FOAR0001 occurs if there are more variables than array members.
Fix #2080.
Pull request #2117 created #created-2117
2082 parse-html options
- Use non-optional types such as xs:boolean for options parameters
- Use regular error codes for bad options
- Drop error code relating to the discontinued method option.
Fix #2082
Pull request #2116 created #created-2116
2112 Refine/revise the rules for get() in node tests
Proposed revision of the rules for get() in node tests.
Mainly editorial clarification; but also changes the rules for the focus - the expression is now evaluated with absent focus to ensure an error in preference to unexpected results.
Fix #2112
Pull request #2115 created #created-2115
2084 - document order of axis steps when context value is a sequence
Clarifies that the results are in document order and deduplicated.
Fix #2084
Pull request #2114 created #created-2114
2087 Change adaptive serialization of JNodes
Fix #2087
Pull request #2113 created #created-2113
2102 Make type labels in diagram consistent
Fix #2102
Drops the parentheses in map(), array(), function(*)
QT4 CG meeting 130 draft minutes #minutes—07-22
Draft minutes published.
Issue #2036 closed #closed-2036
Streamability of xsl:map instruction
Issue #2037 closed #closed-2037
2036 Add rule for streamability of xsl:map
Issue #2104 closed #closed-2104
JNodes: unwrapping
Issue #2111 closed #closed-2111
2104 Point out places where jnode-content is called implicitly
Issue #2098 closed #closed-2098
JNodes: combining node sequences
Issue #2110 closed #closed-2110
2098 Clarify when jnode() is called implicitly
Issue #2103 closed #closed-2103
JNodes functions: 0-arity variant
Issue #2109 closed #closed-2109
2103 Allow operand of JNode accessors to be omitted or empty
Issue #2107 closed #closed-2107
QT4CG-129-01: Actions from review of PR2094
Issue #2108 closed #closed-2108
QT4CG-123-01 Add example of library module using methods
Issue #2106 closed #closed-2106
Add note on the impossibility of cyclic instances
Issue #2105 closed #closed-2105
Fix type of `fn:schema-type-record` field `constructor`
Issue #2097 closed #closed-2097
`jnode` as a subtype of `node`
Issue #2089 closed #closed-2089
JNode properties: Presentation
Issue #2112 created #created-2112
JNodes: get()
The documentation for get()
says for XNodes…
A selector can also take the form
get(Expr)
. The contained expression Expr is evaluated with the focus of the containing axis step (so its value is independent of the specific XNode being tested). The result of the expression after atomization must be a sequence of zero or morexs:QName
values (otherwise a type error [err:XPTY0004] is raised). An XNode satisfies the selector if its node kind is the principal node kind of the axis and its node name is among the values returned by the selector expression.
…and for JNodes…
If the selector takes the form
get(Expr)
, then the contained expression Expr is evaluated with the focus of the containing axis step (so its value is independent of the specific JNode being tested). A JNode satisfies the selector if its ·selector· property is equal to one or more of the values returned by the selector expression, under the rules of the fn:atomic-equal function.
Nitpicking:
- With the existing rule for JNodes, I assume that no match would be returned for
EXPR := [ 'a', 'b' ]
. I would thus propose to atomize EXPR first and compare it afterwards (see below). - “A JNode satisfies the selector if [the value of] its ·selector· property”
Next, I would like us to unify the rules for XNodes and JNodes. The rationale (besides “simpler rules are simpler to explain”):
- XPath is well-known for being forgiving. Maybe we can maintain that tradition for name tests, by tolerating input other than QNames.
- By using identical rules for JNodes and XNodes, it will be easier to process input that mixed XNodes and JNodes. An example:
(<xml>ignored</xml>, { 1: 'one' }, [ 'one' ])/get(1)
I would propose to simplify the joint rules to the following XPath expression:
some(
data(EXPR),
atomic-equal(?, if(. instance of node()) then node-name() else jnode-selector())
)
Finally, I assume that the focus information can be utilized in the get expression, right? Is it correct to assume that all of the following expressions will return <a2/>
?
let $xml := <xml><a3/><a2/><a1/></xml>
let $name := #a2
return (
$xml/get(#a2),
$xml/get($name),
$xml/get(node-name()[. = $name])
$xml/get(xs:QName('a' || position())),
$xml/get(if(position() = 2) { $name } ),
$xml/get(xs:QName(`a{ last() - 1 }`))
)
QT4 CG meeting 130 draft agenda #agenda-07-22
Draft agenda published.
Issue #1786 closed #closed-1786
A case study for XSLT transformation of JSON: the transpiler
Issue #2025 closed #closed-2025
Combine the concepts of pins/labels and modified lookups
Pull request #2111 created #created-2111
2104 Point out places where jnode-content is called implicitly
Fix #2095
This PR is purely editorial: it adds notes and examples showing where jnode-content() is (or is not) called implicitly.
Pull request #2110 created #created-2110
2098 Clarify when jnode() is called implicitly
Fix #2098
Pull request #2109 created #created-2109
2103 Allow operand of JNode accessors to be omitted or empty
Fix #2103
Pull request #2108 created #created-2108
QT4CG-123-01 Add example of library module using methods
Pull request #2107 created #created-2107
QT4CG-129-01: Actions from review of PR2094
Pull request #2106 created #created-2106
Add note on the impossibility of cyclic instances
Responding to an action at today's meeting, this adds a note to the effect that although types can contain cyclic references, instances can not.
QT4 CG meeting 129 draft minutes #minutes—07-15
Draft minutes published.
Pull request #2105 created #created-2105
Fix type of `fn:schema-type-record` field `constructor`
The constructor
of fn:schema-type-record
is currently shown with a type of
fn(xs:anyAtomicType?) as xs:anyAtomicType?
However this does not cover list type constructors, returning multiple occurrences. It should thus be
fn(xs:anyAtomicType?) as xs:anyAtomicType*
A test exists that requires this: schema-type-005
Issue #2104 created #created-2104
JNodes: unwrapping
Related (but not identical to) https://github.com/qt4cg/qtspecs/issues/2095:
As Michael indicated in https://github.com/qt4cg/qtspecs/issues/2095#issuecomment-3069173742, the implicit unwrapping of accessed/iterated JNode results may already be defined in the current spec, but it may need to be further clarified. Examples:
let $jnode := { 'array': [ 1, 2 ] }/array
return (
(: FLWOR expressions :)
for member $m in $jnode return $m,
(: Functions :)
array:size($jnode),
(: Lookups :)
$jnode?1
)
Issue #2054 closed #closed-2054
JPath expression
Issue #2103 created #created-2103
JNodes functions: 0-arity variant
Similar to fn:name
and other accessor functions, the new JNode functions (fn:node-content
, fn:node-position
, fn:node-selector
) should be gifted with a 0-arity variant.
Issue #2102 created #created-2102
Type diagrams: drop/add parentheses
The current presentation of the data types is inconsistent (https://qt4cg.org/specifications/xpath-datamodel-40/Overview.html#types-hierarchy):
It includes GNode
, XNode
, and JNode
, attribute
, document
etc. (without parentheses), but function(*)
, array(*)
and map(*)
. Shouldn’t we remove the parentheses from function(*)
etc., or add them to the other types?
Or maybe we should even change function(*)
to Function, etc.
Labels | Types
-- | --
GNode | gnode()
XNode | node()
JNode | jnode()
attribute | attribute()
, attribute(*)
, attribute(a)
, …
document | document-node()
, document-node(*)
, …
function() | function(*)
, function(xs:int) as xs:int
, …
array() | array(*)
, array(xs:int)
, …
map(*) | map(*)
, map(xs:int, xs:int)
, …
Issue #2011 closed #closed-2011
675(part): Add XSLT static typing rules for new kinds of XPath expression
Issue #2038 closed #closed-2038
Drop dependency of fn:apply-templates on the default mode
Issue #2043 closed #closed-2043
2038 Tweak the rules for fn:apply-templates references to modes
Issue #2101 created #created-2101
Named record types: drop constructors, complete list
The spec defines various built-in named record types (https://qt4cg.org/specifications/xpath-functions-40/Overview.html#id-built-in-named-record-types):
key-value-pair
load-xquery-module-record
parsed-csv-structure-record
random-number-generator-record
schema-type-record
uri-structure-record
Suggestions:
- The spec says in https://qt4cg.org/specifications/xquery-40/xquery-40.html#id-named-record-types:
Named record types implicitly create a constructor function that can be used to create instances of the record type.
I would propose excluding the constructors for built-in types. It will save us a lot of bulky code in the implementations, for functions that will hardly be used, as they all refer to return types of existing functions. Ironically, the only exception might be key-value-pair
, but it is redundant anyway (we already have map:pair
… unless it is dropped, together with the record type, by #2092).
- A type is missing for the result of
fn:divide-decimals
, and we should suffixkey-value-pair
with-record
(provided we will keep it in the spec).
Issue #2014 closed #closed-2014
QT4CG-122-01 Add notes, examples, and rationale for xsl:select
Issue #2003 closed #closed-2003
Conditional entries in map constructors
Issue #2094 closed #closed-2094
2003 Generalize Map Constructors
Issue #2083 closed #closed-2083
2054 Generalized Path Expressions
Issue #2031 closed #closed-2031
2025 JNodes
Issue #1307 closed #closed-1307
For symmetry, add functions array:scan-left and array:scan-right
Issue #2057 closed #closed-2057
Steps: variable element names
Issue #2035 closed #closed-2035
Recursive record types: unrealistic example in XPath spec
Issue #2096 closed #closed-2096
2035 Drop unworkable example of recursive record types
Issue #2100 created #created-2100
JNodes: functions
With #2083, some XQFO functions was generalized for JNodes:
fn:distinct-ordered-nodes
fn:generate-id
fn:root
fn:siblings
fn:transitive-closure
Others are still pending (to be completed):
fn:has-children
fn:innermost
fn:outermost
fn:path
We should…
- analyze the rules of the existing functions (for example, what happens if both XNodes and JNodes are used in
fn:transitive-closure
?) - add more functions.
Issue #2099 created #created-2099
Choosing names for the jnode function and the jnode type
As a late-breaking change to PR #2083, the item type syntax for matching JNodes has been changed to jnode-type()
, to avoid a clash with the name of the function fn:jnode
.
We might prefer to resolve this clash in a different way.
Issue #2098 created #created-2098
JNodes: combining node sequences
Internal questions and feedback on the union
operator, triggered by the JNodes proposal:
- Will
{ 1: 2 } union { 3: 4 }
be allowed, or will it be{ 1: 2 }/. union { 3: 4 }/.
? - Combining maps and arrays: one might expect
{ 1: 2 } union { 3: 4 }
to result in{ 1: 2, 3: 4 }
. - If
union
et al. are enhanced anyway, couldn’t they be generalized for sequences?(1, 2) union 3
→(1, 2, 3)
.
…which I answered as follows:
- I guess no; the conversion to JNodes is needed.
- …a good reason why we should not implicitly coerce maps/arrays to JNodes.
- For atomic-only sequences, it could be equivalent to
fn:distinct-values((A, B))
. For heterogenous sequences, it gets tricky: How should<a>1</a> union 1e0
be combined? Similar to how functions likefn:min
are defined, it could be the first item that determines how the remaining input is combined (but the operation would not be commutative anymore;A union B
might yield different results thanB union A
).
QT4 CG meeting 129 draft agenda #agenda-07-15
Draft agenda published.
Issue #366 closed #closed-366
Support xsl:use-package with xsl:package-location
Issue #2097 created #created-2097
`jnode` as a subtype of `node`
It seems that introduce jnode
by extending node
is more consistent than introducing jnode
as a data type that is a sibling to node
.
I would like to hear comments about pros and cons of this approach.
@ruv wrote:
I wonder what if
jnode
was a subtype ofnode
@michaelhkay wrote:
Then all operations on node would become available for jnode, including many that obviously don't make sense, for example getting the in-scope namespaces, applying schema validation, etc etc.
This does not seem to be a problem. There are many features that make sense for one node kind (or type) and don't make sense for other.
For example,
fn:name()
does not make sense fordocument-node()
,comment()
,text()
(but applies to them and return the empty string);fn:in-scope-prefixes()
applies to only a node that is anelement()
.- the constructor
document { }
does not accept an attribute node.
Thus, nodes that are a subtype of jnode can have their own restrictions.
jnode
as subtype of node
There can be four direct subtypes of jnode
:
map-node
array-node
map-entry
array-member
(or maybe let's call itarray-entry
)
So, jnode
is a union of them: jnode = map-node | array-node | map-entry | array-entry
.
Nodes of the type map-node
and array-node
are similar to document-node
. Their parent is always ()
.
The child axis of a jnode
can contains only nodes of the type map-entry | array-entry
.
An advantage of this approach is that there is no need to introduce XNode and GNode, and corresponding confusion, like say that node()
matches XNode, but not GNode (in the general case).
This probably also allows us to specify XSLT for jnodes more seamlessly.
Pull request #2096 created #created-2096
2035 Drop unworkable example of recursive record types
Fix #2035
Issue #2095 created #created-2095
JNodes: result processing
By playing around with the new JNodes syntax, I noticed that the somewhat bulky function fn:jnode-content
needs to be used a lot to process the result of a path traversal, basically every time when coercion is not possible or does not make sense (iterations, any operation based on item()*
).
An example:
{ 'Catania': { 'nomi': ('Alfredo', 'Andrea') } }
- Count the number of persons in Catania:
./Catania/nomi => jnode-content() => count()
- List all persons in upper case:
for $nome in .//nomi => jnode-content()
return upper-case($nome)
To prevent users from selectively resorting to the shorter lookup syntax…
.?Catania?nomi => count()
for $nome in .?Catania?*?nomi
return upper-case($nome)
…we could include another pseudo-function that returns the content/value instead of a JNode:
./Catania/content(nomi) => count()
for $nome in .//content(nomi)
return upper-case($nome)
One drawback would be that this would violate the current principle that the righthand side is only a filter.
Pull request #2094 created #created-2094
2003 Generalize Map Constructors
Allows conditional and repeated entries in a map constructor.
Fix #2003
Issue #2093 created #created-2093
XQFO: structuring
The way the XQFO functions are structured is becoming increasingly arbitrary. Examples:
- Processing sequences:
fn:doc
- Processing nodes:
fn:string
- Processing QNames:
fn:in-scope-prefixes
- Parsing and serializing:
fn:xsd-validator
For at least 10% of the functions, it will be difficult to find a good categorization (categorization is a challenging topic in itself), but maybe we can improve the status quo.
We can certainly tackle this at a later stage; this issue is only about reminding us of the task.
Issue #1715 closed #closed-1715
Array Lookups: partial removal of out-of-bounds checks
Issue #1995 closed #closed-1995
Consistency: array lookups
Issue #1872 closed #closed-1872
Arrays: members → values / entries?
Issue #1871 closed #closed-1871
Arrays and maps: consistency
Issue #2092 created #created-2092
Drop map:pair, map:of-pairs, map:pairs, array:members, array:of-members
I propose dropping the three functions map:pair
, map:of-pairs
, map:pairs
, together with the built-in record type fn:key-value-pair
.
With the introduction of JNodes, I think these are redundant.
- In place of
map:pairs($map)
, use$map/*
. - In place of
map:pair($key, $value)
, use{$key : $value}/*
- In place of
map:of-pairs
, usemap:build($jnodes, jnode-selector#1, jnode-content#1)
I'm proposing to keep the map:entry, map:entries, and map:merge trio which also do much the same thing.
Similarly, I propose dropping array:members
and array:of-members
, and "value records":
- In place of
array:members($array)
, use$array/*
- In place of
array:of-members
, usearray:build($jnodes, jnode-content#1)
(Alternatively: keep the functions but define them to return and consume JNodes, rather than key-value and value records.)
Issue #2046 closed #closed-2046
Promote ".." to a primary expression
Issue #2076 closed #closed-2076
TOC: Interaction
Issue #2091 closed #closed-2091
ToC changes per issue #2076
Pull request #2091 created #created-2091
ToC changes per issue #2076
Fix #2076
Hopefully this is an improvement!
Issue #2090 closed #closed-2090
What does it mean to send an `encoding` serialization parameter to `fn:serialize`?
Issue #2090 created #created-2090
What does it mean to send an `encoding` serialization parameter to `fn:serialize`?
We say that fn:serialize()
returns a string, so I'd expect it to be UTF-8 (or UTF-16, or whatever the implementation's common character set is for strings). I don't see any mention of the meaning of encoding.
Issue #2089 created #created-2089
JNode properties: Presentation
Maybe the presentation of XNode and JNode properties could be aligned.
The XDM uses square brackets for XML node properties:
Document node properties are derived from the infoset as follows:
base-uri The value of the [base URI] property, if available, […]
…and the character ¶ for JNode properties:
JNode has the following properties: ¶parent: a JNode […]
Would it make sense to always use ¶ or square brackets, or are the kind of properties we talk about too different in order to be aligned? In the latter case, we may need to clarify this in the spec, or add some words on (possibly non-existing) GNode properties.
Issue #2088 created #created-2088
File Module: Feedback, Observations
1. “Regular files” (QT4CG-128-01)
Add POSIX reference.
2. Permissions (QT4CG-128-02)
All functions should be checked with regard to permission handling. Due to the variety of file systems and the programming languages that operate on them, it may turn out that file:not-found
and file:io-error
is all we can offer.
3. file:is-absolute
The rule says: “A path is absolute if it does not need to be combined with other path information, such as the current directory, to locate a file.”
Thus, file:is-absolute('/')
must not return true on Windows systems, as the drive letter is missing.
Other rules of the spec that refer to absolute file paths should reflect this.
4. file:resolve-path
Additionally, Rule 1 “If $path
is an absolute path, it is returned unchanged.” contradicts the final sentence, which states that a separator must be added to directory paths:
…to be continued.
Issue #2087 created #created-2087
Adaptive serialization: JNodes
The (proposed) spec currently says (for the adaptive output method)
A JNode is serialized by serializing its ¶value property.
I propose changing the output to
JNode(k:v)
where k is the serialization of the selector property and v is the serialization of the value property.
The rule for the JSON output method remains the same.
The reason for the change is so that people can see when a query returns a JNode as distinct from returning its value.
Issue #2086 created #created-2086
Can the ¶value property of a JNode be (or contain) a JNode?
The data model allows the ¶value property of a JNode be (or contain) a JNode. But can it actually happen, and if so, what are the consequences?
I think it can happen. Although fn:JNode can't be applied directly to a JNode, it is possible to construct a map or array in which the entries/members are (or contain) JNodes. We can then wrap such an array or map in a JNode using the fn:JNode function, and the child axis applied to this containing array will return JNodes that have JNodes as their ¶value properties.
While the results may be confusing, I don't think they are harmful (and someone may find an imaginative way of making use of such a structure). For the time being therefore, I propose to allow it, perhaps with an explanatory note to point out any dangers.
Should atomization of a JNode unwrap multiple layers? We currently say that if a JNode J has a ¶value V, then the atomization of J is the atomization of V. I see no particular reason to change that rule, but again, it's an edge case we might draw attention to.
QT4 CG meeting 128 draft minutes #minutes—07-08
Draft minutes published.
Issue #2085 closed #closed-2085
Fix markup errors in the EXPath file: specification
Pull request #2085 created #created-2085
Fix markup errors in the EXPath file: specification
The FOS schema does not allow an fos:example containing an fos:test to omit the fos:result element. I’ve inserted
<fos:result>���</fos:result>
as a placeholder to mark the problem. I also had to move some of the fos:errors sections to a different location.
@ChristianGruen apologies for just pushing these in. I wanted to get the specs building again.
Issue #2070 closed #closed-2070
Map build patch
Issue #2016 closed #closed-2016
File Module: Incorporate changes
Issue #2077 closed #closed-2077
2016 File Module: Incorporate changes
QT4 CG meeting 128 draft agenda #agenda-07-08
Draft agenda published.
Issue #2084 created #created-2084
Steps when the context value contains multiple nodes
We have two conflicting statements in the spec.
§4.6.4 says: "The step expression S is equivalent to ./S. Thus, if the context value is a sequence containing multiple nodes, the semantics of a step expression are equivalent to a path expression in which the step is always applied to a single node."
§4.6.5 says: "When the context value for evaluation of a step includes multiple nodes, the step is evaluated separately for each of those nodes, and the results are combined without reordering."
I'm not sure which is intended: does S means. / S
, or . ! S
?
Pull request #2083 created #created-2083
2054 Generalized Path Expressions
This proposal (which has the JNodes proposal as its baseline) is a first cut at defining generalised steps and path expressions that handle both XNodes and JNodes in a uniform way.
The proposal adds functionality to path expressions (using "/") but does not yet remove the corresponding functionality from lookup expressions (using "?") - that will follow in a subsequent draft.
The changes are largely confined to XPath section §4.6.
Obviously there is much scope to add notes and examples. There is also a need to reorganise sections so concepts are introduced before they are referenced.
Issue #2082 created #created-2082
parse-html options parameter conventions
In most functions, the options parameters have types such as xs:boolean
and xs:string
. But in parse-html
, they are xs:boolean?
and xs:string?
Issue #2081 closed #closed-2081
Destructuring let combined with for clause
Issue #2081 created #created-2081
Destructuring let combined with for clause
We may need to look at expressions of the following kind…
let $($a, $b) := (1 to 6) ! string()
for $i in 1 to 3
return ($a, $b, $i)
…which currently yield an exception:
Improper use? Potential bug? Your feedback is welcome:
Contact: basex-talk@mailman.uni-konstanz.de
Version: BaseX 12.1 beta
Java: Eclipse Adoptium, 17.0.14
OS: Windows 11, amd64
java.lang.ArrayIndexOutOfBoundsException: Index -1 out of bounds for length 32
at org.basex.query.var.QueryStack.set(QueryStack.java:124)
at org.basex.query.QueryContext.set(QueryContext.java:577)
at org.basex.query.expr.gflwor.Let$LetEval.next(Let.java:147)
at org.basex.query.expr.gflwor.GFLWOR$1.next(GFLWOR.java:79)
at org.basex.query.scope.MainModule$1.next(MainModule.java:65)
at org.basex.query.QueryContext.next(QueryContext.java:395)
at org.basex.query.QueryContext.lambda$6(QueryContext.java:629)
at org.basex.query.QueryContext.run(QueryContext.java:770)
at org.basex.query.QueryContext.cache(QueryContext.java:609)
Issue #2080 created #created-2080
Destructuring let clauses: Bind remaining values
With the new LetSequenceBinding, LetArrayBinding and LetMapBinding clauses, single items of an evaluated expression can be partially bound:
let $($a, $b) := (1, 2, 3)
let $[$a $b, $c] := [ 1, 2, 3 ]
let ${$a, $b} := { 'a': 1, 'b': 2, 'c': 3 }
It would be helpful to be able to also bind all remaining items, for example with the three-dot syntax. If we allow this syntax for maps, the result could be a submap with all entries except for the ones that were bound by previous bindings:
(: $first: 1, $remaining: (2, 3) :)
let $($first, $remaining...) := (1, 2, 3)
(: $first: 1, $remaining: (2, 3) :)
let $[$first, $remaining...] := [ 1, 2, 3 ]
(: $a: 1, $remaining: { 'b': 2, 'c': 3 } :)
let ${$a, $remaining...} := { 'a': 1, 'b': 2, 'c': 3 }
Issue #2079 created #created-2079
Extend EQName with optional prefix
A QName has three components, prefix, uri, and local, and it's sometimes useful to be able to specify all three.
I suggest extending the definition of EQName to allow the format Q{uri}prefix:local, where the optional prefix is documentary only, but is present in the resulting QName value.
There may be a need to look at places where EQNames are used and to describe the consequences of using this format.
Issue #2078 created #created-2078
2031/2025 JNodes: inconsistency in data model taxonomy, definitions
Per chair request, I'm raising an issue on PR #2031 (on issue #2025), even though it has not been adopted by the CG.
I strongly support the JNodes proposal. But in its current state, I have concerns about the fundamentals. If I am right, or only partly right, adjustments will have cascading effects.
I realize that throughout the specs we use the "Definition" rubric loosely, but, for the points I raise below, I would ask that we at least aspire to a more robust definition of "definition." I draw on the classic Aristotelian model, where a good definition should specify the definiendum's genus ("definiendum" = "the thing to be defined"), then supply only those predicates needed to distinguish the definiendum from other species of that genus. The classic example is the definition of "human being" as a "rational animal." The animal is the genus, and the adjective "rational" delimits human beings from non-human being animals. No need to get hung up on details -- that's the gist of the what informs my comments below.
The PR proposes the new top-level structure:
- GNode
- XNode
- JNode
The first term is defined:
[Definition: The term generic node or GNode is a collective term for XNodes (more commonly called simply nodes) representing the parts of an XML document, and JNodes, often used to represent the parts of a JSON document.]
This definition attempts to define things outside the scope of the definiendum. It is presented here as a kind of abstract umbrella category for more specific things. Not a big deal; we carry on:
[Definition: An XNode, more commonly referred to simply as a node, represents a construct found in an XML document. There are seven kinds: document nodes, element nodes, attribute nodes, text nodes, comment nodes, processing instruction nodes, and namespace nodes] [Definition: A JNode represents an encapsulation of a value in a tree of maps and arrays, such as might be obtained by parsing a JSON document. XDM maps and arrays, however, are more general than those found in JSON.]
Each of these two definitions says not what the definiendum is, but rather what it does ("represents": like a member of parliament represents constituents? -- confusing). It also includes buffer words that introduce intermediaries between the definiendum and the thing you would think most immediate to it: "construct," "found," "encapsulation," "value."
More difficult is the fact that, like GNode, an XNode is an abstract category and not a thing in itself. But a JNode, we learn later, is not an abstract category, but an actual thing, with properties. So at the top level of the taxonomy, we have an inconsistency, between abstract categories that have no instantiation, versus those that do, and the two are put in parataxis.
The definition of JNode is not well formulated. It restricts itself to "a value in a tree of maps and arrays" but not to maps and arrays themselves. Does the quoted phrase mean a map entry or an array member? Or the value within said entry or member?
Slight tangent: in the specs' definition of "value," the term is not really defined, but simply said to be synonymous with "sequence." But in practice the word substitution doesn't work. More often, the specs use "value" in a more restricted, common-sense meaning, to describe a two-term relationship. A thing "owns" a value and some datum inhabits the role of that thing's value. X has value Y. Y is value of X. We run into problems with the ambiguous word "value." Currently a JNode encapsulates a value (see above). But it also has the property (we learn later) of value. So the value has a value?
The JNode definition is sharpened, not in the data model, where it should be, but in the opening sentence of XSLT section 20: "A JNode is a wrapper around a map or array, or around a value that appears within the content of a map or array." This raises the question, what is a wrapper? And content? But it also raises the question about the relationship between JNode and map and between JNode and array, and the juxtaposition of JNode with XNode accentuates the difference. We would never say that an XNode is a wrapper around an element, an attribute, etc. The inconsistency is of a piece with the confusion I've pointed out above concerning the taxonomy of the data model.
Before I propose a solution, I need to probe a similar problem that already exists in the specs:
[Definition: A function is an item that can be called. ]
The word "called" is set in boldface, as if it is a technical term defined elsewhere. It is not, and is rarely used in the specs. What is it for something to be callable? Non-callable? To my mind, we do not have a proper definition of "function," and it is fair game for adjustment. As we have done. In 4.0 we have promoted the function and its proper parts into the topmost level of the data model taxonomy (with adjustments to a few definitions).
[Definition: An array item (also called simply an array) is a function item that represents an array.]
This suffers from the same flaws as GNode and XNode ("represents"), and is tautologous. In the version 3.1 definition of "map" we had the same problem, but the version 4.0 definition at least avoids the tautology.
So, to sum, we have definitions that aren't, inconsistency in our data model taxonomy, and a variety of other problems.
A different approach
We all intuit that the new taxonomy GNode - (XNode | JNode) is meaningful, useful, and important. Arrays and maps really are trees as much as they are functions.
Let all three terms GNode, XNode, and JNode be defined as abstract categories.
Just as XNode is subdivided into specific xnodes, let JNode subdivide into four specific jnodes:
- map
- map entry
- array
- array member
Adopt the same approach we do for xnodes, and define each of the four on its own terms. Define map - map entry and array - array member along lines similar to the approach adopted in 6.6 to define element - attribute (quite analogous!). We have wrestled over having to have both sequence and selector properties. But with this new approach, we are not stuck. Only map and array jnodes require a sequence property. Map entry and array member jnodes require only the selector property, not the sequence.
This approach is extensible. Suppose we have a proposal for a new JNode. It's a blork, and every blork has one or more cheegs, each one of which has one or more drazers. We simply define three more JNodes: blork, cheeg, drazer. We make sure that the properties for each are suited to what they are individually (the same way we do for the 7 types of XNodes).
One more step, the most controversial: drop Map Items and Array Items from the Function Items category. Yes, there is a fundamental way in which maps and arrays behave like (non-map/array) functions, but there are also equally fundamental ways in which maps and arrays behave like XNodes. If we do not need to define maps and arrays subordinate to XNodes/nodes, then why should we define them subordinate to functions? JNodes have dual citizenship.
An alternative taxonomy is to drop the concept of GNode altogether, and let there be four kinds of item:
- item
- anyAtomicType
- XNode
- attribute
- document
- element
- text
- comment
- processing-instruction
- namespace
- JNode/JFunction
- map
- map entry
- array
- array member
- function(*)
Pull request #2077 created #created-2077
2016 File Module: Incorporate changes
Includes general refactorings.
Closes #2016
Issue #2076 created #created-2076
TOC: Interaction
Based on user feedback that I got, it seems that the solution to expand/collapse a sub-TOC could be more intuitive.
@ndw Would it be possible to remove the rightmost arrow icon, and to simply expand the subentries when a TOC entry is clicked? This would also decrease the number of icons, and it would remove the current, somewhat confusing behavior that a TOC entry is also expanded/collapsed when the empty area to the right of the arrow is clicked.
We could keep the arrow on top level to be able to expand/collapse all entries.
Issue #2075 created #created-2075
Editorial notes (incremental)
- The »Summary of Changes« sections contain outdated information that refer to the presentation of the specs. Maybe we don’t really need them:
- Use the arrows to browse significant changes since the 3.1 version of this specification.
- Sections with significant changes are marked Δ in the table of contents. New functions introduced in this version are marked ➕ in the table of contents.
…to be continued.
Issue #2074 closed #closed-2074
JPath operator
Issue #2074 created #created-2074
JPath operator
Christian (I'm not sure where....) has proposed using an operator other than ?
, for example ?/
, for paths involving JNodes.
This would eliminate some of the non-orthogonalities introduced in the interests of backwards compatibility. For example, the result could always be in document order with duplicates eliminated (which is not currently true for A?B
). A?/B
could then be truly synonymous with A?/child::B
, with no ifs and buts. The result would always be a sequence of JNodes, they would only be unwrapped if used in a context (such as arithmetic) where coercion forces the unwrapping.
This might (perhaps?) also enable a tidier syntax for filtering by type: the current syntax A?~[sequenceType]
is rather clunky.
Issue #2073 created #created-2073
JNodes and Sequences
The JNode model as currently proposed doesn't handle sequences very elegantly: specifically maps and arrays whose entries/members contain values of more than one item.
Also (but separate) handling of empty maps, arrays, and sequences isn't ideal.
Consider the map {"a": 1, "b": [2], "c": (3, 4), "d": ([5], [6]), "e": (7, [8]), "f": []}
Applying child::*
to this map gives you six JNodes, as you would expect, with selector properties "a", "b", "c", "d", "e", "f". After that, things get complicated.
-
The JNode
child::a
has a value of1
, and no children. -
The JNode
child::b
has a value of[2]
, and has one child, with selector=1, value=2, position=1 -
The JNode
child::c
has a value of (3,4), and no children. -
The JNode
child::d
has a value of([5], [6])
, and has two children. The first child has selector=1, position=1, value=5; the second has selector=1, position=2, value=6. Each of these two children itself has one child. -
The JNode
child::e
has a value of(7, [8])
, and has one child. The child has selector=1, position=2, value=8. -
The JNode
child::f
has a value of[]
and no children.
There is logic to this, but it isn't easy to explain. I don't at the moment have any clear ideas for improving matters, but raise the issue in the hope that we can come up with ideas.
Issue #835 closed #closed-835
Review names of record types
Issue #2040 closed #closed-2040
XQuery context value declaration
Issue #2050 closed #closed-2050
2040 Fix context value declaration issues
QT4 CG meeting 127 draft minutes #minutes—07-01
Draft minutes published.
Issue #2072 created #created-2072
JNodes: accessing properties
If we decided to introduce a custom JPath expression for JNodes (see #2054), we could possibly use the classic lookup operator to access the properties of JNodes:
For example, the expression…
let $data := { 'name': 'Achab' }
return $data/name
…would result in a JNode…
JNode(
value := 'Achab'
parent := JNode($data)
selector := 'name'
position := 1
)
…and we could use $name?selector
to retrieve the selector. This way, we could possibly go without the specific functions fn:JNode-content
, fn:JNode-selector
and fn:JNode-position
.
In principle, the classic lookup operator could be further extended to access properties of XNodes.
Issue #967 closed #closed-967
XPath Appendix I: Comparisons
Issue #1021 closed #closed-1021
Extend `fn:doc`, `fn:collection` and `fn:uri-collection` with options maps
Issue #1583 closed #closed-1583
JSON: Parsing and serializing numbers, often undesired E notation
Issue #1903 closed #closed-1903
`fn:scan-left`, `fn:scan-right`: missing steps
Issue #1283 closed #closed-1283
77b Update expressions
Pull request #2071 created #created-2071
77c deep update
Proposes a new fn:update function that can handle both JNodes and XNodes.
(this is a branch on a branch, so I don't know how well the diff'ing will work; but look in F&O for the fn:update function)
Pull request #2070 created #created-2070
Map build patch
Small edits to map:build
- fix parameter name
- remove surplus blank lines
- add example with multiple keys returned by key function (a map of sequences)
QT4 CG meeting 127 draft agenda #agenda-07-01
Draft agenda published.
Issue #2068 closed #closed-2068
Editorial notes
Issue #2069 closed #closed-2069
1970, 2068 Editorial notes
Pull request #2069 created #created-2069
1970, 2068 Editorial notes
#1970, Closes #2068.
As this PR includes changes that should have been part of #1970 (and minor other fixes), I will immediately merge the PR. If someone objects, I will be happy to revert the change.
Issue #2068 created #created-2068
Editorial notes
- #2017: Key should also be mandatory for
array:sort-by
…to be continued.
Issue #2059 closed #closed-2059
Literal QNames: Adaptive serialization
Issue #2060 closed #closed-2060
2059 Literal QNames: Adaptive serialization
Issue #2017 closed #closed-2017
`fn:sort-by`: Observations
Issue #2062 closed #closed-2062
2017 fn:sort-by: Observations
Issue #1970 closed #closed-1970
Editorial notes
Issue #2065 closed #closed-2065
1970 Editorial notes
Issue #2056 closed #closed-2056
Implicit Whitespace in `MarkedNCName` and `QNameLiteral`
Issue #2064 closed #closed-2064
2056 Implicit Whitespace in MarkedNCName and QNameLiteral
Issue #2058 closed #closed-2058
Literal QNames: Annotations
Issue #2061 closed #closed-2061
2058 Literal QNames: Annotations
Issue #2067 closed #closed-2067
Fix 'TODO' entries in the function catalog from PR 2013
Pull request #2067 created #created-2067
Fix 'TODO' entries in the function catalog from PR 2013
Issue #2009 closed #closed-2009
xsl:variable implicit document nodes
Issue #2015 closed #closed-2015
2009 Avoid constructing document node when it makes no sense
Issue #2045 closed #closed-2045
Functions taking "." as default argument, when "." is empty
Issue #2049 closed #closed-2049
2045 Context value can be an empty sequence
Issue #748 closed #closed-748
Parse functions: consistency
Issue #2013 closed #closed-2013
748 Parse functions: consistency
Issue #1942 closed #closed-1942
37 Support sequence, array, and map destructuring declarations
Issue #37 closed #closed-37
Support sequence, array, and map destructuring declarations
Issue #2055 closed #closed-2055
37 Sequence, Array, and Map destructuring
Issue #2066 created #created-2066
Cells in the F&O signature blocks should be vertically aligned to the top
If a function parameter type (e.g. in map:build) has a long type that wraps to multiple lines the parameter name and default are center aligned. A similar issue will happen with the type if the default value wraps.
Setting vertical align to top will fix this.
Pull request #2065 created #created-2065
1970 Editorial notes
Closes #1970
Editorial. The only controversial change may be to rename the second parameter of map:build
from $keys
to $key
.
Pull request #2064 created #created-2064
2056 Implicit Whitespace in MarkedNCName and QNameLiteral
Closes #2056
Issue #2002 closed #closed-2002
Adaptive serialization: QNames
Pull request #2063 created #created-2063
1996 Lookups, KeySpecifier: Literal, ContextValueRef
Closes #1996
Pull request #2062 created #created-2062
2017 fn:sort-by: Observations
Closes #2017
Issue #850 closed #closed-850
fn:parse-html: Finalization
Pull request #2061 created #created-2061
2058 Literal QNames: Annotations
Closes #2058
Pull request #2060 created #created-2060
2059 Literal QNames: Adaptive serialization
Closes #2059
QT4 CG meeting 126 draft agenda #agenda-06-24
Draft agenda published.
Issue #2059 created #created-2059
Literal QNames: Adaptive serialization
QNames that are output with the adaptive
serialization method could be prefixed with a #
character:
serialize((#a, #xml:a), { 'method': 'adaptive' })
(: Result :)
#Q{}a
#Q{http://www.w3.org/XML/1998/namespace}a
Issue #2058 created #created-2058
Literal QNames: Annotations
The literal QName syntax could be useful for annotations:
(: Current RESTXQ syntax :)
%rest:query-param('search', '{$search}')
(: Alternative new syntax :)
%rest:query-param('search', #search)
To remove the current exotic status of literal QNames, we could allow the syntax at some more places, like:
- Catch clauses:
catch #err:XPTY0004 { ... }
- Element/attribute tests:
element(#a:b)
Issue #2057 created #created-2057
Steps: variable element names
With lookups, it is simple to use dynamic keys:
$map?$name
An equivalent solution is missing for path expressions. The current approach is to check the name with a predicate:
$node/*[node-name() = $name]
We could provide a more compare concise syntax by extending element tests (and, similarly, attribute tests):
$node/element($name)
The new literal QName syntax simplifies things further:
for $name in (#h1, #h2, #h3)
return $node/element($name)
Issue #2056 created #created-2056
Implicit Whitespace in `MarkedNCName` and `QNameLiteral`
The REx-generated XQuery parser has failed to raise the expected syntax error for test case nscons-046
.
This is caused by REx' unability to precisely handle differing whitespace allowances of multiple viable parsing alternatives. In the actual case,
<foo>{namespace # ...
there are two possible interpretations for namespace
#
that still need to be distinguished:
- a
NamedFunctionRef
refering to some arity of a function namednamespace
, - a
CompNamespaceConstructor
using aMarkedNCName
.
Now the problem is that the former allows implicit whitespace to follow, while the latter does not. A REx parser accepts whitespace here and thus fails the test case.
While this is REx' problem, and can be worked around by creating multiple #
tokens with different lexical lookahead, it would be nicer to avoid this situation. This would possibly also simplify the work for other parsers.
May I ask to allow implicit whitespace in both the MarkedNCName
and QNameLiteral
productions? This would be in line with variable declarations and references, which also allow implicit whitespace between $
and EQName
.
Pull request #2055 created #created-2055
37 Sequence, Array, and Map destructuring
Redrafting of PR 1942, after discussion, and extension to XQuery
Fix #37 Supersedes #1942
This PR implements the decisions of today's discussion to the best of my understanding. I don't think further discussion is needed, but it does merit careful checking.
QT4 CG meeting 125 draft minutes #minutes—06-17
Draft minutes published.
Issue #1888 closed #closed-1888
366 xsl:package-location
Issue #2029 closed #closed-2029
fn:xsd-validator - more explanation needed
Issue #2030 closed #closed-2030
2029 xsd validator notes and examples
Issue #2041 closed #closed-2041
Incorrect example of xsl:namespace-alias
Issue #2042 closed #closed-2042
2041 Correction to xsl:namespace-alias example
Issue #1127 closed #closed-1127
Binary resources
Issue #2044 closed #closed-2044
Hide `MarkedNCName` from XPath spec
Issue #2054 created #created-2054
JPath expression
Edit: ?/
is preferred over \
; see comments.
I feel that there is one aspect about the exciting new JNode proposal that may develop into a permanent crutch. It is the lack of symmetry between simple lookups and lookups with axes:
a?b
a?child::b
This particularly strikes me as this inconsistency does not exist in XPath.
What about the idea to keep XPath 3.1 lookups unchanged and simple – for the vast number of use cases that do not require navigation – and to introduce a new “JPath expression” instead that will exclusively produce JNodes?
The JStep separator that I would recommend for it is the backslash character \
. It bears even more resemblance to the XPath step separator than the often questioned question mark. By using a new syntax, I believe we would have much more freedom in designing a clean solution for navigating maps and arrays that is more consistent as well as more similar to classical node paths.
Some syntactical examples:
let $countries := {
'Japan': {
'cities': [ { 'Fukuoka': { 'population': 1600000 } } ]
}
}
return (
$countries\*,
$countries\\cities
$countries\Japan\cities[.\\population > 1000]\..,
$countries\\*[.\ancestor::Japan]
)
Among other advantages that I see, the traversal of arrays of maps would also be less controversial (#115).
QT4 CG meeting 125 draft agenda #agenda-06-17
Draft agenda published.
Issue #2053 created #created-2053
Add fn:collection-available
Since the fn:collection function can raise errors, perhaps it should have a corresponding function to check if the collection is available?
For reference, see the other *-available functions:
Also, eXist has an implementation-specific xmldb:collection-available
function: https://exist-db.org/exist/apps/fundocs/index.html?q=xmldb:collection-available. It's typically used to determine if a collection already exists before creating or deleting it.
A proposed signature, based on those linked above, could be:
fn:collection-available( $source as xs:string? := () ) as xs:boolean
Issue #2052 created #created-2052
fn:collation-available: $usage
The function fn:collation-available
defines a $usage
parameter, but few information is given on how to interpret the collation URI to return the correct result. In addition, there are no test cases.
Do we believe that the additional parameter offers enough advantages, or should we simplify the function?
Issue #2051 created #created-2051
XSLT group by cluster
I propose an enhancement of xsl:for-each-group
to support clustering.
To start off with a simple use case, suppose one has the following population, <xsl:variable name="ages" as="xs:integer*" select="5, 24, 9, 5, 6, 8, 36, 38, 28"/>
and one wishes to cluster the figures like so, in four groups: (5, 5, 6, 8, 9)
, (24)
, (28)
, (36, 38)
.
One is tempted to create an <xsl:for-each-group>
with group-by="((. - 1) to (. + 1))"
. But this does not work. If @composite
is absent or is no
, eighteen groups are created. If @composite
is yes
, eight groups are created. In both cases, the results are not significantly close to the desired output.
I propose a new @group-by-cluster
. The following code
<xsl:for-each-group select="$ages" group-by-cluster="((. - 1) to (. + 1))">
<xsl:sort select="current-grouping-key()"/>
<group key="{current-grouping-key()}" count="{count(current-group())}">
<xsl:copy-of select="current-group()"/>
</group>
</xsl:for-each-group>
would produce this
<group key="4 5 6 7 8 9 10" count="2">5 5 6 8 9</group>
<group key="23 24 25" count="1">24</group>
<group key="27 28 29" count="1">28</group>
<group key="35 36 37 38 39" count="1">36 38</group>
There are numerous use cases for the proposed new feature. Here are a few:
- Clustering map or spatial coordinates
- Grouping disparate rectangles from OCR output
- Reconciling triples in linked open data (RDF) that use different IRIs synonymously
- Detecting typologies within in large corpora of documents that have periodically repetitive formulaic paragraphs.
- Discovering networks of connected things, e.g., networks of email correspondence or publication citations Currently, the clustering I describe above is feasible in XSLT, but it requires creative strategies, usually a combination of preprocessing and the creation of specialized helper functions to recursively iterate over multiple grouping keys to create group numbers. These are challenging to write and debug, and one loses identity in a preprocessed copy of the original.
By putting clustering into a @group-by-cluster
construct, users benefit not only from convenience but also from performance, as a processor might bring novel strategies for clustering.
The current-grouping-key()
for a group would consist of a sequence of all members' grouping keys, duplicates removed. No two groups would have any overlap in their grouping key sequences. (That's the definition of a cluster.)
@group-by-cluster
would have effect only if its value actually produced a sequence of length greater than one, and if @composite
is no. (Should a user should be warned if @composite
is yes?)
Pull request #2050 created #created-2050
2040 Fix context value declaration issues
Fix #2040
Pull request #2049 created #created-2049
2045 Context value can be an empty sequence
For functions like name(), local-name() etc with as="node()? default="."
in the signature, allow the context value to be an empty sequence.
Fix #2045
Issue #2048 created #created-2048
Untrusted execution, and security more generally
Discussion of untrusted execution (that is, a Processor executing code from an untrusted source), and security in general, is present in the spec, but spread out and not really connected.
Untrusted execution seems to me to be one of the biggest security issues for XSLT/XQuery and XPath in general, and I think it’s important to make the distinction between untrusted execution where an untrusted stylesheet or query is executed (perhaps via Saxon’s XsltCompiler.compile()
), and when a trusted stylesheet or query causes untrusted code to be executed, as with fn:parse-xml()
, fn:doc()
, fn:transform()
and <xsl:evaluate>
.
The proposals in #2034 address one case, but have no effect on the other, and that raises the question of what would happen if an untrusted (unsafe) stylesheet executed fn:doc()
with safe = true
.
I would like to see the specs address the security implications of untrusted execution more explicitly, and to provide clearer guidance for implementation authors around both completely untrusted code, and trusted code which executes untrusted code.
I think that could take the form of:
- Annotation of all functions that have potentially problematic external effects (primarily file / resource access).
- Clear expectations for implementors about what (and how) security restrictions should be configurable.
- Consistent Error codes for security-related exceptions
- Consistent security-related options for functions which can cause untrusted source parsing or code execution.
- A additional security section in the spec under section 2 (Concepts / Basics) in the XSLT/XQuery/XPath specs, and section 1 of XPFO, which collects the overview and points to what to look for in the rest of the spec.
Issue #2047 created #created-2047
external resource-accessing functions, available resources, and error codes
While looking at a problem in an implementation of a vendor function that read data from an external resource I was looking through the spec to see what was said about external resources and untrusted execution contexts (for example a processor executing XSLT or XPath provided by a user).
The spec makes mention of ‘available documents’, ‘available text resources’, ‘available binary resources’, ‘available collections’, and ‘available URI collections’.
The documentation for fn:doc()
, fn:collection()
, and fn:uri-collection()
mention an error (err:FODC0002
) to be raised if the URI requested is not in the relevant ‘available X’ (although there is some disagreement about what the ‘X’ should be - we have ‘available node collections’ and ‘available resource collections’ mentioned in the function docs). fn:unparsed-text()
and fn:unparsed-binary()
do not.
I think the documented behaviour of fn:unparsed-text()
and fn:unparsed-binary()
should be brought into line with the others.
Issue #2046 created #created-2046
Promote ".." to a primary expression
I propose promoting ".." to be a primary expression (rather than an abbreviated step), and (assuming the JNodes proposal is accepted) allowing it to work on JNodes as well as XNodes.
Ignoring JNodes for now, I don't think the change would make any observable difference to the language syntax or semantics. It changes the way that a predicate is interpreted : ..[P]
becomes a regular filter expression and is no longer subject to the special rules for predicates within steps; but the only difference is how position() is interpreted, and since ..
can only return a singleton, this makes no difference.
For JNodes it means we will be able to write expressions such as $my-map??x[..?y = 3]
rather than $my-map??x[?parent::*?y = 3]
Issue #2045 created #created-2045
Functions taking "." as default argument, when "." is empty
A number of functions such as fn:name()
take the context value as their default argument, with the implication that name()
is exactly equivalent to name(.)
.
However, these functions also say that a type error XPTY0004 is raised if the context value is not a single node.
It seems to me that if the context value is an empty sequence, no type error should be raised; the effect of () -> name()
should be the same as name(())
.
Pull request #2044 created #created-2044
Hide `MarkedNCName` from XPath spec
Today's merge of qt4cg/qtspecs#2028 has added the MarkedNCName
production to both the XQuery and the XPath spec. It is however only used within the XQuery spec.
QT4 CG meeting 124 draft minutes #minutes—06-10
Draft minutes published.
Issue #2027 closed #closed-2027
QNameLiteral syntax for namespace and Processing Instruction constructors
Issue #2028 closed #closed-2028
2027 '#' syntax for computed PIs and namespaces
Issue #2032 closed #closed-2032
Simple typo in XPath 4.0 example - inherited from XPath 3.0 spec
Issue #2033 closed #closed-2033
2032 Fix typo in example
Issue #2022 closed #closed-2022
Simplify optional XQuery conformance features
Issue #2026 closed #closed-2026
2022 Drop module feature
Pull request #2043 created #created-2043
2038 Tweak the rules for fn:apply-templates references to modes
Fix #2038
Pull request #2042 created #created-2042
2041 Correction to xsl:namespace-alias example
Fix #2041
Issue #2041 created #created-2041
Incorrect example of xsl:namespace-alias
Reported against XSLT 3.0
https://github.com/w3c/qtspecs/issues/71
QT4 CG meeting 124 draft agenda #agenda-06-10
Draft agenda published.
Issue #2040 created #created-2040
XQuery context value declaration
Section 5.17:
-
there is no changes entry flagging the fact that the coercion rules are now applied (and corresponding test cases such as contextDecl-037a do not identity a PR). The relevant PR is PR #254 .
-
The statement "The context value declaration has the effect of setting the context value static type T in the static context." is incorrect. The static context no longer includes a context value static type.
-
The statement "In all cases where the context value has a value, that value must match the type T according to the rules for SequenceType matching" is incorrect (or at least, misleading): as stated two paragraphs later, coercion is applied. But because there can be multiple context value declarations in different modules, specifying different types, perhaps the intent is that coercion is applied only to a value supplied in the query, and not to a value supplied externally? If so, this needs clarifying. There are apparently no tests for coercing an externally-supplied value to the required type.
Issue #2039 created #created-2039
Generalize context item to context value in XSLT
Various places, for example the xsl:context-item declaration and the xsl:evaluate/@context-item attribute, should be updated to allow the there being a context value rather than a context item.
At present the context value at instruction level is always either a singleton or absent. We should consider generalizing this to align with XPath, where the ->
and ?[....]
operators allow the context value to be an arbitrary sequence.
An xsl:for-each-member instruction that iterates over an array and binds each member to the context value would make sense.
Issue #2038 created #created-2038
Drop dependency of fn:apply-templates on the default mode
The new fn:apply-templates
function in XSLT can invoke the "default mode", either by specifying mode="#default"
or by not specifying a mode. The default mode is defined by the nearest containing instruction that has a [xsl:]default-mode
attribute.
I would like to drop this dependency.
Most of the cases where a function call depends on the static context (especially an XSLT function) are cases where the relevant property is fixed for a package (e.g. the set of named keys, decimal formats, or character maps). There are places where there is a dependency on something tha can vary in a more fine-grained way - notably (a) the default collation, and (b) the set of namespace bindings, but on the whole such dependencies are undesirable (a) because they introduce opportunities for user error (e.g. when copying and pasting code) and (b) because they increase the amount of information the processor has to keep around at runtime just in case it is needed (for example, in a dynamic function call). I would therefore like to avoid introducing this dependency.
The proposed change is (a) to drop "#default" as a value of the mode
option for this function, and (b) to say that if no mode is specified by fn:apply-templates
the unnamed mode is used.
Pull request #2037 created #created-2037
2036 Add rule for streamability of xsl:map
Fix #2036
Issue #2036 created #created-2036
Streamability of xsl:map instruction
The special condition that allows more than one operand of xsl:map to be consuming should apply only if the duplicates
attribute is absent (defaulting to "error"). If duplicates are allowed then in general the result cannot be streamed.
Issue #1955 closed #closed-1955
fn:doc, fn:parse-xml: entity expansion
Issue #2035 created #created-2035
Recursive record types: unrealistic example in XPath spec
The example of mutually-recursive record types in XPath §3.2.8.3.1 (using the schema component model as an example) is unrealistic, because an instance of this structure would be cyclic at the instance level, and therefore would be non-instantiable. In practice, the only way to represent cyclic structures using maps and arrays is by use of functions to represent some of the relationships, as we do in the schema record type returned by functions such as fn:schema-type(). We should change the example to use this technique and explain why it is being used: it would still illustrate the point; although it would be in danger of becoming excessively complicated.
Issue #2034 created #created-2034
fn:parse-xml, fn:doc: `safe` option
This issue replaces #1955.
The first feedback that we got for the entity-expansion-limit
option indicates that our current solution is neither fish nor fowl (weder Fisch noch Fleisch?):
-
With the initial suggestion in #1860, I hoped we could define sane defaults to prevent attacks caused by
fn:parse-xml
andfn:doc
. This turned out to be difficult. Instead, we now have two specific options (allow-external-entities, entity-expansion-limit) that need to be explicitly assigned to make parsing safer. -
In order to parse certain XML documents, like
dblp..xml.gz
, more than one JDK 11 limit needs to be increased:
http://www.oracle.com/xml/jaxp/properties/entityExpansionLimit
http://www.oracle.com/xml/jaxp/properties/maxGeneralEntitySizeLimit
http://www.oracle.com/xml/jaxp/properties/totalEntitySizeLimit
As we have observed, XML parsing depends on the specific XML parsers. I believe it would be more user-friendly to replace the specific settings with a single option safe
, and to let the processor decide which properties are assigned:
true
(default): Avoid XXE and billion laughs attacksfalse
: disable safe parsing (increase limits, allow parsing of external resources)
Pull request #2033 created #created-2033
2032 Fix typo in example
Fix #2032
Issue #2032 created #created-2032
Simple typo in XPath 4.0 example - inherited from XPath 3.0 spec
Location of typo: Array Types section.
Current text:
[ 1, 2 ] instance array(*) returns true()
Expected text:
[ 1, 2 ] instance of array(*) returns true()
Pull request #2031 created #created-2031
2025 JNodes
Fix #2025
This is a first draft for review.
It includes changes to the data model, functions and operators, and XQuery/XPath. It does not yet include changes to XSLT.
It's a big proposal, but I think it removes more complexity from the spec than it adds. It's basically a unification of two concepts, both of which were addressing aspects of the same problem, namely that lookup expressions lose too much information. It gets rid of the pin/label mechanism, and modifiers on lookup expressions, and introduces JNodes and JAxes in their place. (Any suggestions for improved terminology are more than welcome.)
I think we get a lot more "bangs for the buck" with this solution, and it makes navigation of JSON trees work in a much closer way to familiar navigation of XML trees. It needs a lot more work on examples and explanation, of course.
Issue #1859 closed #closed-1859
Question on `fn:chain` and `err:FOAP0001`
Issue #1894 closed #closed-1894
Additional examples to fn:chain - in a new branch
Issue #1883 closed #closed-1883
882 Replace fn:chain by fn:compose
Pull request #2030 created #created-2030
2029 xsd validator notes and examples
Adds more explanation to xsd:validator
Extracts material from the XQuery and XSLT specs describing the validation process, moving this to a new section in F&O, to reduce duplication.
Fix #2029
Issue #1959 closed #closed-1959
1953 (part) XSLT Worked example using methods to implement atomic sets
Issue #882 closed #closed-882
fn:chain or fn:compose
Issue #1984 closed #closed-1984
882 Drop fn:chain
Issue #2023 closed #closed-2023
Semantics of X?$a
Issue #2024 closed #closed-2024
Add rules for $V?$X
Issue #2029 created #created-2029
fn:xsd-validator - more explanation needed
See action QT4CG-119-02
In the review when the function was accepted, I was asked to supply more notes and examples indicating how the various options for assembling a schema interacted with each other.
Pull request #2028 created #created-2028
2027 '#' syntax for computed PIs and namespaces
Fix #2027
Issue #2027 created #created-2027
QNameLiteral syntax for namespace and Processing Instruction constructors
Action QT4CG-021-01 (should be QT4CG-121-01)
We now allow QNameLiterals to be used in element and attribute constructors, for example attribute #xml:space {"default"}
.
For symmetry we need a similar syntax for namespace and processing-instruction constructors. These have the same ambiguity problem if the node name clashes with a reserved word such as "div", but they are constrained to be NCNames (or no-namespace QNames, depending on your perspective).
One possible solution is to use a new construct such as NCNameLiteral (but the name is wrong, unless we also allow it to be used in contexts where a Literal is allowed). Another possibility is for the grammar to allow a QNameLiteral, but for the semantics to restrict it to be in no namespace.
Pull request #2026 created #created-2026
2022 Drop module feature
Fix #2022
The effect is that support for library modules is no longer optional.
I decided not to pursue merging the "schema import" and "typed data" features into one.
QT4 CG meeting 123 draft agenda #agenda-05-27
Draft agenda published.
Issue #2025 created #created-2025
Combine the concepts of pins/labels and modified lookups
We have two rather separate mechanisms, both designed to solve aspects of what is essentially the same problem: lookup expressions lose too much information.
Pinning tries to solve the problem by saying that if the origin of the lookup is pinned, then the results of the lookup carry a label containing information about the key and the parent.
Modifiers like pair::* try to solve the problem by returning a map containing the key and the value as separate fields.
But pinning only solves part of the problem, in particular it doesn't prevent X?* flattening the result, and the pairs
modifier only solves part of the problem, in particular it doesn't retain parentage.
I would like to try combining them and trying to create a mechanism that is better than either. I don't yet know exactly how this might work, but I'm thinking along the lines:
- Replace the concept of labelled items with labelled values (that is, a label can be attached to any value, not just an item)
- Scrap pin() as an explicit function
- A lookup expression like $X ? child::Y returns a sequence of labelled values (no flattening)
- The properties of a labelled value include: ** target - the actual value ** key - the associated key (or array index) ** parent - the containing map or array
- These properties might be made available through syntax such as
$LV ? target::*
,$LV ? key::*
,$LV ? parent::*
(or otherwise) - ancestor and ancestor-or-self can be made available as derived properties
- Many operations when given a labelled value should automatically operate on its target, and ignore the label (rather like atomisation). Exactly which operations do this is an interesting question to which I don't yet know the answer. It's tricky because child::* returns a sequence of labelled values, and we want to be able to manipulate this in unflattened form. Perhaps child::* should instead return an array of labelled values? But then you end up with another lookup operation to extract the members of this array.
Pull request #2024 created #created-2024
Add rules for $V?$X
Fix #2023
Issue #2023 created #created-2023
Semantics of X?$a
In §4.14.3.1 we describe the semantics of lookup expressions. Rules 3a to 3e and 4a to 4e all start "if the KeySpecifier KS is..." and should enumerate all the possibilities for a KeySpecifier. But the case where the KeySpecifier is a VarRef is not mentioned.
Of course, X?$a
is supposed to be a shorthand for X?($a)
, we just fail to state the fact.
Issue #2022 created #created-2022
Simplify optional XQuery conformance features
In XQuery I propose:
- Dropping the "module feature" - every conformant XQ40 implementation must support library modules
- Merging the "schema-aware" and "typed-data" features into a single optional feature, aligned with schema-awareness in XSLT
Issue #2021 created #created-2021
XSLT: Move "Patterns" section into "Template Rules"
I propose to move XSLT section §5.4 Patterns so it becomes §6.1, under Template Rules, where it will hopefully be easier to find.
Because this will produce a large number of diffs, I propose to make it a separate PR rather than combining it with the work I am currently doing on patterns and template rules for maps and arrays.
Issue #2020 closed #closed-2020
Reconsider the rationale for the xsl:select instruction
Issue #2020 created #created-2020
Reconsider the rationale for the xsl:select instruction
The section for xsl:select in the XSLT specification includes the following rationale:
An XPath expression written within an XML attribute is subjected by the XML parser to attribute value normalization, which changes the arrangement of whitespace within the value. While this will rarely affect the actual meaning of the expression, it can mean that formatting is lost. Multi-line attribute values are therefore best avoided. The loss of formatting also makes it difficult for an XSLT processor to provide precise error locations.
There are good reasons why xsl:select
would be a useful instruction, but I don't think providing precise error locations is one of them. This is just circumventing a problem that is solvable today for select
attributes. If an implementer wanted to supply a more precise error location in attribute values (and this would certainly help developers) they could adopt a solution similar to the Ecma SourceMap used by EcmaScript transpilers and minifiers.
In XSLT 3.0, we frequently work with multi-line select
attribute values on XSLT instructions without major issues. Examples include: when calling using fold-left()
functions with one or more inner functions or using multi-case if/else expressions. Using xsl:select
just to get good error messages does not seem like a good trade-off for the added verbosity.
For these cases, one can either use a simple XPath linter in the XSLT editor to highlight the specific error tokens caused by basic typos and unresolved references, and then fall back on using the compiler error messages with approximate line-numbers for (the many) cases that the linter cannot pick up.
Precise XSLT Error Locations and AI Agents
Modern XSLT editors are today fully integrated with AI Agents (e.g. GitHub Copilot or AI Positron). These agents use reported error-locations to explain and suggest a fix for the XSLT problem for the user. Precise error locations are critical to the quality of the explanation and the fix. This help should be available equally for XSLT select
attributes or xsl:select
instructions.
Pull request #2019 created #created-2019
1776: XSLT template rules for maps and array
Currently work In progress, committed so that the draft can be reviewed.
Changes in three main areas:
- Pattern syntax: patterns such as
?item
and?parent?item
are defined to match items in a map by their key - Built-in template rules for on-no-match="shallow-copy-all". Revisits the built in template rules for this scenario.
- General revision of the processing model for xsl:apply-templates applied to a tree of maps and arrays.
Issue #2018 created #created-2018
Type-checking the result of xsl:apply-templates
Code that calls xsl:apply-templates
inevitably has expectations about the type of the result. For example someone doing
<xsl:apply-templates select="@*"/>
may have an expectation that the result will be a sequence of attribute nodes, and the code might fail untidily if it is anything else. There is currently no way of stating this expectation, or of triggering coercion on the result. We have added an attribute xsl:mode/@as
but different calls on apply-templates in the same mode may have different expectations. (I'm seeing this particularly with modes that process maps and arrays)
We could add an as
attribute to xsl:apply-templates to make the expectation explicit.
Issue #2017 created #created-2017
`fn:sort-by`: Observations
We should make the second parameter obligatory (fn:sort-by(1 to 3)
seems confusing).
sort(keys := fn { ?key })
occurs twice in the remaining text; should be sort(key := fn { ?key })
.
Issue #1795 closed #closed-1795
XSLT templates: Matching values in a map by key
Issue #1981 closed #closed-1981
Syntax for QName literals clashes with XQuery pragmas
Issue #2016 created #created-2016
File Module: Incorporate changes
The EXPath File Module must be revised in several steps. First of all, several functions need to be incorporated that were added to the initial version (details: https://docs.basex.org/12/File_Functions).
Pull request #2015 created #created-2015
2009 Avoid constructing document node when it makes no sense
Fix #2009
The rules for xsl:variable are changed so there is no attempt to construct an implicit temporary tree when the sequence constructor contains an xsl:map
. xsl:array
, or xsl:select
instruction (perhaps mixed with other instructions).
Compatibility: note that xsl:array
and xsl:select
are new in 4.0, while xsl:map inside xsl:variable
always throws an error in XSLT 3.0.
Justification:
- a child
xsl:select
element behaves like aselect
attribute - if the content of xsl:variable is
xsl:map
orxsl:array
it makes no sense to require the user to addas=map(*)
oras=array(*)
because the type is obvious anyway.
Pull request #2014 created #created-2014
QT4CG-122-01 Add notes, examples, and rationale for xsl:select
Completes action QT4CG-122-01
Issue #2008 closed #closed-2008
2004 Add xsl:select instruction
Issue #2004 closed #closed-2004
xsl:xpath instruction
QT4 CG meeting 122 draft minutes #minutes—05-20
Draft minutes published.
Issue #2006 closed #closed-2006
2005 Add fn:apply-templates function
Issue #2005 closed #closed-2005
apply-templates() as a function
Issue #1991 closed #closed-1991
835 Add built-in named record types to static context
Issue #1085 closed #closed-1085
Parameters to fn:sort
Issue #2001 closed #closed-2001
1085 Revert fn:sort to the 3.1 spec; introduce fn:sort-by
Issue #1992 closed #closed-1992
Type of fn:schema-type-record ? constructor
Issue #1999 closed #closed-1999
1992 Correct type of constructor function in schema-type-record
Issue #1997 closed #closed-1997
Coercion Rules: §3.4.1 rule 3(c)
Issue #1998 closed #closed-1998
1997 Correct nesting of item coercion rules
Pull request #2013 created #created-2013
748 Parse functions: consistency
Closes #748
Issue #2012 created #created-2012
Add array:sort-with
Issue #655 PR #795 introduced fn:sort-with.
We should define array:sort-with for consistency.
Pull request #2011 created #created-2011
675(part): Add XSLT static typing rules for new kinds of XPath expression
Updates the static typing rules in XSLT for new kinds of expression introduced in XPath 4.0. These rules are used in streamability analysis, but more work needs to be done to complete the streamability analysis.
Production rules are now referenced by name, as production numbers are no longer available.
Issue #2010 created #created-2010
XSLT patterns: generalize union, intersect, and except
This is related to issue #402.
We have generalised the meaning of union, intersect, and except, when used in XSLT patterns, so that they now mean:
- A union B - matches either A or B
- A intersect B - matches both A and B
- A except B - matches A and does not match B
With these semantics, there is no longer any sensible reason to restrict these pattern operators to apply only to node patterns. The semantics work equally well for patterns that match (for example) maps or arrays. For example
<xsl:template match="record(a, b) except record(a, b, c)">
Issue #2009 created #created-2009
xsl:variable implicit document nodes
In XSLT 3.0, an xsl:variable
instruction with no select
or as
attribute implicitly wraps the value created by the sequence constructor in a document node. This inevitably fails if the content is a map, making it necessary to write
<xsl:variable name="m" as="map(*)">
<xsl:map>...</xsl:map>
</xsl:variable>
I propose that this wrapping should not happen if the first item in the result of the sequence constructor is a function item (including a map or array). The practical effect on users is that they can leave out the as="map(*)"
attribute in this situation.
For function items other than arrays, this is currently an error condition so there is no incompatibility.
For arrays it does represent an incompatible change -- the current rules ("Constructing complex content") say that an array is flattened. But XSLT 3.0 has no instruction to construct an array, it would have to be done using xsl:sequence
; and no-one would deliberately construct an array merely in order to flatten it, so the situation is unlikely to arise in practice.
The proposal is that the decision whether or not to construct a wrapping document node should be based on the first item in the sequence. This is to allow lazy evaluation. A function item appearing later in the sequence would be handled the same way as now -- most likely an error. An empty sequence continues to result in a childless document node.
Pull request #2008 created #created-2008
2004 Add xsl:select instruction
Fix #2004
Issue #322 closed #closed-322
Map construction in XSLT: xsl:record instruction
Issue #2007 created #created-2007
Creating arrays in XSLT
This kind of code comes up a lot, and is hard to simplify except by dropping into XPath:
<xsl:array>
<xsl:for-each select="?members?*[?_nodeType='MethodDeclaration']">
<xsl:array-member>
<xsl:apply-templates select="."/>
</xsl:array-member>
</xsl:for-each>
</xsl:array>
Issue #2005 (PR #2006) make it feasible to do it all in XPath (by calling the apply-templates() function) but I don't feel that's the whole answer.
Perhaps we could define something like
<xsl:array-build for-each="?members?*[?_nodeType='MethodDeclaration']">
<xsl:apply-templates select="."/>
</xsl:array-build>
Pull request #2006 created #created-2006
2005 Add fn:apply-templates function
Fix #2005
Issue #2005 created #created-2005
apply-templates() as a function
I propose introducing apply-templates()
as an xslt-only function, with semantics broadly equivalent to the xsl:apply-templates
instruction.
The main use case identified so far is when constructing maps and arrays, it enables the XPath syntax to be used rather than the much more verbose XSLT syntax.
Parameters:
- select: the items to be processed using matching template rules
- with-params: the parameters to be passed. Like with-params on xsl:evaluate, this means variable names exist at run-time, which is a bit of an innovation, but I think it's manageable
- mode: again, this means mode names exist at run-time, which may have consequences. There are open questions about what the default should be, or how the various options (default mode, unnamed mode, current mode) should be expressed.
Issue #2004 created #created-2004
xsl:xpath instruction
In the case study https://github.com/qt4cg/qtspecs/issues/1786#issuecomment-2884424739 I encountered a use case where an instruction xsl:xpath
would be useful.
<xsl:xpath>
{ "class":
{ "name": f:degenerify(name/@identifier),
"abstract": ? abstract,
"extends": array{? extendedTypes =!> map:merge(apply-templates()) },
"implements": array{? implementedTypes =!> map:merge(apply-templates()) }
...
}
}
</xsl:xpath>
The instruction is very simple: <xsl:xpath>EXPR</xsl:xpath>
is equivalent to <xsl:sequence select="EXPR"/>
. It's particularly useful because XPath constructors for maps and arrays are so much more concise than the XSLT equivalents. Compared with using xsl:sequence, it means that:
- XML attribute value normalization doesn't kick in, so your formatting is better protected (meaning also that the system has some chance of computing line numbers correctly for diagnostics)
- You haven't tied up either single or double-quotes as an attribute delimiter; both can be freely used within the expression.
- You aren't creating the false impression that you're returning a (multi-item) sequence
Note that the content is NOT a sequence constructor; no child elements are allowed; and the content is not interpreted as a text value template. Unlike xsl:evaluate, the XPath expression is statically fixed.
(This example also introduces apply-templates as a function, but that will be a separate proposal).
Issue #2003 created #created-2003
Conditional entries in map constructors
If you're constructing a map using a map constructor, adding an entry conditionally can be a real pain, and typically involves a wholesale rewrite of the way the map is constructed. It would be nice to be able to mark an entry as optional so users don't have to resort to such wholesale rewrites.
The difficulty of course is finding a nice syntax: one that is both intuitively readable and grammatically unambiguous.
One possibility might be:
MapConstructorEntry ::= MapKeyExpr ":" MapValueExpr "optional"?
with the semantics that if the "optional" keyword is present, and the result of evaluating MapValueExpr is an empty sequence, then the entry is omitted from the constructed map.
A more ambitious construct might be
MapConstructorEntry ::= MapKeyExpr ":" MapValueExpr ("when" MapEntryCondition)?
which adds the entry to the map only if the condition is true.
For example:
let $map := {"height": string(@height),
"width": string(@width),
"weight": string(@weight) when exists(@weight)}
Issue #2002 created #created-2002
Adaptive serialization: QNames
We could use the new QName literal syntax when serializing QNames with the adaptive
method:
serialize(xs:QName('x'), { 'method': 'adaptive' })
(: current output: Q{}x :)
(: proposed output: #x or #Q{}x :)
Pull request #2001 created #created-2001
1085 Revert fn:sort to the 3.1 spec; introduce fn:sort-by
Fix #1085
The new functionality introduced into the 4.0 version of fn:sort is repackaged into a new function fn:sort-by with a much cleaner interface; the fn:sort function reverts to its 3.1 specification.
If this PR attracts support then the corresponding change will be applied to the array:sort function.
Issue #2000 created #created-2000
element-to-map() - type signature of plan
The specifications of element-to-map() and element-to-map-plan() use different record types for the data structure representing the plan. In both cases the definition is less precise than it might be (though not wrong). The two functions should use a common named record type, which should be as precise as possible.
Issue #1982 closed #closed-1982
1981 Ambiguity with qname literals and pragmas
Issue #1889 closed #closed-1889
HTML serialization: `html-version` and `version` parameters; allowed values
Issue #1977 closed #closed-1977
1889 Tidy up handling of HTML serialization version, default to HTML5
Issue #1985 closed #closed-1985
Default namespace terminology
Issue #1987 closed #closed-1987
1985 Tidy up namespace terminology
Issue #1986 closed #closed-1986
Obsolete note on reporting errors
Issue #1988 closed #closed-1988
1986 Drop obsolete notes on error reporting
Issue #1989 closed #closed-1989
1983 QName literals in node constructors
Issue #1983 closed #closed-1983
Computed node constructors - use QName literals rather than string literals
Issue #1990 closed #closed-1990
Update schema-for-xslt40.xsd
Pull request #1999 created #created-1999
1992 Correct type of constructor function in schema-type-record
Fix #1992
Pull request #1998 created #created-1998
1997 Correct nesting of item coercion rules
Fix #1997
(A correction to an editorial error that made a substantive difference to the spec.)
Issue #1997 created #created-1997
Coercion Rules: §3.4.1 rule 3(c)
This section reads:
If R is an [atomic type] and J is an [atomic item], then:
- If J is an instance of R then it is used unchanged.
- If J is an instance of type xs:untypedAtomic then: ** If R is an [enumeration type] then A is cast to xs:string. ** If R is [namespace-sensitive] then a [type error] [[err:XPTY0117]] is raised.
- Otherwise, J is cast to type R.
The last line (rule 3(c)) looks all wrong. If we just did a cast at this point then rules 4 and 5 would be unnecessary.
It's not easy to trace the history, but I think it went wrong when the rules for choice/union types were refactored (around 2024-04-12). The line in question appears to have originally been under a conditional "if A is an instance of xs:untypedAtomic...".
Issue #1996 created #created-1996
Lookups, KeySpecifier: add NumericLiteral and ContextValueRef?
Various types of expressions are allowed as a KeySpecifier
:
Lookup ::= ("?" | "??") (Modifier "::")? KeySpecifier
KeySpecifier ::= NCName | IntegerLiteral | StringLiteral | VarRef | ParenthesizedExpr | LookupWildcard | TypeSpecifier
Maybe we could add ContextValueRef
and NumericLiteral
to the list, to make the following expressions legal:
{ '1.5': 'one and a half' }?1.5
array { 1 to 256 }?0x80
(3, 4, 5) ! $array?.
Issue #1995 created #created-1995
Consistency: array lookups
The different variants to look up array members should be unified. For example (if I interpret the rules correctly), the following expressions can be evaluated…
[ 'a' ]?(<x>1</x>)
[ 'a' ]??(number(<x>1</x>))
[ 'a' ]??(1e0)
let $a := 1e0 return [ 'a' ]??$a
…whereas the following expressions raise errors:
[ 'a' ]?(number(<x>1</x>))
[ 'a' ]?(1e0)
let $a := 1e0 return [ 'a' ]?$a
We should probably try to make all of them legal, or try to justify what happens.
QT4 CG meeting 121 draft agenda #agenda-05-13
Draft agenda published.
Issue #1797 closed #closed-1797
elements-to-maps: separate function to construct a plan
Issue #1993 closed #closed-1993
Incorrect test generated for map:pairs
Issue #1994 closed #closed-1994
1993 Stylesheet fix to copy the occurrence indicator
Pull request #1994 created #created-1994
1993 Stylesheet fix to copy the occurrence indicator
Fix #1993
Issue #1993 created #created-1993
Incorrect test generated for map:pairs
The signature for map:pairs is
map:pairs(
$map as map(*)
) as key-value-pair*
but the test case generated in misc-BuiltInKeywords is (incorrectly)
map:pairs(map := ?) instance of function(map(*)) as fn:key-value-pair
Note the missing *
at the end.
Issue #1992 created #created-1992
Type of fn:schema-type-record ? constructor
In the type fn:schema-type-record (returned by functions such as fn:schema-type), the field constructor
is said to be of type fn(xs:anyAtomicType) as xs:anyAtomicType
. It is also said to be "the same function as returned by [fn:function-lookup] applied to the type name (with arity one)". But that function has type fn(xs:anyAtomicType?) as T?
The correct type for the constructor
field should be fn(xs:anyAtomicType?) as xs:anyAtomicType?
Pull request #1991 created #created-1991
835 Add built-in named record types to static context
This PR adds six built-in named record types to the static context of every application:
Record [key-value-pair] Record [load-xquery-module-record] Record [parsed-csv-structure-record] Record [random-number-generator-record] Record [schema-type-record] Record [uri-structure-record]
These are now listed in Appendix C of F&O
Issue 835 requests a review of the names of these records; perhaps putting them in one place will make that review easier. Personally, I am happy with the names as currently defined.
Pull request #1990 created #created-1990
Update schema-for-xslt40.xsd
Fixed invalid syntax xs:simpleType/@ref (moved to @memberTypes) in simpleType named method
Based the type fixed-namespaces-type-default on xs:token instead of xs:string to allow for whitespace normalization
Changed collation attribute on xsl:merge-key to be an avt (according to spec)
Changed attributes that were previously of type "xsl:char-optionally-expanded" to just xs:string since the spec says they can be any string. I couldn't think of a reason they should be limited to one character optionally followed by a colon and more characters, so I assumed this was some sort of artifact from the past.
Changed xsl:next-iteration and xsl:evaluate to not allow mixed content
Corrected _split_when to _split-when
Added missing shadow attributes (in two places) for allow-duplicate-names, build-tree, json-lines and json-node-output-method
Added missing shadow attribute for select on perform-sort
Added assertions for required attributes:
- xsl:use-package - name
- xsl:expose - names, component and visibility
Made the errors attribute a list of tokens instead of just xs:token. Doesn't affect validation but I think it is more clear.
Changed default value of per-mille from a tilde to ‰
Changed the default value of the on-no-match attribute from shallow-skip to text-only-copy, per the spec
Gave xsl:exclude-result-prefixes the same type as no-namespace exclude-result-prefixes, to allow for #all and #default
Gave xsl:extension-element-prefixes the same type as no-namespace extension-element-prefixes, to allow for #default
Removed the xsl:prefixes and xsl:char-optionally-expanded types since they were no longer used after the above changes
Changed the type of the visibility attribute on xsl:attribute-set, xsl:function, xsl:template and xsl:variable to xsl:visibility-not-hidden-type to exclude "hidden" per the spec
Changed the keyword value of the fixed-namespaces attribute from #default to #standard (and adjusted type names)
Pull request #1989 created #created-1989
1983 QName literals in node constructors
Fix #1983
Pull request #1988 created #created-1988
1986 Drop obsolete notes on error reporting
Fix #1986
Pull request #1987 created #created-1987
1985 Tidy up namespace terminology
Fix #1985
Editorial.
The main effect is to centralise the descriptions of how to expand unprefixed QNames into a few named rules which can be referenced and reused throughout the spec.
Issue #1986 created #created-1986
Obsolete note on reporting errors
I propose dropping the folliowing text in XQuery §2.4.2
None of this text says anything prescriptive, and the suggested notation of URI#local appears outdated.
The method by which an XQuery 4.0 processor reports error information to the external environment is implementation-defined.
An error can be represented by a URI reference that is derived from the error QName as follows: an error with namespace URI NS and local part LP can be represented as the URI reference NS # LP . For example, an error whose QName is err:XPST0017 could be represented as http://www.w3.org/2005/xqt-errors#XPST0017.
Note:
Along with a code identifying an error, implementations may wish to return additional information, such as the location of the error or the processing phase in which it was detected. If an implementation chooses to do so, then the mechanism that it uses to return this information is implementation-defined.
Issue #1985 created #created-1985
Default namespace terminology
There are places in the spec that use sloppy terminology regarding namespaces. For example 2.1.3 says
the namespace URI is inferred from the prefix by examining the in-scope namespaces in the static context
But the static context does not define "in-scope namespaces", it defines "statically known namespaces"
I propose to put together an editorial PR to tidy this up.
Pull request #1984 created #created-1984
882 Drop fn:chain
Fix #882
Supersedes PR #1883
There has been a great deal of discussion about the relative merits of the status-quo fn:chain function and the proposed replacement fn:compose. The CG was polled on whether it preferred to have fn:chain only, fn:compose only, or both, or neither. There was no clear consensus. The only option which no-one seemed to favour was to have fn:chain only -- which is the status quo. Since no-one is happy with the status quo I am therefore proposing that we drop this function. We can then start with a clean slate.
For the record the main criticisms of the fn:chain function as currently specified were:
(a) it is more useful to have a function that combines several functions into a single function, without actually applying that function to a set of supplied arguments
(b) The function has special-case behaviour for arrays (if the input is not an array and the function has arity > 1 then the input sequence is converted to an array).
(c) The need for the function is not clearly motivated; the examples given can all be achieved in some simpler more intuitive way.
Issue #1983 created #created-1983
Computed node constructors - use QName literals rather than string literals
We have introduced (in 4.0) the option to specify element and attribute names in computed node constructors in the form of string literals. We should replace this with QName literals.
Pull request #1982 created #created-1982
1981 Ambiguity with qname literals and pragmas
Resolves the syntax problem identified in #1981 by requiring a space between (
and #
.
Adds more examples and notes scattered around the specs.
Issue #1981 created #created-1981
Syntax for QName literals clashes with XQuery pragmas
Unfortunately (as revealed by implementation and testing) the syntax for QName literals clashes with the syntax for pragmas in XQuery.
In the expression error(#err:XPTY0004)
, the longest token after error
is (#
which looks like the start of a pragma.
It's actually a wee bit complicated. Looking at the tokenization rules, we shouldn't be recognizing a pragma here because there is no closing #)
. The tokenization notes say "The lexical production rules for [variable terminals] have been designed so that there is minimal need for backtracking."; the introduction of the new syntax would mean that this is no longer the case. But regardless of the details, I think we have to change the QName literal syntax.
I propose we go for doubling the hash: error(##err:XPTY0004)
. We need to qualify the rules for tokenizing a pragma to say that a pragma is recognized when we see ((#
, optional whitespace, EQName) - that's not unlike the rules we have for other "variable tokens".
Issue #1972 closed #closed-1972
Dynamic function call applied to empty sequence
Issue #1240 closed #closed-1240
$sequence-of-maps ? info()
Issue #1975 closed #closed-1975
1240 Allow operand of dynamic function call to be a sequence
Issue #1661 closed #closed-1661
QName arguments: also allow strings
Issue #1976 closed #closed-1976
1661 Introduce QName literals
Issue #1973 closed #closed-1973
Substantitively disjoint types
Issue #1974 closed #closed-1974
1973 Cross-reference from type analysis to definition of disjointedness
Issue #1951 closed #closed-1951
Some nits regarding the method attribute
Issue #1971 closed #closed-1971
1951 Clarifications on serialization parameters
Issue #1952 closed #closed-1952
Change option name from xsi-schema-location to use-xsi-schema-location
Issue #1969 closed #closed-1969
1952 Change option name xsi-schema-location
Issue #1967 closed #closed-1967
Example for fn:unparsed-binary uses obsolete function name
Issue #1968 closed #closed-1968
1967 r/binary-resource/unparsed-binary/
Issue #1957 closed #closed-1957
Schema for XSLT incorrectly allows mixed content for xsl:output
Issue #1964 closed #closed-1964
1957 xsl output allows mixed content
Issue #1958 closed #closed-1958
Typo in map:build
Issue #1963 closed #closed-1963
1958 Fix simple typo in map:build
Issue #1980 created #created-1980
HTML serialization: the rules for adding a meta element need to be aligned with HTML5
See Saxon bugs:
https://saxonica.plan.io/issues/5852 https://saxonica.plan.io/issues/6772
regarding the recognition and generation of META elements in the HTML and XHTML header sections.
Saxon is producing HTML5 output as mandated by the 3.1 serialization spec but this is apparently either invalid or deprecated by the HTML5 specification. The 4.0 serialization spec makes some adjustments in this area but I don't think it is fully in line yet with HTML5.
Issue #1979 created #created-1979
Records: Type Safety
One cognitive challenge with records is to internalize that records are not independent types, but only map constraints. As a consequence, no type safety guarantees exist when records are accessed and updated:
- A lookup of a non-existing key raises no error.
- A record update may result in a map that does not match the original record definition.
This makes it hard and often impossible/illegal for processors to output helpful error messages.
There are reasons why we don’t want to make records too strict: an extensible record may include keys that are not defined in the record type:
(: must not raise an error :)
declare record local:r(a, *);
let $r as local:r := { 'a': 1, 'b': 2 }
return $r?b
However, for non-extensible records, I think we should allow processors to perform stricter checks when unknown keys are looked up, or when the result of an update would conflict with the original record type:
declare record local:r(a as xs:integer);
(: unknown key :)
local:r(1)?b,
(: invalid value type :)
map:put(local:r(1), 'a', 'string')
As records are no independent types, it will be difficult to enforce errors in all cases: It would require implementations to always know that a currently processed map has once been validated against a specific record type. But in many cases, implementations may be able to preserve record types for maps that have been coerced to a record, or created with a record declaration, and propagate them to updated maps. For example, we already do so when we can statically infer that the resulting map of a map:put
call will match the original record type.
Issue #1978 created #created-1978
Function `map:build` does not allow expressing the dependency of a value on its key. Some simple types of maps cannot be built.
The Problem
Function map:build
, does not allow to explicitly define the functional dependency of a value on its key.
As result, it is unusable for creating even such simple maps as the following:
The input is:
("apple", "apricot", "banana", "blueberry", "cherry")
The $keys
function is:
$keys := fn($x){characters($x)}
That is, every character, in every input string, is a key.
We need the values to be: if an input string contains the key two or more times, then each such string, else the empty sequence.
The expected map to be produced is:
{
"a": "banana",
"b": "blueberry",
"c": (),
"e": "blueberry",
"h": (),
"i": (),
"l": (), (: Lowercase L :)
"n", "banana",
"o", (),
"p": "apple",
"r": ("cherry", "blueberry"),
"t": (),
"y": ()
}
Solution
We provide a new definition of map:build
- this can be a complete replacement of the current function, or could be added as a new overload.
I am in the process of writing a PR, and your feedback would be appreciated.
The definition is simple:
let $mapBuild := fn(
$input as item()*,
$keys as (fn($item as item(), $position as xs:integer) as xs:anyAtomicType*),
$value as (fn($key as xs:anyAtomicType, $input as item()*) as item()*)
) as map(*)
{
let $allKeys := distinct-values(for-each($input, $keys))
return
$allKeys ! map:pair(., $value(., $input)) => map:of-pairs()
}
As can be seen from executing the code below, the redefined function can be successfully used to build the "problematic" map above, and also all currently provided examples in the FO Spec for the function map:build
.
let $mapBuild := fn(
$input as item()*,
$keys as (fn($item as item(), $position as xs:integer) as xs:anyAtomicType*),
$value as (fn($key as xs:anyAtomicType, $input as item()*) as item()*)
) as map(*)
{
let $allKeys := distinct-values(for-each($input, $keys))
return
$allKeys ! map:pair(., $value(., $input)) => map:of-pairs()
}
return
let $input := ("apple", "apricot", "banana", "blueberry", "cherry"),
$employees :=
<employees>
<employee name="Jim Nelson" location="New York" ssn="1234567890" salary="123456"/>
<employee name="Ann West" location="New York" ssn="0987654321" salary="99999"/>
<employee name="Peter Smith" location="Seattle" ssn="123454321" salary="155223"/>
<employee name="Karen Johnson" location="Seattle" ssn="5432198760" salary="175000"/>
<employee name="Jonh Lagarde" location="Boston" ssn="9999999999" salary="145000"/>
<employee name="Samantha Weird" location="Boston" ssn="1111111111" salary="153000"/>
</employees>
return
(
$mapBuild(
$input,
fn($string, $pos) {distinct-values(characters($string))},
fn($key, $input)
{
filter($input, fn($string, $pos){$key = duplicate-values(characters($string))})
}
),
$mapBuild((), string#1, string#1),
$mapBuild(1 to 10, fn {. mod 3}, fn($key, $input){filter($input, fn{$key = . mod 3})}),
$mapBuild(1 to 5, identity#1, format-integer(?, "w")),
$mapBuild(("January", "February", "March", "April", "May", "June",
"July", "August", "September", "October", "November", "December"),
substring(?, 1, 1), fn($key, $input){filter($input, fn{$key = substring(., 1, 1)})}
),
$mapBuild(
("apple", "apricot", "banana", "blueberry", "cherry"),
substring(?, 1, 1), fn($key, $input){sum($input[$key eq substring(., 1, 1)] ! string-length(.))}
),
$mapBuild(
('Wang', 'Liu', 'Zhao'),
fn($name, $pos) { $name },
fn($key, $input){index-of($input, $key)}
),
let $titles :=
<titles>
<title>A Beginner’s Guide to <ix>Java</ix></title>
<title>Learning <ix>XML</ix></title>
<title>Using <ix>XML</ix> with <ix>Java</ix></title>
</titles>
return
$mapBuild($titles/title,
fn($title){$title/ix},
fn($key, $input){filter($input, fn($elem){$key = $elem/ix})}
),
$mapBuild(
$employees//employee, fn{@ssn}, fn($key, $input){filter($input, fn($elem){$key = $elem/@ssn})}
),
$mapBuild(
$employees//employee, fn{@location}, fn($key, $input) {count(filter($input, fn($elem){$key = $elem/@location}))}
),
$mapBuild(
$employees//employee, fn{@location}, fn($key, $input) {max((filter($input, fn($elem){$key = $elem/@location}))/xs:decimal(@salary))}
)
)
All results (executed with BaseX) are the expected, correct ones:
{"a":"banana","p":"apple","l":(),"e":"blueberry","r":("blueberry","cherry"),"i":(),"c":(),"o":(),"t":(),"b":"blueberry","n":"banana","u":(),"y":(),"h":()}
{}
{1:(1,4,7,10),2:(2,5,8),0:(3,6,9)}
{1:"one",2:"two",3:"three",4:"four",5:"five"}
{"J":("January","June","July"),"F":"February","M":("March","May"),"A":("April","August"),"S":"September","O":"October","N":"November","D":"December"}
{"a":12,"b":15,"c":6}
{"Wang":1,"Liu":2,"Zhao":3}
{"Java":(<title>A Beginner’s Guide to <ix>Java</ix></title>,<title>Using <ix>XML</ix> with <ix>Java</ix></title>),"XML":(<title>Learning <ix>XML</ix></title>,<title>Using <ix>XML</ix> with <ix>Java</ix></title>)}
{"1234567890":<employee name="Jim Nelson" location="New York" ssn="1234567890" salary="123456"/>,"0987654321":<employee name="Ann West" location="New York" ssn="0987654321" salary="99999"/>,"123454321":<employee name="Peter Smith" location="Seattle" ssn="123454321" salary="155223"/>,"5432198760":<employee name="Karen Johnson" location="Seattle" ssn="5432198760" salary="175000"/>,"9999999999":<employee name="Jonh Lagarde" location="Boston" ssn="9999999999" salary="145000"/>,"1111111111":<employee name="Samantha Weird" location="Boston" ssn="1111111111" salary="153000"/>}
{"New York":2,"Seattle":2,"Boston":2}
{"New York":123456,"Seattle":175000,"Boston":153000}
Pull request #1977 created #created-1977
1889 Tidy up handling of HTML serialization version, default to HTML5
Does some general tidying up of the serialization text, but the main substantive changes are (a) to make HTML5 the default version, and (b) to make support for earlier versions effectively optional.
Please review carefully. Marking as editorial because I'm not sure any test cases need to change, but I might be wrong.
Fix #1889
Pull request #1976 created #created-1976
1661 Introduce QName literals
Fix #1661
See also #747
As discussed in the issue, I wasn't happy with the idea of changing the coercion rules to allow strings to be provided where a QName is expected, because of the need to keep the namespace context around at run-time, and because of potential confusion about exactly what namespace context is used.
Instead I have gone back to the idea of introducing QName literals, using the simple syntax #EQName.
Examples:
error(#err:XPTY0004)
node-name($node) = #xml:space
format-number($num, #de)
load-xquery-module($module)?variables?(#myvar)
transform({'initial-template':#xsl:initial-template})
{'last': 'Kay', 'first': 'Michael', 'suffix':#fn:null}
Pull request #1975 created #created-1975
1240 Allow operand of dynamic function call to be a sequence
Fix #1240 Fix #1972
This PR enables use of expressions such as $rectangle?area() - sum($rectangle?contents()?area())
which would previously have failed with a type error.
Pull request #1974 created #created-1974
1973 Cross-reference from type analysis to definition of disjointedness
Fix #1973
Issue #1973 created #created-1973
Substantitively disjoint types
Section §2.3.3.1 Static Analysis Phase mentions
A processor may raise a type error during static analysis if the inferred static type of an expression has no overlap (intersection) with the required type, and cannot be converted to the required type using the [coercion rules].
This should cross-refer to the more precisely defined concept of types being "substantively disjoint" - see §3.4.3.
Issue #1972 created #created-1972
Dynamic function call applied to empty sequence
A note in F+O under map:get states
map:get(map:get(map:get($map, 'employee'), 'name'), 'first')
can be written as$map('employee')('name')('first')
.
That's technically correct: both these expressions will fail in the same way if $map
does not contain an entry for the key employee
. Unlike the lookup expression $map?employee?name?first
which returns an empty sequence in this situation.
The rules for dynamic function calls (xpath, §4.5.3.1) state that $F($X)
raises a type error if $F
is an empty sequence.
I think it would be more useful if both map:get() and dynamic function calls were changed to have "empty if empty" semantics.
This is related to #1240 which goes further by allowing $F
to be a sequence of function items.
Pull request #1971 created #created-1971
1951 Clarifications on serialization parameters
Fix #1951
Issue #1970 created #created-1970
Editorial notes
XQFO
fn:fold-right
has an obsolete change section saying that “The $action callback function accepts an optional position argument.”- “then [the] operation will fail”
- remove whitespace before/after QName literals (#1982)
fn:unparsed-binary
: return type:xs:base64Binary?
(everyone: feel free to add notes, I’ll create a PR sometime later)
Pull request #1969 created #created-1969
1952 Change option name xsi-schema-location
Change to use-xsi-schema-location (because the value is a boolean, not a location)
Fix #1952
Pull request #1968 created #created-1968
1967 r/binary-resource/unparsed-binary/
Fix #1967
Issue #1967 created #created-1967
Example for fn:unparsed-binary uses obsolete function name
One of the examples for the new function fn:unparsed-binary uses the obsolete function name fn:binary-resource
Issue #1568 closed #closed-1568
Define a Unicode case-insensitive collation
Issue #1966 closed #closed-1966
1568b Add unicode case-blind collation
Issue #1945 closed #closed-1945
1568 unicode case blind collation
Pull request #1966 created #created-1966
1568b Add unicode case-blind collation
Replaces #1945 which was approved by the CG, but had pull conflicts because of incidental editorial changes
Fix #1568
Issue #1965 created #created-1965
The Generator record
This is a continuation of the original issue https://github.com/qt4cg/qtspecs/issues/716, created almost 2 years ago, and having accumulated a lot of very useful discussion.
Now, when we have methods that are fields of records, it became practical to produce the record type entirely in code, and this is the base for the planned PR.
1. What it contains
- The standard record fields as originally published:
initialized as xs:boolean,
endReached as xs:boolean,
getCurrent as %method fn() as item()*,
moveNext as %method fn(*)
- The following 34 methods - this will form the signatures and formal definitions of the methods inside the documentation:
toArray := %method fn()
take := %method fn($n as xs:integer)
takeWhile := %method fn($pred as function(item()*) as xs:boolean)
skip := %method fn($n as xs:nonNegativeInteger)
skipWhile := %method fn($pred as function(item()*) as xs:boolean)
some := %method fn()
someWhere := %method fn($pred)
subrange := %method fn($m as xs:positiveInteger, $n as xs:integer)
chunk := %method fn($size as xs:positiveInteger)
head := %method fn()
tail := %method fn()
at := %method fn($ind as xs:nonNegativeInteger)
for-each := %method fn($fun as function(*))
for-each-pair := %method fn($gen2 as f:generator, $fun as function(*))
zip := %method fn($gen2 as f:generator)
concat := %method fn($gen2 as f:generator)
append := %method fn($value as item()*)
prepend := %method fn($value as item()*)
insertAt := %method fn($pos as xs:positiveInteger, $value as item()*)
removeAt := %method fn($pos as xs:nonNegativeInteger)
replace := %method fn($funIsMatching as function(item()*) as xs:boolean, $replacement as item()*)
reverse := %method fn()
filter := %method fn($pred as function(item()*) as xs:boolean)
fold-left := %method fn($init as item()*, $action as fn(*))
fold-right := %method fn($init as item()*, $action as fn(*))
fold-lazy := %method fn($init as item()*, $action as fn(*), $shortCircuitProvider as function(*))
scan-left := %method fn($init as item()*, $action as fn(*))
scan-right := %method fn($init as item()*, $action as fn(*))
makeGenerator := %method fn($provider as function(*))
makeGeneratorFromArray := %method fn($input as array(*))
makeGeneratorFromSequence := %method fn($input as item()*)
toSequence := %method fn()
emptyGenerator := %method fn()
- 90 tests/examples - with calls to all the methods - in normal and edge cases
2. Where to get the executable (with BaseX) code?
For everyone's convenience, you will find the complete executable code at the end of this issue/initial-comment. Alternatively, the code is available here: https://github.com/dnovatchev/Articles/blob/main/Generators/Code/generator.xpath
The latter will always contain the latest, up-to-date code. And, of course, please execute the code with BaseX, as I have done many times:
3. What this gives us:
- Working with huge collections, that would otherwise be restricted by the available memory.
- Deferred execution.
- Handling collections containing unknown or infinite number of members
- A (next) member is produced only on request. No time is spent on producing all members of the collection.
- A (next) member is produced only on request. No memory is consumed to store all members of the collection.
- Lazy evaluation - due to the above and also using the fold-lazy method (also described in this article)
- Implementation of the original idea about Kollection - https://github.com/qt4cg/qtspecs/issues/910 .
4. What assistance is needed
I will greatly appreciate any recommendations on how to proceed with the actual PR:
- Can this be a single PR ?
- If this is too-big for a single PR, then how to proceed, like splitting it to pieces?
- Any observations and comments on the code itself.
5. References:
- The original issue: Generators in XPath: https://github.com/qt4cg/qtspecs/issues/716
- This article: Generators in XPath
- The article defining fold-lazy : "Laziness in XPath. The trouble with fn:fold-right"
6. Complete, executable definition of the generator record
declare namespace f = "http://www.w3.org/2005/xpath-functions-2025";
declare record f:generator
( initialized as xs:boolean,
endReached as xs:boolean,
getCurrent as %method fn() as item()*,
moveNext as %method fn(*) (: as f:generator, :),
toArray := %method fn()
{
while-do( [., []],
function( $inArr)
{ $inArr(1)?initialized and not($inArr(1)?endReached) },
function($inArr)
{ array{$inArr(1)?moveNext(),
array:append($inArr(2), $inArr(1)?getCurrent())
}
}
) (2)
},
take := %method fn($n as xs:integer)
{
let $gen := if(not(?initialized)) then ?moveNext()
else .
return
if($gen?endReached or $n le 0) then $gen?emptyGenerator()
else
let $current := $gen?getCurrent(),
$newResultGen := map:put(., "getCurrent", %method fn(){$current}),
$nextGen := $gen?moveNext()
return
if($nextGen?endReached) then $newResultGen
else
let
$newResultGen2 := map:put($newResultGen, "moveNext", %method fn() {$nextGen?take($n -1)})
return
$newResultGen2
},
takeWhile := %method fn($pred as function(item()*) as xs:boolean)
{
let $gen := if(not(?initialized)) then ?moveNext()
else .
return
if($gen?endReached) then $gen?emptyGenerator()
else
let $current := $gen?getCurrent()
return
if(not($pred($current))) then $gen?emptyGenerator()
else
let $newResultGen := map:put(., "getCurrent", %method fn(){$current}),
$nextGen := ?moveNext()
return
if($nextGen?endReached) then $newResultGen
else
let $newResultGen2 := map:put($newResultGen, "moveNext", %method fn() {$nextGen?takeWhile($pred)})
return $newResultGen2
},
skipStrict := %method fn($n as xs:nonNegativeInteger, $issueErrorOnEmpty as xs:boolean)
{
if($n eq 0) then .
else if(?endReached)
then if($issueErrorOnEmpty)
then error((), "Input Generator too-short")
else ?emptyGenerator()
else
let $gen := if(not(?initialized)) then ?moveNext()
else .
return
if(not($gen?endReached)) then $gen?moveNext()?skipStrict($n -1, $issueErrorOnEmpty)
else $gen?emptyGenerator()
},
skip := %method fn($n as xs:nonNegativeInteger)
{
?skipStrict($n, false())
},
skipWhile := %method fn($pred as function(item()*) as xs:boolean)
{
let $gen := if(not(?initialized)) then ?moveNext()
else .
return
if($gen?endReached) then $gen?emptyGenerator()
else
let $current := $gen?getCurrent()
return
if(not($pred($current))) then $gen
else $gen?moveNext()?skipWhile($pred)
},
some := %method fn()
{
?initialized and not(?endReached)
},
someWhere := %method fn($pred)
{
?filter($pred)?some()
},
subrange := %method fn($m as xs:positiveInteger, $n as xs:integer)
{
?skip($m - 1)?take($n - $m + 1)
},
chunk := %method fn($size as xs:positiveInteger)
{
let $gen := if(not(?initialized)) then ?moveNext()
else .
return
if($gen?endReached) then $gen?emptyGenerator()
else
let $thisChunk := $gen?take($size)?toArray(),
$cutGen := $gen?skip($size),
$resultGen := $gen => map:put("getCurrent", %method fn(){$thisChunk})
=> map:put("moveNext", %method fn(){$cutGen?chunk($size)})
return $resultGen
},
head := %method fn() {?take(1)?getCurrent()},
tail := %method fn() {?skip(1)},
at := %method fn($ind as xs:nonNegativeInteger) {?subrange($ind, $ind)?getCurrent()},
for-each := %method fn($fun as function(*))
{
let $gen := if(not(?initialized)) then ?moveNext()
else .
return
if(?endReached) then ?emptyGenerator()
else
let $current := $fun(?getCurrent()),
$newResultGen := map:put(., "getCurrent", %method fn(){$current}),
$nextGen := ?moveNext()
return
if($nextGen?endReached) then $newResultGen
else
let $newResultGen2 := map:put($newResultGen, "moveNext", %method fn() {$nextGen?for-each($fun)})
return
$newResultGen2
},
for-each-pair := %method fn($gen2 as f:generator, $fun as function(*))
{
let $gen := if(not(?initialized)) then ?moveNext()
else .,
$gen2 := if(not($gen2?initialized)) then $gen2?moveNext()
else $gen2
return
if(?endReached or $gen2?endReached) then ?emptyGenerator()
else
let $current := $fun(?getCurrent(), $gen2?getCurrent()),
$newResultGen := map:put(., "getCurrent", %method fn(){$current}),
$nextGen1 := ?moveNext(),
$nextGen2 := $gen2?moveNext()
return
if($nextGen1?endReached or $nextGen2?endReached) then $newResultGen
else
let $newResultGen2 := map:put($newResultGen, "moveNext", %method fn(){$nextGen1?for-each-pair($nextGen2, $fun)})
return
$newResultGen2
},
zip := %method fn($gen2 as f:generator)
{
?for-each-pair($gen2, fn($x1, $x2){[$x1, $x2]})
},
concat := %method fn($gen2 as f:generator)
{
let $gen := if(not(?initialized)) then ?moveNext()
else .,
$gen2 := if(not($gen2?initialized)) then $gen2?moveNext()
else $gen2,
$resultGen := if($gen?endReached) then $gen2
else if($gen2?endReached) then $gen
else
$gen => map:put( "moveNext",
%method fn()
{
let $nextGen := $gen?moveNext()
return
$nextGen?concat($gen2)
}
)
return
$resultGen
},
append := %method fn($value as item()*)
{
let $gen := if(not(?initialized)) then ?moveNext()
else .,
$genSingle := $gen => map:put("getCurrent", %method fn(){$value})
=> map:put("moveNext", %method fn(){?emptyGenerator()})
=> map:put("endReached", false())
return
$gen?concat($genSingle)
},
prepend := %method fn($value as item()*)
{
let $gen := if(not(?initialized)) then ?moveNext()
else .,
$genSingle := $gen => map:put("getCurrent", %method fn(){$value})
=> map:put("moveNext", %method fn(){?emptyGenerator()})
return
$genSingle?concat($gen)
},
insertAt := %method fn($pos as xs:positiveInteger, $value as item()*)
{
let $genTail := ?skipStrict($pos - 1, true())
return
if($pos gt 1)
then ?take($pos - 1)?append($value)?concat($genTail)
else $genTail?prepend($value)
},
removeAt := %method fn($pos as xs:nonNegativeInteger)
{
let $genTail := ?skipStrict($pos, true())
return
if($pos gt 1)
then ?take($pos - 1)?concat($genTail)
else $genTail
},
replace := %method fn($funIsMatching as function(item()*) as xs:boolean, $replacement as item()*)
{
if(?endReached) then .
else
let $current := ?getCurrent()
return
if($funIsMatching($current))
then let $nextGen := ?moveNext()
return
. => map:put("getCurrent", %method fn() {$replacement})
=> map:put("moveNext", %method fn() { $nextGen }
)
else (: $current is not the match for replacement :)
let $nextGen := ?moveNext()
return . => map:put("moveNext",
%method fn()
{
let $intendedReplace := function($z) {$z?replace($funIsMatching, $replacement)}
return
if($nextGen?endReached) then $nextGen
else $intendedReplace($nextGen)
}
)
},
reverse := %method fn()
{
if(?endReached) then ?emptyGenerator()
else
let $current := ?getCurrent()
return
?tail()?reverse()?append($current)
},
filter := %method fn($pred as function(item()*) as xs:boolean)
{
if(?initialized and ?endReached) then ?emptyGenerator()
else
let $getNextGoodGen := function($gen as map(*),
$pred as function(item()*) as xs:boolean)
{
if($gen?endReached) then $gen?emptyGenerator()
else
let $mapResult :=
while-do(
$gen,
function($x) { not($x?endReached) and not($pred($x?getCurrent()))},
function($x) { $x?moveNext() }
)
return
if($mapResult?endReached) then $gen?emptyGenerator()
else $mapResult
},
$gen := if(?initialized) then .
else ?moveNext(),
$nextGoodGen := $getNextGoodGen($gen, $pred)
return
if($nextGoodGen?endReached) then $gen?emptyGenerator()
else
$nextGoodGen => map:put("moveNext",
%method fn()
{
let $nextGoodGen := $getNextGoodGen(?inputGen?moveNext(), $pred)
return
if($nextGoodGen?endReached) then $nextGoodGen?emptyGenerator()
else
map:put(map:put($nextGoodGen, "moveNext", %method fn() {$nextGoodGen?moveNext()?filter($pred)}),
"inputGen", $nextGoodGen
)
}
)
=>
map:put("inputGen", $nextGoodGen)
},
fold-left := %method fn($init as item()*, $action as fn(*))
{
if(?endReached) then $init
else ?tail()?fold-left($action($init, ?getCurrent()), $action)
},
fold-right := %method fn($init as item()*, $action as fn(*))
{
if(?endReached) then $init
else $action(?head(), ?tail()?fold-right($init, $action))
},
fold-lazy := %method fn($init as item()*, $action as fn(*), $shortCircuitProvider as function(*))
{
if(?endReached) then $init
else
let $current := ?getCurrent()
return
if(function-arity($shortCircuitProvider($current, $init)) eq 0)
then $shortCircuitProvider($current, $init)()
else $action($current, ?moveNext()?fold-lazy($init, $action, $shortCircuitProvider))
},
scan-left := %method fn($init as item()*, $action as fn(*))
{
let $resultGen := ?emptyGenerator()
=> map:put("endReached", false())
=> map:put("getCurrent", %method fn(){$init})
return
if(?endReached)
then $resultGen => map:put("moveNext", %method fn(){?emptyGenerator()})
else
let $resultGen := $resultGen => map:put("getCurrent", %method fn(){$init}),
$partialFoldResult := $action($init, ?getCurrent())
return
let $nextGen := ?moveNext()
return
$resultGen => map:put("moveNext", %method fn()
{
$nextGen?scan-left($partialFoldResult, $action)
}
)
},
scan-right := %method fn($init as item()*, $action as fn(*))
{
?reverse()?scan-left($init, $action)?reverse()
},
makeGenerator := %method fn($provider as function(*))
{
let $gen := if(not(?initialized)) then ?moveNext()
else .,
$nextDataItemGetter := $provider(0),
$nextGen := if(not($nextDataItemGetter instance of function(*))) then $gen?emptyGenerator()
else $gen?emptyGenerator()
=> map:put("numDataItems", 1)
=> map:put("current", $nextDataItemGetter())
=> map:put("endReached", false())
=> map:put("getCurrent", %method fn() {?current})
=> map:put("moveNext",
%method fn()
{
let $nextDataItemGetter := $provider(?numDataItems)
return
if(not($nextDataItemGetter instance of function(*))) then ?emptyGenerator()
else
. => map:put("current", $nextDataItemGetter())
=> map:put("numDataItems", ?numDataItems + 1)
}
)
return $nextGen
},
makeGeneratorFromArray := %method fn($input as array(*))
{
let $size := array:size($input),
$arrayProvider := fn($ind as xs:integer)
{
if($ind +1 gt $size) then -1
else fn(){$input($ind + 1)}
}
return ?makeGenerator($arrayProvider)
},
makeGeneratorFromSequence := %method fn($input as item()*)
{
let $size := count($input),
$seqProvider := fn($ind as xs:integer)
{
if($ind +1 gt $size) then -1
else fn(){$input[$ind + 1]}
}
return ?makeGenerator($seqProvider)
},
toSequence := %method fn() {?toArray() => array:items()},
emptyGenerator := %method fn()
{
. => map:put("initialized", true()) => map:put("endReached", true())
=> map:put("getCurrent", %method fn() {error((),"getCurrent() called on an emptyGenerator")})
=> map:put("moveNext", %method fn() {error((),"moveNext() called on an emptyGenerator")})
},
*
);
let $gen2ToInf := f:generator(initialized := true(), endReached := false(),
getCurrent := %method fn(){?last +1},
moveNext := %method fn()
{
if(not(?initialized))
then map:put(., "inittialized", true())
else map:put(., "last", ?last + 1)
},
options := {"last" : 1}
),
$double := fn($n) {2*$n},
$sum2 := fn($m, $n) {$m + $n},
$product := fn($m, $n) {$m * $n}
return
(
"$gen2ToInf?take(3)?toArray()",
$gen2ToInf?take(3)?toArray(),
"================",
"$gen2ToInf?take(3)?skip(2)?getCurrent()",
$gen2ToInf?take(3)?skip(2)?getCurrent(),
(: $gen2ToInf?take(3)?moveNext()?moveNext()?moveNext()?getCurrent(), :)
"================",
"$gen2ToInf?getCurrent()",
$gen2ToInf?getCurrent(),
"$gen2ToInf?moveNext()?getCurrent()",
$gen2ToInf?moveNext()?getCurrent(),
"================",
"$gen2ToInf?take(5) instance of f:generator",
$gen2ToInf?take(5) instance of f:generator,
"==> $gen2ToInf?skip(7) instance of f:generator",
$gen2ToInf?skip(7) instance of f:generator,
"================",
"$gen2ToInf?subrange(4, 6)?getCurrent()",
$gen2ToInf?subrange(4, 6)?getCurrent(),
"$gen2ToInf?subrange(4, 6)?moveNext()?getCurrent()",
$gen2ToInf?subrange(4, 6)?moveNext()?getCurrent(),
"$gen2ToInf?subrange(4, 6)?moveNext()?moveNext()?getCurrent()",
$gen2ToInf?subrange(4, 6)?moveNext()?moveNext()?getCurrent(),
(: $gen2ToInf?subrange(4, 6)?moveNext()?moveNext()?moveNext()?getCurrent() :) (: Must raise error:)
"================",
"$gen2ToInf?subrange(4, 6)?head()",
$gen2ToInf?subrange(4, 6)?head(),
"$gen2ToInf?subrange(4, 6)?tail()?head()",
$gen2ToInf?subrange(4, 6)?tail()?head(),
"$gen2ToInf?subrange(4, 6)?toArray()",
$gen2ToInf?subrange(4, 6)?toArray(),
"$gen2ToInf?head()",
$gen2ToInf?head(),
"==> $gen2ToInf?tail()?head()",
$gen2ToInf?tail()?head(),
"================",
"$gen2ToInf?subrange(4, 6)?tail()?toArray()",
$gen2ToInf?subrange(4, 6)?tail()?toArray(),
"================",
"$gen2ToInf?at(5)",
$gen2ToInf?at(5),
"================",
"$gen2ToInf?subrange(1, 5)?toArray()",
$gen2ToInf?subrange(1, 5)?toArray(),
"$gen2ToInf?subrange(1, 5)?for-each($double)?toArray()",
$gen2ToInf?subrange(1, 5)?for-each($double)?toArray(),
"$gen2ToInf?take(5)?for-each($double)?toArray()",
$gen2ToInf?take(5)?for-each($double)?toArray(),
"==> $gen2ToInf?for-each($double)?take(5)?toArray()",
$gen2ToInf?for-each($double)?take(5)?toArray(),
"================",
"$gen2ToInf?subrange(1, 5)?toArray()",
$gen2ToInf?subrange(1, 5)?toArray(),
"$gen2ToInf?subrange(6, 10)?toArray()",
$gen2ToInf?subrange(6, 10)?toArray(),
"$gen2ToInf?subrange(1, 5)?for-each-pair($gen2ToInf?subrange(6, 10), $sum2)?toArray()",
$gen2ToInf?subrange(1, 5)?for-each-pair($gen2ToInf?subrange(6, 10), $sum2)?toArray(),
"==> $gen2ToInf?for-each-pair($gen2ToInf, $sum2)?take(5)?toArray()",
$gen2ToInf?for-each-pair($gen2ToInf, $sum2)?take(5)?toArray(),
"================",
"==> $gen2ToInf?filter(fn($n){$n mod 2 eq 1})?getCurrent()",
$gen2ToInf?filter(fn($n){$n mod 2 eq 1})?getCurrent(),
"$gen2ToInf?filter(fn($n){$n mod 2 eq 1})?moveNext()?getCurrent()",
$gen2ToInf?filter(fn($n){$n mod 2 eq 1})?moveNext()?getCurrent(),
"================",
"$gen2ToInf?filter(fn($n){$n mod 2 eq 1})?take(10)?toArray()",
$gen2ToInf?filter(fn($n){$n mod 2 eq 1})?take(10)?toArray(),
"================",
"$gen2ToInf?filter(fn($n){$n mod 2 eq 1})?take(10)?toSequence()",
$gen2ToInf?filter(fn($n){$n mod 2 eq 1})?take(10)?toSequence(),
"================",
"$gen2ToInf?takeWhile(fn($n){$n < 11})?toArray()",
$gen2ToInf?takeWhile(fn($n){$n < 11})?toArray(),
"$gen2ToInf?takeWhile(fn($n){$n < 2})?toArray()",
$gen2ToInf?takeWhile(fn($n){$n < 2})?toArray(),
"================",
"$gen2ToInf?skipWhile(fn($n){$n < 11})?take(5)?toArray()",
$gen2ToInf?skipWhile(fn($n){$n < 11})?take(5)?toArray(),
"==> $gen2ToInf?skipWhile(fn($n){$n < 2})",
$gen2ToInf?skipWhile(fn($n){$n < 2}),
"
==> $gen2ToInf?skipWhile(fn($n){$n < 2})?skip(1)",
$gen2ToInf?skipWhile(fn($n){$n < 2})?skip(1),
(: $gen2ToInf?skipWhile(fn($x) {$x ge 2}) :) (: ?skip(1) :)
"================",
"$gen2ToInf?some()",
$gen2ToInf?some(),
"let $empty := $gen2ToInf?emptyGenerator()
return $empty?some()",
let $empty := $gen2ToInf?emptyGenerator()
return $empty?some(),
"================",
"$gen2ToInf?take(5)?filter(fn($n){$n ge 7})?some()",
$gen2ToInf?take(5)?filter(fn($n){$n ge 7})?some(),
"$gen2ToInf?take(5)?someWhere(fn($n){$n ge 7})",
$gen2ToInf?take(5)?someWhere(fn($n){$n ge 7}),
"$gen2ToInf?take(5)?someWhere(fn($n){$n ge 6})",
$gen2ToInf?take(5)?someWhere(fn($n){$n ge 6}),
"$gen2ToInf?someWhere(fn($n){$n ge 100})",
$gen2ToInf?someWhere(fn($n){$n ge 100}),
"================",
"$gen2ToInf?take(10)?take(11)?toArray()",
$gen2ToInf?take(10)?take(11)?toArray(),
"$gen2ToInf?take(10)?skip(10)?toArray()",
$gen2ToInf?take(10)?skip(10)?toArray(),
"$gen2ToInf?take(10)?skip(9)?toArray()",
$gen2ToInf?take(10)?skip(9)?toArray(),
"$gen2ToInf?take(10)?subrange(3, 12)?toArray()",
$gen2ToInf?take(10)?subrange(3, 12)?toArray(),
"$gen2ToInf?take(10)?subrange(5, 3)?toArray()",
$gen2ToInf?take(10)?subrange(5, 3)?toArray(),
"================",
"$gen2ToInf?take(100)?chunk(20)?getCurrent()",
$gen2ToInf?take(100)?chunk(20)?getCurrent(),
"==> $gen2ToInf?chunk(20)?take(5)?toArray()",
$gen2ToInf?chunk(20)?take(5)?toArray(),
"================",
"$gen2ToInf?take(100)?chunk(20)?moveNext()?getCurrent()",
$gen2ToInf?take(100)?chunk(20)?moveNext()?getCurrent(),
"$gen2ToInf?take(100)?chunk(20)?moveNext()?moveNext()?getCurrent()",
$gen2ToInf?take(100)?chunk(20)?moveNext()?moveNext()?getCurrent(),
"$gen2ToInf?take(100)?chunk(20)?skip(1)?getCurrent()",
$gen2ToInf?take(100)?chunk(20)?skip(1)?getCurrent(),
"================",
"$gen2ToInf?take(100)?chunk(20)?for-each(fn($genX){$genX})?toArray()",
$gen2ToInf?take(100)?chunk(20)?for-each(fn($genX){$genX})?toArray(),
"================",
"$gen2ToInf?take(10)?chunk(4)?toArray()",
$gen2ToInf?take(10)?chunk(4)?toArray(),
"$gen2ToInf?take(10)?chunk(4)?for-each(fn($arr){array:size($arr)})?toArray()",
$gen2ToInf?take(10)?chunk(4)?for-each(fn($arr){array:size($arr)})?toArray(),
"================",
"$gen2ToInf?subrange(10, 15)?concat($gen2ToInf?subrange(1, 9))?toArray()",
$gen2ToInf?subrange(10, 15)?concat($gen2ToInf?subrange(1, 9))?toArray(),
"================",
"$gen2ToInf?subrange(1, 5)?append(101)?toArray()",
$gen2ToInf?subrange(1, 5)?append(101)?toArray(),
"$gen2ToInf?subrange(1, 5)?prepend(101)?toArray()",
$gen2ToInf?subrange(1, 5)?prepend(101)?toArray(),
"==> $gen2ToInf?append(101)",
$gen2ToInf?append(101),
"$gen2ToInf?prepend(101)?take(5)?toArray()",
$gen2ToInf?prepend(101)?take(5)?toArray(),
"================",
"$gen2ToInf?subrange(1, 5)?zip($gen2ToInf?subrange(6, 10))?toArray()",
$gen2ToInf?subrange(1, 5)?zip($gen2ToInf?subrange(6, 10))?toArray(),
"$gen2ToInf?subrange(1, 5)?zip($gen2ToInf?subrange(10, 20))?toArray()",
$gen2ToInf?subrange(1, 5)?zip($gen2ToInf?subrange(10, 20))?toArray(),
"==> $gen2ToInf?zip($gen2ToInf?skip(5))?take(10)?toArray()",
$gen2ToInf?zip($gen2ToInf?skip(5))?take(10)?toArray(),
"================",
"$gen2ToInf?makeGenerator(fn($numGenerated as xs:integer)
{if($numGenerated le 9) then fn() {$numGenerated + 1} else -1}
)?toArray()",
$gen2ToInf?makeGenerator(fn($numGenerated as xs:integer)
{if($numGenerated le 9) then fn() {$numGenerated + 1} else -1}
)?toArray(),
"================",
"$gen2ToInf?makeGeneratorFromArray([1, 4, 9, 16, 25])?toArray()",
$gen2ToInf?makeGeneratorFromArray([1, 4, 9, 16, 25])?toArray(),
"$gen2ToInf?makeGeneratorFromSequence((1, 8, 27, 64, 125))?toArray()",
$gen2ToInf?makeGeneratorFromSequence((1, 8, 27, 64, 125))?toArray(),
"================",
"$gen2ToInf?take(10)?insertAt(3, ""XYZ"")?toArray()",
$gen2ToInf?take(10)?insertAt(3, "XYZ")?toArray(),
"$gen2ToInf?take(10)?insertAt(1, ""ABC"")?toArray()",
$gen2ToInf?take(10)?insertAt(1, "ABC")?toArray(),
"$gen2ToInf?take(10)?insertAt(11, ""PQR"")?toArray()",
$gen2ToInf?take(10)?insertAt(11, "PQR")?toArray(),
"==> $gen2ToInf?insertAt(3, ""XYZ"")?take(10)?toArray()",
$gen2ToInf?insertAt(3, "XYZ")?take(10)?toArray(),
(: , $gen2ToInf?take(10)?insertAt(12, "GHI")?toArray() :) (: Must raise error "Input Generator too-short." :)
"================",
"$gen2ToInf?take(10)?removeAt(3)?toArray()",
$gen2ToInf?take(10)?removeAt(3)?toArray(),
"$gen2ToInf?take(10)?removeAt(1)?toArray()",
$gen2ToInf?take(10)?removeAt(1)?toArray(),
"$gen2ToInf?take(10)?removeAt(10)?toArray()",
$gen2ToInf?take(10)?removeAt(10)?toArray(),
"==> $gen2ToInf?removeAt(3)?take(10)?toArray()",
$gen2ToInf?removeAt(3)?take(10)?toArray(),
(: , $gen2ToInf?take(10)?removeAt(11)?toArray() :) (: Must raise error "Input Generator too-short." :)
"================",
"$gen2ToInf?take(10)?replace(fn($x){$x gt 4}, ""Replacement"")?toArray()",
$gen2ToInf?take(10)?replace(fn($x){$x gt 4}, "Replacement")?toArray(),
"$gen2ToInf?take(10)?replace(fn($x){$x lt 3}, ""Replacement"")?toArray()",
$gen2ToInf?take(10)?replace(fn($x){$x lt 3}, "Replacement")?toArray(),
"$gen2ToInf?take(10)?replace(fn($x){$x gt 10}, ""Replacement"")?toArray()",
$gen2ToInf?take(10)?replace(fn($x){$x gt 10}, "Replacement")?toArray(),
"$gen2ToInf?take(10)?replace(fn($x){$x gt 11}, ""Replacement"")?toArray()",
$gen2ToInf?take(10)?replace(fn($x){$x gt 11}, "Replacement")?toArray(),
"$gen2ToInf?take(10)?replace(fn($x){$x lt 2}, ""Replacement"")?toArray()",
$gen2ToInf?take(10)?replace(fn($x){$x lt 2}, "Replacement")?toArray(),
"==> $gen2ToInf?replace(fn($x){$x gt 4}, ""Replacement"")?take(10)?toArray()",
$gen2ToInf?replace(fn($x){$x gt 4}, "Replacement")?take(10)?toArray(),
"$gen2ToInf?replace(fn($x){$x lt 3}, ""Replacement"")?take(10)?toArray()",
$gen2ToInf?replace(fn($x){$x lt 3}, "Replacement")?take(10)?toArray(),
(:
Will result in endless loop:
, "==> ==> ==> $gen2ToInf?replace(fn($x){$x lt 2}, ""Replacement"")?take(10)?toArray() <== <== <==",
$gen2ToInf?replace2(fn($x){$x lt 2}, "Replacement")?take(10)?toArray()
:)
"================",
"$gen2ToInf?emptyGenerator()?reverse()?toArray()",
$gen2ToInf?emptyGenerator()?reverse()?toArray(),
"$gen2ToInf?emptyGenerator()?append(2)?reverse()?toArray()",
$gen2ToInf?emptyGenerator()?append(2)?reverse()?toArray(),
"$gen2ToInf?take(10)?reverse()?toArray()",
$gen2ToInf?take(10)?reverse()?toArray(),
"================",
"$gen2ToInf?take(5)?fold-left(0, fn($x, $y){$x + $y})",
$gen2ToInf?take(5)?fold-left(0, fn($x, $y){$x + $y}),
"================",
"$gen2ToInf?take(5)?fold-right(0, fn($x, $y){$x + $y})",
$gen2ToInf?take(5)?fold-right(0, fn($x, $y){$x + $y}),
"================",
"$gen2ToInf?emptyGenerator()?scan-left(0, fn($x, $y){$x + $y})?toArray()",
$gen2ToInf?emptyGenerator()?scan-left(0, fn($x, $y){$x + $y})?toArray(),
"$gen2ToInf?take(5)?scan-left(0, fn($x, $y){$x + $y})?toArray()",
$gen2ToInf?take(5)?scan-left(0, fn($x, $y){$x + $y})?toArray(),
"================",
"$gen2ToInf?makeGeneratorFromSequence((1 to 10))?scan-right(0, fn($x, $y){$x + $y})?toArray()",
$gen2ToInf?makeGeneratorFromSequence((1 to 10))?scan-right(0, fn($x, $y){$x + $y})?toArray(),
"================",
let $multShortCircuitProvider := fn($x, $y)
{
if($x eq 0) then fn(){0}
else fn($z) {$x * $z}
},
$gen-5ToInf := $gen2ToInf?for-each(fn($n){$n -7})
return
(
"let $multShortCircuitProvider := fn($x, $y)
{
if($x eq 0) then fn(){0}
else fn($z) {$x * $z}
},
$gen-5ToInf := $gen2ToInf?for-each(fn($n){$n -7})
return
$gen2ToInf?take(5)?fold-lazy(1, $product, $multShortCircuitProvider),
$gen-5ToInf?fold-lazy(1, $product, $multShortCircuitProvider)",
$gen2ToInf?take(5)?fold-lazy(1, $product, $multShortCircuitProvider),
$gen-5ToInf?fold-lazy(1, $product, $multShortCircuitProvider)
)
)
Issue #1954 closed #closed-1954
Private fields in records
Pull request #1964 created #created-1964
1957 xsl output allows mixed content
Change to schema-for-xslt40
Fix #1957 (xsl:output disallow mixed content)
Add support for xsl:import-schema/@role
QT4 CG meeting 119 draft minutes #minutes—04-29
Draft minutes published.
Issue #1961 closed #closed-1961
Attempt to show that xsl:record allows extra attributes
Issue #1956 closed #closed-1956
1954 (part) Private variables and functions don't need to be in the module namespace
Issue #1271 closed #closed-1271
Schema validation in XPath
Issue #1933 closed #closed-1933
1271 fn:xsd-validator() function
Issue #557 closed #closed-557
fn:unparsed-binary: accessing and manipulating binary types
Issue #1587 closed #closed-1587
557 Add fn:unparsed-binary function
Issue #1319 closed #closed-1319
Specification Documents: Editors and Contributors
Issue #1416 closed #closed-1416
Key-value pairs: built-in record type `pair`
Issue #1844 closed #closed-1844
Drop mapping arrow operator
Issue #1704 closed #closed-1704
Ignore the byte order mark more completely/globally
Issue #1950 closed #closed-1950
1704 Add rules/notes for BOM and related topics
Issue #1906 closed #closed-1906
1797 elements-to-maps-conversion-plan function
Pull request #1963 created #created-1963
1958 Fix simple typo in map:build
Fix #1958
Issue #1962 created #created-1962
fn:map-to-element
As this feature request has been reported back to us more than once, I want to raise the question here if we want to introduce a function that inverts the result of fn:map-to-element
back to an XML representation – provided that a conversion plan exists.
Many results would certainly be lossy, but as the plan is now available separately, it would be possible to roundtrip a lot of data with regular structures, without having to write custom conversion code.
Pull request #1961 created #created-1961
Attempt to show that xsl:record allows extra attributes
By the simple expedient of adding
<e:attribute name="*" required="yes">
<e:data-type name="expression"/>
</e:attribute>
to the syntax summary we get
With some additional prose, would that be sufficient?
Issue #1960 closed #closed-1960
Attempt to improve rendering of the dynamic ToC
Pull request #1960 created #created-1960
Attempt to improve rendering of the dynamic ToC
This is an attempt to complete action QT4CG-116-03.
Part of the confusion in the rendering is that it’s partly done by CSS and partly done by JavaScript and those had gotten out of sync. Given that the JavaScript code has to do some of the work, I changed things so it does all of the work.
I also discovered that the weird Firefox bug where the font size changed was, wait for it, caused by the particular codepoint being used for “changed”. So I, uh, changed it.
We now get ✚ for new sections, ✭ for changed sections, and both when there are both new and changed sections. I spent a bit of time trying to find a third symbol, but gave up.
The markup in the XSLT specification isn’t quite the same as in other specifications, so the results are a tiny bit odd in places. There are some sections that get marked “both” in the ToC but when you expand the ToC, there’s only one mark. I haven’t tried to work out what’s going on there yet.
Pull request #1959 created #created-1959
1953 (part) XSLT Worked example using methods to implement atomic sets
Provides an XSLT package that uses named record types and methods to implement an atomic set data type, as an example of how abstract data types can now be implemented.
Issue #1958 created #created-1958
Typo in map:build
If the key is already present, the processor combines the new value for the key with the existing value as determined by the and duplicates option.
Issue #1957 created #created-1957
Schema for XSLT incorrectly allows mixed content for xsl:output
The declaration (line 1412) says mixed="true".
Pull request #1956 created #created-1956
1954 (part) Private variables and functions don't need to be in the module namespace
See issue #1954
The PR removes the requirement for private variables and functions declared in library modules to be in the module namespace. There has never been any sensible reason for this restriction.
The restriction is retained for public variables and functions; one could argue that it is unnecessary in that case also, but it does no harm and enforces good coding discipline.
Issue #1955 created #created-1955
fn:doc, fn:parse-xml: entity expansion
The current rule for the entity-expansion-limit
option is:
The processor should impose a limit on the number of entity references that are expanded, or on the size of the expanded entities, depending on the options available in the underlying XML parser; the limit should be commensurate with the value requested, but the precise effect may be . implementation-dependent. If the XML parser does not offer the ability to impose a limit, or if the value is zero, then entity expansion should if possible be disabled entirely, leading to a dynamic error if the input contains any entity references. A negative value should be interpreted as placing no limits on entity expansion.
By default, Java uses 64000 as limit. An explicit value less than or equal to 0 indicates no limit (https://docs.oracle.com/javase/tutorial/jaxp/limits/limits.html, https://docs.oracle.com/en/java/javase/17/docs/api/java.xml/module-summary.html). I don’t know about other languages.
I would like to…
- change the option to an
expand-entities
Boolean, or - change the rules and make
0
disable the limit.
Any favorites?
Issue #1954 created #created-1954
Private fields in records
It would be nice to have some way of indicating that some of the fields in a record are (in some sense) private, intended for internal use.
I'm not proposing full encapsulation - the instances of a record type are maps, and can be manipulated by functions such as map:keys(), map:get(), and map:put() which expose all the keys.
Rather I'm proposing a convention that makes it difficult to access the fields "accidentally" using lookup expressions: a bit like naming the fields using a leading underscore, but something a bit stronger. Analogous to reflection in Java, which allows you to break encapsulation with a bit of effort.
I'd suggest making the keys for these "private" fields QNames rather than strings:
- In the record declaration, we allow a field name to be a QName rather than a string: record(private:data as item()*, long, lat).
- QNames can't be used directly in a lookup; to access the field, you need to know what namespace "private" is bound to, which doesn't need to be published information (though it is of course discoverable)
- Internally the implementor of this interface can bind a QName to a private variable and use this:
declare %private variable $private:data as xs:QName('http://my.private.namespace/', 'data')
and then access it using $record?$private:data
Issue #1953 created #created-1953
Make generation of constructor function for named record types optional
I propose that when a named record type is declared in XQuery or XSLT, the generation of a constructor function should be optional.
Perhaps in XQuery it should only happen if there is an annotation %constructor
, and in XSLT if there is an attribute constructor="yes"
. I think it's better for the default to be "no constructor" because it's better to make the existence of a constructor explicitly visible.
There are cases where you don't want a system-generated constructor primarily because you want to provide your own constructor which perhaps accepts the data in a slightly different form, or perhaps imposes constraints like cross-validation of supplied arguments.
Issue #1952 created #created-1952
Change option name from xsi-schema-location to use-xsi-schema-location
Functions (such as fn:doc and fn:parse-xml) that have a boolean option xsi-schema-location
should change this to use-xsi-schema-location
to make it clearer that the expected value is a boolean and not a schema location.
(Comment made in passing at the last meeting).
Issue #1951 created #created-1951
Some nits regarding the method attribute
A few minor comments on the method attribute (that also apply to XSLT 3.0):
-
A note in section 25.1 says "In the case of the attributes method, cdata-section-elements, suppress-indentation, and use-character-maps, the effective value of the attribute contains a space-separated list of EQNames." The effective value of the method attribute should not be a list of values, just one value, right?
-
The XSD schema requires that the method attribute contain a colon if it is not one of the 6 "built-in" values. (The type xsl:method restricts xsl:EQName with the pattern "\c*:\c*"). Now that it can be an expanded QName, it should also allow for Q{...}.
-
A note in section 3.2 says: "Extension attributes may also be used to influence the behavior of the serialization methods xml, xhtml, html, or text, to the extent that... If a serialization method other than one of these four is requested (using a prefixed QName in the method parameter) then...".
a. This lists only 4 methods, Should "json" and "adaptive" be added to the list and "four" changed to "six"? b. Should "using a prefixed QName" be changed to something like "using an EQName with a non-absent namespace"? -
In the note about error XTSE1570 in Section 26.2. "The value must (if present) be a valid EQName. If it is a lexical QName with no a prefix, then it identifies a method specified in [XSLT and XQuery Serialization] and must be one of xml, html, xhtml, or text. a. It only lists the 4 methods, leaving out "json" and "adaptive" b. Typo - "no a prefix" should be "no prefix"
Thanks!
Pull request #1950 created #created-1950
1704 Add rules/notes for BOM and related topics
Fix #1704
The main substantive change is that unparsed-text() now explicitly discards any leading BOM.
Other functions that involve decoding of octets to strings are updated to reflect the changes that we made to reference the concept of "permitted characters".
Issue #1644 closed #closed-1644
fn:elements-to-maps: Mixed Content
Issue #1658 closed #closed-1658
fn:elements-to-maps: `empty`, normalize space ?
Issue #1647 closed #closed-1647
fn:elements-to-maps: Explicit Layouts
Issue #1949 created #created-1949
fn:element-to-map: Updated Feedback
My feedback is based on the latest version PR (#1906):
1. Boolean types
I think we should be careful about changing data to a representation that differs from the input data. If the input contains 0
and 1
, it seems too invasive to me to return a boolean. Many users will not be aware that those numbers are valid candidates for Boolean conversions in XPath. That’s why I would still pledge for adapting the type rule detections, and placing numeric
before boolean
(related: 5).
Things are even more awkward (if I got the rules right) when working without a conversion plan:
(: Query :)
element-to-map(<x><a>1</a><a>2</a></x>)
(: Result :)
{ "x": [ true(), 2 ] }
2. Explicit types
I still feel uneasy that we ignore the specified type if it does not match – even more because XML is known for its rigor that documents must be well-formed to be accepted. I agree we should allow users to be lax about their generated output – by deliberately omitting types – but if type hints are supplied, I think we should take them serious.
An example:
element-to-map(
<a>2</a>,
{ 'plan': { 'a': { 'layout': 'simple', 'type': 'boolean' } } }
)
3. Numeric casts
If the prescribed type is
numeric
and the value is castable asxs:numeric
, then it is output as an instance ofxs:integer
,xs:decimal
, orxs:double
depending on the lexical form of the value, following the same rules as for XPath numeric literals.
Unless we use the same rule somewhere else in the spec, I would definitely vote for making things easier and choose consistency. xs:numeric(<a>1</a>)
returns a double value, so I think we should do exactly the same here. If the result will be serialized as JSON, everything will be a number anyway.
4. Normalized space
- If
empty($EE/(* | text())
…the layout isempty
I still believe empty($EE/(* | text()[normalize-space()])
would be a better choice. The error sections for both empty
and list
state that “whitespace-only text nodes are discarded.”, so it is not clear to me why the rules for whitespace text nodes differ for these layouts.
5. 18.5.2 Creating a conversion plan
The current rules do not mention yet that child
keys need to be added for list
and list-plus
.
In general, I would appreciate if redundancy could be removed. I’m still struggling finding all relevant information without resorting to the tests. For example, I think that due to the new XQuery code, a lot of informal and possibly lossy rules can be dropped.
6. Function signatures: document-node()
, element()
Both functions should accept only elements, or accept both document nodes and elements. Maybe it’s better to only accept elements; it would resemble the name of the function.
7. deep-skip
option
I wouldn’t be able to tell how a shallow-skip
option could work, so maybe skip
is sufficient?
8. Streamability
- The conversion is not streamable.
It is not clear (to me) what this means. Is this XSLT-related? Maybe a reference would be helpful, or we should drop the phrase if it’s not relevant anymore?
My observation was that fn:element-to-map
can be implemented without keeping the full document in main-memory, so maybe we should let the processors decide what to do?
9. JSON
Example output: …shown as serialized JSON. The result is always shown as a singleton map…
This sounds contradictory, as there are no maps in the JSON terminology (but objects). Maybe there is no need to mention the JSON serialization, as the presented results are maps & arrays that can be run as XPath expressions out of the box.
10. Layout rules: errors
The error rules for empty
and empty-plus
say: “If any other child nodes are present, this layout fails.”. For the simple
and simple-plus
layouts, it is “If any child elements are present, this layout fails.”.
Am I right to assume that in both cases it’s only child element that result in a failure?
Issue #1948 created #created-1948
fn:element-to-map: Tests
My feedback is based on the latest version PR (#1906) and the latest test cases (https://github.com/qt4cg/qt4tests/pull/223). I decided to list my observations in this repository, as I am not sure whether it’s the tests or the spec that may possibly need to be revised:
element-to-map-017
:
As discussed in the last meeting (see also https://github.com/qt4cg/qtspecs/pull/1906#issuecomment-2821502378), the xsi:type
attribute should already be ignored in the choice of the conversion plan, in order to choose a plan that does not include attributes:
element-to-map(parse-xml('<a xmlns="http://a.com/" xsi:type="xs:integer"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xs="http://www.w3.org/2001/XMLSchema">2</a>')/*)
Result: { "Q{http://a.com/}a": 2 }
Expect: { "Q{http://a.com/}a": { "#content": "2" } }
element-to-map-401
:
I would expect the record
layout to be also applied for the b
child node:
element-to-map(parse-xml('<a id="3"><b/></a>')/a, {'plan': {'a': {'layout': 'empty'},
'*': {'layout': 'record'}, '@id': {'type': 'numeric'} }})
Result: { "a": { "@id": 3, "b": { } } }
Expect: { "a": { "@id": 3, "b": "" } }
element-to-map-420
and others:
As discussed in the meeting, list
layouts require a child
key:
element-to-map(parse-xml('<a id="zz"><b/><b/><c/></a>')/a, {'plan':{'a':{'layout':'list'}}})
Error : XPTY0004: Missing key 'child' (node: a).
Expect: FOJS0008
element-to-map-511
:
I would expect the mixed
layout to be also applied for the a
child node:
element-to-map(parse-xml('<a>The <i>short</i> introduction</a>')/a, {'plan': {'a': {'layout':'simple'}, '*': {'layout':'mixed'} }})
Result: { "a": [ "The ", { "i": [ "short" ] }, " introduction" ] }
Expect: { "a": array { ("The ", { "i": "short" }, " introduction") } }
Issue #1936 closed #closed-1936
XSD for XSLT 4.0 is missing form="qualified" on several attributes within the attribute group "literal-result-element-attributes"
Issue #1943 closed #closed-1943
Mark attribute declarations as form=qualified
Issue #1947 closed #closed-1947
1936 Mark attributes with form=qualified
Pull request #1947 created #created-1947
1936 Mark attributes with form=qualified
Fix #1936
Issue #1946 created #created-1946
We need examples of a record with an entry that is a %method and invoking this method with the result it must produce
We have two new great features in XPath 4.0 - the record and %method entries of a map.
It is natural to combine the two and have a record, one of whose entries is a method.
Unfortunately, at present there is no such example in the relevant Specs, and this leaves the reader trying to guess even the syntax of the specific record definition.
Therefore, we need at least one such example, to help readers and implementors in this.
BaseX did a great job in implementing both records and %method map-entries, and I constructed the following example, which is syntactically correct, but results in error (due to function coercion - explained in a separate issue here: https://github.com/qt4cg/qtspecs/issues/1938).
This code:
declare namespace t = "my:t";
declare record t:location
( longitude as xs:integer,
latitude as xs:integer,
myFun2 as %method fn() as xs:integer,
*
);
let $r := t:location(longitude := 25, latitude := 10, myFun2 := %method fn() {?longitude + ?latitude}
)
return ($r, $r?myFun2())
When executed with BaseX, raises an error: [XPDY0002] .: Context value is undefined.
We need the specification to provide a similar example, and what the result should be.
In particular:
- Must the return type of the method myFun2 (above) be specified or can it be omitted? BaseX raises a syntax error if this is specified as:
myFun2 as %method fn()
: "Expecting 'as', found ',' ? - If the method entry is specified as having a particular type (such as:
as xs:integer
) then can the corresponding function, provided in the construction of this record omit the function type (such asmyFun2 := %method fn() {?longitude + ?latitude}
) , even though it is clear that the type of the result isxs:integer
?
Issue #1941 closed #closed-1941
Add PR numbers and dates to change metadata
Issue #1921 closed #closed-1921
XSLT: semantics of PatternVersionRange
Issue #1922 closed #closed-1922
1921 Expand definition of version ranges in XSLT
Issue #1907 closed #closed-1907
Method lookup: wildcards
Issue #1926 closed #closed-1926
1907 method lookup (disallow wildcard selection)
Issue #1928 closed #closed-1928
1844b Arrow Expressions
Issue #1724 closed #closed-1724
Allow @copy-namespaces on <xsl:mode>?
Issue #1929 closed #closed-1929
1725 xsl:mode/@copy-namespaces
Issue #1939 closed #closed-1939
XQDY0153 (from try/finally) should be a type error
Issue #1940 closed #closed-1940
1939 XQDY0153 (from try/finally) should be a type error
Issue #1931 closed #closed-1931
QT4-CG-116-02 improve description of validation
Issue #910 closed #closed-910
Introduce a Kollection object with functions that operate on all types of items that can be containers of unlimited number of "members"
Pull request #1945 created #created-1945
1568 unicode case blind collation
Fix #1568
Issue #1944 created #created-1944
Try/Catch/Finally - order of evaluation
I'm struggling a bit with try/catch/finally. Is there an implied constraint on the order of evaluation? The spec appears to suggest so:
its expression will be evaluated after the expressions of the try clause and a possibly evaluated catch clause.
What does evaluated after actually mean in a functional language? I can't see how to reconcile this with the general principles of the language regarding lazy evaluation etc. Is there some kind of exception to the general rule that you only have to evaluate as much of an expression as is needed to work out what the result is going to be?
It seems to me that the hidden unstated purpose of "finally" is to execute expressions that have side-effects, and without a proper semantic framework for handling side-effects, this is going to get us into trouble.
Pull request #1943 created #created-1943
Mark attribute declarations as form=qualified
Resubmitted because of some clerical error...
Issue #1937 closed #closed-1937
1936 Mark attribute declarations as form=qualified
Pull request #1942 created #created-1942
37 Support sequence, array, and map destructuring declarations
Closes #37.
This currently only supports XPath. I'm working on the wording for XQuery.
Pull request #1941 created #created-1941
Add PR numbers and dates to change metadata
Purely editorial. No issue raised.
Pull request #1940 created #created-1940
1939 XQDY0153 (from try/finally) should be a type error
Closes #1939
Issue #1939 created #created-1939
XQDY0153 (from try/finally) should be a type error
The finally
clause is required to return an empty sequence. If not, it raises XQDY0153. This should be a type error rather than a dynamic error, so that it can be raised statically when appropriate.
Also noted in passing: "the the" in item 4 of the "changes" section of §4.20).
Issue #1938 created #created-1938
Invoking coerced methods
@ChristianGruen, in BaseXdb/basex#2420, brought up a test case similar to this:
declare record local:r(
f as fn() as item()
);
local:r(%method fn() {.})
? f()
=> map:keys()
With a method passed to the constructor, and retrieved by the lookup operator, one would expect the function call to return the map, and the result to be f
, the set of keys of the map. With BaseX's current implementation however, it fails with [XPDY0002] .: Context value is undefined.
The reason for this failure is function coercion. The record constructor asks for a more specific type than that of the supplied function, so it is subject to function coercion. This creates a new function item, which preserves the method annotation, and which effects a call to the original one. The lookup operator then is applied to the newly created function item, which as a method is equipped with the map as its context item. But when the original function gets called, that context item is not propagated to it. So there is a (rightful?) complaint about an undefined context.
I may be missing something here, but I do not see anything in the spec that makes the context item available to the coerced function, so I think that the described behavior is in fact conformant to the spec.
But as it contradicts the original expectation, I would be grateful for a clarification.
Pull request #1937 created #created-1937
1936 Mark attribute declarations as form=qualified
Fix #1936
Issue #1936 created #created-1936
XSD for XSLT 4.0 is missing form="qualified" on several attributes within the attribute group "literal-result-element-attributes"
The attributes in attribute group "literal-result-element-attributes" should be in the XSL namespace, and most of them are, except for xsl:default-mode, xsl:default-validation and xsl:expand-text. Those 3 are missing their form="qualified" attribute, so they would default to being in no namespace.
I know it's non-normative but it would be good to make the correction. (This is also an issue for the XSD for XSLT 3.0).
Issue #1799 closed #closed-1799
"well-formed HTML document"?
Issue #1891 closed #closed-1891
`fn:parse-html`: `html-version`
Issue #1918 closed #closed-1918
1891 clarifications on HTML versions and errors
Issue #1363 closed #closed-1363
map:get and array:get
Issue #1901 closed #closed-1901
1363 fallback becomes a value not a function
Issue #1896 closed #closed-1896
Drop "parameter names" as a property of a function item
Issue #1916 closed #closed-1916
1896 Drop parameter names as a property of function items
Issue #1932 closed #closed-1932
QT4-CG-115-01 xsl:next-match examples
Issue #1930 closed #closed-1930
QT4-CG-116-04 correction to fn:function-identity
Issue #1923 closed #closed-1923
Arithmetic Expressions needlessly mentions UnionExpr
Issue #1924 closed #closed-1924
1923 Editorial adjustments for arithmetic expressions
Issue #269 closed #closed-269
Function for URI relativization
Issue #826 closed #closed-826
Arrays: Representation of single members of an array
Issue #1566 closed #closed-1566
EXPath Modules: Future
Issue #1754 closed #closed-1754
Inverse functions to bin:hex, bin:bin, and bin:octal
Issue #1780 closed #closed-1780
xsl:for-each optional variable introduction
Issue #1905 closed #closed-1905
Editorial edits
Issue #1919 closed #closed-1919
1905 Editorial edits
Issue #1935 created #created-1935
doc-available() with invalid options
The doc-available function needs to make it clear what happens when invalid options are supplied.
Clearly invalid options such as xinclude="yes-or-no"
should be an error (rather than resulting in a return value of false, as the current spec might suggest).
It's less clear what should happen if say you request schema validation with a non-schema-aware processor. I think this probably calls for returning false rather than an error.
Issue #1934 created #created-1934
Supporting RELAX NG validation
At meeting 116, the question was raised: why don't we support RELAX NG validation?
I think that's a good question. Further, I think we should, if we can work out the technical details and arrive at consensus.
The good news is that it's a lot simpler than XSD validation. For those not familiar with RELAX NG, the 50,000 foot summary is that it's (more-or-less) regular expressions over trees. The grammar defines a number of patterns, including at least one designated as a start pattern. If the document matches (any one of the) start pattern(s), then it's valid. A trivial example looks like this:
start = doc
doc =
element doc {
attribute date { xsd:date },
p+
}
p =
element p {
text
}
(RELAX NG also has an XML syntax, but the "compact syntax" is isomorphic and many people find it easier to read.)
A couple of things to note: the "p" in "p+" in the doc pattern is a reference to the "p" pattern, not to the element named "p". And although the date attribute has to conform to an xsd:date
, that does not do any type assignment. RELAX NG allows user-defined data types; I suggest we make that an implementation-defined feature. Since no type assignment is performed, it doesn't really matter.
How might we add support for RELAX NG validation? A sketch...
-
Add to the static context a set of RELAX NG patterns. Initally empty, these are patterns that can match the document element during RELAX NG validation. (The union of all of the "start" patterns from all of the imported RELAX NG grammars.) There's no user-access to these patterns, so we don't technically need to add them to the data model, though I suppose we could.
-
In XQuery, allow schema import to import RELAX NG grammars. This has no effect except that the start patterns defined in that grammar are added to the start patterns in the static context. It is an error to specify “fixed” or “default element namespace” if the imported schema is a RELAX NG grammar.
-
Add “relax-ng” as a
ValidationMode
inValidateExpr
. It is an error to also specify a “type”.If RELAX NG validation is requested, the patterns in the static context are used to attempt to validate the document. If one succeeds, the validated document is returned. If none succeed, that’s an error.
-
In XSLT, allow schema import to import RELAX NG grammars.
- Details about role, TBD
- Details about literal schema elements, TBD
-
In XSLT, on elements that have a validation attribute, allow the value “relax-ng” with semantics analagous to the validate expression in XQuery.
-
In the F&O functions that have “dtd-validation” and “xsd-validation” options, add a boolean “relax-ng” validation option. If true, validation is done with the patterns in the static context.
There's one small wrinkle, the RELAX NG DTD Compatibility specification defines some annotations that allow a RELAX NG grammar to return default attributes. That means RELAX NG validation can return a different document than was validated, but I'm not sure how important that support is in 2025 if there were strong objections.
Pull request #1933 created #created-1933
1271 fn:xsd-validator() function
This proposal makes schema validation (as performed by the XQuery validate expression) available as a function. This allows additional options to be defined without extending the grammar, it makes it easier to incorporate validation within a pipeline of function calls, and it makes validation available from XPath.
If the proposal is accepted I would propose doing some editorial reorganisation so that the current XQuery and XSLT text describing the semantics of validation are directed to the definition of this function, reducing duplication in the specs.
Fix #1271
Pull request #1932 created #created-1932
QT4-CG-115-01 xsl:next-match examples
Adds an example demonstrating passing of parameter through a chain of xsl:next-match instructions
Pull request #1931 created #created-1931
QT4-CG-116-02 improve description of validation
Improves the description of the semantics of the xsd-validation
option on parse-xml()
and doc()
. Also brings the two functions into line by adding the xsi-schema-location
option from doc()
to parse-xml()
.
Pull request #1930 created #created-1930
QT4-CG-116-04 correction to fn:function-identity
Fix a simple typo.
Pull request #1929 created #created-1929
1725 xsl:mode/@copy-namespaces
Fix #1724
Issue #1742 closed #closed-1742
Maps constructed using streamed xsl:fork instruction should not be ordered
Issue #1925 closed #closed-1925
1844 Arrow Expressions
Pull request #1928 created #created-1928
1844b Arrow Expressions
This PR doesn't do what issue https://github.com/qt4cg/qtspecs/issues/1844 suggests, namely dropping the mapping arrow. Instead it picks up a couple of points made in passing in that issue:
(a) drops remaining references to the obsolete =?> operator
(b) simplifies the grammar for arrow expressions
(c) improves the way arrow expressions are described, including their relationship to pipeline expressions.
Issue #1927 closed #closed-1927
1907b method lookup
Pull request #1927 created #created-1927
1907b method lookup
Fix #1907
Pull request #1926 created #created-1926
1907 method lookup (disallow wildcard selection)
Fix #1907
Issue #1341 closed #closed-1341
Remove the `$position` argument from the `$action` function passed to folds
Pull request #1925 created #created-1925
1844 Arrow Expressions
This PR doesn't do what issue #1844 suggests, namely dropping the mapping arrow. Instead it picks up a couple of points made in passing in that issue:
(a) drops remaining references to the obsolete =?>
operator
(b) simplifies the grammar for arrow expressions
(c) improves the way arrow expressions are described, including their relationship to pipeline expressions.
Pull request #1924 created #created-1924
1923 Editorial adjustments for arithmetic expressions
Fix #1923
Issue #1923 created #created-1923
Arithmetic Expressions needlessly mentions UnionExpr
Minor editorial adjustment needed at XPath specs, 4.8 Arithmetic Expressions. The definition of UnionExpr is mentioned in the EBNF snippets at the top, but that definition is not discussed, nor should it be. This snippet should be dropped.
Pull request #1922 created #created-1922
1921 Expand definition of version ranges in XSLT
Fix #1921
Simple editorial bug fix.
Issue #1921 created #created-1921
XSLT: semantics of PatternVersionRange
In XSLT §3.5.1 The semantics of VersionTo
and VersionFromTo
are described as if the keyword to
is always followed by a VersionPrefix
, whereas the syntax allows a choice of a VersionPrefix
or a PackageVersion
.
This problem is present in XSLT 3.0.
See also Saxon bug https://saxonica.plan.io/issues/6746
Note also the absence of tests for the form to VersionPrefix
.
Issue #1737 closed #closed-1737
Grammar problems introduced by #1732
Issue #1798 closed #closed-1798
Getting the value of the new identity-(DM)property of a function. `fn:function-identity`
Issue #1920 created #created-1920
Parse functions: determinism
The function fn:parse-xml
is nondeterministic: Every function call may return a different node instance. Most other parse functions (fn:parse-json
, fn:parse-csv
, fn:csv-to-xml
, etc) are deterministic, and I believe we should change that and make them nondeterministic as well.
We could also make fn:json-doc
nondeterministic. If we don’t, we should probably add a stable
option.
Pull request #1919 created #created-1919
1905 Editorial edits
Closes #1905
Issue #1917 closed #closed-1917
1891 HTML versions and errors
Pull request #1918 created #created-1918
1891 clarifications on HTML versions and errors
Fix #1891 Fix #1799
partial fix for #1889
Pull request #1917 created #created-1917
1891 HTML versions and errors
Fix #1891 Fix #1799
partial fix for #1889
Pull request #1916 created #created-1916
1896 Drop parameter names as a property of function items
Fix #1896
Issue #1911 closed #closed-1911
Remarks on recent changes to regular expression handling
Issue #1902 closed #closed-1902
`binary:unpack-integer`, overflow/underflow
Issue #451 closed #closed-451
Multiple Schemas
Issue #1819 closed #closed-1819
451 Multiple schemas in XSLT
Issue #1881 closed #closed-1881
fn:function-identity for maps and arrays
Issue #1895 closed #closed-1895
1881 Function identity for maps and arrays
Issue #1876 closed #closed-1876
`fn:replace`: Combine $replacement and $action parameters
Issue #1897 closed #closed-1897
1876 In fn:replace(), merge the $replacement and $action parameters
Issue #1520 closed #closed-1520
Type declarations of cyclically dependent modules
Issue #1908 closed #closed-1908
1520 Allow forwards references to named item types
Issue #1910 closed #closed-1910
1021 (part 1) Add $options arg to doc() and doc-available()
Issue #501 closed #closed-501
Error handling: try/finally
Issue #1914 closed #closed-1914
501 Error handling: try/finally
Issue #1915 closed #closed-1915
1902b bin:unpack out of range error
Issue #1624 closed #closed-1624
document-node(a|b) is the same type as document-node(a)|document-node(b)
Issue #1898 closed #closed-1898
1624b Expand rules for document node subtyping
Issue #1832 closed #closed-1832
Associativity of Operators, especially "||" (Appendix A.5)
Issue #1904 closed #closed-1904
1832 Operator Associativity
Issue #564 closed #closed-564
Sorted maps
Issue #982 closed #closed-982
scan-left, scan-right: position argument, array functions
Issue #1846 closed #closed-1846
%method functions, dynamic function calls
Issue #1900 closed #closed-1900
Records: instance checks
Issue #1913 closed #closed-1913
1911 Clarifications for regular expressions
Issue #1645 closed #closed-1645
fn:elements-to-maps: Debugging
Issue #1646 closed #closed-1646
fn:elements-to-maps: Robustness
Issue #1648 closed #closed-1648
fn:elements-to-maps: Types
Issue #1909 closed #closed-1909
1902 bin unpack out of range
Pull request #1915 created #created-1915
1902b bin:unpack out of range error
Replaces PR #1909
Adds error conditions for unpacking an integer that is too large for the implementation
Pull request #1914 created #created-1914
501 Error handling: try/finally
Closes #501
Pull request #1913 created #created-1913
1911 Clarifications for regular expressions
- Reinstates the non-capturing group syntax
(?: xxx )
- Clarifies that a zero-length matching segment does not overlap an immediately preceding adjacent (but non-zero-length) segment.
Issue #1912 created #created-1912
Error handling: `fn:throw`
Adopted from #501:
In https://github.com/qt4cg/qtspecs/pull/493, a function/expression was suggested to re-throw errors:
try {
(: wild stuff :)
} catch * {
module:log($err:description),
fn:throw($err:map)
}
Existing errors map can be modified before rethrowing them:
try {
1 div 0
} catch * {
module:log($err:description),
fn:throw($err:map => map:put('description', 'Arithmetic error'))
}
Issue #1911 created #created-1911
Remarks on recent changes to regular expression handling
I would like to share these observations that I made while working on recent changes of regular expression handling per #1856.
The section that mentions potential rewrites of \b
and \B
misses to consider the start and end of the string, as well as the empty string. It should rather read:
\b
can be rewritten to an equivalent form in terms of lookbehind and lookahead assertions:
(?:(*positive_lookbehind:\w)(?:$|(*positive_lookahead:\W))|(?:^|(*positive_lookbehind:\W))(*positive_lookahead:\w))
A similar rewrite is possible for
\B
, but it must additionally take care of the empty string.
For fn:analyze-string
, it might be useful to add a clarifying remark about empty matches at the end of the result (see qt4cg/qt4tests#224).
The specification of fn:analyze-string
contains a duplicated word, the the
. The same also occurs in several other places in the documents.
Pull request #1910 created #created-1910
1021 (part 1) Add $options arg to doc() and doc-available()
To follow: options for collection() and uri-collection().
Pull request #1909 created #created-1909
1902 bin unpack out of range
Add error condition.
Fix #1902
I also did some work on removing errors and warnings from the EXPath binary build. There are a couple of outstanding issues I'm not sure how to fix:
(a) The function bin:bin had the incorrect id value func-bin-binary instead of func-bin-bin. I've corrected it, but the database of section ids needs updating.
(b) In database.xml, the EXPath binary spec is identified as document-summary/@uri = "https://qt4cg.org/specifications/EXPath/binary-40/". But the actual location of the specification is "https://qt4cg.org/specifications/expath-binary-40/"
(c) There are tags such as <code>bin:index-of-range</code>
which the stylesheet is trying to interpret as function names rather than error codes. They actually refer to obsolete error codes so we can't use <errorref>
Pull request #1908 created #created-1908
1520 Allow forwards references to named item types
Fix #1520
Issue #1907 created #created-1907
Method lookup: wildcards
We should ignore the %method
annotation for wildcard lookups:
let $data := { 'fn': %method fn() { . } }
return $data?*
I cannot see when this makes sense as it only seems to work if the wildcard lookup returns a single item (→ “selects a key/value pair whose value part is a singleton method”). In addition, it makes streaming of wildcard results troublesome.
If we believe we should support context value bindings for wildcards, I think it would be better to apply it to each item of the returned value, instead of the value as a whole.
Pull request #1906 created #created-1906
1797 elements-to-maps-conversion-plan function
The PR drops the "uniform" option of elements-to-maps
into a separate function elements-to-maps-conversion-plan
, which can be used to analyze a corpus of data and generate a conversion plan for use by elements-to-maps
. This is useful when the conversion is to be applied to documents that are not part of the corpus, for example when new documents arrive for conversion every day and need to be converted in a consistent way. It also provides a more general mechanism for users to override the system decisions on what layouts to use for what elements.
The PR is not entirely complete at this stage: the technical detail is all there, but examples need to be reviewed. Comments are welcome at this stage.
There are a few other minor changes. The most notable are:
- More consistent fallback when an inappropriate layout is chosen. If the layout does not allow attributes, then attributes are discarded; if there is any other mismatch, the converter falls back to serialized XML layout.
- Better handling of boolean and numeric element and attribute content.
Issue #1905 created #created-1905
Editorial edits
XQFO:
- Buggy examples/results:
map:put
,map:of-pairs
,fn:scan-left
- duplicates:
the the
, … - Boolean defaults:
true()
/false()
vs.true
/false
…to be continued
Pull request #1904 created #created-1904
1832 Operator Associativity
Update the table and explanatory notes.
Fix #1832
Issue #1903 created #created-1903
`fn:scan-left`, `fn:scan-right`: missing steps
I have labeled #982 (which included position arguments) to be closed to focus on the remaining todos:
- The types of the
$action
parameters offn:fold-right
andfn:scan-right
should be aligned. In particular,item()*
anditem()
of the scan function should be swapped: → #1919
fn:fold-right(
$input as item()*,
$init as item()*,
$action as fn(item(), item()*) as item()*
) as item()*
fn:scan-right(
$input as item()*,
$init as item()*,
$action as fn(item()*, item()) as item()*
) as array(*)*
- The result of the last example of
fn:scan-left
is syntactically wrong. → #1919 - The equivalent array functions are still missing (if we still believe we want to include them).
- We need tests.
Issue #1902 created #created-1902
`binary:unpack-integer`, overflow/underflow
If binary:unpack-integer
or binary:unpack-unsigned-integer
generates a value that exceeds the range supported by the implementation, err:FOAR0002
should be raised.
Related: https://github.com/expath/expath-cg/issues/116
Pull request #1901 created #created-1901
1363 fallback becomes a value not a function
Issue #1363 generated a large amount of discussion on how to handle absent keys in map:get() and out-of-range indexes in array:get().
I felt that one of the simplest proposals was to change the $fallback argument to be a simple default value, rather than a function. This eliminates some of the more "clever" use cases, but these can always be achieved in other ways, as the discussion thread demonstrates. Meanwhile reducing $fallback
to a simple default value makes life easier for the 90% of cases where this is all that is needed (especially for arrays, when the desire is to return a default value rather than throwing an error).
This PR therefore implements that simple proposal.
Fix #1363
Issue #1766 closed #closed-1766
1715 Drop array bound checking
Issue #1900 created #created-1900
Records: instance checks
Continues #1862:
In the last meeting, we discussed whether the order of record entries should be considered in instance checks. After further reflection and attempts to implement it, I believe this will make things much easier in the long term:
In 3.4.1 Item Coercion Rules, the coercion of records was added as a second exceptional case: The coercion may change the item in question even if the upstream instance check is succesful. This leads to additional action that I believe could simply be avoided if the successful instance means that no further action is required. I think it will also reduce possible cost that was indicated in https://github.com/qt4cg/qtspecs/issues/1862#issuecomment-2709104860.
Note: This issue clearly focuses on implications for the implementation. From a user perspective, I assume it will hardly ever make a difference whether we consider order or not. My assumption is that nearly all records will have the expected order anyway, or they will match the order once the first coercion has taken place.
All this takes time to specify. I will be glad to make an attempt and write the PR.
Issue #1899 closed #closed-1899
Superflous whitespace change to nudge CI; apologies for the noise.
Pull request #1899 created #created-1899
Superflous whitespace change to nudge CI; apologies for the noise.
Issue #1660 closed #closed-1660
Further suggestions for fn:path
Issue #1747 closed #closed-1747
Function finder is broken
Issue #1858 closed #closed-1858
Initial xsl:record
Issue #1870 closed #closed-1870
Rename $zero keyword of fold-left and fold-right
Issue #1887 closed #closed-1887
1870 rename $zero keyword of fold functions
Issue #1886 closed #closed-1886
1660 Additional options for fn:path
Issue #1862 closed #closed-1862
Records: consider order
Issue #1874 closed #closed-1874
1862 Coercing to a record type changes map order
Issue #1861 closed #closed-1861
xsl:next-match with-all-params
Issue #1875 closed #closed-1875
1861 Params passed automatically through next-match
Pull request #1898 created #created-1898
1624b Expand rules for document node subtyping
Fix #1624
Pull request #1897 created #created-1897
1876 In fn:replace(), merge the $replacement and $action parameters
Fix #1876
Issue #1884 closed #closed-1884
Deep-equality keys
Issue #1896 created #created-1896
Drop "parameter names" as a property of a function item
One of the properties of function items is the parameter names.
This property is unused; there is nothing that depends on the value of this property, and no way of discovering the value, and it isn't defined for all function items, e.g. maps and arrays, or functions returned by functions such as fn:op. It causes complications, such as whether two functions can have the same identity if they have different parameter names. I propose to drop it.
Of course, there are open issues that suggest allowing parameter names to be used in dynamic function calls. But I see little chance of coming up with a design that achieves this, because in general when you're given a function item to call, you have no idea what the parameter names are, and the person supplying the function item has very little control over what the parameter names will be.
Pull request #1895 created #created-1895
1881 Function identity for maps and arrays
Supplies rules for how fn:function-identity()
should handle maps and arrays.
Also makes the point that labels are ignored. There's a general statement to the effect in XDM that labels are ignored except where otherwise specified, but it's useful to avoid any doubt here.
Fix #1881
Issue #1892 closed #closed-1892
Dnovatchev dn examples (ignore this)
Pull request #1894 created #created-1894
Additional examples to fn:chain - in a new branch
Re-submitted the same as PR 1890. Added some new examples to fn:chain.
Issue #1890 closed #closed-1890
More examples added to fn:chain
Issue #1893 closed #closed-1893
Fix broken markup
Pull request #1893 created #created-1893
Fix broken markup
I cannot imagine how we got a merged PR that included broken markup, but it's probably made a mess of the diffs recently.
Pull request #1892 created #created-1892
Dnovatchev dn examples (ignore this)
This PR #1890 rebased off master to test if it makes for cleaner diffs.
Issue #1891 created #created-1891
`fn:parse-html`: `html-version`
Maybe we can align the HTML versions that fn:parse-html
needs to support with the remaining specification. It currently says:
Valid values an implementation must support for the
html
method are:3
,3.2
for HTML 3.2 W3C Recommendation, 14 January 19974
,4.01
for HTML 4.01 W3C Recommendation, 24 December 19995.0
for HTML5 W3C Recommendation, 28 October 20145.1
for HTML 5.1 W3C Recommendation, 1 November 20165.2
for HTML 5.2 W3C Recommendation, 14 December 2017LS
for HTML Living Standard, WHATWG5
may be equivalent to any of5.0
,5.1
,5.2
, orLS
In the XQFO and Serialization specs, only HTML 4.0/4.01 and HTML 5 are mentioned.
@rhdunn Do you have an opinion on this?
Related: #1889
Pull request #1890 created #created-1890
More examples added to fn:chain
Added 6 more examples and tests All are correctly executed.
Issue #1889 created #created-1889
HTML serialization: `html-version` and `version` parameters; allowed values
The serialization spec says (HTML Output Method: the version and html-version Parameters):
If the
html-version
serialization parameter is not absent, the requested HTML version is the value of thehtml-version
serialization parameter; otherwise, it is the value of theversion
serialization parameter.
fn:serialize defines the following defaults:
html-version
:5
version
:1.0
I wonder whether these rules cover all possible cases:
- Is it correct that HTML will be serialized as HTML5 if no options are supplied?
(: html-version=5 :)
serialize(<html/>, { 'method': 'html' })
- If only
version
is supplied, is it correct that it is ignored because ofhtml-version
defaulting to5
?
(: html-version=5 ? :)
serialize(<html/>, { 'method': 'html', 'version': '4.01' })
- If no, i.e., if
{ 'version': '4.01' }
is expected to overwrite the default forhtml-version
, how can we know at which stage the default values are to be considered?
In addition, the serialization specification mentions versions HTML 4.01 and HTML5 various times, but it seems to be up to the implementation to decide which HTML versions to support. However, we seem to have test cases for 4.0
and 5
. Would it make sense to define a miminum set of versions that need to be supported?
Finally, for some reason, the html-version
parameter was defined to be a decimal, whereas version
is defined as a string (since XQFO 3.1). Maybe this leads to the surprising result that Saxon seems to accept the option { 'version': '4.0' }
, but rejects { 'html-version': 4 }
.
Pull request #1888 created #created-1888
366 xsl:package-location
First draft, for initial feedback.
Notes:
- Because the CG has little energy/resources to develop the EXPath Zip module, I have situated the question of archive (compressed or not) in the URI scheme itself. There are dozens of archives, dozens of URI schemes. The only case where I have found overlap is in the
jar:
scheme/archive. Yes, I've seenzip:
used as an alias forjar:
, but it's not an official IANA URI scheme. This may need discussion. - I have opted to bind
@priority
to a non-zero integer. This is the first time the constraint for the union of positive and negative integers has been placed on an XSLT attribute, so I may not have correctly set upelement-catalog.xml
. - I have opted to not make attribute values
format
,name
, andversion
as criteria for the priority package location (new term), so that developers can be warned when the package is at odds with the declaration. To make them criteria would mean that inconsistencies between the declaration and the referenced packages would remain undetected. - I adopted the terms "URL" and "entry" based upon the IANA nomenclature for the jar: scheme.
- I may have overthought the distinction between archive and non-archive URIs. Feedback is appreciated.
- Error code
3000
has been broken up into different possible errors. - Suggestions on the type and number of tests that need to be written for the test suite are welcome.
Pull request #1887 created #created-1887
1870 rename $zero keyword of fold functions
Fix #1870
Issue #998 closed #closed-998
regular expression addition - lookbehind assertions and lookahead assertions
Issue #1848 closed #closed-1848
Define regular expressions using XSD 1.1 as baseline
Issue #1856 closed #closed-1856
998 Add boundary and lookahead/behind assertions
Pull request #1886 created #created-1886
1660 Additional options for fn:path
Issue #1860 closed #closed-1860
fn:parse-xml: DTDs, external resources
Issue #1857 closed #closed-1857
fn:parse-xml: `xinclude`
Issue #1879 closed #closed-1879
1857, 1860: Add more options to parse-xml
Issue #1882 closed #closed-1882
982 Editorial rewrite of scan-left and scan-right
Issue #1866 closed #closed-1866
Ambiguities introduced by #1864
Issue #1877 closed #closed-1877
1866 Disambiguate TypeSpecifier syntax
Issue #1867 closed #closed-1867
1341 Drop position from fold callbacks
Issue #1869 closed #closed-1869
`fn:duplicate-values`: Order of results
Issue #1873 closed #closed-1873
1869 duplicate values
Issue #1851 closed #closed-1851
Questions on `fn:atomic-type-annotation`
Issue #1878 closed #closed-1878
1851 Make ?variety optional; explain namespace-sensitive
Issue #1863 closed #closed-1863
add \U \u L \u \E to replace() (case conversion)
Issue #1880 closed #closed-1880
Editorial revision of fn:function-identity
Issue #1885 created #created-1885
Use the spcification grammar markup to define the regular expression grammar in F&O
The grammar for regular expressions in the regular expression section of F&O is currently defined as a code block. Making it use the grammar markup used to define the pattern, XPath, and XQuery grammars would:
- give the grammar a unified appearance with the other grammars;
- allow grammar elements to be cross referenced and linked back to the grammar.
Issue #1884 created #created-1884
Deep-equality keys
Issue #119 proposes extending maps to allow arbitrary values as keys. This is very difficult to achieve, (a) because the fact that keys are atomic items is deeply embedded in the design of a number of functions and operations on maps, and (b) because it's very hard to define an equality function that suits everyone.
The way we tacked variable equality semantics for strings was via the collation-key() function, which takes a string and a collation as input and produces an opaque key value, which can be used as a key in maps, and which reflects the desired equality semantics.
We could extend the same idea to values other than strings. In particular, we could define a deep-equality-key() that can be calculated for any sequence, and that takes all the matching options of the deep-equal() function as a parameter. (We could then redefine deep-equal(a, b, options)
to mean deep-equality-key(a, options) eq deep-equality-key(b, options)
).
The main drawback is that the deep-equality-keys for large node trees or maps would be rather long strings. People might use the functionality without realising the expense.
Another problem is that one of our options in deep-equals() is a callback function for item equality, and we couldn't replicate this when computing a key. But this callback is the only way we have, for example, to compare nodes by identity rather than by content.
Note that an internal deep-equality-key concept (or at least a deep-equality hashcode) is needed anyway for efficient implementation of deep-equals where order is deemed irrelevant. Without it, the function becomes O(n^2). Quite independently of this proposal, we should perhaps have an explicit option on deep-equals() to compare nodes by identity.
Issue #1296 closed #closed-1296
982 Rewrite of scan-left and scan-right
Pull request #1883 created #created-1883
882 Replace fn:chain by fn:compose
Drops the existing fn:chain function and replaces it with a new fn:compose function.
This combines two separate changes:
(a) whereas fn:chain applies a sequence of functions to an input, fn:compose returns a composite function that can be used repeatedly with different inputs.
(b) the fn:compose function is restricted to arity-1 functions, which leads to a much simpler specification that still handles the vast majority of practical use cases.
In particular, note that if the sequence of functions to be applied is statically known, then it can always be written out explicitly; the real use case for this function is when the sequence of functions is constructed dynamically. And in this situation, fn:chain in its current form can easily fail because of problems with the arity of the functions included in the chain.
Issue #1865 closed #closed-1865
Callback functions, position argument: consistency
Pull request #1882 created #created-1882
982 Editorial rewrite of scan-left and scan-right
This is intended to be purely an editorial rewrite, it does not change the functionality.
Replaces #1296.
Addresses #982, but we still need to add corresponding functions for arrays.
Issue #1881 created #created-1881
fn:function-identity for maps and arrays
The data model spec says that function identity is not defined for maps and arrays.
The specification of fn:function-identity() fails to mention this fact.
Pull request #1880 created #created-1880
Editorial revision of fn:function-identity
Tidies up the text and adds examples
Pull request #1879 created #created-1879
1857, 1860: Add more options to parse-xml
Add options to control entity expansion and XInclude processing.
Fix #1857 Fix #1860
Pull request #1878 created #created-1878
1851 Make ?variety optional; explain namespace-sensitive
Fix #1851
Allow ?variety to be absent e.g. for xs:anySimpleType
Define namespace-sensitive by an xtermref to the definition in the XP/XQ spec.
Pull request #1877 created #created-1877
1866 Disambiguate TypeSpecifier syntax
Fix #1866
Issue #1876 created #created-1876
`fn:replace`: Combine $replacement and $action parameters
We could combine the competing $replacement
and $action
parameters:
replace(
'this is a test',
'(\w)(\w+)?',
fn($s, $g) { upper-case($g[1]) || lower-case($g[2]) }
)
Original comment: https://github.com/qt4cg/qtspecs/issues/1863#issuecomment-2711149296
Pull request #1875 created #created-1875
1861 Params passed automatically through next-match
Fix #1861
Pull request #1874 created #created-1874
1862 Coercing to a record type changes map order
Fix #1862
Pull request #1873 created #created-1873
1869 duplicate values
Fix #1869
Issue #1872 created #created-1872
Arrays: members → values / entries?
I am pretty sure the first reaction will be DONT!, but for the sake of consistency it seems important enough for me to bring this up:
Could we rename array “members” to “values”?
Some advantages that I would see:
- We could treat arrays and maps more similarly.
- We already have a
values
lookup key specifier for arrays. - No 3.1 array function contains the string “member”, so we will not introduce any backward inconsistencies.
- All 4.0 features that use this string could be safely renamed.
Of course the term “values” is a very common one, but we have to decided to stick with “map values” – and arrays and maps are very similar.
Finally, I noticed that also the term “member” has different meanings in the spec and is not exclusively used for arrays (e.g. in the rules for fn:innermost
, fn:format-integer
or for members of union types).
Issue #1338 closed #closed-1338
Arrays and maps: Members, entries, values, contents, pairs, …
Issue #1871 created #created-1871
Arrays and maps: consistency
Suggestions (based on #1338, related: #1868)
- In symmetry with the
pairs
lookup specifier, we should addarray:pairs
and an inversearray:of-pairs
function. - In symmetry with the
values
lookup specifier, we should addarray:values
andmap:values
functions, to retrieve the values of maps and the members of arrays as a sequence of arrays. - In return,
array:members
andarray:of-members
seem redundant, and we should drop them. - In analogy with the
keys
specifier andmap:keys
, we should addarray:keys
(which returns a dense integer range).
Background
With version 4.0, we are adding a lot of promising and powerful new map and array features. This is a big step forward, compared to the obvious limitations of 3.1.
Some aspects of the 3.1 design have made it difficult (or impossible) to fully adjust array and maps, but (in my opinion) the old overall concept was impressively consistent – and it is definitely a big challenge to achieve a 4.0 design that is not too fragmented.
To me, this becomes particularly evident in the case of arrays. The following example sums up the items of all members of an array. For the cumbersome 3.1 solution…
for $pos in 1 to array:size($array)
return sum($array($pos))
…we now have several (roughly?) equivalent options to do this:
for member $m in $array return sum($m)
array:members($array) ! sum(?value)
$array?pairs::* ! sum(?value)
$array?values::* ! sum(.)
The examples above imply that:
- for 1., an array member is a sequence;
- for 2., an array member is a map;
- for 3., an array has pairs (but there is no
array:pairs
); - for 4., an array has values (but there is no
array:values
).
Issue #1870 created #created-1870
Rename $zero keyword of fold-left and fold-right
I find the name $zero for this parameter unhelpful and confusing.
I suggest $accum
, short for "accumulator" or "accumulated result".
Issue #1869 created #created-1869
`fn:duplicate-values`: Order of results
With https://github.com/qt4cg/qtspecs/pull/987, a rule was added to fn:duplicate-values
:
For any set of values that compare equal, the one that is returned is the one that appears first in $values.
I think we should adapt the behavior to return the duplicates in the order they appear, not the original values:
- A common use case for this function is to find the first duplicate in a list.
- If we return the original values in the correct order, we need to parse the full sequence before we can know which will be the first result. A worst-case example:
(0x7FFFFFFFFFFFFFFF, 1, 1 to 0x7FFFFFFFFFFFFFFF)
=> duplicate-values()
=> head()
Issue #1868 created #created-1868
array:members() to include index position
Currently array:members(["a", "b"])
returns
{'value': "a"},
{'value': "b"}
I suggest that it should instead return
{'key': 1, 'value': "a"},
{'key': 2, 'value': "b"}
The extra information is useful for any operation that wants to take account of positions as well as values. For example, rearranging an array into multiple columns. Using the names "key" and "value" also means that the data is suitable for converting an array to a map by means of map:of-pairs
.
The function array:of-members() should change to accept record('value', *)
(making the record type extensible) so that the key part is ignored if present.
Pull request #1867 created #created-1867
1341 Drop position from fold callbacks
Following up on issue 1341, we decided to drop the position argument from the 4 fold functions.
Most of the changes in this PR are dealing with the collateral damage - changes to "formal equivalents" of other functions that previously relied on fold-left having the position available to the callback function.
Issue #1866 created #created-1866
Ambiguities introduced by #1864
The grammar check done by RExification of XQuery and XPath 4.0 Grammars has detected a bunch of LALR(2) conflicts caused by the recent addition of TypeSpecifier
to the KeySpecifier
production.
In fact these are ambiguities between following being used as a QName
(via EQName
, TypeName
), or as a keyword:
array
attribute
comment
document-node
element
empty-sequence
enum
fn
function
item
map
namespace-node
node
processing-instruction
record
schema-attribute
schema-element
text
E.g. element
in
$A?~element()
can be parsed as an element test, element()
, or as a type name element
followed by a PositionalArgumentList
. This is similar to what the "reserved-function-names" constraint covers, but that does not apply here because there is no function name involved.
The SequenceType
in TypeSpecifier
, enclosed in extra parenthese, does not present a problem, so my proposal is to drop ItemType
from TypeSpecifier
and rewrite the production to
TypeSpecifier
::= '~' '(' SequenceType ')'
Issue #1865 created #created-1865
Callback functions, position argument: consistency
- In https://github.com/qt4cg/qtspecs/pull/1735#issuecomment-2715090198, it was decided to remove the position argument from
fn:fold-left
andfn:fold-right
. → #1867 - As maps are ordered now, we should add the position argument to iterative map functions (e.g.,
map:for-each
; basically all functions for which equivalent sequence and array functions exist).
Issue #1227 closed #closed-1227
150 PR resubmission for fn ranks
Issue #1456 closed #closed-1456
Filtering by type in lookup expressions
Issue #1864 closed #closed-1864
1456 Lookup expressions filtered by type
Issue #1740 closed #closed-1740
1725b Further elaboration of duplicates handling in maps
Issue #1735 closed #closed-1735
1341 Drop $position callback from many functions
Issue #1794 closed #closed-1794
Lookup: select all except
Issue #1778 closed #closed-1778
1456 Lookup expressions filtered by type
Pull request #1864 created #created-1864
1456 Lookup expressions filtered by type
Fix #1456
Technically identical to PR #1778, but reworked because it had become impossible to resolve the merge conflicts.
Issue #1863 created #created-1863
add \U \u L \u \E to replace() (case conversion)
Many systems using regular expressions support case conversion in the replacement strings.
For example,
sed -e 's/[aA]*/\L\u&/'
given AAA as input, produces Aaa.
It’s not 100% clear to me its worth adding, since an action function can do the same thing with more or less work, but for reference,
\U turns the replaced text into upper case until \E, \L, or the end of the replacement string \L turns the replaces text to lower case in the same way \u and \l affect the single next character and operate independently of \U, \L, \E.
I wrote up some more precise spec text and can make a pull request; the case in the sed example above is common in text conversion projects but slightly tricky to get right with a function,
fn { upper-case(substring(., 1, 1)) || lower-case(substring(., 2) }
This is simple, but consider \2 \L\u\1\3
as a function, where \1
may be empty.
Overall i don’t have strong feelings either way, except that supporting them may help people migrate from other systems or languages. \E feels uncomfortably procedural. In Perl and libpcre i think, \E also turns of \Q (which disables all metadata characters up until \E).
Like < and > in patterns, \L and friends can be emulated with some care, but that’s true of a lot of regular expression syntax, and one point of the shorthands (as i see it) is to move the feature towards being accessible by people with less of a programming background.
Issue #1862 created #created-1862
Records: consider order
I think we should make the order of record entries part of instance checks and coercion rules:
- It will be less confusing for users if records have a well-defined order (similar to objects in OOL), in particular if records are serialized.
- It will be much easier for implementations to access record entries by their internal index if the order is statically known. There will still be opportunities for optimizing lookups in arbitrary maps (index-based access has generally become easier with maps being ordered).
Issue #1861 created #created-1861
xsl:next-match with-all-params
Problem
The <xsl:next-match>
instruction is useful when writing local templates to customize the behavior of an imported XSLT. Unfortunately, there is a limitation due to the fact that <xsl:next-match>
does not pass along parameters unless the parameters are defined as tunneling or the parameters are explicitly coded using <xsl:with-param>
.
The fact that <xsl:next-match>
does not automatically pass along parameters can be surprising or lead to cumbersome workarounds, and limits how <xsl:next-match>
can be used when writing local templates to customize the behavior of imported XSLT.
-
In situations where parameters are defined in an imported XSLT it might not be feasible to change parameters to tunneling.
-
In situations where a variety of parameters might be in scope when a template that uses
<xsl:next-match>
is invoked, currently each parameter needs to be explicitly coded using<xsl:param>
and<xsl:with-param>
in<xsl:next-match>
, even though the parameters might not be relevant to the purpose or logic of the template. This may lead to fragile and less maintainable code and increases the cognitive load for developers, especially when working with complex, multi-layered stylesheets.
This proposal aims to simplify the use of the <xsl:next-match>
instruction while being backwards compatible.
Proposal
-
Add an option to
<xsl:next-match>
to enable passing along all parameters. This option might take the form of a new optional attribute on<xsl:next-match>
namedwith-all-params
(this name is similar to the existing element name<xsl:with-param>
) that takes a yes/no (or Boolean) value and defaults to no (false). -
An instruction
<xsl:next-match with-all-params="no"/>
would operate the same as<xsl:next-match/>
currently does. -
An instruction
<xsl:next-match with-all-params="yes"/>
would operate the same as<xsl:next-match/>
currently does with the difference that all parameters that were in scope when the current template was invoked will remain in scope for the next matching template. -
An instruction
<xsl:next-match with-all-params="yes">
that contains<xsl:with-param>
should operate the same as described in the preceding paragraph with the difference that parameters defined by<xsl:with-param>
will also be in scope for the next matching template. -
If a parameter defined by
<xsl:with-param>
within<xsl:next-match with-all-params="yes">
has the same name as a parameter that was in scope when the current template was invoked, then the effective value of that parameter should be the value defined by<xsl:with-param>
. This will allow a template to override parameters when necessary.
To summarize, <xsl:next-match with-all-params="yes">
should invoke the next matching template and automatically pass along all parameters that were in scope when the current template was invoked, and optionally allow using <xsl:with-param>
to set additional parameters or modify parameter values.
Example
Given this input document:
<!-- input.xml -->
<section>
<p>hello</p>
</section>
This stylesheet import.xsl
provides a set of base templates. The template matching element "p" uses <xsl:next-match/>
in it's current (default) operation.
<!-- import.xsl -->
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="3.0" expand-text="yes">
<xsl:output indent="yes"/>
<xsl:mode on-no-match="shallow-copy"/>
<xsl:template match="section">
<section>
<xsl:apply-templates>
<xsl:with-param name="a" select="'a'"/>
<xsl:with-param name="b" select="'b'"/>
</xsl:apply-templates>
</section>
</xsl:template>
<xsl:template match="p">
<xsl:param name="a"/>
<xsl:param name="b"/>
<xsl:param name="c"/>
<p>a {$a}</p>
<p>b {$b}</p>
<p>c {$c}</p>
<xsl:next-match/>
</xsl:template>
</xsl:stylesheet>
This is the output of the above stylesheet import.xsl
and the input document:
<section>
<p>a a</p>
<p>b b</p>
<p>c </p>
<p>hello</p>
</section>
This stylesheet before.xsl
imports the stylesheet import.xsl
and defines a template to customize how <p>
elements are processed. The parameter $a
needs to be intercepted and forwarded even though this template is not doing anything with $a
. The parameter $b
is overridden, and the parameter $c
is added within <xsl:next-match>
.
<!-- before.xsl -->
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="3.0">
<xsl:import href="import.xsl"/>
<xsl:template match="p">
<xsl:param name="a"/>
<p>customization</p>
<xsl:next-match>
<xsl:with-param name="a" select="$a"/>
<xsl:with-param name="b" select="'buzz'"/>
<xsl:with-param name="c" select="'c'"/>
</xsl:next-match>
</xsl:template>
</xsl:stylesheet>
This stylesheet after.xsl
does the same thing as the previous stylesheet but uses with-all-params="yes"
. The template does not need to intercept and forward the parameter $a
because this is handled automatically by with-all-params="yes"
. The parameter $b
is overridden, and the parameter $c
is added within <xsl:next-match>
in the same way as the previous stylesheet. Although this is a small example in which the parameter $a
is the only savings, the benefit of with-all-params="yes"
can be significant in scenarios where there are more parameters.
<!-- after.xsl -->
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="4.0">
<xsl:import href="import.xsl"/>
<xsl:template match="p">
<p>customization</p>
<xsl:next-match with-all-params="yes">
<xsl:with-param name="b" select="'buzz'"/>
<xsl:with-param name="c" select="'c'"/>
</xsl:next-match>
</xsl:template>
</xsl:stylesheet>
The two stylesheets above (before.xsl
and after.xsl
) should produce the same output.
<!-- output.xml -->
<section>
<p>customization</p>
<p>a a</p>
<p>b buzz</p>
<p>c c</p>
<p>hello</p>
</section>
Issue #1801 closed #closed-1801
1798 Function fn:function-identity
Issue #1860 created #created-1860
fn:parse-xml: DTDs, external resources
The text doesn’t say much about what DTD validation means. Is my assumption correct that it boils down to a SAXParserFactory.setValidating
call in Java?
What about DTDs in general? Given the following snippets (using the default false
for DTD validation)…
<!-- xml.dtd -->
<!ENTITY arrow "→">
parse-xml(`
<!DOCTYPE xml SYSTEM 'xml.dtd'>
<xml>&arrow;</xml>`
)
…should the result be <xml/>
, <xml>→</xml>
, or an error? In other words, should the (potentially external) xml.dtd
resource be resolved and interpreted?
Maybe we should introduce an additional DTD
option (or options?) to control the loading of external DTDs and the handling of entities, for example:
http://apache.org/xml/features/nonvalidating/load-external-dtd
http://xml.org/sax/features/external-general-entities
http://xml.org/sax/features/external-parameter-entities
Thoughts are welcome.
Issue #1859 created #created-1859
Question on `fn:chain` and `err:FOAP0001`
Per #1280, fn:apply
has been changed to allow the number of arguments to be greater than the arity of the function.
fn:chain
is defined in terms of fn:apply
, and it also refers to the error code err:FOAP0001
which belongs to fn:apply
.
However the condition for the error differs between the two:
-
fn:chain
An error [err:FOAP0001] is raised if the arity of any function $f in $functions is different from the number of members in the array that is passed to fn:apply.
-
fn:apply
A dynamic error is raised if the arity of the function $function is greater than the size of the array $arguments ([err:FOAP0001]).
Also in B Error Codes, err:FOAP001
still asks for the arity to exaclty match the number of arguments:
err:FOAP0001, Wrong number of arguments.
Raised when fn:apply is called and the arity of the supplied function is not the same as the number of members in the supplied array.
Should not the description of fn:chain
and the error summary be adapted to the changed behaviour of fn:apply
?
Issue #1853 closed #closed-1853
1845 Revised design of methods to use . rather than $this
Issue #1845 closed #closed-1845
Should we add additional syntactic sugar for use with %method functions?
Issue #1820 closed #closed-1820
Delta markers in collapsed TOC
Issue #1838 closed #closed-1838
1820 Attempt to add change markup in collapsed ToC
Issue #1796 closed #closed-1796
Allow fn:invisible-xml to return a function that returns an item()
Issue #1839 closed #closed-1839
Relax the return type of the Invisible XML parsing function
Issue #1849 closed #closed-1849
Reduce the indentation in the ToC
Issue #1850 closed #closed-1850
Actions from meeting 111
Issue #1771 closed #closed-1771
fn:deep-equal: map order
Issue #1855 closed #closed-1855
1771 Add option for deep-equal to consider map order
Issue #1847 closed #closed-1847
%method functions: explicit self reference?
Pull request #1858 created #created-1858
Initial xsl:record
An initial draft of the xsl:record instruction, for discussion
Issue #1656 closed #closed-1656
Ordered Maps: Updates
Issue #1829 closed #closed-1829
Problems with new arrow expression syntax
Issue #1854 closed #closed-1854
Can someone direct me to the motivating use case of objects?
Issue #1857 created #created-1857
fn:parse-xml: `xinclude`
We should allow XInclude processing to be enabled/disabled, as it can potentially lead to memory leaks.
Issue #1835 closed #closed-1835
add zero-width assertions to regular expressions
Pull request #1856 created #created-1856
998 Add boundary and lookahead/behind assertions
Incorporates and supersedes #1835
Issue #1836 closed #closed-1836
unparsed-text-lines() - line endings
Pull request #1855 created #created-1855
1771 Add option for deep-equal to consider map order
Adds an option for deep-equal to treat order of entries in a map as significant.
Fix #1771
Issue #1854 created #created-1854
Can someone direct me to the motivating use case of objects?
There's a LOT of conversations about "this" and methods and really quite complex syntax, but (despite writing OO software for about 30+ years), I cant think of a motivating use case in the context of XSLT/XQuery.
- in imperative languages with mutable state...yes
- in very large code bases requiring some abstraction/encapsulation - well maybe...there are probably easier ways to do this without 'objects'.
this is the sort of canonical example people use (to illustrate simple technical points)
$rect := {'x': 10, 'y': 7, 'area': fn(){?x * ?y}}
but actually I would write
$rect := {'x': 10, 'y': 7, 'area': 10 * 7}
(and similarly write a data constructor for the record in that manner)
i.e.
fn($x,$y){'x': $x, 'y': $y, 'area': $x * $y}
I'm not sure at the moment its worth the effort.
Pull request #1853 created #created-1853
1845 Revised design of methods to use . rather than $this
Proposal is that in methods, the containing map should be bound to the context item rather than to the special variable $this, so fields of that map are referenced as ?x
rather than $this?x
.
Issue #1852 created #created-1852
fn:values-except: Return atomic values that occur in A but not in B
fn:distinct-values
can be used to perform a union
on atomic values:
(: returns 1 to 5 :)
let $one := 1 to 4, $two := 2 to 5
return distinct-values(($one, $two))
fn:duplicate-values
can be used for intersect
:
(: returns 2 to 4 :)
let $one := 1 to 4, $two := 2 to 5
return duplicate-values(($one, $two))
A (roughly) equivalent alternative is $one[. = $two]
.
I think we should add an equivalent for except
(it requires 2 arguments instead of 1):
fn:values-except(
$values as xs:anyAtomicType*,
$exclude as xs:anyAtomicType*,
$collation as xs:string? := fn:default-collation()
) as xs:anyAtomicType*
An example:
(: returns 1 :)
let $one := 1 to 4, $two := 2 to 5
return values-except($one, $two)
In principle, this function can also be written as $one[not(. = $two)]
, but a dedicated function will be easier to understand for users and easier to optimize for processors.
Issue #1247 closed #closed-1247
`??type(T)` in lookup expressions - shortcuts
Issue #1851 created #created-1851
Questions on `fn:atomic-type-annotation`
These questions came up while working on fn:atomic-type-annotation
:
- what is the
variety
ofxs:anySimpleType
? - should not
constructor
be absent in anfn:schema-type-record
describingxs:QName
?
Here are the detailed observations:
fn:schema-type-record?variety
Consider the following query:
(
<x>42</x>
=> fn:atomic-type-annotation()
)
?base-type()
?base-type()
?variety
My interpretation is as follows:
- the
x
element node is atomized to a value of typexs:untypedAtomic
, - so
fn:atomic-type-annotation
returns the information forxs:untypedAtomic
, - the base type of that is
xs:anyAtomicType
, - the base type of that is
xs:anySimpleType
, - per XML schema 1.1, 3.16.7.1 xs:anySimpleType, the
{variety}
ofxs:anySimpleType
is absent, - so
variety
should be absent in anfn:schema-type-record
describingxs:anySimpleType
, - the result thus should be an empty sequence.
According to the current spec, variety
must always be present with a value of type enum("atomic", "list", "union", "empty", "simple", "element-only", "mixed")
, but also correspond to the {variety}
of the simple type in the XSD component model.
Should not variety
be optional, and omitted for xs:anySimpleType
?
fn:schema-type-record?constructor
The spec says this about constructor
:
The field is absent for complex types and for the abstract types xs:anyAtomicType, xs:anySimpleType, and xs:NOTATION. It is also absent for all namespace-sensitive types, that is, types derived from xs:QName or xs:NOTATION.
The formulation does not include xs:QName
, but should not its constructor be absent for the same reasons as for the types derived from it?
Pull request #1850 created #created-1850
Actions from meeting 111
[ ] QT4CG-111-01: MK to review the editorial comments on PR #1837 and then merge the PR.
Done (along with a couple of other minor corrections noted in passing)
[ ] QT4CG-111-02: MK to fix the typo $in as xs:double+ and 1.3. 1.4 that middle “.” should be a “,”
Already done before the PR was merged
[ ] QT4CG-111-03: MK to add a %method example that uses the arrow syntax.
Done (though the example isn't especially convincing).
Also added another couple of examples and notes in passing.
Pull request #1849 created #created-1849
Reduce the indentation in the ToC
Issue #1848 created #created-1848
Define regular expressions using XSD 1.1 as baseline
Issue #1800 closed #closed-1800
The `=?>` lookup arrow expression operator is weird, difficult to use, difficult to understand, difficult to read and unnatural
Issue #1817 closed #closed-1817
1800 Methods
Issue #1843 closed #closed-1843
XQFO: TOC texts
Issue #1847 created #created-1847
%method functions: explicit self reference?
This is a discussion issue; I am torn and would be interested in feedback:
With the just added %method
annotation, basically two things happen:
- An implicit
$this
parameter is preprended to the remaining parameters of a function. - The current map will be bound to the first parameter by the lookup operator.
The inner workings of the example in the spec were not entierly obvious in today’s meeting…
let $area := %method fn() { $this?x * $this?y }
return $area({ 'x': 3, 'y': 4 })
…and I am wondering if we are not more flexible by making the self-referencing parameter explicit. This way, it would be up to the user to decide how the parameter is called…
let $number := { 'value': 3, 'inc': %method fn($self) { $self?value + 1 } }
return $number?inc()
…the focus function syntax could be used alternatively…
let $number := { 'value': 3, 'inc': %method fn { ?value + 1 } }
return $number?inc()
…and it would allow for a stricter typing ($this as map(*)
is not very specific), and thus for better error reporting:
declare record coord(
x as xs:double,
y as xs:double,
product := %method fn($coord as coord) { $coord?x * $coord?y }
);
coord(3, 4)?product()
Obviously, it would cause new issues:
%method fn() {}
would need to be made illegal- The type of the first argument would need to be
map(*)
or a subtype of it. - Users may be led to write…
let $map := { 'fn': %method fn($a, $b) { $a * $b } }
return $map?fn(2, 3)
On the other hand, the existence of the %method
annotation should indicate that this function type differs from others.
If we stick with the invisible $this
parameter, I wonder what function-arity(%method fn() {})
is supposed to return?
Issue #1846 created #created-1846
%method functions, dynamic function calls
With the #1817, the %method
annotation was introduced for functions. It is interpreted by the lookup operator:
let $number := { 'value': -3, 'abs': %method fn() { abs($this?value) } }
return $number?abs()
I think we should extend this mechanism to dynamic function calls, as many people use the constructs interchangeably:
return $number('abs')()
I agree that the binding mechanism should not apply for map:get
or any other map functions and iterations.
Issue #1830 closed #closed-1830
1829 Reintroduce restrictions on RHS of `=>`
Issue #1815 closed #closed-1815
Function annotations on function items
Issue #1828 closed #closed-1828
1815 Add more detail on annotations of function items
Issue #1834 closed #closed-1834
json-lines - refinement
Issue #1837 closed #closed-1837
1834 Additional clarification on JSON lines
Issue #583 closed #closed-583
(array|map):replace → *:substitute or *:change
Issue #1833 closed #closed-1833
583 Drop map:replace and array:replace
Issue #1816 closed #closed-1816
Programmatic partial application
Issue #1825 closed #closed-1825
1816 New function fn:partial-apply
Issue #1818 closed #closed-1818
Grammar problem introduced by #1802
Issue #1826 closed #closed-1826
Fix grammar bug #1818
Issue #1823 closed #closed-1823
Clearer top-level section headings in F+O
Issue #1824 closed #closed-1824
1823 Revise top-level headings in F+O spec
Issue #1845 created #created-1845
Should we add additional syntactic sugar for use with %method functions?
During meeting 111, DN was arguing for additional syntactic sugar when his connection to the call ended abruptly. This issue is to make sure we come back to those discussions.
Specifically, should we allow ^x
as an abbreviation for $this?x
?
Issue #1813 closed #closed-1813
Reorganise top-level sections in XDM
Issue #1814 closed #closed-1814
1813 Reorganise the XDM spec at top level
Issue #1811 closed #closed-1811
Add note concerning non-XML characters in character maps
Issue #1812 closed #closed-1812
1811 Add note regarding non-XML chars in xsl:output-character
Issue #1844 created #created-1844
Drop mapping arrow operator
To reduce the number of new operators, I suggest removing the mapping arrow operator =!>
, in favor of the recently added ->
operator (which now allows us to arbitrarily create chains for single items and sequences).
Related: https://github.com/qt4cg/qtspecs/issues/1685
Issue #1843 created #created-1843
XQFO: TOC texts
The XQFO TOC is overly verbose, and inconsistent nevertheless. With the addition of arrows and symbols, many headers stretch across several lines.
If no one objects, I will remove all the redundant "Functions ..." strings:
Current:
- Introduction
- Functions on nodes and node sequences
- Errors and diagnostics
- Functions and operators on numerics
- Functions on strings
- Functions that manipulate URIs
- Functions and operators on Boolean values
- Functions and operators on durations
- Functions and operators on dates and times
- Functions related to QNames ...
Proposed:
- Introduction
- Nodes
- Errors and diagnostics
- Numerics
- Strings
- URIs
- Boolean values
- Durations
- Dates and times
- QNames ...
Issue #1842 closed #closed-1842
This is a test of the emergency broadcast system. This is only a test.
Issue #1842 created #created-1842
This is a test of the emergency broadcast system. This is only a test.
Had this been a real emergency, we would have fled in terror and you would not have been informed.
Issue #1840 closed #closed-1840
GH action remove-label-on-reopen.yml
Issue #1841 closed #closed-1841
Action to remove label on reopen
Pull request #1841 created #created-1841
Action to remove label on reopen
Close #1840
Pull request #1840 created #created-1840
GH action remove-label-on-reopen.yml
In response to the mailing list post by @ndw: https://lists.w3.org/Archives/Public/public-xslt-40/2025Feb/0024.html
This is untested, but it might at least serve as an inspiration how to avoid the unwanted tag in an automated manner.
Pull request #1839 created #created-1839
Relax the return type of the Invisible XML parsing function
Fix #1796
This change does not appear to change any test results. (In other words, none of our tests checked that the return type was explicitly a document node.)
Pull request #1838 created #created-1838
1820 Attempt to add change markup in collapsed ToC
Fix #1820
This PR updates the styling so that a small "Δ" is added to the expand arrow when there are changes or additions in the concealed subsections. It's smaller and not blue. I could argue that this is on purpose so that the marking is different and perhaps more subtle. But the truth is, it was just easier to add the Δ without any markup that would make it larger or blue.
I've opted to conceal the Δ when the ToC is "open" on the grounds that you can see what is or isn't marked new on the revealed subsctions.
Issue #1827 closed #closed-1827
XPath TOC: For and Let Expressions: whitespace
Issue #1831 closed #closed-1831
1827 Fix excess whitespace in TOC
Pull request #1837 created #created-1837
1834 Additional clarification on JSON lines
Fix #1834
Issue #1836 created #created-1836
unparsed-text-lines() - line endings
The description of the unparsed-text-lines function contradicts itself regarding line endings.
First it says that the function is equivalent to calling unparsed-text()
and applying tokenize(., '\n')
to the result.
Then it says that it accepts x0A, x0D, or x0D0A as line endings.
Pull request #1835 created #created-1835
add zero-width assertions to regular expressions
Proposal for issues !998 and !1006 to add zero-width assertions - lookahead, lookbehind, and word boundary.
Word boundaries use the already-defined \w and \W from XML Schema.
The syntax for lookahead and lookbehind assertions supports the two most common variants, one using < and > and the other using (*positive_lookahead:expr)
, which is at least amenable to Web searches, and doesn’t need escaping in XSLT or XQuery.
Note that word boundary < \b \B > assertions can be rewritten in terms of lookahead and lookbehind assertions.
Perl has a more powerful form of \b and \B that can match grapheme clusters, the Unicode linebreaking algorithm, and more, but supporting that would require language and script based mechanisms; if the graphemes() function is added, it would be worth considering. For now, i made it an error to write \b{...} so that the support could be added later if wanted, and also so that copying regular expressions into XPath would raise an error for the unsupported feature.
I will reopen !998 - if this is accepted i can produce test cases. Of course, i’m also happy to edit/rewrite etc. The syntax is widely supported, although \K is i think not in libpcre (but, libpcre has looser restrictions on negative backward assertions).
Issue #1834 created #created-1834
json-lines - refinement
Some suggestions regarding support for json-lines:
(a) The json-lines spec has no official standing. It might therefore be a good idea if we summarize its essentials, just in case it disappears off the web.
(b) The spec makes the final newline optional. Our test cases assume no final newline. We should probably mandate this for interoperability.
(c) We should tell people how to read files in json-lines format - specifically unparsed-text-lines() ! parse-json()
Pull request #1833 created #created-1833
583 Drop map:replace and array:replace
Fix #583
Issue #1832 created #created-1832
Associativity of Operators, especially "||" (Appendix A.5)
The associativity of the ||
operator is given as "left-to-right" - it should surely be "either" (like comma, "or", and "union").
Other aspects of this table are questionable.
- The operator
?[]
for filtering a map or array should probably be included. - Arguably
=>
and?
should be omitted because the RHS is not actually an expression, though it's true that ifA => B => C
is allowed, then it means(A => B) => C
. +
and*
are associative, it's only in conjunction with other operators that they aren't.
Pull request #1831 created #created-1831
1827 Fix excess whitespace in TOC
Fix #1827
Pull request #1830 created #created-1830
1829 Reintroduce restrictions on RHS of `=>`
Partial reversion of PR #1763
Issue #1829 created #created-1829
Problems with new arrow expression syntax
I'm hitting problems with implementing the changes in PR #1763
The problem is that the =>
can now be followed by either a static function call or a dynamic function call, and I think we need unbounded lookahead to distinguish them.
Consider
3 => function-lookup(xs:QName('fn:abs'), 1)()
at first sight the arrow appears to be followed by a static function call, function-lookup(xs:QName('fn:abs'), 1)
. But treating it as such causes a parsing error when we get to the ()
- what we actually have here is a dynamic function call that starts with a static function call.
I propose that we revert to allowing a dynamic function call only in the form
a => x ( argument-list )
where x is a variable reference, a parenthesized expression, an inline function expression, or a map or array constructor.
Pull request #1828 created #created-1828
1815 Add more detail on annotations of function items
Fix #1815
Issue #1827 created #created-1827
XPath TOC: For and Let Expressions: whitespace
The table of contents for XPath, section 4.12, "For and Let Expressions", contains spurious whitespace. The whitespace appears to be present in the HTML, but it is not there in the source XML. In the actual section heading, there are two <a>
elements before and after the heading text, each having as content a single space character.
The problem is also there in the equivalent section heading "FLWOR Expressions" in the XQuery spec.
In the "xpath-assembled" document, the heading appears as
<head>
<phrase role="xpath">For and Let Expressions</phrase>
</head>
It seems to be the phrase
element that's causing the trouble: or more likely, the whitespace text nodes that surround it.
Pull request #1826 created #created-1826
Fix grammar bug #1818
Fix #1818
Pull request #1825 created #created-1825
1816 New function fn:partial-apply
Fix #1816
Pull request #1824 created #created-1824
1823 Revise top-level headings in F+O spec
Revises the headings for consistency and brevity, to make the ToC easier to navigate at a glance
Fix #1823
Issue #1823 created #created-1823
Clearer top-level section headings in F+O
The improved rendition of the table of contents makes it apparent that the top-level sections headings in F+O are inconsistent and unnecessarily verbose.
Issue #1821 closed #closed-1821
Generated appendices in XDM
Issue #1822 closed #closed-1822
1821 Fix the generated appendixes in the Data Model
Pull request #1822 created #created-1822
1821 Fix the generated appendixes in the Data Model
Fix #1821
Issue #1821 created #created-1821
Generated appendices in XDM
The last four appendices in XDM are stylesheet-generated, and their TOC entries are added "by hand", and as a result they are incorrectly rendered.
I suggest using the same process for these appendices as other specs use: they should have a skeletal presence in the XML master, with a processing instruction to direct the stylesheet to expand the content; there is then no need for special machinery in the stylesheet to generate the TOC.
Issue #1808 closed #closed-1808
Add pipeline operator to list of tokens using '<' and '>' characters
Issue #1820 created #created-1820
Delta markers in collapsed TOC
When the TOC is shown in collapsed mode, it would be nice to promote the Δ change markers to the level where they become visible.
Pull request #1819 created #created-1819
451 Multiple schemas in XSLT
Fix #451
Issue #1818 created #created-1818
Grammar problem introduced by #1802
The recent merge of #1802 has incorrectly changed production ArrowExpr
from
ArrowExpr
::= UnaryExpr ( SequenceArrowTarget | MappingArrowTarget | LookupArrowTarget )
to
ArrowExpr
::= UnaryExpr ( SequenceArrowTarget MappingArrowTarget LookupArrowTarget )*
It has changed a g:choice
operator to a g:zeroOrMore
operator, while it should have added a g:zeroOrMore
around the g:choice
.
Issue #1716 closed #closed-1716
Variable lookahead needed for `ArrowTarget`
Issue #1763 closed #closed-1763
1716 Generalize syntax of arrow expressions
Issue #1789 closed #closed-1789
Terminology: "singleton map"
Issue #1791 closed #closed-1791
1789 Fix singleton terminology
QT4 CG meeting 110 draft minutes #minutes—02-18
Draft minutes published.
Issue #1769 closed #closed-1769
Add links from processing model diagrams
Issue #1788 closed #closed-1788
Drop reference to maps being unordered
Issue #1790 closed #closed-1790
1788 Replace statement that maps are unordered
Issue #1785 closed #closed-1785
XQuery 4.0 grammar: `ArrowExpr` target, `ReverseAxis`
Issue #1802 closed #closed-1802
1785 Fix two simple grammar bugs
Issue #1803 closed #closed-1803
Drop "(Non-Normative)" from table of contents
Issue #1804 closed #closed-1804
Drop "(Non-Normative)" from ToC
Issue #1805 closed #closed-1805
Drop middle dots from term references in F&O
Issue #1806 closed #closed-1806
1805 Drop middle dots from termref rendition in F+O
Issue #1807 closed #closed-1807
Two exceptions or three?
Issue #1809 closed #closed-1809
1807 Two exceptions to the rule, not three
Issue #1631 closed #closed-1631
xsl:apply-templates (without select) should allow inline content
Issue #1810 closed #closed-1810
1808 Add -> to list of tokens using lt and gt characters
Pull request #1817 created #created-1817
1800 Methods
Fix #1800
Issue #1816 created #created-1816
Programmatic partial application
We don't have a programmatic way of doing partial function application, in particular it's very difficult to to a partial application supplying the first argument of a function item without knowing statically how many other arguments there are.
I suggest extending fn:apply so that the second argument can be a map with integer keys; it can supply some or all of the arguments to a function, and arguments that aren't supplied are retained in the returned partially-applied function item. For the example cited where only the first argument is to be supplied, supplying an array of length one would be equivalent.
Issue #1815 created #created-1815
Function annotations on function items
We say very little about function annotations on function items.
For example,
-
we don't say that the function item constructed by a named function reference inherits the function annotations of the function declaration
-
we don't say whether a function item constructed by partial application (whether static or dynamic) has any function annotations, and if so what they are.
-
we don't mention them in function-lookup().
Pull request #1814 created #created-1814
1813 Reorganise the XDM spec at top level
Fix #1813
The diff version is probably not too useful because a lot of material has moved around. Very little text has actually changed, and none of it substantively.
QT4 CG meeting 110 draft agenda #agenda-02-18
Draft agenda published.
Issue #1813 created #created-1813
Reorganise top-level sections in XDM
The improved presentation of the TOC for our specs makes it rather obvious that the structure of the XDM spec has become unbalanced. Most of the spec is about node trees; information about atomic values, maps, functions etc is hard to find, and sometimes appears in strange places such as "Terminology".
In addition, the way the spec is assembled from multiple entities serves little purpose. It makes it harder for editors to find the text that needs to be edited and to locate markup errors introduced in the course of editing.
Pull request #1812 created #created-1812
1811 Add note regarding non-XML chars in xsl:output-character
Fix #1811
Issue #1811 created #created-1811
Add note concerning non-XML characters in character maps
We have relaxed the rules for using non-XML characters in strings. It would be useful to explain how to take advantage of this in character maps.
Pull request #1810 created #created-1810
1808 Add -> to list of tokens using lt and gt characters
Pull request #1809 created #created-1809
1807 Two exceptions to the rule, not three
Fix #1807
Issue #1808 created #created-1808
Add pipeline operator to list of tokens using '<' and '>' characters
Add the operator ->
to the list of tokens in XPath A3.3.
Issue #1807 created #created-1807
Two exceptions or three?
XPath 4.5.2.7 on function identity says "There are two exceptions to this rule:" and then lists three.
Pull request #1806 created #created-1806
1805 Drop middle dots from termref rendition in F+O
Brings F+O into line with the other specs.
Fix #1805
Issue #1805 created #created-1805
Drop middle dots from term references in F&O
The F&O spec renders termref links between middle dots. None of the other specs use this convention.
Pull request #1804 created #created-1804
Drop "(Non-Normative)" from ToC
Fix #1803
Issue #1803 created #created-1803
Drop "(Non-Normative)" from table of contents
Proposal - drop the phrase "(Non-Normative)" from section titles in the table of contents (but not in the body of the document).
In several of the specs this phrase disrupts the indentation and formatting of the ToC, and it adds very little value.
Pull request #1802 created #created-1802
1785 Fix two simple grammar bugs
Fix #1785
Pull request #1801 created #created-1801
1798 Function fn:function-identity
The function fn:identity as already described and discussed in #1798
Issue #1800 created #created-1800
The `=?>` lookup arrow expression operator is weird, difficult to use, difficult to understand, difficult to read and unnatural
The XPath 4.0 language now includes a way for a function defined as a member of a map to easily access other members (siblings) that belong to the same map instance. Special syntax, the =?>
operator, was introduced to call such a function. As a whole this is a huge step forward providing the user with a new, powerful mechanism to conveniently express relationships and calculations over several member-values of a map instance.
I am raising this issue with the goal of further improving and simplifying for the user the way to define and call a member function of a map/record, giving it a convenient way to access the values of other members of the instance of the map, on which the call has been issued.
In my work, I have been trying to define a number of functions that must belong to a map/record and that should be able to access other members of the same map/record to which these functions belong.
The experience was far from satisfying and here I describe the main problems I encountered when trying to use the =?>
operator, and some obvious suggestions how we can further simplify the syntax for calling any member function of a map or record.
1. Problems trying to use the =?>
operator
Here are the main problems I ran into.
Problem1. The =?>
operator was:
- weird-looking;
- difficult to use;
- difficult to understand;
- difficult to read;
- feeling unnatural.
It would be much better if we didn't have to use any special operator at all in order to call a member function "
myFunction
" of a map$m
by simply:$m?myFunction(<tuple of any arguments defined in the signature of the function>)
Problem2. There is no example, in the sections that describe the record type (3.2.8.3), showing a record member-function that accesses the values of other members of the same instance of the record. Thus, the new feature is effectively hidden for people who want to work with records. We need such an example for a record, so that we don't forget that any record is also a map and possesses all functionality a map has to offer. And a statement to this effect must be added to the description of records.
Problem 3. This syntax is overcomplicated and difficult to use and remember, resulting in unnecessarily long and complex expressions:
let $rectangle := {
"width": 20,
"height": 12,
"area": fn($this) { $this?width * $this?height }
}
return $rectangle =?> area()
It would be significantly better to use a much simplified syntax such as:
let $rectangle := {
"width": 20,
"height": 12,
"area": fn() { ?width * ?height }
}
return $rectangle ? area()
Recognizing that ?name
is already used since XPath 3.1 as Unary Lookup Operator, and to avoid the unlikely case of collision, when a member function accesses other members of the map-owner-instance that happen to have identically the same names as expected constituents of the current context item (upon which the function is applied), we can introduce a special character to denote the current map-owner-instance, thus the above example could look like this:
let $rectangle := {
"width": 20,
"height": 12,
"area": fn() { ^width * ^height }
}
return $rectangle ? area()
Solutions
Solution for Problem 1 above (weirdness of the =?>
operator:
Do not introduce any special operator. Just use ?
to invoke the member-function.
Solution for Problem 2 above (lack of example of a record having a member-function that accesses other members of the same map-owner-instance). Obviously, provide such an example. Also reiterate there that all features and functionality of a map continue to be available for records.
Solution for Problem 3 above (overcomplicated syntax:
- Get rid of the
=?>
operator. Use?
for all references to member-functions. - Don't use any special variable like
$this
. For example, the current example in the documentation:"area": fn($this) { $this?width * $this?height }
should instead be:"area": fn() { ^width * ^height }
- use the
^
character to denote owner-map-instance membership. Thus^width
means: "The member named "width" of the map instance upon which the current function was invoked"
Conclusion
I will issue a PR with the solutions, provided there are not any substantial comments hilighting problems with this proposal.
Issue #1799 created #created-1799
"well-formed HTML document"?
There is an apparent ambiguity in the XPath Functions specification as to whether fn:parse-html
raises dynamic error err:FODC0011 when html-version
is set to one of the HTML5 versions and the content of $html
is not well-formed, given the general expectation that HTML5 parsers can always parse an input string regardless of syntactic validity.
The HTML5 standard, for its part, has actually always allowed the parser to be aborted upon encountering a parse error (though no browser does this), so the function definition would seem to require that parse-html("<p>Hello</p>", { "method": "html", "html-version": 5 })
invariably raises an error, given that the input is invalid (missing opening <html>
tag, etc.).
I don't think this is the intended behavior; my suggestion is to either have it be explicitly implementation-defined as to which parse errors cause err:FODC0011 to be raised, or require that it is never raised for HTML5.
Issue #1238 closed #closed-1238
XSLT on-no-match="shallow-copy-all" - revised rules
Issue #1798 created #created-1798
Getting the value of the new identity-(DM)property of a function. `fn:function-identity`
The current set of Functions on Functions: at (https://qt4cg.org/specifications/xpath-functions-40/Overview.html#functions-on-functions) was recently updated with a new function to produce all annotations for a given function: fn:function-annotations. However we are still missing the ability to reference another important, newly-added property of a function: the function identity: in DM and in XPath.
fn:function-identity
Summary Returns the identity of the function item.
Signature
fn:function-identity( $function as fn(*) ) as xs:string
Properties This function is ·deterministic·, ·context-independent·, and ·focus-independent·.
Notes
- This function can be useful in any scenario, where the evaluation of a function call requires repeated evaluation of this same, or other functions. Often in such cases, the algorithm needs access to a general structure, containing the cached results of executing possibly many different functions applied on specific arguments-tuples.
What unique key is needed under which to group all invocations of a specific function and then the mapping between their function-call arguments and the result of the call? Remarkably, the identity of a function fits exactly the requirements (uniqueness / one per function) for such a key.
As we already have a function-identity property in the Data Model for each function-item, it is straightforward to provide this identity, and the
fn:function-identity
does exactly that. - The function identity, by definition, is generated upon the creation of a function and has meaning throughout of the life of that function. It is not meaningful to store this value across different executions, because the identity given to a function in execution1 will generally be different from the identity, given to it in execution2. However, the definitions of system functions (functions defined in the specifications under the system namespaces - with standard prefixes:
xs
,fn
,map
,array
,math
,err
,output
) can be assigned permanent identities in the official Specs documents, requiring every implementation to use exactly this published identity value for the official, system functions, thus achieving efficiency and convenience during debugging. - The function identity, being a string, can be used as a key in a map, thus making it possible to map a particular function to a sequence of items. It becomes possible to allow function items as map-keys by extending the definition of same-keys with: "If both keys are function items:
$f1
and$f2
, then they are the same if and only if:function-identity($f1) eq function-identity($f2)
Issue #1797 created #created-1797
elements-to-maps: separate function to construct a plan
I propose separating out the uniform=true
option of elements-to-maps()
into a separate function. This function analyses the data and produces a conversion plan, which can be supplied to the "layouts" option (perhaps renamed) of the main function.
The benefits are:
- The plan can be tweaked after it is created by manual adjustment, for example if the user wants to use "empty-plus" layout wherever the system's choice would be "empty", or to take account of anticipated future changes in the structure.
- The plan can be used to process documents that did not exist at the time it was created, thus ensuring that future documents are all converted in the same way, and avoiding the overhead of rebuilding the plan each time.
- The plan can be created from a small sample of the documents to be converted.
- The plan can be created from a large collection of documents including documents that don't need to be converted but which may contain structural elements that are not revealed by the documents that need converting now.
- The user can examine the plan to see what it is doing, which is useful for diagnostics.
We should define the format of the plan (a map from element names to layouts) so that it can conveniently be serialized as a JSON document.
Issue #1796 created #created-1796
Allow fn:invisible-xml to return a function that returns an item()
Our current fn:invisible-xml
function returns a document node. That makes perfect sense when held up against the Invisible XML specification. But I wonder if we should leave the door open to some extensibility. I can imagine, for example, an implementation of an Invisible XML processor that has the ability to return a map or even a CSV structure instead of XML. (XML is the required, standard result in 1.0 but implementors have been known to offer user options to produce other serializations and one area of potential change in the future is other serialization formats.)
Pro: more extensible. Con: less type information about the result.
Issue #1795 created #created-1795
XSLT templates: Matching values in a map by key
The simplest coding pattern for template rule processing for JSON structures would be to take a structure like this:
[
{"name": "John", "address": { .... }, job-history: [ { .... }, {....} ]},
{"name": "Jane", "address": { .... }, job-history: [ { .... }, {....} ]}
]
and to process it using template rules of the form:
<xsl:template match="record(name, address, job-history)">
<xsl:apply-templates select="?*"/>
</xsl:template>
<xsl:template match="(pattern matching key 'name')">...</xsl:template>
<xsl:template match="(pattern matching key 'address')">...</xsl:template>
<xsl:template match="(pattern matching key 'job-history')">...</xsl:template>
We have nearly all the ingredients in place for this. In particular, we can ensure that the select="?*"
selects values that are labelled with the relevant key, making it technically possible to match values according to that key: select="?*"
might select an xs:string value "John", but the string is labelled with the property key="name", so it can in principle match a template rule designed to process the "name" value.
The only piece that's missing is how to write the match patterns. We can write match=".[label()?key = 'name']"
, but that's hopelessly long-winded.
I propose that we use the syntax match="?name"
to match a value that is labelled with the key "name". This feels intuitive and natural, and 99% of users won't trouble with the complex underlying semantics.
We can extend this by borrowing other parts of the Lookup expression syntax, for example match="?('X', 'Y', 'Z')"
to match several keys.
I would also suggest promoting the operators "union", "intersect" and "except" so they can be used to combine any patterns (not just node patterns) so this could be written match="?X | ?X | ?Z"
, or we could write match="?* except ?X"
. But note that this would create an expectation that users can also write select="?* except ?X"
in an XPath expression; and that's quite hard to achieve: see also #1794.
Issue #1794 created #created-1794
Lookup: select all except
In lookup expressions we have ?*
to select all entries, and ?X
to select a specific entry. There is frequently a requirement to select all entries with specific exceptions.
One way of doing this is $map => map:remove('X')?*
Another is to do $map?pairs::*[?key != 'X']?value
A third option is $map?[?key != 'X']?*
Or $map => map:filter(($k, $v){$k != 'X'})?*
Or for key $k value $v in $map where $k != 'X' return $v
None of these feels particularly user-friendly.
A possible syntax might be $map?-X
or more generally "?" "-" KeySpecifier
to select all entries that are not selected by the KeySpecifier
. For example this would allow $map?-('X', 'Y')
to exclude X
and Y
,
Issue #1782 closed #closed-1782
1776 Add lookup patterns using ? and ??
Issue #1781 closed #closed-1781
XSLT: drop section 23 (Processing JSON Data) and Appendix B
Issue #1792 closed #closed-1792
Schema validation errors on function catalog for EXPath binary spec
Issue #1793 closed #closed-1793
1792 Make function-catalog file schema-valid
Pull request #1793 created #created-1793
1792 Make function-catalog file schema-valid
Fix #1792
Issue #1792 created #created-1792
Schema validation errors on function catalog for EXPath binary spec
I'm seeing schema validation errors after rebasing, it looks like PR #1765 introduced lines like
<fos:changes issue="1751">
when the @issue attribute should be on the child fos:change element.
Perhaps the validation done by the build has improved.
I'll fix this in a separate PR to be emergency-applied.
QT4 CG meeting 109 draft minutes #minutes—02-11
Draft minutes published.
Issue #1779 closed #closed-1779
XPath 4.0 EBNF grammar
Issue #1783 closed #closed-1783
1779 Make CharRef XQuery-only
Issue #1752 closed #closed-1752
Return type of fn:partition()
Issue #1761 closed #closed-1761
1752 Correct return type of fn:partition()
Issue #1751 closed #closed-1751
bin:encode-string - should the result have a BOM?
Issue #1765 closed #closed-1765
1751 Clarify BOM handling
Issue #1770 closed #closed-1770
Union patterns in XSLT
Issue #1772 closed #closed-1772
1770 Default priority of rules with a union pattern
Issue #402 closed #closed-402
XSLT patterns: intersect and except
Issue #1773 closed #closed-1773
402 Change the semantics of intersect and except in patterns
Issue #1784 closed #closed-1784
1781 Drop obsolete material from XSLT spec
Issue #755 closed #closed-755
with expression; chaining and concatenation
Issue #877 closed #closed-877
Inconsistency in XQFO comparator functions/operators with recursive rules
Issue #1729 closed #closed-1729
Grammar problems introduced by #1721
Issue #1767 closed #closed-1767
1729/1737 Fix grammar for "declare record"
Pull request #1791 created #created-1791
1789 Fix singleton terminology
Replaces "singleton map" with "single-entry map" and "singleton array" with "single-member array"; the term "singleton" now always means count()=1, not size()=1.
Fix #1789
Pull request #1790 created #created-1790
1788 Replace statement that maps are unordered
Fix #1788
Issue #1789 created #created-1789
Terminology: "singleton map"
We often use the term "singleton map" to mean a map containing a single entry (key-value pair).
But in XQ 4.14.3.1 we use the same term to mean "a sequence containing a single map".
Issue #1788 created #created-1788
Drop reference to maps being unordered
In F&O 17.5.1.5 elements-to-maps record layout, mapping rules, delete
Because the child elements are converted to a map, their order is not retained.
Substitute a rule that the entries in the map will correspond with "order of first appearance".
QT4 CG meeting 109 draft agenda #agenda-02-11
Draft agenda published.
Issue #1787 created #created-1787
Sorted maps revisited
Now that we have ordered maps established, I'd like to make another attempt to introduce sorted maps - that is, maps whose ordering is by key value. The entries in such a map would be sorted by key, but there's no attempt to maintain sort order in subsequent put() operations.
We introduce map:sort($m) essentially as a convenient shorthand for map:of-pairs(sort(map:pairs($m), fn{?key}))
.
And then we introduce something like map:get-range($from, $to)
which returns the keys (or pairs, or entries) whose keys are in a given range -- which the implementation can optimize if it knows the map has been sorted.
Issue #1786 created #created-1786
A case study for XSLT transformation of JSON: the transpiler
One of the design aims of XSLT 4.0 is that it should be easier to transform JSON. Back in 2016 I published a paper at XML Prague (https://www.saxonica.com/papers/xmlprague-2016mhk.pdf) with the rather disappointing result that for a couple of non-trivial JSON transformation tasks, the easiest solution was to convert the JSON to XML, transform the XML, and then convert it back. In many ways it was that discovery that motivated the whole XSLT 4.0 project. So I want to review to what extent we have solved that problem, and what remains to be done. In particular, I have recently raised a number of open issues related to how we transform JSON-derived trees of maps and arrays using template rules, and I'm not sure we can resolve those issues without testing the proposals against real use cases.
I'm proposing to take as a case study the Java-to-C# transpiler which we described in a 2021 paper at https://www.saxonica.com/papers/markupuk-2021mhk.pdf. This is a real XSLT application in daily use. It invokes the (open source) JavaParser to emit an XML representation of Java source code, it performs various transformations of that XML, and then finally spits out equivalent C# source code. My basic question is: suppose the JavaParser had chosen to emit JSON instead of XML (as it might perfectly reasonably have chosen to do). Would we be able to write the transpiler in XSLT 4.0 to work entirely within the JSON space, avoiding all use of XML?
I chose this case study for several reasons:
- It's entirely plausible that the input might have been JSON rather than XML
- The application relies very heavily (and successfully) on rule-based processing: if we didn't have template rules, then it would be dominated by large xsl:choose statements with hundreds of branches.
- At around 5000 lines of XSLT, it's large enough to be non-trivial, yet small enough to be tractable as a case study.
I looked at a couple of other candidates, and found they were things that could be readily done in XSLT 3.0 without any enhancements. For example we have production XSLT 3.0 code that takes a JSON data feed from our online shop at saxonica.com and uses it to update our sales database and to generate license keys. The JSON is voluminous but the structure is simple, and the constructs in XSLT 3.0 for handling maps and arrays are entirely up to the job. The transpiler differs in that the JSON has a much more interesting recursive structure, making rule-based transformation a natural fit to the task.
I'm not proposing to actually produce a complete replacement of the current transpiler, only to explore the task of doing so in enough detail to get some useful insights. I propose to use this issue tracker to capture my working notes as the study proceeds, but if there are recommendations affecting the 4.0 specs (as seems likely), then I will extract those into separate issues. Perhaps at the end of the process I will write up the case study as a conference paper.
My rough plan is as follows:
- Explore conversion of the current XML output by JavaParser to JSON using the new elements-to-maps() function. We have a number of open issues on the usability of this function and it will be interesting to see whether we encounter similar difficulties to those that have already been raised, and whether the suggested solutions are appropriate.
- Convert the xml-to-java stylesheet to work on this JSON input. This stylesheet is not actually a working part of the transpiler, rather it's something we built as a stepping stone; before attempting to convert the XML syntax tree to C#, we felt it would be instructive to write code that converted it back to Java. This is an 820-line stylesheet and it should be feasible to convert it completely.
- The transpiler currently produces, as an intermediate output, a "digest" file containing summary information about all the classes and methods found in the Java code, and their subtyping/override relationships. We then have a process that augments this digest with attributes that are needed by the C# generation, for example which methods to label with "virtual" or "override" modifiers. I propose to experiment with producing (and transforming) this digest in JSON rather than XML format.
- Examine the XSLT code that generates C# output to look for features that appear to be tricky to convert, for example anything that uses the parent or ancestor axis, and study to what extent we now have the capability in XSLT 4.0 to handle those situations.
Using this format (a GitHub issue) to record progress carries a risk that there will be comments that take things off at a tangent. Please help by resisting that temptation: if there are interesting issues raised in your mind, please take those up as separate issues.
Issue #1785 created #created-1785
XQuery 4.0 grammar: `ArrowExpr` target, `ReverseAxis`
While testing the parser generated from the specification grammar, I encountered two issues.
1. ArrowExpr
target must be optional
The current definition in the specification is as follows:
ArrowExpr ::= UnaryExpr (SequenceArrowTarget | MappingArrowTarget | LookupArrowTarget)
However, the target part must at least be optional, or better zero-or-more:
ArrowExpr ::= UnaryExpr (SequenceArrowTarget | MappingArrowTarget | LookupArrowTarget)*
Otherwise arrow targets are expected almost everywhere. Making it zero-or-more allows parsing of
a => b() => c()
which would not be possible without extra parentheses if it was optional.
2. Missing preceding-sibling
in ReverseAxis
The ReverseAxis
production currently appears as:
ReverseAxis ::= ( "ancestor"
| "ancestor-or-self"
| "parent"
| "preceding"
| "preceding-or-self"
| "preceding-sibling-or-self" ) "::"
It is missing the prededing-sibling
axis.
Pull request #1784 created #created-1784
1781 Drop obsolete material from XSLT spec
Drops material mainly deriving from when XSLT 3.0 had to work with both XPath 3.0 and 3.1. Includes non-normative exposition and some obsolete conformance statements.
Pull request #1783 created #created-1783
1779 Make CharRef XQuery-only
Fix #1779
Makes the CharRef token XQuery-only.
Pull request #1782 created #created-1782
1776 Add lookup patterns using ? and ??
Fix #1776
Issue #1026 closed #closed-1026
XSLT match patterns on pinned maps and arrays
Issue #1781 created #created-1781
XSLT: drop section 23 (Processing JSON Data) and Appendix B
Section 23 Processing JSON data at one time contained the specification of maps, before this moved into XPath 3.1. This has now gone, and what's left is pretty much content-free.
Appendix B contains a stylesheet for converting XML to JSON. It has some educational value, but not much, and I think it can go.
Issue #1780 created #created-1780
xsl:for-each optional variable introduction
I spend quite a lot of time writing
<xsl:for-each select="foo">
<xsl:variable name="foo" select="." as="element(foo)"/>
<xsl:for-each select="$foo/bar">
<xsl:variable name="bar" select="." as="element(bar)"/>
.... do some stuff with $foo and $bar
</xsl:for-each>
</xsl:for-each>
I'd prefer to go (much like xquery)
<xsl:for-each name="foo" as="element(foo)" select="foo">
<xsl:for-each name="bar" as="element(bar)" select="$foo/bar">
.... do some stuff with $foo and $bar
</xsl:for-each>
</xsl:for-each>
Issue #1779 created #created-1779
XPath 4.0 EBNF grammar
The grammar extraction and transformation in RExify XQuery 4.0 grammar has been extended to cover the XPath 4.0 specification document, resulting in an LALR(1) grammar for XPath 4.0 that is suitable for REx.
This update revealed a minor issue: section A.3.1 Terminal Symbols lists CharRef
, which is unreferenced:
CharRef ::= [http://www.w3.org/TR/REC-xml#NT-CharRef]
/* xgc: xml-version */
Currently, the transformation process includes a rule that removes this production. The rules will be adjusted as the grammar evolves.
Pull request #1778 created #created-1778
1456 Lookup expressions filtered by type
Fix #1456
Allows selection of records by type within a JSON tree, for example $json ?? ~record(first, last) ? last
.
I'm aware that the use of the tilde here is controversial but I think this kind of query is going to be very common; it needs something simple and I think people will get used to it. No-one has suggested anything that is obviously better, and I propose to also use ~
in other similar contexts, for example type patterns in XSLT, which will increase familiarity.
I suggest reading ~
as "of type".
Issue #1777 created #created-1777
Shallow copy in XSLT with maps and arrays
Currently the xsl:copy
instruction, if applied to a map or array, does a deep copy, and ignores the content of the contained sequence constructor.
I propose that if the contained sequence constructor is non-empty then instead of ignoring it, we should use it to create the content of the new map or array. Specifically, for maps xsl:copy will behave essentially like xsl:map, and for arrays it will behave essentially like xsl:array.
This is an incompatibility with 3.1, but since a contained sequence constructor is currently totally useless in this situation, it doesn't seem likely to cause any trouble.
I also propose that rather than using the new built-in on-no-match="shallow-copy-all
, we should extend the semantics of shallow-copy
to cover maps and arrays (as currently defined for shallow-copy-all
). Again, there is an incompatibility, but the current rules are so unhelpful that it's unlikely people are relying on them.
I also propose that when apply-templates is applied to a map or array, it should be automatically pinned if it is not pinned already. The means that match patterns can be used with a lot more context to match the deep contents of the map or array and override the processing of the built-in templates.
And I propose that when apply-templates is applied to a map or array and there is no select
attribute, it should "do the right thing" by applying templates to the map or array contents, rather than using the useless default of child::node()
.
Issue #1776 created #created-1776
Using `?` and `??` in XSLT patterns
I propose that the pattern P1 ? P2
, where P1
and P2
are patterns, should match any labelled item $L provided that $L matches P2, and $L?..
(that is, ($L => label())?parent
) matches P1
.
Similarly, the pattern P1 ?? P2
, where P1
and P2
are patterns, should match any labelled item $L provided that $L matches P2, and $L?...
(that is, ($L => label())?ancestors()
) matches P1
.
Note that neither the syntax nor the semantics are directly related to the lookup operator in XPath. In particular, P2 is a pattern, not a KeySpecifier. But there is a strong analogy, both with the use of ?
and ??
in XPath expressions, and with the use of /
and //
in patterns.
Issue #1775 created #created-1775
Navigation in JSON trees
I propose that the parse-json function should create a pinned tree, so that upwards navigation to parent and ancestor j-nodes becomes possible.
I propose introducing the key specifier ..
, with $M?..
being a shorthand for ($M => label())?parent
, giving a convenient and familiar way to navigate from a j-node to its parent in a pinned tree. For example, $M?..?name
gives the value of the name
property in the immediately containing map.
Similarly, I propose introducing the key specifier ...
to navigate to ancestors, so $M?...
becomes a shorthand for ($M => label())?ancestors()
, and $M?...?name
returns the name
property of all containing maps.
For symmetry I suggest we also provide ...
as an abbreviated axis step, short for ancestor::node()
.
I'd like to find a better name for "pinned". Perhaps "tracked" better captures that what it does is to track downward navigation steps and make them reversible.
I'd also like to introduce the terms j-tree
and j-node
. A j-tree is a map or array, recursively expanded to include its entries or members. A j-node is a value in a j-tree. Perhaps confine the usage to maps and arrays that have been pinned/tracked.
Issue #1774 created #created-1774
Nomenclature: relabelling
The term relabelling
- used when we are down-casting, for example when an xs:integer
is supplied and the required type is xs:unsignedByte
- is easily confused with the concept of a label, being a set of properties that can be associated with any item in XDM 4.0, and which is accessible through the fn:label
function.
I suggest we rename relabelling
as rebadging
.
Apart from anything else, this has the virtue that my spell-checker won't auto-correct it...
Issue #1713 closed #closed-1713
Patchy exposition of XSLT type pattern syntax
Pull request #1773 created #created-1773
402 Change the semantics of intersect and except in patterns
Fixes a bug in the 3.0 spec whereby the intersect
and except
operators in a pattern have counter-intuitive semantics.
Fix #402
Pull request #1772 created #created-1772
1770 Default priority of rules with a union pattern
Scraps the increasingly-complicated rules for handling priority of rules with a union pattern.
Fix #1770
Issue #1771 created #created-1771
fn:deep-equal: map order
It may not come as a big surprise: A first feature request we received for ordered maps was to be able to take the order into account when comparing maps.
I would propose to add an ordered-map
option to fn:deep-equal
, which defaults to false:
(: returns false :)
deep-equal(
{ 1: 'one', 2: 'two' },
{ 2: 'two', 1: 'one' },
{ 'ordered-map': true() }
)
It should be simple to use and easy to implement.
Issue #1770 created #created-1770
Union patterns in XSLT
The original XSLT 1.0 rule for union patterns such as match="A|B"
said that the default priority was calculated as if there were two separate template rules with match="A"
and match="B"
. This became more complicated with the introduction of xsl:next-match
in XSLT 2.0 - what should happen if the item matches both branches? It became more complicated again in XSLT 3.0 with the introduction of on-multiple-match
- is it a multiple match if an item matches both branches? And in 4.0 it's complicated further by the introduction of constructs like match="element(A|B)"
which is deemed equivalent to match=A|B
.
I would like to break this cycle with a backwards-incompatible change. The default priority of a union pattern should be the numeric maximum of the default priorities of its branches; the treatment as being somewhat-equivalent to two separate template rules should go. We should encourage implementations to issue a compatibility warning if a union pattern appears with no explicit priority, and with multiple branches having different default priority.
Pull request #1769 created #created-1769
Add links from processing model diagrams
Completes action QT4CG-108-02
I’ve added link targets where necessary. I didn’t try to link closer than that paragraph level, partly because I think that’s the context the reader needs, but also partly because we don’t copy ID values from all elements.
There’s no definition of DM4.
(Review of the link targets and comments on what (if anything) the remaining boxes and labels should link to most appreciated.)
Issue #1768 closed #closed-1768
Inline SVG images
Pull request #1768 created #created-1768
Inline SVG images
In order for links to work in the browser, the SVG has to be inline, not loaded from a separate file. For self-document links, I guess this makes sense.
This is a tools-only change.
Pull request #1767 created #created-1767
1729/1737 Fix grammar for "declare record"
Fix #1729
- The syntax should be "declare record", not "declare type record".
- All the declarations using annotations should allow multiple annotations.
- Added a note about refactoring the grammar to avoid unbounded lookahead.
Pull request #1766 created #created-1766
1715 Drop array bound checking
Fix #1715
Drops array bound checking from array:get
, arrays-as-functions, and array lookup. Returns () instead of an error FOAY0001 when the index is out of bounds. This brings arrays and maps into closer alignment.
Drops the $fallback
argument of array:get()
Adds a new function array:get-if-present()
which replicates the old behaviour of array:get()
.
Functions such as array:put
, array:replace
, array:insert-before
, array:head
, array:tail
continue to perform bound checking.
Issue #1738 closed #closed-1738
Formatting of lists within notes
Pull request #1765 created #created-1765
1751 Clarify BOM handling
Fix #1751
Clarifies BOM handling (and byte order generally) in bin:encode-string
and bin:decode-string
.
Also adds a note to bin:octal
for the prevention of possible misunderstanding.
Issue #1758 closed #closed-1758
EXPath specification validation problems
Issue #1759 closed #closed-1759
Fix validation issues in the EXPath module function catalogs
Issue #1739 closed #closed-1739
Obsolete references to ordering mode
Issue #1741 closed #closed-1741
1739 drop references to ordering mode in the static context
QT4 CG meeting 108 draft minutes #minutes—02-04
Draft minutes published.
Issue #1757 closed #closed-1757
Build cleanup: remove the "by hand" diffs
Issue #1760 closed #closed-1760
Remove hand-generated diffs from the builds
Issue #1743 closed #closed-1743
1738 Formatting of Notes in F&O
Issue #1733 closed #closed-1733
ACTION QT4CG-088-04, reworking the processing model diagram
Issue #1746 closed #closed-1746
Replace processing model diagrams
Issue #1750 closed #closed-1750
EXPath Binary : copy-edits and minor enhancements
Issue #1753 closed #closed-1753
1750 Overhaul of EXPath binary spec
Issue #1571 closed #closed-1571
Discussion: On the implementability of the specs and helping implementors
Issue #1699 closed #closed-1699
XPath function to calculate edit distance between two strings
Issue #1682 closed #closed-1682
Type Promotion
Issue #1734 closed #closed-1734
1682 Type promotion and operator mapping
Issue #1764 closed #closed-1764
Remove the BOM from unparsed text input?
Issue #1764 created #created-1764
Remove the BOM from unparsed text input?
XML parsing handles the BOM for us, and we say something explicit about the BOM when parsing JSON, but we're silent about the BOM when loading unparsed text. I think the right answer is to discard the BOM and return the text that follows it...
Issue #1762 closed #closed-1762
Combining different kinds of arrow
Pull request #1763 created #created-1763
1716 Generalize syntax of arrow expressions
Fix #1716
QT4 CG meeting 108 draft agenda #agenda-02-04
Draft agenda published.
Issue #1762 created #created-1762
Combining different kinds of arrow
In the spec, under arrow expressions, we have this example:
(1 to 5) =!> xs:double() =!> math:sqrt() =!> fn($a) { $a + 1 }() => sum()
That use of an inline function is pretty clumsy, and it would be nice to think we could eliminate it using the new ->
operator. But it ain't easy.
We can't do
(1 to 5) =!> xs:double() =!> math:sqrt() -> .+1 => sum()
because the precedence is wrong.
We can't do
(1 to 5) =!> (xs:double() => math:sqrt() -> .+1 ) => sum()
because we can't have a parenthesised construct on the RHS of the mapping arrow.
We can use the bang operator but the parentheses are awkward:
((1 to 5) ! (xs:double(.) => math:sqrt() -> (.+1) )) => sum()
If we changed the precedences we could allow
(1 to 5) ! xs:double(.) ! math:sqrt(.) ! (.+1) -> sum(.)
Which would require moving ->
so it has lower precedence than !
. But this would disrupt its relationship with =>
.
Pull request #1761 created #created-1761
1752 Correct return type of fn:partition()
Fix #1752
Pull request #1760 created #created-1760
Remove hand-generated diffs from the builds
Fix #1757
The PR build isn't going to be very informative, but I'll leave this one open in case anyone wants to review the source code diffs.
I did not attempt to remove the XML markup from the specs. Perhaps we should, but I think we'd want to manage that carefully to avoid an absolute mountain of merge conflicts.
Issue #1744 closed #closed-1744
Remove dead wood re: SVG diagrams from the XSLT build
Pull request #1759 created #created-1759
Fix validation issues in the EXPath module function catalogs
Fix #1758
Issue #1758 created #created-1758
EXPath specification validation problems
As @michaelhkay noted in email, the function catalogs for the EXPath specifications are not being validated.
That validation only occurs during test generation ¯_(ツ)_/¯
- Add an example to the EXPath file specification so that it's possible to run test generation
- Add test generation for EXPath file and binary to the build
- Fix the validation errors in the function catalog
Issue #1756 closed #closed-1756
Make DeltaXML diffs on the main build too
Issue #1757 created #created-1757
Build cleanup: remove the "by hand" diffs
Unless I'm mistaken, the 'by hand' diffs, the ones that are created from explicit diff markup added by the editors, have not been consistently maintained for some time.
We still have places that point to them, and I think this could be confusing.
I propose that we pull all of that machinery out and remove references to them.
Pull request #1756 created #created-1756
Make DeltaXML diffs on the main build too
This PR should build DeltaXML diffs of the EXPath specs...and when merged, should build them on the main build as well.
Issue #1755 closed #closed-1755
Attempt to make DeltaXML diffs for EXPath specs
Pull request #1755 created #created-1755
Attempt to make DeltaXML diffs for EXPath specs
Issue #1754 created #created-1754
Inverse functions to bin:hex, bin:bin, and bin:octal
In writing formal equivalents for the functions in the binary EXPath module, I found that while we have bin:bin() which turns a string of 0s and 1s into a binary value, we don't have any convenient way of doing the inverse. The same is true for octal. For hex we can cast to hexBinary and then cast to string, but that's a bit of a circumlocution.
I propose functions bin:to-bin, bin:to-octal and bin:to-hex that convert a binary value to a string of binary, octal, or hexadecimal digits respectively. Perhaps with an options parameter that allows a grouping separator and grouping size to be specified.
Pull request #1753 created #created-1753
1750 Overhaul of EXPath binary spec
Apart from general copy-editing, the main changes are:
- A lot more examples, presented in executable markup format (though they are not yet tested)
- Many functions now have formal equivalents (again, currently untested)
- Allow underscores and spaces in input to bin:hex, bin:octal, and bin:bin
- Use type xs:unsignedByte for octet arguments
- Use an enum() type for the octet-order argument
Fix #1750
Issue #1752 created #created-1752
Return type of fn:partition()
The return type of fn:partition
should be array(item()*)*
not array(item())*
.
Issue #1751 created #created-1751
bin:encode-string - should the result have a BOM?
Test cases in the EXPath test suite using bin:encode-string
with encoding=utf-16 include a BOM at the start of the output, but the spec says nothing about this. It's probably useful for some use case but a nuisance for others.
Issue #1750 created #created-1750
EXPath Binary : copy-edits and minor enhancements
Suggested minor enhancements:
- Allow underscores and whitespace in strings of binary, octal, or hex digits supplied as strings.
- Use type
xs:unsignedByte
rather than xs:integer for octet values - Use an enum type for params like "little-endian".
The following are some suggested copy-edits:
Abstract para 4 - link to XQuery 4.1. The last sentence of the para ("The signatures and summaries of functions defined in this document...") makes no sense.
1.1 para 1, twice, ".)" should be ").".
1.2 Mention that the coercion rules in 4.0 mean that wherever a function accepts xs:base64Binary, it also accepts xs:hexBinary (but we've changed the signature to allow either, anyway).
para 2. " if the result return"?
The Note is ineleganty worded.
1.3 I guess we should integrate the test suite into QT4.
1.5 para 2 "In accordance with current practice" eh?
2.1 Example would benefit from reformatting.
2.2 Example, similarly. Could use underscores in the long integers. "and the examples from above reverse"??
-
"fn:fn:binary-resource" does not yet exist and is triple-barrelled.
-
Avoid "apologetic quotes" in 'constants'. And elsewhere. If it doesn't work as plain English without quotes, then it needs to be a defined term.
4.1 and throughout, in Examples, use the F&O rendition rather than the right arrow. Also, add these functions to the example checking mechanism.
4.1 Notes, be more precise than "similarly". Define formal equivalent. Non-editorial enhancement: allow underscores in the string.
4.2 There must be a more elegant way of saying "(8-wise) (ASCII) binary digits ([01])". Allow underscores.
4.2 "a xs:base64Binary with no embedded data" - use the term "zero-length".
4.3 similarly. Allow underscores.
Function properties: I think all these functions are pure functions so it's a waste of space to say this explicitly for each function.
4.4, 4.5 Use xs:unsignedByte
to represent octets now that we have implicit downcasting. (Changes error code [[bin:octet-out-of-range]to XPTY0004).
5.6 "blank octets"?
7.1.2 "or assumed to be represented"
7.1.3 "Care should be taken" - what does this mean?
"Positive and negative infinities are supported" - who or what is doing the supporting?
Use underscore rather than space as separators between digits.
'quiet' NaN - avoid apologetic quotes.
7.4 - I find the note regarding signed/unsigned integers very confusing.
8.1: "bitwise or" - avoid apologetic quotes. For these three functions we should say what they do rather than assuming the reader will guess from the names. bin:shift could do with more precision. Perhaps the functions could be explained more formally by a mapping from a binary value to a sequence of booleans, then for example bin:and becomes something like for-each-pair(op:from-bits($a), op:from-bits($b), op('and') ) => op:to-bits().
8.5: avoid the notation |$by|
for absolute value. Not all of us remember our schooldays. (And when I was at school, by
meant b × y
, and $ meant dollars.)
Issue #1749 closed #closed-1749
Don't set the function finder position to 'fixed' on small devices
Pull request #1749 created #created-1749
Don't set the function finder position to 'fixed' on small devices
This is also related to 1747. It "fixes" the problem that @ChristianGruen reported where the function finder obscured content on mobile (narrow) devices. I've changed things so that it isn't at a fixed location on narrow devices. It still appears above the ToC, but it scrolls as normal.
Issue #1748 closed #closed-1748
Fix 'window.onload' bug in ToC JS
Pull request #1748 created #created-1748
Fix 'window.onload' bug in ToC JS
This fixes 1747 so I'm going to push it immediately.
I'm leaving the bug open because I'll also look at @ChristianGruen 's report that it is problematic on mobile.
Issue #1747 created #created-1747
Function finder is broken
The function finder in F&O (and elsewhere) is broken. I believe that it uses the ToC to find the link target and now that the ToC structure has changed, it's failing.
Issue #1745 closed #closed-1745
Implement expanding/collapsing ToC
Pull request #1746 created #created-1746
Replace processing model diagrams
Fix #1733
Pull request #1745 created #created-1745
Implement expanding/collapsing ToC
I'm just going to merge this one because
- The CG agreed they wanted this
- All of the changes are presentational, there are no technical changes
- The PR build won't work anyway
I did make a couple of executive decisions.
The use of "..." as the target to click on didn't seem like a practical affordance. It's not a common use of ellipsis and it looked too much like it simply meant that part of the title was elided. I went with right and down triangles instead. And I added the few lines of JS required to make them "turn".
I added a top-level expand/collapse that does all of the sections. I wasn't happy that with the new UI, there was no way to get an overview of the document by seeing all of the section titles.
I tinkered with the CSS. I'm not uniformly happy with it, especially with the treatment of long titles, but I think the aesthetic failings are infrequent.
We need to review accessibility before we try to publish as a CG Report.
Pull request #1744 created #created-1744
Remove dead wood re: SVG diagrams from the XSLT build
Completes action QT4CG-106-01
Pull request #1743 created #created-1743
1738 Formatting of Notes in F&O
- Improves the stylesheets and CSS so that Notes sections in the F&O spec are rendered with a single continuous green stripe, rather than a separate (and sometimes indented) stripe per paragraph or list item.
- Makes some other markup changes identified in passing, especially using
<char>
to mark up individual characters.
Issue #1742 created #created-1742
Maps constructed using streamed xsl:fork instruction should not be ordered
One of the techniques used in XSLT streaming is to build multiple outputs during a single streamed pass of the input, and the multiple outputs can be captured in different entries in a map (in different prongs of an xsl:fork
instruction). The ordering of such a map should be implementation-dependent, in order to allow construction in parallel threads.
Furthermore, I think that an xsl:map instruction used in this way should probably not allow duplicate keys. In principle we could collect key/value pairs during the streamed processing and then resolve duplicates at the end, but it's extra complexity.
Pull request #1741 created #created-1741
1739 drop references to ordering mode in the static context
Fix #1739
Pull request #1740 created #created-1740
1725b Further elaboration of duplicates handling in maps
Actions QT4CG-107-02 and QT4CG-107-03.
The three functions map:build, map:of-pairs, and map:merge now all have the same options parameters, and avoid duplication in the specification. The xsl:map instruction is defined by reference to map:merge.
Although the action suggested specifying these functions to use the first key from a set of duplicates, I found this was not possible because of the way map:put is defined. They therefore use the last key from the set of duplicates.
Fix #1725
Issue #1739 created #created-1739
Obsolete references to ordering mode
The functions fn:distinct-values
and fn:duplicate-values
refer to the ordering mode in the static context, a concept that we have abolished.
Issue #1738 created #created-1738
Formatting of lists within notes
The formatting of lists within notes in F&O is weird: see for example the math:atan2 function.
Issue #1737 created #created-1737
Grammar problems introduced by #1732
Today's merge of #1732 has introduced two problems to the grammar as now shown in the spec:
-
ValueExpr
has changed fromValueExpr ::= ValidateExpr | ExtensionExpr | SimpleMapExpr
to
ValueExpr ::= SimpleMapExpr
This disconnects
ValidateExpr
andExtensionExpr
from the rest of the grammar. -
AnnotatedDecl
has been added without being referenced. Also it describes something that would look likedeclare declare variable $x external
Issue #1722 closed #closed-1722
1717 define focus functions using pipeline operator
Issue #1717 closed #closed-1717
Define focus functions in terms of the pipeline operator
Issue #1736 created #created-1736
Add option retain-order=false when constructing maps
I would like to provide an option on functions that potentially create large maps, including
xsl:map
map:build
map:merge
map:of-pairs
parse-json
json-doc
If retain-option=false is specified, the user declares to the processor that they don't require the resulting map to be in any particular order. An implementation is of course free to ignore this and deliver an ordered map anyway, but if the implementation can save time or space by not retaining order then it is free to do so.
I propose to provide some data quantifying the potential benefits of this option. I realise that some optimisation hints provided in the past, for example the unordered{}
expression, have been ineffective, but I think there is a difference here because changing maps to be ordered may result in a performance regression for people moving from 3.1 to 4.0.
QT4 CG meeting 107 draft minutes #minutes—01-28
Draft minutes published.
Issue #1719 closed #closed-1719
Purging dead build code
Issue #1731 closed #closed-1731
1719 drop shared spec from build
Issue #1725 closed #closed-1725
Position of duplicates in ordered maps
Issue #1727 closed #closed-1727
1725 Define more detailed rules for duplicates in maps
Issue #1485 closed #closed-1485
Record declarations in XSLT
Issue #1708 closed #closed-1708
1485 Add xsl:record-type declaration
Issue #76 closed #closed-76
non-deterministic time
Issue #747 closed #closed-747
QName literals
Issue #885 closed #closed-885
fn:uuid
Issue #981 closed #closed-981
Identify optional arguments in callback functions
Issue #1720 closed #closed-1720
Grammar overhaul
Issue #1732 closed #closed-1732
1720 grammar simplification
Issue #1069 closed #closed-1069
fn:ucd
Issue #1124 closed #closed-1124
Formatting XPath/XQuery: Preferences, Conventions
Issue #1252 closed #closed-1252
Add a new function `fn:html-doc`
Issue #1728 closed #closed-1728
Fix CSS for production tables
Pull request #1735 created #created-1735
1341 Drop $position callback from many functions
Responding to the discussion in #1341, this (somewhat experimental) PR explores the possibility of dropping the optional $position argument to the callback of many higher-order functions such as some(), every(), filter(), for-each(), fold-left(), fold-right(). Instead, it provides the option to wrap the input sequence in a call of numbered-items() which replaces each item in the input with an (item, position) pair.
I've done this only (so far) for higher-order sequence functions, but the intent is that the same could be done for arrays and (potentially) maps.
I left the position argument in place for a few functions where losing it seemed to cause genuine inconvenience:
- partition(), where the function wraps the supplied items into arrays, and you don't want to have to remove the positions afterwards
- subsequence-where(), where many use cases are likely to use positional information
- for-each-pair(), where there are two input sequences and it seems clumsy to associate position information with one or the other
The main benefit is that we provide one basic mechanism which is automatically available everywhere, which means we don't have to have debates about whether or not there is a use case for adding position information to (say) fold-left or scan-right.
A further benefit is that the functions defined for sequences automatically become available for arrays and maps. I haven't yet explored the impact on maps and arrays; I will wait first to see what the reaction is to this proposal.
Issue #1730 closed #closed-1730
Consistency in default handling of map duplicates
Pull request #1734 created #created-1734
1682 Type promotion and operator mapping
Fix #1682
Moves the relevant parts of the operator mapping table into the sections for Arithmetic Expressions and Value Comparisons. Adds links to the op: functions in F&O.
Drops the Type Promotion appendix, moving the rules inline; and drops the term "type promotion"
Adjusts the specs for sum() and avg() so they are now defined directly in terms of pairwise addition of values.
Issue #1733 created #created-1733
ACTION QT4CG-088-04, reworking the processing model diagram
I have no idea what tool was used to create the current processing model diagram. We know it needs to be updated, but I've no particular skill with drawing programs, so I spent half an hour constructing a Graphviz diagram:
digraph Processing_Model {
subgraph clusterQT4 {
Exec [label="Execution\nEngine" ];
XDM [label="XPath Data\nModel"; shape="note" ];
AST [label="Abstract\nSyntax Tree" ];
Static [label="Static\nContext"; shape="box3d" ];
Dynamic [label="Dynamic\nContext"; shape="box3d" ];
Schema [label="Schema\nDefinitions"; shape="note" ];
XPath -> AST [label=" SQ1" ];
AST -> AST [label=" SQ5" ];
AST -> Exec [label=" DQ1" ];
Schema -> Static;
Static -> AST [label=" SQ4" ];
Static -> Dynamic [label=" DQ2" ];
Dynamic -> Exec [ dir="both"; label=" DQ5" ];
Exec -> XDM [ dir="both"; label=" DQ4" ];
}
XML [ shape="note" ];
PSVI [ shape="note" ];
XML -> PSVI [ label=" DM1" ];
PSVI -> XDM [ label=" DM2" ];
XML -> XDM [ label=" DM1" ];
Direct [ label=" Direct\nGeneration" ];
Direct -> XDM [ label=" DM3" ];
Host [label="Host\nEnvironment" ];
Host -> Schema [label=" SI1" ];
Host -> Static [label=" SQ2" ];
Host -> Dynamic [label=" DQ3" ];
Serialize [ shape="note" ];
XDM -> Serialize [ label=" DM4" ];
}
It looks something like this:
Is this worth pursuing, or is that just half an hour of my life I'll never get back?
QT4 CG meeting 107 draft agenda #agenda-01-28
Draft agenda published.
Pull request #1732 created #created-1732
1720 grammar simplification
This PR primarily affects the grammar file, simplifying it to remove most of the material that is only there to support the generation of a JavaCC parser (which has probably not been achievable since XPath/XQuery 2.0).
The section of the grammar that defines the binary operators starting with OrExpr
is now expressed using conventional production rules, rather than the precedence-based grammar previously used. This allows deletion of some convoluted code in grammar2spec.xsl.
The DTD for the grammar file is revised to exclude many constructs that are no longer used.
Many simple token definitions (especially those that consist of a simple constant string) have been inlined.
Fix #1720
Pull request #1731 created #created-1731
1719 drop shared spec from build
Removes tasks from the gradle build, and associated stylesheets, that are there only to construct the "shared XPath/XQuery specification" which is no longer used by the editors or made visible to readers.
Also fixes a couple of link errors/warnings in the build.
Fix #1719
Issue #1730 created #created-1730
Consistency in default handling of map duplicates
In 3.1:
map:merge()
defaults to duplicates = use-firstxsl:map
defaults to duplicates = reject
In 4.0
map:build
defaults to duplicates = combinemap:of-pairs
defaults to duplicates = combine
Should we try to align the defaults?
Issue #1729 created #created-1729
Grammar problems introduced by #1721
Some productions of the XQuery 4.0 grammar were made obsolete by recent changes, but still occur in the document:
StringConstructorStart
StringInterpolationStart
StringInterpolationEnd
StringConstructorEnd
TagQName
EndTagQName
ProcessingInstructionStart
ProcessingInstructionEnd
DirCommentContentChar
DirCommentContentDashChar
Also, the replacement of declare record
by declare type record
has intoduced a new ambiguity. For example, with the input
declare type A as xs:integer;
declare type record as (A);
it remains unclear whether the second line declares a type named "record", in which case
42 instance of record
or a type named "as", where
{'A': 42} instance of as
My proposal would be return to declare record
. There are also 13 examples in the document using declare record
.
Issue #1723 closed #closed-1723
`ThenAction` left over after removal of `BracedActions`
Pull request #1728 created #created-1728
Fix CSS for production tables
This PR removes some extraneous space between the rows and columns in the production tables (cellspacing) and turns off the odd grey background on comments. (I don't think the grey backbround was helping any, but if you disagree...)
Issue #1721 closed #closed-1721
1713 Revise code for generating production rules
Pull request #1727 created #created-1727
1725 Define more detailed rules for duplicates in maps
Clarifies the rules for how duplicates are handled by map:merge, map:build, map:of-pairs, and xsl:map.
Introduces a callback option for map:merge that is compatible with map:build and map:of-pairs, to increase commonality between all four functions/instructions.
Fix #1725
Issue #1726 closed #closed-1726
1726 Control order when map input has duplicate keys
Pull request #1726 created #created-1726
1726 Control order when map input has duplicate keys
Issue #1725 created #created-1725
Position of duplicates in ordered maps
It became clear to me when writing test cases that the specs aren't entirely clear about what happens when building a map from an input sequence that contains duplicate keys. It says clearly what entry should be created for the duplicated key, but it doesn't say clearly where this entry should appear in the result.
There are four functions/instructions that this applies to: map:merge, map:build, map:of-pairs, and xsl:map.
I propose that in each case, the position of the entry for the duplicated key in the resulting map should correspond to the position of the first occurrence of that key in the input sequence. That is, "order of first appearance": the effect should be the same as if new entries are always created using a map:put() operation.
This might be slightly unexpected in the case of map:merge()
with the option duplicates=use-last
. It means the value will be that of the last duplicate, but its position will be that of the first duplicate. However, the other three functions/instructions achieve the effect of use-last with the callback on-duplicates=fn{$a, $b){$a}
which only controls the value of the entry, and cannot be used to control its position, and I think it makes sense for map:merge
with duplicates=use-last
to behave in the same way.
Of course we could introduce a separate option to control the position of the combined entry, but I think that would be overkill. xsl:for-each-group
and distinct-values
both use the "order of first appearance" rule and this has never caused any problems. (group-by
in XQuery delivers groups in implementation-dependent order, however).
Issue #1724 created #created-1724
Allow @copy-namespaces on <xsl:mode>?
As part of an XSLT transformation I need to remove an unused (anywhere) namespace declaration and <xsl:mode on-no-match="shallow-copy"/>
doesn’t appear to accept a @copy-namespaces
attribute where I can tell it not to copy unused namespaces. The unused namespace was used on a single attribute on the input document, but I’m removing the attribute entirely as part of the transformation, so nothing will remain in the output document that uses the namespace in question. With <xsl:mode on-no-match="shallow-copy"/>
the namespace declaration is copied into the result, even though it is not used. If I use the old identity template and set the value of @copy-namespaces
on it to something falsy, I get the result I want, that is, no unneeded namespace declaration.
Insofar as <xsl:mode on-no-match="shallow-copy"/>
has come to fill the role formerly occupied by the identity template, would it be reasonable to allow it also to declare that unused namespaces should not be copied? If that request is reasonable, is it reasonable to think of it as a bug fix, rather than a new-feature request?
Issue #1723 created #created-1723
`ThenAction` left over after removal of `BracedActions`
Thanks for fixing the IfExpr
ambiguity.
In #1712, BracedAction
was introduced to replace the previous rules for braced actions. Of these, ThenAction
still appears in the EBNF summary, but it is no longer referenced.
Issue #1651 closed #closed-1651
Ordered Maps: maps that retain insertion order
Issue #1703 closed #closed-1703
1651 ordered maps
Issue #1709 closed #closed-1709
Extend diagram of item types to include record types etc
Pull request #1722 created #created-1722
1717 define focus functions using pipeline operator
Fix #1717
Provides a formal definition of focus functions making use of the new pipeline operator.
Pull request #1721 created #created-1721
1713 Revise code for generating production rules
The main change here is to change the way "scraps" are expanded: these are the local collections of production rules that appear inline within the spec. These are now driven by a single prodrecap
element naming the rule to be expanded, and the logic is now automated for deciding (a) which subsidiary production rules to include in the scrap, and (b) which occurrence of a production rule to use as the target for a hyperlinked reference to that rule, depending on where the reference appears.
Along with this there has been a fair bit of deletion of legacy code and general modernisation (e.g using XSLT 2.0 and 3.0 constructs where appropriate).
Issue #1720 created #created-1720
Grammar overhaul
There is a lot of dead wood in the xpath-grammar.xml file. This issue is raised to capture some observations and suggestions about how it can be simplified.
- The DTD lists 19 attributes that can appear on g:token, and documents the meaning of 5 of them (very briefly). I suspect that many of the attributes are never used. Many of them were probably intended primarily for use by the JavaCC parser generator.
- As if that weren't enough, the grammar2spec stylesheet has logic that looks for additional attributes (an example is
@alias-for
) which are not even allowed by the DTD let alone being in active use. - The "if" logic to assign productions to different languages (xpath, xquery, XSLT patterns) is hard to maintain and could be automated: just search for productions that are reachable from the top-level production for each language. This could be done by a preprocessing stylesheet that generates a grammar file for each language.
- The switch into a precedence-based grammar for binary operators (g:exprProduction name="OperatorExpr") doesn't really help anyone. For generating production rules in the spec, it just complicates the generation logic. The same is true for anyone else writing applications that use the grammar as input. It doesn't really make life easier for maintainers of the grammar, because it means there is more to learn.
All the JavaCC machinery is still in the repo and I think it could probably go. Leaving stuff like that lying around makes things more difficult when you need to search filestore for references to things.
Issue #1719 created #created-1719
Purging dead build code
In the course of working on #1713 I've been exploring some dark corners of the build system. There's a lot of dead code. Some of it might come in useful in the future (e.g. code supporting XQuery Update) but most will be very hard to revive. For example there's a lot of grammar machinery which is there only to allow generation of a JavaCC parser.
The main purpose of this issue is to capture notes that might lead to some reduction of technical debt.
The gradle build is currently giving me
Warning: link-text-with-check was unable to make a link for $ref-id="doc-shared40-Prolog"
That message comes from xmlspec-override.xsl
. This stylesheet looks like dead code because it has lots of references to XPath30 and XQuery30. But it can't be completely dead if we're getting errors from it. It's imported from two places: xpath-functions-30.xsl in the F&O tree and shared.xsl in the xquery40 tree. The message comes from gradle task xquery_shared_html
. As far as I can see the build system is constructing an XPath specification, an XQuery specification, and a "shared" specification which is a union of the two. (It starts off "XQuery 4.0 and XPath 4.0 is an expression language that allows..."). Presumably this was intended to allow editors and WG members to review a single document rather than reviewing XPath and XQuery separately. But I don't think it's used today, I think we could kill it off.
shared.xsl is referenced only from build.gradle when building the shared specification.
xpath-functions-30.xsl doesn't appear to be referenced from anywhere, and it carries a comment saying
Created 17 Dec 2008 by MHK.
No longer used 16 Feb 2009?
In the short term I've deleted the code in xmlspec-override.xsl starting with the comment "Our inability to create a link for $ref-id may be a sign of something wrong, so...". This gets rid of the warning messages. In the longer term, subject to confirmation, I think we can delete the build targets associated with the "shared" language spec, and delete the stylesheets xmlspec-override.xsl, xpath-functions-30.xsl, and shared.xsl.
Issue #1718 created #created-1718
Ordered Maps: positions in callback functions
Now that maps have a defined order, we should add the position to HOF parameters in map functions (in alignment with sequence and array functions). Examples:
map:for-each(
$map as map(*),
$action as fn($key as xs:anyAtomicType, $value as item()*, $pos as xs:integer) as item()*
) as item()*
map:filter(
$map as map(*),
$predicate as fn($key as xs:anyAtomicType, $value as item()*, $pos as xs:integer) as xs:boolean?
) as map(*)
QT4 CG meeting 106 draft minutes #minutes—01-21
Draft minutes published.
Issue #1706 closed #closed-1706
Ambiguous `if` syntax
Issue #1712 closed #closed-1712
1706 Drop "else if" and "else" clauses from braced conditionals
Issue #1685 closed #closed-1685
Pipeline Operator
Issue #1686 closed #closed-1686
1685 Pipeline Operator
Issue #1701 closed #closed-1701
Add dedication to MSM (action QT4CG-088-01)
Issue #1705 closed #closed-1705
fn:divide-decimals, fn:round: large precision values
Issue #1711 closed #closed-1711
1705 Say that max precision is implementation-defined
Issue #1710 closed #closed-1710
1709 Updated type diagrams
Issue #1606 closed #closed-1606
Drop named item types other than named record types
Issue #1494 closed #closed-1494
Records: Introduction?
Issue #1176 closed #closed-1176
Use fn:parse-uri to check whether a filepath is relative or absolute
Issue #1700 closed #closed-1700
Remove some dead .DS_Store files
Issue #1717 created #created-1717
Define focus functions in terms of the pipeline operator
Now that we have accepted the pipeline operator into the language, we can define the semantics of focus functions to take advantage of them, specifically, fn() { EXPR }
can be defined to be equivalent to fn($v) { $v -> EXPR }
where $v is an otherwise-unused variable name.
QT4 CG meeting 106 draft agenda #agenda-01-21
Draft agenda published.
Issue #1716 created #created-1716
Variable lookahead needed for `ArrowTarget`
The current grammar definition allows any QName
(via EQName
) as an ArrowStaticFunction
:
ArrowTarget
::= ArrowStaticFunction ArgumentList
| ArrowDynamicFunction PositionalArgumentList
ArrowStaticFunction
::= EQName
ArrowDynamicFunction
::= VarRef
| InlineFunctionExpr
| ParenthesizedExpr
This complicates the distinction of the static and dynamic variants of ArrowTarget
, as it cannot be done with a fixed number of lookahead tokens. E.g. in an expression starting like this
A => fn ( $A, $B, $C, (: ... :) $Z ) { } (
the distinction cannot be made before the left brace is seen. While constructing an LR parser, there is a shift-reduce conflict between shifting fn
as a keyword of an InlineFunctionExpr
, or reducing fn
to the QName
of EQName
.
This can easily be fixed by adding xgc: reserved-function-names
to ArrowStaticFunction
, which would also be consistent with other function calls in disallowing reserved function names:
ArrowStaticFunction
::= EQName
/* xgc: reserved-function-names */
But could not ArrowTarget
also be written like the following?
ArrowTarget
::= FunctionCall
| DynamicFunctionCall
In this case, the xgc: reserved-function-names
constraint would be inherited from FunctionCall
. It eliminates ArrowStaticFunction
and ArrowDynamicFunction
and at the same time lifts some restrictions imposed by the current ArrowTarget
. It does not cause any LALR(2) conflicts.
Issue #1715 created #created-1715
Array Lookups: partial removal of out-of-bounds checks
Various QT4 tests imply that the out-of-bounds check for arrays have been removed. An example:
<test-case name="UnaryLookup-005a">
<description>Integer subscript into an array: array index too low</description>
<created by="Michael Kay" on="2014-11-27"/>
<modified by="Michael Kay" on="2024-07-22" change="returns () in 4.0"/>
<dependency type="spec" value="XP40+ XQ40+"/>
<test>(['a', 'b'], ['c', 'd'])[ ?0 eq 'c']</test>
<result>
<assert-empty/>
</result>
</test-case>
I believe this is not reflected in the spec yet, or at least it includes examples that need to be updated:
[ "a", "b" ]?3
raises a dynamic errorerr:FOAY0001
.
I guess that #832 would have been the PR with the relevant changes (we have already observed in another issue that some changes of this PR need to survive; see https://github.com/qt4cg/qtspecs/pull/1283#issuecomment-2568330191).
Edit (2025-05-26): Outdated:
That leads me to the original reason for creating this issue:
- I think it’s a good idea to drop the range check for array lookups, and it would seem consistent to me to also drop it for dynamic function calls.
- As map/array lookups and dynamic function calls are often used interchangeably,
$array?0
and$array(0)
should behave identically. - The
FOAY0001
error would (and should) still be raised by the array functions, includingarray:get
,array:put
,array:remove
, orarray:insert-before
.
Issue #1714 created #created-1714
sibling:: axis. Action Item QT4CG-097-03
This issue is a reflection of the following Action Item:
QT4CG-097-03: DN to proposal an axis for accessing the siblings of a node.
I have prepared a pdf file that contains the updated relevant updated sections from the "Xpath 4.0" document:
- There are no deletions or conflicting changes.
- The additions to the text are highlighted in turquoise.
- The file that contains all relevant updated sections of the document is at: https://github.com/dnovatchev/qtspecs/blob/dn-siblings/sibling-axis.pdf
If the above doesn't work, please try: https://github.com/dnovatchev/MathPuzzles/blob/master/sibling-axis.pdf
Issue #1713 created #created-1713
Patchy exposition of XSLT type pattern syntax
In XSLT §5.4.2.2 Type Patterns, the exposition of the grammar is "patchy" - it includes some production rules such as FieldDeclaration
that are in the subtree of the main production rule (TypePattern
) without giving all the intermediate rules that connect this rule to the root.
It's easy enough to correct this by hand, but it would be nice to prevent this happening by automating the generation of these families of grammar rules, perhaps by including all rules in the subtree up to a depth of 3, say. It would also be nice to simply list the productions to be included without having to decide manually which of them should be the principal target of termref
references (by being marked with an ID).
Pull request #1712 created #created-1712
1706 Drop "else if" and "else" clauses from braced conditionals
Fix #1706
Pull request #1711 created #created-1711
1705 Say that max precision is implementation-defined
Applies to fn:round, fn:round-half-to-even, fn:divide-decimals
Fix #1705
Pull request #1710 created #created-1710
1709 Updated type diagrams
Added a few details to the type diagrams: user-defined array, map, and record types; enumeration types; untypedAtomic
Issue #1709 created #created-1709
Extend diagram of item types to include record types etc
I propose to extend the diagram of item types (common to DM and FO) to include more detail of the hierarchy below function types.
Issue #1617 closed #closed-1617
1606 Drop named item types, refine named record types, esp in XSLT
Pull request #1708 created #created-1708
1485 Add xsl:record-type declaration
Adds named record types to XSLT, with much the same spec as for XQuery, but some extra tweaks for handling visibility and overriding.
Fix #1485
Issue #1707 closed #closed-1707
Fix bug in build dependencies
Pull request #1707 created #created-1707
Fix bug in build dependencies
Changing xslt.xml
didn't actually cause the HTML for the XSLT specification to be rebuilt. 👎
Issue #1706 created #created-1706
Ambiguous `if` syntax
The optional else
in a braced if
expression introduces an ambiguity in the XQuery 4.0 grammar.
Here is an example of an ambiguous expression:
if (A) then if (B) {C} else if (D) {E} else if (F) {G} else {H}
It can be parsed like this
if (A) then if (B) {C} else {}
else if (D) {E} else if (F) {G} else {H}
but also like the following
if (A) then if (B) {C} else if (D) {E} else {}
else if (F) {G} else {H}
The corresponding part of the grammar is
IfExpr ::= 'if' '(' Expr ')' ( UnbracedActions | BracedActions )
UnbracedActions
::= 'then' ExprSingle 'else' ExprSingle
BracedActions
::= ThenAction ElseIfAction* ElseAction?
ThenAction
::= EnclosedExpr
ElseIfAction
::= 'else' 'if' '(' Expr ')' EnclosedExpr
ElseAction
::= 'else' EnclosedExpr
The ambiguity could be resolved by making the ElseAction
in BracedActions
mandatory, i.e.:
BracedActions
::= ThenAction ElseIfAction* ElseAction
Issue #1705 created #created-1705
fn:divide-decimals, fn:round: large precision values
We may need to specify what is going to happen if very large (positive and negative) precisions are specified:
divide-decimals(1, 1, 0x7FFFFFFF)
A simple implementation in Java to compute the quotient for this function returns an Overflow exception:
BigDecimal.ONE.divide(BigDecimal.ONE, 0x7FFFFFFF, RoundingMode.DOWN)
This also affects fn:round
: The query round(1, -0x80000000)
seems to behave unexpectedly in existing implementations.
In general, the computation gets very slow for large precision values, and it may not be simple to interrupt such low-level operations, so maybe (if it makes sense, I haven’t really thought about it) we could define precision limits.
Issue #1704 created #created-1704
Ignore the byte order mark more completely/globally
Following on a discussion with @line-o on the XML.com Slack, I took a peek at the way we deal with the byte order mark in Functions and Operators. We seem to be explicit about it in a couple of JSON functions but not elsewhere. I think we should assert that the byte order mark is explicitly ignored in all of the input functions (json-, parse-, unparsed-* etc.)
Issue #1136 closed #closed-1136
Defining names for parameters on typed function tests
Issue #1696 closed #closed-1696
1136 Optional names in typed function types
Issue #1688 closed #closed-1688
In rendered HTML, link to definition is missing its link text
Pull request #1703 created #created-1703
1651 ordered maps
Reopened pull request introducing ordered maps.
Fix #1651.
Issue #1609 closed #closed-1609
1651 Ordered Maps
QT4 CG meeting 105 draft minutes #minutes—01-14
Draft minutes published.
Issue #1632 closed #closed-1632
Add xsl:map/@select
Issue #1694 closed #closed-1694
1632 Add xsl:map/@select
Issue #1684 closed #closed-1684
[XSLT] Composite merge keys
Issue #1689 closed #closed-1689
1684 Composite merge keys; current-merge-key-array function
Issue #1680 closed #closed-1680
Ambiguous `switch` syntax
Issue #1692 closed #closed-1692
1680 Fix switch syntax ambiguity
Issue #1672 closed #closed-1672
array:values, map:values: Alternatives
Issue #1687 closed #closed-1687
1672 array:values, map:values: Alternatives
Issue #1006 closed #closed-1006
regular expression addition - word boundaries
Issue #490 closed #closed-490
Control over schema validation in parse-xml(), doc(), etc.
Issue #108 closed #closed-108
Template match using values of [tunnel] parameters
Issue #1284 closed #closed-1284
Build issue: Unsupported specref to [streamability-fn-distinct-ordered-nodes]
Issue #1695 closed #closed-1695
1284 Define streamability of distinct-ordered-nodes
Issue #1693 closed #closed-1693
1683 Extend xpath-functions schema with CSV components
Issue #1690 closed #closed-1690
1688 In "implementation-defined" appendix, fix absent generated link
Issue #1702 created #created-1702
Node Updates: Functions
In #1225, I have summarized some thoughts on generalizing updates for both nodes and structured items (maps/arrays).
XQuery Update is complex, as updates are in general, so we may still decide that it is too ambitious to introduce update features in the core language. If we want to give it a try, we could offer functions that are based on XQUF, but that only perform one update operation at a a time on a given input. This way, we could ignore the sophisticated Pending Update List semantics, which is only important when multiple updating expressions are specified and need to be checked and brought into order.
A function set that provides an equivalent functionality for all XQUF update operations could look as follows (the presented functions are valid XQuery Update code):
declare namespace update = 'http://www.w3.org/TR/xquery-update';
declare function update:delete(
$node as node(),
$path as fn(node()) as node()*
) as node() {
copy $c := $node
modify delete node $path($c)
return $c
};
declare function update:rename(
$node as node(),
$path as fn(node()) as node()*,
$name as (xs:QName | xs:NCName | fn(node(), xs:integer) as (xs:QName | xs:NCName))
) as node() {
copy $c := $node
modify (
for $target at $pos in $path($c)
let $result := if($name instance of fn(*)) {
$name($target, $pos)
} else {
$name
}
return rename node $target as $result
)
return $c
};
declare function update:replace(
$node as node(),
$path as fn(node()) as node()*,
$contents as (node() | xs:anyAtomicType | fn(node(), xs:integer) as node()*)*,
$options as record(value? as xs:boolean)? := {}
) as node() {
copy $c := $node
modify (
for $target at $pos in $path($c)
let $result := (
for $content in $contents
return if($content instance of fn(*)) {
$content($target, $pos)
} else {
$content
}
)
return if($options?value) {
replace value of node $target with $result
} else {
replace node $target with $result
}
)
return $c
};
declare function update:insert(
$node as node(),
$path as fn(node()) as node()*,
$contents as (node() | xs:anyAtomicType | fn(node(), xs:integer) as (node() | xs:anyAtomicType))*,
$options as record(position? as enum('last', 'first', 'before', 'after'))? := {}
) as node() {
copy $c := $node
modify (
for $target at $pos in $path($c)
let $result := (
for $content in $contents
return if($content instance of fn(*)) {
$content($target, $pos)
} else {
$content
}
)
return switch($options?position) {
case 'before' return insert node $result before $target
case 'after' return insert node $result after $target
case 'first' return insert node $result as first into $target
default return insert node $result as last into $target
}
)
return $c
};
Here are some exemplary function calls:
let $node := <xml><e/><e/></xml>
return (
(: deletes all <e/> child nodes :)
update:delete($node, fn { e }),
(: renames the <e/> child nodes to <f/> :)
update:rename($node, fn { e }, 'f'),
(: replaces the <e/> child nodes with <replaced/> :)
update:replace($node, fn { e }, <replaced/>),
(: replaces the string value of the <e/> child nodes with 'text' :)
update:replace($node, fn { e }, 'text', { 'value': true() }),
(: inserts a 'text' text node into the <e/> child nodes :)
update:insert($node, fn { e }, 'text'),
(: inserts 'text1' and 'text2' text nodes into the <e/> child nodes :)
update:insert($node, fn { e }, fn($node, $pos) { 'text' || $pos }),
(: inserts an <x/> element after each <e/> child node :)
update:insert($node, fn { e }, <x/>, { 'position': 'after' })
)
Multiple update operations can easily be chained:
(: rename <e/> child nodes to <f/>, insert 'x' text nodes :)
<xml><e/><e/></xml>
=> update:rename(fn { e }, 'f')
=> update:insert(fn { f }, 'x')
Ideally, we could offer a similar function set (or maybe even the same) for maps and arrays in a next step (see #77). The map/array syntax would be similar for deletions…
let $data := { 'a': [ 1, 2, 3 ] }
return update:delete($data, fn { ?a?2 })
…but it certainly gets trickier for other operations.
If some of you believe that the presented approach is something that we should pursue, I will be happy to add details. As an alternative, we could pursue the XQUF light approach that I have sketched in #1225, based on the existing XQUF update keywords.
Yet another solution could be to stick with what we have, but add map/array update features to XQUF.
Pull request #1701 created #created-1701
Add dedication to MSM (action QT4CG-088-01)
I've had this action on my plate for a while. Having written a dedication, there's a follow-up question of where to put it. Having it in only one specification isn't wrong, but it seems slightly odd given that MSM contributed to them all. In the end, I decided to put a full dedication in the XPath specification and link to it from the others.
My rationale for the XPath spec is that it's probaly one that everyone reads. Another possibility was the Data Model as it's "foundational" but I think it's less read than XPath.
The published PR won't be write because there are tooling changes required. I've attached a couple of screen shots, one of the full dedication in XPath:
And another of the link from the other specs (from XSLT, I think, but they're all the same).
Pull request #1700 created #created-1700
Remove some dead .DS_Store files
I'm not sure how these got checked in...
QT4 CG meeting 105 draft agenda #agenda-01-14
Draft agenda published.
Issue #1699 created #created-1699
XPath function to calculate edit distance between two strings
I propose a new XPath function to calculate the edit distance between two strings. It could use a specific algorithm, for example fn:levenshtein-distance(s1,s2)
.
The function could also be designed more generic like fn:edit-distance(s1, s2, algorithm)
where algorithm could be levenshtein, hamming, lcs ... (see edit distance).
Use Case: Schematron Quick Fix when checking glossentry
elements against terms defined in glossary
. "Your term is not defined in Glossary, did you mean ...".
Thanks, Frank
Issue #1407 closed #closed-1407
Improve the spec prose and table of content layout for types
Issue #1698 created #created-1698
Allow select attribute for xsl:call-template instruction
The lack of the following feature is something that bothers me from time to time. I hope this is the right place here for my proposal. And even though I did some search -- I am not sure if something similar was discussed before ...
I propose to allow a select
attribute for xsl:call-template
instructions. When the select
attribute is set, then the named template is called for each selected item as context item.
When the empty sequence is selected, the template is not invoked.
When the select
attribute is omitted, then the instruction works as before (Invoked once and "[...] does not change focus [...]").
For extension instructions from named templates: May work the same with a prefixed attribute (e.g. xsl:select
).
Benefits I see:
- Change the context without
xsl:apply-templates
- Avoid template parameter with
current()
as default value (annoying when you have nested named template calls) - Avoid
xsl:for-each
workaround where context just must be adjusted for a single item (no such parameter available, see before) - Save an
xsl:for-each
instruction with this shorter form - Harmonize
xsl:call-template
withxsl:apply-templates
concept a little bit
Simple example:
<xsl:template match="elem">
<xsl:call-template name="t:make-something" select="child-elem"/>
<!-- ... or as extension instruction: -->
<t:make-something xsl:select="child-elem"/>
</xsl:template>
<xsl:template name="t:make-something">
<xsl:context-item use="required" as="element(child-elem)">
<!-- ... -->
</xsl:template>
The call of t:make-something
before is equivalent with:
<xsl:template match="elem">
<xsl:for-each select="child-elem">
<xsl:call-template name="t:make-something"/>
</xsl:for-each>
</xsl:template>
Issue #1675 closed #closed-1675
CSV parsing
Issue #1677 closed #closed-1677
1675 Fixes for CSV parsing
Issue #1673 closed #closed-1673
1407 TOC structure for types
Issue #1681 closed #closed-1681
Δ in the table of contents
Issue #1691 closed #closed-1691
1681 - Delta marker in TOC
Issue #1697 created #created-1697
Add documentary names to callback function signatures
If PR #1696 is accepted we can add documentary names to the parameters of callback function signatures, for example fn:filter
can become
fn:filter(
$input as item()*, |
$predicate as fn($item as item(), $position as xs:integer) as xs:boolean? |
) as item()*
and we can (if we need to) use the parameter names in the prose
Pull request #1696 created #created-1696
1136 Optional names in typed function types
Fix #1136
Pull request #1695 created #created-1695
1284 Define streamability of distinct-ordered-nodes
Fix #1284
Issue #1610 closed #closed-1610
Some cross references are incorrect
Issue #1683 closed #closed-1683
There are validity errors in the function catalog related to csv elements
Pull request #1694 created #created-1694
1632 Add xsl:map/@select
Fix #1632
Pull request #1693 created #created-1693
1683 Extend xpath-functions schema with CSV components
This was an unsuccessful attempt to fix issue #1683, but the change is still worth making. It extends the aggregated schema for the XPath functions namespace to include definitions for the result of the csv-to-xml function.
Pull request #1692 created #created-1692
1680 Fix switch syntax ambiguity
Fix #1680 (as suggested in the issue)
Pull request #1691 created #created-1691
1681 - Delta marker in TOC
Fix #1681
Pull request #1690 created #created-1690
1688 In "implementation-defined" appendix, fix absent generated link
For F&O the automatically-generated appendix of implementation-defined item should link each such item to the nearest containing section that has a head
child as well as an id attribute.
Pull request #1689 created #created-1689
1684 Composite merge keys; current-merge-key-array function
Acknowledges that as a result of changes to xsl:sort
, xsl:merge
now accepts composite merge keys; introduces the current-merge-key-array()
function to handle them.
Fix #1684
Issue #1688 created #created-1688
In rendered HTML, link to definition is missing its link text
https://qt4cg.org/specifications/xpath-functions-40/Overview.html#impl-def Item 6 contains a sentence that renders as "See ." In the raw HTML, there is a link <a href="#dt-nondeterministic-wrt-ordering"></a>
with no link text.
I thought I might make this issue a little more substantive by reporting a second typo or broken link, but I can't find a second one at the moment. :)
Pull request #1687 created #created-1687
1672 array:values, map:values: Alternatives
Issue: #1672
Pull request #1686 created #created-1686
1685 Pipeline Operator
Issue: #1685
The PR introduces the pipeline operator ->
. If we decide to add it, we could drop =!>
in a second step and update various examples in the text.
Issue #1685 created #created-1685
Pipeline Operator
This is an attempt to find a solution for the discussion in #755, which was originally about defining an expression to bind the context value. It serves as a summary for an upcoming PR.
We have two operators in the language that can be used for pipelining:
1. With the simple map operator !
, single items of an input can be bound to the context value.
2. With the arrow operator =>
, an input can be bound as first argument in a function call.
The current restrictions are: A) There is no way to bind a sequence with 0 or more than 1 items to the context value. B) We can only bind the input to the first function argument.
In addition, we have introduced the mapping arrow expression =!>
to bind single items of an input to the first function argument.
We could generalize and simplify the situation by introducing a dedicated and very basic pipeline operator: A -> B
evaluates A
to a value, which is bound to the context value before evaluating B
.
With the operator, restriction A) would be resolved. Restriction B) would be tackled indirectly, as ->
and !
can often be combined. For example, the following examples from the specification could be simplified…
(: current vs. simplified syntax :)
$s => tokenize() =!> fn { `"{.}"` }()
$s -> tokenize(.) ! `"{.}"`
(: current vs. simplified syntax :)
(1 to 5) =!> xs:double() =!> math:sqrt() =!> fn($a) { $a + 1 }() => sum()
(1 to 5) ! xs:double(.) ! math:sqrt(.) ! (. + 1) -> sum(.)
…and we could drop =!>
in favor of the new operator.
An equivalent representation for the focus function fn { E }
would be fn($c) { $c -> E }
.
Issue #1684 created #created-1684
[XSLT] Composite merge keys
The changes in PR #1674 to allow composite sort keys automatically propagate to xsl:merge
, because the semantics of xsl:merge-key
are defined entirely by reference to xsl:sort
.
No immediate problem, except (1) we should acknowledge the fact and point out that composite merge keys are now allowed, and (2) the effect on the current-merge-key() function. This is the sequence-concatenation of the merge keys for multiple merge sources. The spec says:
the [current merge key] will be a single atomic item if there is a single merge key, or a sequence of atomic items if there are multiple merge keys.
Actually I think that's already wrong, because it forgets that an individual merge key may be an empty sequence. If that happens then the current-merge-key() function is somewhat useless. I suggest we simply document the fact: if there are multiple merge sources generating multiple merge keys and they are not all singletons, then the sequence concatenation of the merge keys may not be especially useful.
We could provide a variant current-merge-key-array() that returns an array of sort key values, one for each xsl:merge-key element, each one being a sequence of atomic items.
QT4 CG meeting 104 draft minutes #minutes—01-07
Draft minutes published.
Issue #1261 closed #closed-1261
Add decimal-divide function
Issue #1671 closed #closed-1671
1261 New fn:divide-decimals() function
Issue #1662 closed #closed-1662
xsl:sort - add composite sort keys
Issue #1674 closed #closed-1674
1662 Allow composite sort keys in xsl:sort
Issue #1621 closed #closed-1621
compare() with collations that do not support ordering
Issue #1676 closed #closed-1676
1621 Capabilities of Collations
Issue #1678 closed #closed-1678
Semantics of element(N, T) where T is a union type
Issue #1679 closed #closed-1679
1678 Define element(E,T) and attribute(A,T) in terms of "derives-from"
Issue #1670 closed #closed-1670
Action QT4CS-097-02: Enable xtermref links to XSD SCM property names
Issue #1667 closed #closed-1667
Invalid XML characters in JSON input
Issue #1669 closed #closed-1669
1667 Revise handling of non-XML characters in parse-json
Issue #1668 closed #closed-1668
Minor copy edits (no issue raised)
Issue #1649 closed #closed-1649
Result type of fn:function-annotations()
Issue #1666 closed #closed-1666
1649 result of function annotations
Issue #1650 closed #closed-1650
fn:node-kind, fn:type-of: Editorial
Issue #1665 closed #closed-1665
1650 Tidy up fn:type-of
Issue #1663 closed #closed-1663
Remove DTD/stylesheet distractions at the top of the schema
QT4 CG meeting 104 draft agenda #agenda-01-07
Draft agenda published.
Issue #1683 created #created-1683
There are validity errors in the function catalog related to csv elements
The build reports:
Processing file:/Volumes/Saxonica/src/qt4cg/qtspecs/specifications/xpath-functions-40/src/function-catalog.xml
Using parser com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser
Building tree for file:/Volumes/Saxonica/src/qt4cg/qtspecs/specifications/xpath-functions-40/src/function-catalog.xml using class net.sf.saxon.tree.tiny.TinyBuilder
Tree built in 215.684667ms
Tree size: 38655 nodes, 773635 characters, 7637 attributes
Error on line 1 column 53 of generate-qt3-test-set.xsl:
XTTE1512 Cannot validate <Q{.../xpath-functions}csv>: no element declaration available
Error on line 3 column 7 of generate-qt3-test-set.xsl:
XTTE1512 One validation error was reported: Cannot validate <Q{.../xpath-functions}csv>:
no element declaration available
** Failure in parse-xml on fos:result of csv-to-xml-002
Using parser com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser
Error on line 1 column 53 of generate-qt3-test-set.xsl:
XTTE1512 Cannot validate <Q{.../xpath-functions}csv>: no element declaration available
Error on line 5 column 7 of generate-qt3-test-set.xsl:
XTTE1512 One validation error was reported: Cannot validate <Q{.../xpath-functions}csv>:
no element declaration available
** Failure in parse-xml on fos:result of csv-to-xml-003
Using parser com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser
Error on line 1 column 53 of generate-qt3-test-set.xsl:
XTTE1512 Cannot validate <Q{.../xpath-functions}csv>: no element declaration available
Error on line 3 column 7 of generate-qt3-test-set.xsl:
XTTE1512 One validation error was reported: Cannot validate <Q{.../xpath-functions}csv>:
no element declaration available
** Failure in parse-xml on fos:result of csv-to-xml-004
Using parser com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser
Error on line 1 column 53 of generate-qt3-test-set.xsl:
XTTE1512 Cannot validate <Q{.../xpath-functions}csv>: no element declaration available
Error on line 8 column 7 of generate-qt3-test-set.xsl:
XTTE1512 One validation error was reported: Cannot validate <Q{.../xpath-functions}csv>:
no element declaration available
** Failure in parse-xml on fos:result of csv-to-xml-005
Using parser com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser
Error on line 1 column 53 of generate-qt3-test-set.xsl:
XTTE1512 Cannot validate <Q{.../xpath-functions}csv>: no element declaration available
Error on line 16 column 7 of generate-qt3-test-set.xsl:
XTTE1512 One validation error was reported: Cannot validate <Q{.../xpath-functions}csv>:
no element declaration available
** Failure in parse-xml on fos:result of csv-to-xml-006
Using parser com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser
Error on line 1 column 53 of generate-qt3-test-set.xsl:
XTTE1512 Cannot validate <Q{.../xpath-functions}csv>: no element declaration available
Error on line 24 column 7 of generate-qt3-test-set.xsl:
XTTE1512 One validation error was reported: Cannot validate <Q{.../xpath-functions}csv>:
no element declaration available
** Failure in parse-xml on fos:result of csv-to-xml-007
Using parser com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser
Error on line 1 column 53 of generate-qt3-test-set.xsl:
XTTE1512 Cannot validate <Q{.../xpath-functions}csv>: no element declaration available
Error on line 37 column 7 of generate-qt3-test-set.xsl:
XTTE1512 One validation error was reported: Cannot validate <Q{.../xpath-functions}csv>:
no element declaration available
** Failure in parse-xml on fos:result of csv-to-xml-008
Using parser com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser
Error on line 1 column 53 of generate-qt3-test-set.xsl:
XTTE1512 Cannot validate <Q{.../xpath-functions}csv>: no element declaration available
Error on line 40 column 7 of generate-qt3-test-set.xsl:
XTTE1512 One validation error was reported: Cannot validate <Q{.../xpath-functions}csv>:
no element declaration available
** Failure in parse-xml on fos:result of csv-to-xml-009
Using parser com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser
Error on line 1 column 53 of generate-qt3-test-set.xsl:
XTTE1512 Cannot validate <Q{.../xpath-functions}csv>: no element declaration available
Error on line 36 column 7 of generate-qt3-test-set.xsl:
XTTE1512 One validation error was reported: Cannot validate <Q{.../xpath-functions}csv>:
no element declaration available
** Failure in parse-xml on fos:result of csv-to-xml-010
Execution time: 532.103792ms
Memory used: 98Mb
Issue #1682 created #created-1682
Type Promotion
The description of type promotion in Appendix B.1 has become outdated.
Firstly, the coercion rules no longer invoke type promotion; instead, they use a custom table of "implicit casts". So B.1 is wrong to say that type promotion is invoked by the coercion rules.
Secondly, for selecting an entry in the operator mapping table, I don't think type promotion comes into play.
- The rules for value comparisons do all the necessary type conversions of operands BEFORE invoking a search of the operator mapping table.
- The rules for arithmetic operators don't require any type promotion: for numerics, they invoke a function such as op:numeric-add, and it is the definition of this function (not the selection of the function in the mapping table) that invokes type promotion.
The statement in B.1 that "If the result type of an operator is listed as numeric, it means "the first type in the ordered list (xs:integer, xs:decimal, xs:float, xs:double) into which all operands can be converted by [subtype substitution] and [type promotion]" seems wrong in general: for example it doesn't cover integer div integer
. The result type is actually defined by the rules of the selected function, e.g. op:numeric-divide, and not by the "result type" column of the operator mapping table. Perhaps it should say that if the result type is a subtype of numeric as defined by the particular function.
References to Type Promotion in F&O section 1.6 are also outdated.
The sum() and avg() functions invoke "numeric promotion" to convert all values in the input to a common type, but the exact rules for doing this aren't exactly clear. For example, the equivalent expression given for sum() doesn't do what the prose says. For example, given sum() applied to a sequence (X as decimal, Y as decimal, Z as float), the prose says the result is float(X) + float(Y) + Z, whereas the equivalent expression gives float(decimal(X + Y)) + Z) which is not necessarily the same thing.
Issue #1681 created #created-1681
Δ in the table of contents
All the spec say in their first changes section:
Sections with significant changes are marked Δ in the table of contents.
However, these markers are present only in the F&O specification.
Issue #1680 created #created-1680
Ambiguous `switch` syntax
Unless I am overlooking some constraint preventing this, an ambiguity has been introduced to the XQuery 4.0 grammar by allowing the SwitchComparand
to be omitted per #671/#678.
Here is an example of an ambiguous expression:
switch case A return switch case B return switch case C return D default return E default return F
It can be parsed along the lines of
switch
case A return SWITCH
case B return switch
case C return D
default return E
default return F
but also like the following
switch
case A return switch
case B return SWITCH
case C return D
default return E
default return F
Pull request #1679 created #created-1679
1678 Define element(E,T) and attribute(A,T) in terms of "derives-from"
Fix #1678
Issue #1678 created #created-1678
Semantics of element(N, T) where T is a union type
The semantics of element(N, T)
say that to get a match, the type annotation A of the element must be derived from T by restriction. This means you will never get a match if T is a union type.
Furthermore, if T is a complex type, there is no match if the type annotation is a complex type derived by extension from T.
I think this is a simple error in the spec. It should say that derived-from(A, T) must be true. The derived-from() relationship handles union types and derivation by extension correctly.
We do use derived-from
when specifying subtyping. This means that an element E can be an instance of element(E, xs:integer)
, and not be an instance of element(E, xs:numeric)
, even though element(E, xs:integer)
, is a subtype of element(E, xs:numeric)
.
The error seems to have crept in when the rules were redrafted for 4.0. Up to and including 3.1, the semantics of ElementTest and AttributeTest reference the derived-from() function.
Pull request #1677 created #created-1677
1675 Fixes for CSV parsing
Fix #1675
Pull request #1676 created #created-1676
1621 Capabilities of Collations
Fix #1621
This PR is largely editorial, except that it makes a substantive change to the fn:collation-available
function.
Issue #1675 created #created-1675
CSV parsing
Pull request #1674 created #created-1674
1662 Allow composite sort keys in xsl:sort
Fix #1662
Pull request #1673 created #created-1673
1407 TOC structure for types
Addresses part of #1407:
- Improves the section headings and levels for the Types and Subtyping sections
- Level-4 headings (and level-5 if there were any) are no longer omitted from the F&O TOC.
There are other suggestions in #1407 regarding the spec prose that are not (yet) implemented.
Changing the CSS to adjust presentation of level-4 and level-5 headings in the TOC is way above my level of CSS competence, there's some very elaborate logic in this area, and anyone who wants to tackle it is welcome.
Issue #1672 created #created-1672
array:values, map:values: Alternatives
We still have array:values
and map:values
in the spec, even though the names were considered suboptimal: When retrieving values of struct(ure(d item))s, one would expect to get not a flat, but a structured result.
A while ago, the items
key specifier was introduced to mimic the classical wildcard lookup syntax (making $A?*
and $A?items::*
equivalent), and I suggest renaming our functions to array:items
and map:items
:
$map?*
≍ map:items($map)
$array?*
≍ array:items($array)
Plan B could be to extend the second argument of map:get
(and array:get
) to also accept predicate functions…
map:get(
$map as map(*),
$key as (xs:anyAtomicType|fn(xs:anyAtomicType) as xs:boolean?)
) as item()*
…which would allow us to write:
$map?a
≍ $map => map:get('a')
≍ $map => map:get(fn { . = 'a' })
$map?(1 to 5)
≍ $map => map:get(fn { . = 1 to 5 })
$map?*
≍ $map => map:get(true#0)
(: and things like :)
$map => map:get(fn { . mod 2 = 1 })
Pull request #1671 created #created-1671
1261 New fn:divide-decimals() function
Fix #1261
Pull request #1670 created #created-1670
Action QT4CS-097-02: Enable xtermref links to XSD SCM property names