@qt4cg statuses in 2024
This page displays status updates about the QT4 CG project from 2024.
See also recent statuses.
QT4 CG meeting 075 draft minutes #minutes-04-30
Draft minutes published.
Issue #1086 closed #closed-1086
array:values spec cleanup
Issue #1087 closed #closed-1087
1086 Editorial changes to array:values
Issue #1166 closed #closed-1166
Invalid option keys: the rule is unclear
Issue #1168 closed #closed-1168
1166 Clarify rule on invalid option keys
Issue #1173 closed #closed-1173
array:build, map:build: Positional access
Issue #1174 closed #closed-1174
1173 array:build, map:build: Positional access
Issue #1177 closed #closed-1177
1162 Positional variables are xs:integer not xs:positiveInteger
Issue #553 closed #closed-553
New function fn:substitute()
Issue #1179 created #created-1179
Editorial: `array:values`, `map:values`
Triggered by #1087:
- Align
map:values
witharray:values
- Revise notes.
- Rename the functions.
@michaelhkay You suggested that content
might be a better name – rather than item
– for retrieving the sequence-concatenation of values in map lookups in https://github.com/qt4cg/qtspecs/issues/1169#issuecomment-2074378446. Should we rename the functions to array:contents
, map:contents
or use array:items
and map:items
?
Issue #1178 closed #closed-1178
1146 Add inline change markup in the XPath/XQuery spec
Pull request #1178 created #created-1178
1146 Add inline change markup in the XPath/XQuery spec
There is more work to be done to ensure the change log entries are complete, but this is a good start.
Issue #1159 closed #closed-1159
Filter operator for arrays
Pull request #1177 created #created-1177
1162 Positional variables are xs:integer not xs:positiveInteger
Reverts a change that made the type xs:positiveInteger as the impact of the change was not fully explored, and the same change was not made elsewhere e.g. to the argument type of array:get() or the return type of fn:position().
Note: the use of xs:positiveInteger has been retained for row and column numbers in the CSV functions, and for codepoints in the fn:char() function.
Issue #1176 created #created-1176
Use fn:parse-uri to check whether a filepath is relative or absolute
I have a question about the new function fn:parse-uri()
. A common use case is to check whether a file path is absolute or relative. For example, I want to check whether the file path images/img1.png
is relative and can therefore be converted to an absolute file path using resolve-uri()
. Or I want to check whether $base
is absolute and can therefore be used as the second argument in resolve-uri()
.
How would I use a uri-structure-record
map determined as a result of fn:parse-uri ()
to decide whether it is a relative or absolute file path?
Greetings, Frank
Issue #1175 created #created-1175
XPath: Optional parameters in the definition of an inline function
This is a proposal to extend the definition of an inline-function item with the ability to specify a set of optional/keyword-value parameters, following the sequence of positional parameters of the function.
This is very similar to what we already have for static function definitions: https://qt4cg.org/specifications/xquery-40/xpath-40.html#dt-function-definition and https://qt4cg.org/specifications/xquery-40/xpath-40.html#id-static-functions
While a static function definition has the following parts:
-
The function name, which is an expanded QName.
-
A (possibly empty) list of required parameters, each having:
-
a parameter name (an expanded QName)
-
a required type (a sequence type)
-
A (possibly empty) list of optional parameters, each having:
-
a parameter name (an expanded QName)
-
a required type (a sequence type)
-
a default value expression (an expression: see 4 Expressions)
-
A return type (a sequence type)
-
A (possibly empty) set of function annotations
-
A body. The function body contains the logic that enables the function result to be computed from the supplied arguments and information in the static and dynamic context.
For an inline function definition we will have:
-
A name of a variable to contain the function item being defined.
-
A (possibly empty) list of required parameters, each having:
-
a parameter name (an expanded QName)
-
an optional type (a sequence type)
-
A (possibly empty) list of optional parameters, each having:
-
a parameter name (an expanded QName)
-
an optional type (a sequence type)
-
a default value expression (an expression: see 4 Expressions)
-
An optional return type (a sequence type)
-
A body. The function body contains the logic that enables the function result to be computed from the supplied arguments and information in the static and dynamic context.
What is accomplished by introducing optional parameters?
The answer is the same as for the effect of having optional parameters in a static function definition: increased brevity, conciseness and clarity .
let $myFun := fn($pos1, $pos2, $posK, $kw1 := expr1, $kw2 := expr2, ..., $kwN := exprN) { (: Some expression :)}
replaces what would otherwise be a set of N! + 1
separate inline function definitions, each of which must be assigned to a separate variable.
Similarly to the static function calls, with this new feature a call to such an inline function must provide values for all positional arguments, followed by an optional set (meaning in any order) of assignments of values to specific keyword-valued (optional) arguments. The rules for an inline function call are similar to those for a call to a static function - the provided values for the positional arguments must precede all other provided values and the values for the optional arguments may be provided in any order.
Here is a short example of an inline function definition and calling it:
let $incr := fn($arg1, $increment := 1) {$arg1 + $increment }
return
(
$incr(5),
$incr(5, increment := 2),
$incr(5, increment := 3)
)
produces:
6, 7, 8
Pull request #1174 created #created-1174
1173 array:build, map:build: Positional access
Issue: #1173
Issue #1110 closed #closed-1110
New error codes
Issue #1173 created #created-1173
array:build, map:build: Positional access
array:build
seems to be the only function (for iterating over ordered input) for which the HOF parameters lacks a positional parameter:
array:build(
$input as item()*,
$action as function(item()) as item()* := fn:identity#1
) as array(*)
(: should be :)
array:build(
$input as item()*,
$action as function(item(), xs:integer) as item()* := fn:identity#1
) as array(*)
Issue #1172 created #created-1172
Iterating maps: Positional access
With the new filter expression for maps, the context position is available. To be consistent, if we want to provide positional access for unordered data, we should…
- either provide this for map functions as well (
map:filter
,map:for-each
), or - don’t provide it for map predicates at all.
Issue #1171 created #created-1171
Predicates returning xs:boolean vs. xs:boolean?
In 4.13.4 Filter Expressions for Maps and Arrays, the type of the predicate expression FILTER
is xs:boolean?
.
To be consistent, we should either relax the types of the predicate/filter functions (for fn:filter
, etc.) or stick with xs:boolean
.
Issue #1170 created #created-1170
Editorial: fn:index-where; parentheses; …
1. fn:index-where, array:index-where
Returns the position in an input ...
→ positions
2. Redundant parens in function signatures
as (function(xs:anyAtomicType, item()*) as xs:boolean)
...
Issue #1169 created #created-1169
Maps & Arrays: Consistency & Terminology
After the introduction of #1094 and #1159, and before adding more map/array operations, I think it’s time to get more serious about consistency and terminology. The current drafts employ a variety of terms that are not clearly defined, or separated from each other. We now have at least…
items, members, pairs, keys, values, entries
…which are sometimes used for maps, for arrays, or for both data structures. A first attempt to clean up, with reducing the overall effort:
-
A minor one: The modifier for lookups should be in singular form, analagous to node axes:
item
,key
,value
,pair
. -
While I first advocated the orthogonality principle for axes in lookup expressions, I now think we should stick to the existing terminology. Otherwise, we would need to revise many other existing parts of the spec. My suggestion would be to:
- introduce
member
for arrays - only allow
key
,value
andpair
for maps - allow
items
for both maps and arrays
This would make it symmetric with a) the current terminology for maps and arrays, and b) enhanced for
clauses, i.e. for member $m
and for key $k value $v
.
The reverse approach would be to drop for member $m
and to also allow for key $k value $v
for arrays (with for value
replacing for member
). In addition, we could have for pair
.
-
With the introduction of the
item
axis,map:values
andarrays:values
should be renamed tomap:items
andarray:items
. -
I would suggest dropping
array:members
andarray:of-members
. The names don’t imply we’ll deal with records, and it’s not in line withfor member $m
either. If we want to keep these functions, we could rename them toarray:pairs
andarray:of-pairs
and add the integer positions as keys, and we should introduce and consistently use the termpair
for maps and arrays.
Closely related: #826
Issue #1125 closed #closed-1125
1094 Enhanced lookup expressions
Issue #1094 closed #closed-1094
Axis steps in lookup expressions
QT4 CG meeting 074 draft minutes #minutes-04-23
Draft minutes published.
Issue #1135 closed #closed-1135
Definition of focus functions
Issue #1157 closed #closed-1157
1135 Correction to definition of focus functions
Issue #1163 closed #closed-1163
1159 Add filter expressions for maps and arrays
Issue #235 closed #closed-235
Add multiple=true() option to fn:parse-json and fn:json-doc
Issue #1155 closed #closed-1155
Glossary formatting
Issue #1164 closed #closed-1164
1155 Consistency of glossaries
Pull request #1168 created #created-1168
1166 Clarify rule on invalid option keys
An error is raised for an option key unless (a) it is listed in the specification, or (b) it is recognized by the implementation, or (c) it is a QName with a non-absent namespace.
Also clarified the rule about accepting an array in place of a sequence (I'm not sure whether this is something that we actually do, and certainly it doesn't happen unless the parameter explicitly allows an array.)
Fix #1166
Issue #1167 created #created-1167
Merge $collation into $options parameter of fn:deep-equal()
To avoid the ugly third parameter to deep-equal which will almost always be set to (), merge $collation into the $options parameter, whose type becomes (map(*) | xs:string)?
for backwards compatibility.
The same idea is being applied to unparsed-text and can probably be done elsewhere.
Issue #1166 created #created-1166
Invalid option keys: the rule is unclear
PR #1059 introduced the rule:
If an option is not described in the specification, if it is not supported by the implementation and if its name is in no namespace, a type error [err:FORG0013] must be raised.
and this has proved its worth in finding quite a few errors in the test suite!
However, it's not entirely clear what it means.
Entries in the options map can have keys of any type. I suspect this rule is intended to apply to (a) keys of type xs:string, and (b) strings of type xs:QName where the namespace URI is absent.
Alternatively, perhaps it should apply to ALL option keys other than a QName in a non-null namespace?
QT4 CG meeting 074 draft agenda #agenda-04-23
Draft agenda published.
Issue #1165 created #created-1165
[Editorial] References to numeric codepoints in prose: consistency
A quick glance in F+O finds:
\n (newline, x0A)
π (x3C0)
the glyph ≂̸ which is expressed using the two codepoints #x2242 #x0338
A format token of ١ (Arabic-Indic digit one)
the format token ① (circled digit one, ①)
the actual Unicode character COMBINING DIARESIS (Unicode codepoint U+0308) or ̈
The Latin small letter dotless i (ı, U+0131, used in Turkish)
the Unicode replacement character (U+FFFD)
CRLF (U+000D, U+000A), LF (U+000A), or CR (U+000D)
comma "," (U+002C)
the Unicode quotation mark " (U+0022)
a single newline (U+000A) character
I feel we could do better...
Issue #700 closed #closed-700
Operators for array mapping and filtering
Pull request #1164 created #created-1164
1155 Consistency of glossaries
Use a common style for all glossaries.
Add a glossary to F+O.
Fix #1155
Pull request #1163 created #created-1163
1159 Add filter expressions for maps and arrays
Issue #1162 created #created-1162
Revert strict type for positional variables (xs:integer → xs:positiveInteger)
I feel that the decision to change xs:integer
to xs:positiveInteger
was a bit hasty (https://github.com/qt4cg/qtspecs/pull/1131#issuecomment-2051379262):
To be consistent, numerous other expressions and functions would need to be rewritten as well to use stricter types (arbitrary examples: the count
clause; $err:line-number
in the catch clause; the result type of fn:string-to-codepoints
; positions in array:get
, fn:parse-integer
, etc.; the position parameter in HOF functions, and so on and on). We haven’t done so yet, and I seriously wonder what exactly we would win from the stricter types. In many cases, it would be reasonable to also define a strict upper limit, which is not possible with our types anyway.
Implementations may build heavily on the fact that xs:integer
has been the default type for integer values in previous versions of the languages. For example, we use cached instances for the most small integer values, or we rewrite constructs with xs:integer
to other constructs accepting the same type.
For all these challenges, as always, technical solutions exist, but the question is if there aren’t more interesting things to focus on than on such corner cases. Queries like count(1 to 1000000000000)
are not supported by all implementations either although they may appear trivial to the ordinary user (by coincidence, it’s supported by BaseX, but I doubt it has been used a lot).
In short, I would like us to revert the change in https://github.com/qt4cg/qtspecs/pull/1131/commits/bba6e4f1067e0ef0779688622a58320a5298d440 and stick with xs:integer
. If people feel bad about it, I would suggest discussing strict types in a much broader and general way.
Issue #1161 created #created-1161
More changes to drop the requirement for document-uri() uniqueness
Issue #898 was about dropping this constraint that document-uri()s had to be unique and PR #905 was adopted to resolve it. However, I see that the the XPath specification still contains the following note:
Note:
This means that given a document node $N, the result of fn:doc(fn:document-uri($N)) is $N will always be true, unless fn:document-uri($N) is an empty sequence.
I don't believe that applies any longer, so it should be removed.
It's possible that we need to finesse the description of available documents as well. The current description was clearly written from the perspective that document URIs would be unique and there'd be a 1:1 mapping from URIs to documents.
QT4 CG meeting 073 draft minutes #minutes-04-16
Draft minutes published.
Issue #1160 created #created-1160
fn:is-collation-available
The new function fn:collation raises an error [err:FOCH0002] in the case when the requested collation is not supported. Or, if the fallback
key's value is true()
, then the implementation chooses "the most similar supported collation" - which could be perceived as arbitrary and unexpected by the code developer.
This might be OK if the language has try/catch capabilities and fallback="no"
is specified, but may not be the best outcome in a pure XPath evaluation.
A solution to this problem is to provide a function fn:is-collation-available that accepts the same argument ($options
map) as fn:collation
, and also could accept a string argument whose value is the URI of the collation. This function produces a boolean, true()
meaning that the collation is available and can be constructed and used, false()
- otherwise.
Signature
fn:is-collation-available( $descriptor as xs:string | map(*) ) as xs:boolean
Issue #1140 closed #closed-1140
Use $target instead of $search for indexing functions
Issue #1141 closed #closed-1141
1140 Replace 'search' with 'target' for indexing functions
Issue #1147 closed #closed-1147
QT4CG-072-01 Clarify schema type terminology
Issue #1142 closed #closed-1142
fn:deep-equal: items-equal
Issue #1150 closed #closed-1150
1142 Drop restriction disallowing items-equal with unordered
Issue #1138 closed #closed-1138
format-number arguments
Issue #1151 closed #closed-1151
1138 Merge format and format-name params of format-number
Issue #1152 closed #closed-1152
1146 Inline change log
Issue #115 closed #closed-115
Lookup operator on arrays of maps
Issue #298 closed #closed-298
Abstract supertype for map and array
Issue #397 closed #closed-397
Type names
Issue #836 closed #closed-836
Add support for CSV 'dialect' features covered by the OKFN's Frictionless Data CSV spec in `fn:parse-csv` and related functions
Issue #1115 closed #closed-1115
XSLT - ability to call a function from xslt (not just xpath)
Issue #1154 closed #closed-1154
[xsl:item-type] error in sample
Issue #1156 closed #closed-1156
Fix error in XSLT example
Issue #1084 closed #closed-1084
Incorrect rendition of option defaults
Issue #1149 closed #closed-1149
1084 Add fos:default-description to support prose descriptions of defaults
Issue #1159 created #created-1159
Filter operator for arrays
I propose to provide ?[...]
as a filter operator for arrays.
For example,
let $array := [(1,2,3), (4,5,6,7)]
return $array?[count(.) = 4]
returns
[(4,5,6,7)]
I propose that the operator should work exactly like the familiar []
for sequences in its handling of numeric and boolean predicate values. So for example $array?[2,1] in the above example returns [(4,5,6,7), (1,2,3)]
. The result is always an array (which may be a little surprising). This means that $array?[3]
has the same effect as [$array?3]
or [$array(3)]
.
Issue #1158 created #created-1158
Simple mapping operator for arrays
I propose to provide !!
as a simple mapping operator for arrays.
For example [(1,2,3), (4.5.6)]!!count(.)
returns [3, 3]
.
The expression on the LHS must be an array.
The expression on the RHS is evaluated once for every member of the array, with that member as the context value, with the context position set to the position of that member in the array, and with the context size set to the array size.
The result is returned as an array which will always be the same size as the input array.
Note in passing that this provides a solution (though perhaps a clumsy solution) to issue #755, in that the example expression
(0 to 4) ~ count(.)
can now be written as [(0 to 4)]!!count(.)?*
Pull request #1157 created #created-1157
1135 Correction to definition of focus functions
Fix #1135
Pull request #1156 created #created-1156
Fix error in XSLT example
Fix #1154
QT4 CG meeting 073 draft agenda #agenda-04-16
Draft agenda published.
Issue #1155 created #created-1155
Glossary formatting
The format of the glossary for the data model spec differs needlessly from the other specifications. (Note, linking from the glossary entry to the place where the term is defined seems useful.)
Issue #1154 created #created-1154
[xsl:item-type] error in sample
Here : https://qt4cg.org/specifications/xslt-40/Overview-diff.html#named-item-types
First sample is
<xsl:item-type name="cx:complex" as="record(r as xs:double, i as xs:double)"/>
<xsl:variable name="i" as="cx:complex" select="cx:number(0, 1)"/>
xsl:variable/@select
should probably be cx:complex(0, 1)
instead of cx:number
Issue #1153 created #created-1153
XSLT: debugging template rule selection
The biggest headache when debugging XSLT stylesheets is working out which template rules have been invoked in response to an xsl:apply-templates
instruction (I'm hitting this frustration right now with the qtspecs build stylesheets...). The xsl:message
instruction is unhelpful here, because if the "wrong" template rule is firing, you don't know where to add the message. And the only other standardised debugging aids are fn:trace()
and xsl:assert
, which don't help either.
I propose an attribute on xsl:apply-templates
, xsl:apply-imports
, and xsl:next-match
: trace=yes|no. If enabled, execution of the instruction causes a message to be output (as if by xsl:message) identifying the rule that is invoked, in an implementation-defined way. In the case that a built-in template rule is invoked, the message should indicate this, and any implicit apply-templates performed by the built-in rule should be evaluated as if it specified trace="yes". It is "recommended" that the message output should identify the stylesheet module, line number, match pattern, and mode, if the information is available, and should also include a representation of the item that is being processed by the instruction, for example the node kind and name.
Pull request #1152 created #created-1152
1146 Inline change log
This is a first cut at changes to introduce an inline change log - changes shown at the start of each affected section, with a flag in the TOC to indicate which sections have changed.
It is currently applied, for demonstration purposes, to changes made in the serialization spec.
More specifically:
- Added the
changes
andchange
elements to the DTD;changes
is an optional element that followshead
within any section - Changed the XSLT stylesheets and CSS to render the
changes
element, and to add a flag to the TOC entry if achanges
element is present - Added specimen
changes
elements to the seriallization spec
There's a lot more to be done:
- Generate an aggregated list of changes in an appendix
- Improve the CSS rendition
- Toggle change markings on and off; browse forward and backward through changed sections
- Add the
changes
data to the other specs
Pull request #1151 created #created-1151
1138 Merge format and format-name params of format-number
close #1138
Note, the proposal could do with further editorial work to use standard options
markup to define the options available.
Pull request #1150 created #created-1150
1142 Drop restriction disallowing items-equal with unordered
Allows the use of an items-equal callback even when comparisons are unordered, despite the fact that this may have atrocious performance.
close #1142
Pull request #1149 created #created-1149
1084 Add fos:default-description to support prose descriptions of defaults
Close #1084
This won't render correctly in the PR, but hopefully the diff is clear enough to decide if this is the approach we want to take.
Pull request #1148 created #created-1148
1143 Coercion rules: handle choice types before atomization
Fix #1143
Pull request #1147 created #created-1147
QT4CG-072-01 Clarify schema type terminology
Responding to an action from the review of PR #1132, this editorial PR attempts to improve the definitions and usage of terms such as "schema type", "atomic type", "pure union type", "generalized atomic type".
Issue #796 closed #closed-796
allow explicit type expressions in XPath variable bindings
Issue #1131 closed #closed-1131
796,231 - Extend XPath for and let expressions
Issue #1146 created #created-1146
Identifying 4.0 Changes
The list of changes in an appendix is (a) difficult to maintain (with a tendency to cause Git conflicts) and (b) remote from the places in the spec where the changes actually arise. At the same time, automated diff markup tends to give a lot of unwanted detail, highlighting changes that are purely editorial.
I propose that we try out an alternative approach. Each section/subsection with significant changes should start with an info box listing the changes, headed "Changes in 4.0". This should be rendered with a distinct colour or border to make it recognisable, and it should be possible to toggle whether the changes are shown or hidden. Changes that represent an incompatibility should be specially marked, perhaps with a device such as a warning triangle. A Δ marker (or colour highlighting) could appear in the table of contents against any section that has a changes
entry.
Internally, the changes should be identified with custom markup: I suggest an optional <changes>
element immediately after <head>
, with a sequence of <change>
children, each of which should contain administrative metadata (such as a link to the issue and/or PR) as well as user-readable text.
For changes to F+O functions, corresponding elements should be added to the FOS catalog schema; this should replace (or generate) the current "History" section.
Issue #1145 closed #closed-1145
Array Decomposition
Issue #1144 closed #closed-1144
Sequence Decomposition
Issue #1145 created #created-1145
Array Decomposition
This proposal allows arrays to be decomposed and assigned to separate variables in a single declaration within a for or let expression binding.
Given an array such as [1, 2, 3]
, the values within that array cannot easily be extracted. With the current version of XPath and XQuery, they need to be assigned to a temporary variable first. For example:
let $result := get-camera-point()
let $x := $result?(1)
let $y := $result?(2)
let $z := $result?(3)
return "(" || $x || "," || $y || "," || $z || ")"
This proposal would allow this to be written more concisely as:
let [$x, $y, $z] := get-camera-point()
return "(" || $x || "," || $y || "," || $z || ")"
These are equivalent in this proposal, except that $result
is not a statically known variable binding in the array decomposition let clause.
Note: The older syntax in XPath-NG was:
let $[x, y, z] := get-camera-point() return "(" || $x || "," || $y || "," || $z || ")"
For each variable declaration in the array decomposition at index N
, and $expr
being the result of the for/let expression, then $expr?(N)
is the value bound to the variable declaration as a new variable binding. If the value does not exist, an err:FOAY0001
(array index out of bounds) error will be raised.
An array decomposition can be used in any for or let clause binding to decompose the items in an array. If the type of the for or let clause binding expression is not a sequence, an err:XPTY0004
error is raised.
Assigning the rest of an array
It can be useful to only extract part of an array (e.g. the heading of a table), and store the rest of the items in another variable. For example:
let $(heading as array(xs:string), rows as array(xs:string)...) :=
load-csv("test.csv")
If there are no items remaining in the array the result is an empty array.
Influences
Tuple decomposition is found in various languages such as Python, Scala, and C#. These languages also have support for tuple types.
Python has support for specifying that a variable is assigned the remaining values in the tuple.
Use Cases
There are many cases where fixed size sequences may be used such as points, complex and rational numbers, sin/cos, and mul/div. This makes extracting data from these simpler, and may also be used to aid readability by assigning descriptive names to each of the items in the sequence.
Examples
Extracting values from an array:
declare function sincos($angle as xs:double?) {
[ math:sin($angle), math:cos($angle) ]
};
let $angle := math:pi()
let [$sin, $cos] := sincos($angle)
return $sin || "," || $cos
Issue #1144 created #created-1144
Sequence Decomposition
This proposal allows sequences to be decomposed and assigned to separate variables in a single declaration within a for or let expression binding.
Given a sequence such as (1, 2, 3)
, the values within that sequence cannot easily be extracted. With the current version of XPath and XQuery, they need to be assigned to a temporary variable first. For example:
let $result := get-camera-point()
let $x := $result[1]
let $y := $result[2]
let $z := $result[3]
return "(" || $x || "," || $y || "," || $z || ")"
This proposal would allow this to be written more concisely as:
let ($x, $y, $z) := get-camera-point()
return "(" || $x || "," || $y || "," || $z || ")"
These are equivalent in this proposal, except that $result
is not a statically known variable binding in the sequence decomposition let clause.
Note: The older syntax in XPath-NG was:
let $(x, y, z) := get-camera-point() return "(" || $x || "," || $y || "," || $z || ")"
For each variable declaration in the sequence decomposition at index N
, and $expr
being the result of the for/let expression, then $expr[N]
is the value bound to the variable declaration as a new variable binding. If the value does not exist, an empty sequence is bound to the variable.
A sequence decomposition can be used in any for or let clause binding to decompose the items in a sequence. If the type of the for or let clause binding expression is not a sequence, an err:XPTY0004
error is raised.
Assigning the rest of a sequence
It can be useful to only extract part of a sequence or array (e.g. the heading of a table), and store the rest of the items in another variable. For example:
let $(heading, rows ...) := fn:parse-csv("test.csv")
If there are no items remaining in the sequence the result is an empty sequence.
Influences
Tuple decomposition is found in various languages such as Python, Scala, and C#. These languages also have support for tuple types.
Python has support for specifying that a variable is assigned the remaining values in the tuple.
Use Cases
There are many cases where fixed size sequences may be used such as points, complex and rational numbers, sin/cos, and mul/div. This makes extracting data from these simpler, and may also be used to aid readability by assigning descriptive names to each of the items in the sequence.
Examples
Extracting values from a sequence:
declare function sincos($angle as xs:double?) {
math:sin($angle), math:cos($angle)
};
let $angle := math:pi()
let ($sin, $cos) := sincos($angle)
return $sin || "," || $cos
Issue #983 closed #closed-983
fn:reduce (or fn:fold without initial value)
Issue #1143 created #created-1143
Coercion Rules for Choice Item Types
The proposal that we accepted for choice item types (PR #1132) invokes atomization only if the choice type is a generalised atomic type, that is, if all alternatives in the choice are atomic.
This makes it tricky to take advantage of choice types for extending existing functions in a backwards-compatible way. For example, we might want to change the second argument of fn:unparsed-text from $encoding as xs:string
to $options as (xs:string | map(*))
. But under the current rules, this means the supplied value of the $encoding argument will no longer be atomized.
I propose to change this by effectively promoting rule 3 to appear before rule 2. Rule 2 is the atomization rule, and rule 3 is the new rule:
If R is a [choice item type] that is not a [generalized atomic type], then the following rules are applied with R set to each of the alternatives in the choice item type, in order, until an alternative is found that does not result in a type error; a type error is raised only if all alternatives fail.
The phrase in italics is deleted.
The effect is that if the required type is (xs:string | map(*))
then we first try converting the supplied argument as if the required type were xs:string
(including atomization), and if that fails we try converting it as if the required type were map(*).
Issue #231 closed #closed-231
for expression: "at" keyword
Issue #1139 closed #closed-1139
let clause: function coercion
Issue #788 closed #closed-788
New function fn:annotate()
Issue #1105 closed #closed-1105
Casting to numerical type from strings with underscores
Issue #67 closed #closed-67
Allow optional parameters and keyword arguments on map and sequence variadic functions.
Issue #132 closed #closed-132
Clarify if redirects should be followed
Issue #613 closed #closed-613
Allow "union" as synonym for "|" everywhere
Issue #666 closed #closed-666
Polyfill function implementations
Issue #713 closed #closed-713
Annotations: Editorial notes
Issue #834 closed #closed-834
Add creation function for `csv-row-record` type
Issue #1142 created #created-1142
fn:deep-equal: items-equal
The current spec says about items-equal
that…
If this option is present then the
ordered
option MUST betrue
and theunordered-elements
option MUST be an empty sequence.
I doesn’t say what is going to happen if ordered
is false
or if unordered-elements
is non-empty.
My preference would be to allow all combinations; we could then do things like:
deep-equal(
(1, 2, 3),
(3.1, 2.1, 1.1),
{ 'ordered': false(), 'items-equal': fn($a, $b) { xs:integer($a) = xs:integer($b) } }
)
It may imply O(n²), but it’s very simple to formulate other XPath expressions with the same complexity, such as $huge1[. = $huge2]
.
Pull request #1141 created #created-1141
1140 Replace 'search' with 'target' for indexing functions
Close #1140
Having created the issue, per my outstanding action, I thought I'd take a quick look to see how extensive the change would be. AFAICT (though I confess to not looking exceedingly carefully), only two functions are effected. Here, for your consideration, is a PR that resolves the issue.
Issue #1140 created #created-1140
Use $target instead of $search for indexing functions
Back in February, when we discussed array:index-of
, DN observed that the argument name $search
could be interpreted as performing some sort of action. The alternative $target
was proposed instead as being more "noun like".
QT4 CG meeting 072 draft minutes #minutes-04-09
Draft minutes published.
Issue #1093 closed #closed-1093
1091 Add fn:collation function
Issue #1091 closed #closed-1091
Convenience function to construct a collation URI
Issue #99 closed #closed-99
Functions that determine equality of two sequences or equality of two arrays
Issue #1063 closed #closed-1063
deep-equal() - option to compare functions liberally
Issue #1120 closed #closed-1120
99v2 deep equal with callback
Issue #122 closed #closed-122
Support general union sequence types
Issue #1132 closed #closed-1132
122 Choice item types (generalizing local union types)
Issue #1112 closed #closed-1112
1110-partial New error codes
Issue #1118 closed #closed-1118
Use new map{} syntax in adaptive output method
Issue #1123 closed #closed-1123
1118 Drop the "map" keyword in adaptive serialization output
Issue #1128 closed #closed-1128
1020 Further notes on the consequences of function coercion
Issue #1133 closed #closed-1133
fn:filter why predicate as map(*)
Issue #1134 closed #closed-1134
1133 Correct map:filter callback signature
Issue #1139 created #created-1139
let clause: function coercion
@michaelhkay I should be careful (as I regularly miss changes and additions in the 4.0 drafts), but it seems that the application of the function coercion rules for typed let
clauses (https://qt4cg.org/specifications/xquery-40/xquery-40.html#id-binding-rules) is not mentioned in the list of substantive changes at the end of the document. Would it be useful to add it, or does this change fall into a different category?
QT4 CG meeting 072 draft agenda #agenda-04-09
Draft agenda published.
Issue #1138 created #created-1138
format-number arguments
Given the availability of choice item types, I propose that we merge the format
and format-name
parameters of fn:format-number
into a single parameter of type (xs:string | xs:QName | map(*))
. This seems a better design when parameters are mutually exclusive and perform a related role.
Pull request #1137 created #created-1137
161 Variadic functions
Fix #161
This proposal attempts to do the minimum necessary to allow the variable-arity nature of the fn:concat function to be reproduced for other functions including user-defined functions. The idea is that fn:concat should no longer be treated as a special case.
The proposal is deliberately less ambitious than some of the ideas discussed in the referenced issue. It's generally easier to get something into the language if we take smaller steps.
For an overview see section 4.5.3 of the XQuery spec.
Issue #1136 created #created-1136
Defining names for parameters on typed function tests
When defining the type of a higher-order function parameter, you cannot currently specify the names of the parameters of that higher-order function.
Allowing this can be useful for variou reasons:
- documenting the parameter names in the function signature -- this makes it clear looking at the function in an IDE, etc. what the parameters are;
- making the specs clearer by referring to the parameters by name;
- allowing a processor to provide better error messages by referring to the parameters names, e.g. when there is a type conversion error;
- allowing a user to reference the parameter by name if we enable this to resolve named keyword argments (which is currently being discussed in #1114).
Thus, you could declare e.g. index-where like this:
declare function fn:index-where(
$input as item()*,
$predicate as function(
$item as item(),
$position as xs:integer
) as xs:boolean
) as xs:integer* {
(: ... :)
};
Issue #1135 created #created-1135
Definition of focus functions
§5.4.2.6 states
The expression function { EXPR } (or fn { EXPR }) is a syntactic shorthand for the expression function($Z as item()) as item() { $Z ! (EXPR) }, where $Z is a variable name that is otherwise unused. Note that the function body (EXPR) is evaluated with a [fixed focus]: the context position and context size will always be 1 (one).
This is no longer true now since generalization of the context item to context value. EXPR is evaluated once with the entire sequence $Z as the context value, it is not evaluated once for each item in $Z.
We have no direct way of expressing this in the absence of a resolution to issue #755.
Pull request #1134 created #created-1134
1133 Correct map:filter callback signature
Fix #1133
Issue #1133 created #created-1133
fn:filter why predicate as map(*)
fn:filter
is defined like this :
map:filter(
$map as map(*),
$predicate as function(xs:anyAtomicType, item()*) as map(*)
) as map(*)
Why the predicate function is not returning a xs:boolean
?
Pull request #1132 created #created-1132
122 Choice item types (generalizing local union types)
Fix #122.
Allows the new item type syntax (A | B)
, replacing union(A, B)
; the alternatives are no longer restricted to be atomic types. The choice item type is a generalized atomic type if and only if all the alternatives are generalized atomic types.
Note that #122 also proposed unions of sequence types. While that is also viable, I found that unions of item types handled pretty well all practical use cases, and it seems excessive to offer both. Unions of item types proved (a) more useful (b) easier to combine with the existing feature of local union types, and (c) easier to handle in the coercion rules. Providing both is also tricky to handle in the grammar.
Issue #1044 closed #closed-1044
CSV row delimiter - allowed values
Issue #1104 closed #closed-1104
TypeTest expressions
Pull request #1131 created #created-1131
796,231 - Extend XPath for and let expressions
Fix #796 Fix #231
Extends "for" and "let" expressions in XPath to allow a larger subset of XQuery FLWOR expression syntax. Specifically:
- Allow positional variables (
at $pos
) - Allow type declarations (
as type
) - Allow let/for clauses to be mixed without an intervening
return
.
Issue #1122 closed #closed-1122
Rendering xspecref
Issue #1130 closed #closed-1130
Fix xspecref to production
Pull request #1130 created #created-1130
Fix xspecref to production
Fix #1122
This is (apparently) the first use of an xspecref
to a production.
Issue #1129 closed #closed-1129
Fix Norm's affiliation
Pull request #1129 created #created-1129
Fix Norm's affiliation
Someone preparing a QT4 status talk for a conference observed that my affiliation on the data model spec was out-of-date.
I'm just going to merge this one without any fanfare.
Pull request #1128 created #created-1128
1020 Further notes on the consequences of function coercion
Adds further notes an examples explaining the consequences of function coercion, especially when applied to maps and arrays. The new notes make it clear that test case MapTest-058 is incorrect; a map, once coerced to a function, cannot be used as a map.
Issue #1127 created #created-1127
Binary resources
We have some functions that accept binary input (parse-html, parse-csv) and others that don't (parse-xml, parse-json). There seems to be no obvious justification for the inconsistency.
Related to this:
(a) we have no functions to convert (encode/decode) between binary and string given an encoding
(b) we have no function to read a binary resource from a URI
Both of these are available in the EXPath bin library but should perhaps be promoted to the main spec.
Issue #1039 closed #closed-1039
Allow dynamic collations in XQuery "order by" and "group by"
Issue #1092 closed #closed-1092
1039 Add notes referring to fn:collation-key
Issue #1100 closed #closed-1100
99 fn:equal() function to compare sequences and arrays
Issue #1113 closed #closed-1113
Misleading rendering BiDi text in parse-integer example
Issue #1126 closed #closed-1126
1060 Minor fixes
Pull request #1126 created #created-1126
1060 Minor fixes
Whitespace, variable names, tests
Issue #1121 closed #closed-1121
1060 Formatting
Pull request #1125 created #created-1125
1094 Enhanced lookup expressions
Fix #1094
Lookup expressions (both deep and shallow) are enhanced in two ways:
(a) the syntax is extended to provide options that avoid flattening the result. For example $V?pairs::K
delivers the result as a sequence of key-value pairs.
(b) a new KeySpecifier format is provided to filter the results by type. For example $V??type(record(first, last))
selects all items in the recursive content that are of type record(first, last)
. This replaces the previous syntax $V??*::record(first, last)
which caused ambiguities with occurrence indicators.
Issue #859 closed #closed-859
Syntax problem with type-qualified wildcards in lookup expressions
Issue #1106 closed #closed-1106
859 lookup syntax problems
Issue #1124 created #created-1124
Formatting XPath/XQuery: Preferences, Conventions
In #1060, the formatting of code examples in the spec was unified. This issue is about discussing the formatting rules and (ideally) to define conventions for newly added code. If we don’t manage to define rules, the existing specs should provide enough examples for all syntactical constructs to be inspired by.
To start with, one suggestion in yesterday’s meeting was to choose a more compact presentation. Empty maps, empty arrays, and functions with an empty body are currently formatted as follows:
map { }, { },
array { }, [ ],
function { }, fn { }, fn($x) { }
We could remove the inner whitespace:
map {}, {},
array {}, [],
function {}, fn {}, fn($x) {}
Pull request #1123 created #created-1123
1118 Drop the "map" keyword in adaptive serialization output
Close #1118
Issue #1122 created #created-1122
Rendering xspecref
xspecref markup in the serialization spec is being rendered incorrectly. See for example
<xspecref spec="XP40" ref="doc-xpath40-MapConstructor"/>
in section 10, which renders as
A [map item] is serialized using the syntax of a [Section ] ^XP40 without ...
where the link to the referenced section works correctly, but the section title is not displayed.
Pull request #1121 created #created-1121
1060 Formatting
Minor editorial fixes (examples, typos). I’ll merge this after a while if no one objects.
Pull request #1120 created #created-1120
99v2 deep equal with callback
A second attempt to address issue #99
Replaces PR #1100
Fix #99 Fix #1063
In response to comments during the review of #1100, this PR abandons the proposed fn:equal() function and instead adds a callback option to fn:deep-equal. This can potentially be used to compare any pair of items (including maps and arrays) if desired.
Issue #1119 created #created-1119
Declare namespace bindings in XPath
We have dropped the proposed "with" expression, which was in the spec but never reviewed by the WG.
We need to reconsider the requirement: do we need some kind of construct to declare namespace prefixes, and perhaps other parts of the static context, in XPath?
One thought here is that when XPath expressions are issued from a host language such as Javascript or Python, the typical pattern is to have lots of small independent XPath expressions within a program. It doesn't make sense for each such expression to have its own boilerplate to establish the context, which will usually be the same for each expression. Rather it makes sense for the XPath invocation API to supply a reference to a context object which is set up once and reused. However, this doesn't mean that there's no room for XPath syntax to establish the context. For example, one might envisage a program doing
XPath engine = new XPath();
engine.setStaticContext("declare namespace abc='http://abc.uri'; pqr = 'http://pqr.uri");
engine.evaluate("//x/@y");
so the syntax for creating the static context would be decoupled from the expression syntax.
Issue #711 closed #closed-711
Using annotations for navigation of JSON trees
Issue #1070 closed #closed-1070
Concise syntax for map construction
Issue #1071 closed #closed-1071
1070 Bare Brace map constructor syntax
Issue #1118 created #created-1118
Use new map{} syntax in adaptive output method
Should the adaptive output method be changed to use the new bare-brace syntax when serializing maps, dropping the map
keyword?
Issue #1019 closed #closed-1019
XQFO: Unknown option parameters
Issue #1059 closed #closed-1059
1019 XQFO: Unknown option parameters
Issue #1077 closed #closed-1077
Correct the status of new language features
Issue #1074 closed #closed-1074
Confirm status of provisional functions
Issue #1097 closed #closed-1097
566-partial Fix colon issue in URI parsing
Issue #1107 closed #closed-1107
Grammar discrepancy on fn:pin() examples
Issue #1060 closed #closed-1060
Formatting XPath/XQuery
Issue #1078 closed #closed-1078
1060 Formatting XPath/XQuery
Issue #1076 closed #closed-1076
1075 Drop 'with' expressions
Issue #1075 closed #closed-1075
Drop "with" expressions
Issue #1109 closed #closed-1109
Discrepancies in fn:hash() published examples
Pull request #1117 created #created-1117
1116 Add options param to unparsed-text
Reverts the change to unparsed-text and unparsed-text-lines so they no longer normalise line endings by default.
Instead an options parameter is added to select this as a non-default behaviour.
At the same time, we add an option to control whether the function is deterministic (that is, returns the same content if called repeatedly with the same URI). In 3.1 the spec stated that implementations might provide an option to do this, but did not provide an interoperable way of setting this option. For compatibility, the default is still to be deterministic.
Fix #1116
Issue #1116 created #created-1116
unparsed-text() end-of-line normalization
I'm uncomfortable with the backwards-incompatible change we have made to unparsed-text()
which now normalizes line endings.
I think it's unlikely that there are many users who care about the difference between LF and CRLF line endings and want to preserve that difference; but I think it's very likely that there are users who have written application code that expects the line ending to be CRLF, where the application will break if the line ending changes.
I would be more comfortable with the change if there were an option setting to revert to the 3.1 behaviour. But my preference would be to add the option and keep the default compatible with 3.1.
Also, for users who want to treat the file as a sequence of lines, we already introduced unparsed-text-lines()
so they don't have to worry about different representations of line endings.
QT4 CG meeting 071 draft agenda #agenda-03-26
Draft agenda published.
Issue #1115 created #created-1115
XSLT - ability to call a function from xslt (not just xpath)
I don't think this is possible in 3.0 and I don't think its yet suggested (whats the easiest way to find out)....if it has then close.
I wonder in passing many times why I ever use
<xsl:call-template.../>
when functions exist? I never use the data context inside a named template...feels dangerous.
So the only reason I do it, is because I can embed literal XML elements e.g.
<xsl:call-template name='foo'>
<xsl:with-param name='barElement' as='element(barElement)'>
<barElement/>
<xsl:with-param/>
but if I could do
<xsl:call-function name='foo'>
<xsl:with-param name='barElement' as='element(barElement)'>
<barElement/>
<xsl:with-param/>
then i would use functions always in preference to named templates.
motivation
- language simplification (though you'd have to keep named templates for legacy)
- you explicitly remove the data context (which I think is error prone in practice).
- if people used functions in preference to named templates then you would be able to use more of your code directly from x-path expressions.
Issue #1114 created #created-1114
Partial function application: Keywords and placeholders
The test suite contains test cases – FunctionCall-414 … FunctionCall-417 – for partially applied functions with keywords and placeholders:
<test-case name="FunctionCall-414" covers-40="keywords">
<description>Use of keyword arguments with placeholders on user-defined function</description>
<created by="Michael Kay" on="2023-03-13"/>
<modified by="Michael Kay" on="2023-12-13" change="do what the description says"/>
<dependency type="spec" value="XQ40+"/>
<test><![CDATA[
declare function local:diff ($s as xs:integer, $t as xs:integer) as xs:integer {
$s - $t
};
local:diff(s := 12, t := ?)(8)
]]></test>
<result>
<assert-eq>4</assert-eq>
</result>
</test-case>
...
I didn’t find information on this feature combination in the spec; is it already covered? If yes, is it also possible to partially apply function items with keywords?…
declare function local:f($s, $t) { $s - $t };
local:f#2(s := 12, t := ?)(8),
local:f(?, ?)(s := 12, t := ?)(8)
...
If 2x yes, I can try to add some more test cases (for example, I assume that $f(t := 12, ?)
is illegal, as arguments without keywords probably need to be placed first).
Issue #1113 created #created-1113
Misleading rendering BiDi text in parse-integer example
In fuction index one example of fn:parse-integer() is using parameters containing arabic letters. This leads to wrong display of parameters as 1st and 2nd parameter look like switched because browser renders both of them from left to right. It is this example:
<fos:test>
<fos:expression><eg>translate('٢٠٢٣', '٠١٢٣٤٥٦٧٨٩', '0123456789')
=> parse-integer()</eg></fos:expression>
<fos:result>2023</fos:result>
</fos:test>
This looks confusing. I don't know what will be the best fix. Maybe storing '٠١٢٣٤٥٦٧٨٩' into variable would help and prevent the issue.
Pull request #1112 created #created-1112
1110-partial New error codes
Issue: #1110.
- fn:hash: I added
FOHA0001
as error code. - fn:op: I used
XPTY0004
as error code, as the allowed operators could also be defined as string enumeration. - XQuery, Map Test: Not included in this PR.
Issue #1111 created #created-1111
xsl:pipeline
In XSLT 3.0 it is not possible to write a multi-phase streaming transformation, where two are more phases each operate in streaming mode and the result of one phase is piped into the next. Such transformations can only be written as multiple stylesheets, coordinated by some calling application.
A non-streamed multiphase transformation typically uses variables for the intermediate results:
<xsl:variable name="temp1">
<xsl:apply-templates mode="phase1"/>
</xsl:variable>
<xsl:variable name="temp2">
<xsl:apply-templates select="$temp1" mode="phase2"/>
</xsl:variable>
<xsl:apply-templates select="$temp2"/>
This cannot be streamed because variables cannot hold streamed nodes.
The idea is to allow this to be written:
<xsl:pipeline streamable="yes">
<xsl:apply-templates mode="phase1"/>
<xsl:apply-templates select="." mode="phase2"/>
<xsl:apply-templates select="." mode="phase3"/>
</xsl:pipeline>
where each instruction in the pipeline takes as its context value the result of the previous instruction.
Even when no streaming is involved, the xsl:pipeline instruction brings usability benefits: it's much clearer to the reader what is going on.
(Triggered by a support request from a user wanting to make an existing pipelined transformation streamable; but the idea was considered and "postponed to v.next" during XSLT 3.0 development. The replacement of "context item" by "context value" removes one of the obstacles.)
Issue #1110 created #created-1110
New error codes
The XQFO spec includes various “[TODO: error code]” comments. Should we add error codes when finalizing PRs, or does a master plan exist to add them at the very end?
Issue #1109 created #created-1109
Discrepancies in fn:hash() published examples
In the third example:
hash("")
the expected result has a spurious trailing letter "o". That's trivial and I will fix it.
In the seventh example:
hash(serialize($doc), map{"algorithm": "sha-1"})
I am getting a completely different result, which I suspect is because I am getting a different result from serialize()
. Perhaps the difference is something like a trailing newline, I don't know.
In my case the result of serialize($doc)
is the 14-character string "<doc>abc</doc>"
, which I believe is correct, but I suspect there might be other results of serialize($doc)
that would also be conformant with the spec.
Pull request #1108 created #created-1108
566-partial Describe a less aggressive %-encoding for fn:build-uri
My proposal seemed to meet with general approval, so here is my attempt to implement it in the spec.
Issue #1107 created #created-1107
Grammar discrepancy on fn:pin() examples
I'm probably getting the wrong end of the stick, but I can't see how the example for fn:pin():
pin(["a","b","c"])?1 => label()?parent => array:foot()
meets the current EBNF. (I know bits of this area may be in flux, so this may be just for the record .)
pin(["a","b","c"])?1 => label()
meets the production for ArrowExpr
, and the RHS of an ArrowExpr
is, in this case, an ArrowStaticFunction
ArgumentList
pair, which doesn't encompass the subsequent lookup.
But for LookupExpr
to include the ?parent
requires PostfixExpr
as its first term, and PostfixExpr
can only include ArrowExpr
via PrimaryExpr/ParenthesizedExpr
, i.e. with brackets.
Issue #1052 closed #closed-1052
parse-csv() - simplify output
Pull request #1106 created #created-1106
859 lookup syntax problems
Fix the syntax ambiguity identified in issue #859 by dropping the troublesome construct.
It is hoped something else will be introduced in its place.
Fix #859
Issue #1105 created #created-1105
Casting to numerical type from strings with underscores
The Digits production now permits underscores as separators in long numerical character sequences. However in casting to numerical types, either by operator or by function:
'12_345_678' as xs:integer
number('12.345_678')
am I correct that this should fail according to Casting from xs:string and xs:untypedAtomic, even though a static resolution/rewrite would be possible?
Issue #1104 created #created-1104
TypeTest expressions
Our current status quo text allows the result of a lookup expression to be filtered by type:
[[1,2], [3,4], 5, 6]?*::array(*)?1
and issue #859 points out that this doesn't work because of a syntax ambiguity involving occurrence indicators ('?' is both an occurrence indicator and a lookup operator).
This issue addresses that problem by re-examining the requirements, and pulling in a number of other issues at the same time.
In path expressions we have a shorthand syntax for selecting nodes, called the node test, and the proposed syntax ::array(*)
was modelled on this. Node tests have a considerable overlap with types, but there are limitations. For example the self
axis is often used to turn a node test into a general predicate, but [self::XX]
can only be used to test elements, not attributes. However, the popularity of node tests and the self axis illustrates the need for a concise filtering operation.
Of course it's always possible to write [. instance of array(*)]
but this gets extremely verbose.
In XSLT 3.0, template rules matching maps and arrays could only be written as match=".[. instance of array(*)]"
, which gets really ugly, so we have proposed an alternative in 4.0. Specifically, you can match any type using match="type(ItemType)"
, and for many types such as arrays and maps you can abbreviate this to, for example match="array(*)"
. But this feels clumsy because the type() wrapper is sometimes needed and sometimes not.
I would like to propose an expression that has concise syntax, whose effect is equivalent to . instance of T
. I propose to use the ~
symbol. This is available as both a binary and unary operator, so we can define a binary form $z ~ T
which is syntactic shorthand for $z instance of T
, and a unary form ~T
which is shorthand for . ~ T
.
First, in the case of lookup expressions, we can now write:
[[1,2], [3,4], 5, 6]?*[~array(*)]?1
TypeTests will often be used within predicates in this way, and of course the usage is completely general.
Here's an example used for array:filter: array:filter($array, fn{~xs:integer+})
which selects all members of the array comprising one or more integers.
In XSLT 4.0 the syntax ~T
replaces the current TypePattern, giving a much more uniform way of matching items by type.
In XPath and XSLT conditionals the construct can be used as an equivalent to XQuery's TypeswitchExpr:
<xsl:choose>
<xsl:when test="~xs:integer">...</xsl:when>
<xsl:when test="~xs:string">...</xsl:when>
...
</xsl:choose>
The choice of tilde for this operator is motivated by:
- There are not many symbols available
- Tilde has many different uses in mathematics and computing, some of which represent a boolean test applied to a value (for example testing whether it is similar to another value or whether it matches some pattern), which is not dissimilar to this proposed usage
- The alliteration between "tilde" and "type" has some mnemonic value (cf. the use of
@
for the attribute axis).
Issue #1103 created #created-1103
CSV Parsing - handling line ending normalization
During discussion of PR #1066 there was much debate about how best to handle normalization of (typically CRLF) line endings.
Perhaps it's very unlikely that CRLF line endings will make it as far as the parse-csv() function, because they will already have been normalized for example by unparsed-text(). But data can also be read in other ways, for example bin:read-binary() or sql:query() extension functions, or passed in as a string-valued parameter to a transformation.
Perhaps we should have a separate mechanism for normalizing line endings in any data, independent of CSV parsing? (But perhaps it's important to retain CRLF in quoted strings?)
Perhaps CSV parsing should normalise CRLF unconditionally, without needing to set a special option for it?
Issue #1101 closed #closed-1101
XQuery: Normalize line endings
Issue #1089 closed #closed-1089
Rounding when casting string to date/time or duration types
Issue #1090 closed #closed-1090
1089 Add rounding rules for casting string to duration etc
Issue #1079 closed #closed-1079
Editorial: XSLT, Applying Template Rules, Examples
Issue #1083 closed #closed-1083
1079 Change book used in example
Issue #1050 closed #closed-1050
Potential (low-risk) Ambiguities in XPath EBNF
Issue #1081 closed #closed-1081
1050 Fix ItemType grammar ambiguity
Issue #1080 closed #closed-1080
1036 Rephrase the rules for number-parser with liberal JSON
Issue #1036 closed #closed-1036
parse-json: liberal parsing
Issue #1102 closed #closed-1102
Fix broken idref to escaped-crlf in test generation
Pull request #1102 created #created-1102
Fix broken idref to escaped-crlf in test generation
It appears that escaped-crlf-3
might have been intended. @michaelhkay ?
Issue #1073 closed #closed-1073
XQFO (editorial)
Issue #757 closed #closed-757
Function families
Issue #463 closed #closed-463
fn:parts() - extract the parts of a (not-really) atomic value
Issue #448 closed #closed-448
Support extended dateTime formats of ISO-8601:2019?
Issue #283 closed #closed-283
Enumeration types
Issue #218 closed #closed-218
Function library for maps with composite keys: and thoughts on encapsulation
Issue #119 closed #closed-119
Allow a map's key value to be any sequence
Issue #33 closed #closed-33
JSON Parsing & Serialization: Numbers
Issue #883 closed #closed-883
Improve return type for fn:load-xquery-module()
QT4 CG meeting 070 draft minutes #minutes-03-19
Draft minutes published.
Issue #1072 closed #closed-1072
883 Return type of load-xquery-module
Issue #1066 closed #closed-1066
1052 Simplify the results of parse-csv
Issue #1101 created #created-1101
XQuery: Normalize line endings
Various tests, such as line-ending-Q002
, validate if line ending are normalized when parsing the input:
<test-case name="line-ending-Q002">
<description>Normalization of line endings in XQuery</description>
<created by="Michael Kay" on="2011-11-24"/>
<dependency type="spec" value="XQ10+"/>
<test>deep-equal(string-to-codepoints('
'), (10))</test>
<result>
<assert-true/>
</result>
</test-case>
I cannot find a corresponding note in the current XQuery 4 draft. Should we add it?
I would welcome this normalization. I assume that no one over the last decades has missed carriage return in XML?
Issue #1099 closed #closed-1099
Build fixes
Pull request #1100 created #created-1100
99 fn:equal() function to compare sequences and arrays
Fix issue #99
Introduces a function fn:equal() that compares two arbitrary values (sequences, maps, arrays, etc), with a callback for comparing "leaf" items in the structure.
Pull request #1099 created #created-1099
Build fixes
I had a brain cramp when I wrote the build.gradle
file for this repository. This PR fixes that.
It also adds a nobreak
attribute to the code
element. The intent, not yet implemented, is that you can say
<code nobreak="true">some long, but not unreasonably long expression</code>
and the stylesheet will prevent a line break in the middle of the expression.
Pull request #1098 created #created-1098
566-partial Editorial improvements for parse-uri
- Add a note clarifying that the fragment identifier should be (1) URI decoded and (2) ignored if it's the empty string.
- Reworked a bit of the description in order to avoid an ambiguity in how
///abc
should be parsed. (The current spec can be satisfied either by parsing it as//(/abc)()
or//()/abc
and only the former is intended.)
Pull request #1097 created #created-1097
566-partial Fix colon issue in URI parsing
In the course of reviewing the tests for fn:parse-uri
, I discovered (or perhaps more correctly, @ChristianGruen discovered) that the rules for matching Windows drive letters are inconsistent. This PR fixes that inconsistency.
Issue #1096 created #created-1096
Effect of atomization on array:index-of()
What is the expected result of the expression:
array:index-of( [[1,2], [3,4]], [3,4] )
It seems that the second argument is atomised (because its declared type is atomic), but the first argument is not.
So both members of the array have count=1, whereas $search has count=2, so nothing matches, so the result is ().
Now, what if we write:
array:index-of( [[1,2], (3,4)], [3,4] )
This time it seems that the second member of the array matches, so the result is 2.
This doesn't feel right. One solution would be to say that each member of the array is itself atomised. But that seems to lead to other surprises with other examples of nested arrays.
An alternative would be to atomize neither argument (which would mean changing the function signature). But then we would need to use a different comparison operation.
We seem to be back where we started -- I was unhappy about introducing this function because of the difficulty of defining a good comparison operation for it to use.
QT4 CG meeting 070 draft agenda #agenda-03-19
Draft agenda published.
Issue #1095 closed #closed-1095
Collation: caseblind → Standardize or replace with `html-ascii-case-insensitive`?
Issue #1095 created #created-1095
Collation: caseblind → Standardize or replace with `html-ascii-case-insensitive`?
Various test cases use the artificial http://www.w3.org/2010/09/qt-fots-catalog/collation/caseblind
collation. It seems that most (all) of them could also be written with the http://www.w3.org/2005/xpath-functions/collation/html-ascii-case-insensitive
.
Could we replace the tests with the standardized collation, or should we rather try to standardize the caseblind variant?
Issue #1094 created #created-1094
Axis steps in lookup expressions
This issue picks up where issue #341, issue #350, issue #596, issue #960 etc left off - an attempt to find better syntax and semantics for navigation within JTrees (by which I mean trees of maps and arrays). The problems we are addressing are well aired in those previous issues. There are new opportunities for improving navigation within pinned trees, where upwards navigation becomes possible.
Firstly I propose that the existing constructs ?*
, ?key
, and ?1
be treated as abbreviations for ?content::*
, ?content::key
, and ?content::1
respectively. The content axis delivers a flattened sequence of items.
Then I propose we introduce an entry
axis. ?entry::*
, ?entry::key
, and ?entry::1
deliver their results as a sequence of key value pairs, in the style of map:pairs()
. Arrays for this purpose are treated as maps with integer keys. For example if $A
is [(1,2), (3,4)]
then $A?entry::*
delivers (map{'key':1, 'value':(1,2)}, map{'key':2 'value':(3,4)}
.
This applies equally to the deep lookup operator. $A??entry::*
returns all the key-value pairs within the JTree rooted at $A, recursively.
We could also consider a value
axis which delivers a sequence of arrays containing the values, losing the associated keys.
If values are labelled, as a result of being found by navigating a pinned JTree. then upwards navigation is also possible. For an item in a pinned tree,
-
containing-entry::*
delivers the containing entry as a key-value pair. Duplicates are eliminated. -
owner::*
delivers the immediately containing map or array as identified by the label -
ownership::*
delivers the transitive closure of theowner::*
axis. -
peer::*
deliversowner::*/entry::*
-
following-member::*
delivers the subarray of the containing array that follows the current entry -
preceding-member::*
delivers the subarray of the containing array thay precedes the current entry
Of course, improved names for these concepts are welcomed!
In these examples I have used *
to select everything on the relevant axis. This can always be replaced by a key specifier K that selects the item only if it is labelled with a key K. So for example ownership::address selects the containing maps and arrays that are themselves in a map entry with key "address".
I think we also need a convenient way to filter the selection by type (see issue #859 for a problem with the current syntax). I propose
??content::[record(longitude, latitude)]
to select all items in the recursive content that match type record(longitude, latitude)
Similarly
??entry::[array(xs:integer)+]
to select all entries where the value is an array of integers.
Finally, responding to issue #341, I propose that lookup operators should be error free: rather than reporting errors, they should return nothing.
Pull request #1093 created #created-1093
1091 Add fn:collation function
Fix #1091
Pull request #1092 created #created-1092
1039 Add notes referring to fn:collation-key
Fix #1039
Rather than adding a new feature to the language, we add notes to "order by" and "group by" explaining how the requirement can be met using the fn:collation-key() function.
Issue #334 closed #closed-334
Transient properties: a new approach to deep selection and update in maps and arrays
Issue #86 closed #closed-86
Fallback for named timezones
Issue #64 closed #closed-64
Specify optional parameters to create bounded variadic functions
Issue #56 closed #closed-56
Allow item-type to be matched within its definition scope
Issue #1091 created #created-1091
Convenience function to construct a collation URI
I propose a convenience function to construct a collation URI: for example
collation({'lang':'fr'})
returns a collation URI suitable for French.
If the property names supplied are those that are defined for UCA collation names, the result will be the corresponding UCA collation URI; alternatively, implementation-defined property names can be included.
Pull request #1090 created #created-1090
1089 Add rounding rules for casting string to duration etc
Fix #1089
Issue #1089 created #created-1089
Rounding when casting string to date/time or duration types
F+O section 21.2 describes rules for rounding when strings are cast to xs:decimal. The same rules should apply when casting to a dateTime, time, or duration, in the case where the number of digits in the fractional seconds part exceeds the precision supported by the implementation.
Issue #1082 closed #closed-1082
Inconsistency in underscore in numeric literal grammar
Issue #1088 closed #closed-1088
1082 Fix numeric literal grammar
Pull request #1088 created #created-1088
1082 Fix numeric literal grammar
Fix #1082
Also adds some notes and examples
Pull request #1087 created #created-1087
1086 Editorial changes to array:values
Fix #1086
Issue #1086 created #created-1086
array:values spec cleanup
The rules for array:values say:
The function concatenates the members of $array and returns them as a sequence. The values are returned in their original order. Arrays contained within members are returned unchanged.
The effect of the function is equivalent to $array?*.
This is all a bit too vague.
- the values are not concatenated, at least not in the sense of concat()
- it doesn't return the members, it returns their sequence concatenation
- the phrase "arrays contained within members" is unclear. The examples reveal that this rule is intended to include arrays that ARE members.
- the equivalent expression
$array?*
allows $array to be things thatarray:values()
doesn't allow (like an empty sequence).
More subtly, the introduction to section 19.1 says:
All functionality on arrays is defined in terms of two primitives:
The function [array:members] decomposes an array to a sequence of value records. The function [array:of-members] composes an array from a sequence of value records.
and the spec for array:values doesn't conform to this guideline.
Try:
The function returns the sequence-concatenation of the members of $array, retaining order. More formally, the effect of the function is equivalent to the expression array:members($array)?value
.
and add to the notes:
Unlike array:flatten, the function does not apply recursively to nested arrays.
Issue #1085 created #created-1085
Parameters to fn:sort
An interesting suggestion made in passing in the thread discussing fn:ranks(). It would be possible to combine the collation
argument and the ascending/descending
argument of fn:sort
into a single argument, whose value is an optional "ascending|descending" keyword followed by an optional collation URI (whitespace-separated, presumably).
This might seem a little bizarre at first sight, but having a list of collation URIs followed by a list of sort key functions followed by a list of ascending/descending keywords is also a little bizarre, and it would have two advantages - it would make better use of the second argument which is currently nearly always set to ()
, and it would put the two parts of the order specification (the collation and its direction) in closer proximity. After all, they are used in combination to decide whether one value precedes or follows another.
I'm not 100% convinced by the idea, but it seems worth considering. What do people think?
Issue #1084 created #created-1084
Incorrect rendition of option defaults
In the F&O spec, when rendering the default value of an option, code font is being used for narrative prose: see for example defaults for the delivery-format
and base-output-uri
options of fn:transform
Pull request #1083 created #created-1083
1079 Change book used in example
Changed the example to a book by a reputable author.
Fix #1079
Issue #1082 created #created-1082
Inconsistency in underscore in numeric literal grammar
Numeric Literals describes permitting underscores to be used as separators in sequences of digits within long numbers. The first interpretation rule says underscores are first stripped out.
But the grammar provided appears to me to be inconsistent.
IntegerLiteral ::= Digits
DecimalLiteral ::= ("." Digits) | (Digits "." [0-9]*)
DoubleLiteral ::= (("." Digits) | (Digits ("." [0-9]*)?)) [eE] [+-]? Digits
Digits ::= DecDigit ((DecDigit | "_")* DecDigit)?
DecDigit ::= [0-9]
Digits
permits underscores in the grammar, which works in the integer portion of the numeric literal, but when the value exceeds 1 the fractional part is described as [0-9]*
. If underscore stripping is a 'pre-parsing' step, then Digits
need not mention it at all.
On the other hand if the grammar is defining the sequence of characters that are permitted, then the fractional section in the grammar should also permit underscores, which it plainly does not in the presence of an integer part. (The test seconds-010
uses such in an expansion of π.)
An alternative formulation that I think does describe underscores in fractional parts might be:
DecimalLiteral ::= ("." Digits) | (Digits "." Digits?)
and similarly for DoubleLiteral
. I know this isn't a game-changer, but for those generating grammars, consistency certainly helps.
Pull request #1081 created #created-1081
1050 Fix ItemType grammar ambiguity
Two alternatives in the grammar were both EQNames, distinguished semantically. This ambiguity in the grammar is now fixed (no living XPath expressions are harmed by this change).
Fix #1050
Pull request #1080 created #created-1080
1036 Rephrase the rules for number-parser with liberal JSON
Rephrasing as suggested in the issue.
Fix #1036
Issue #1079 created #created-1079
Editorial: XSLT, Applying Template Rules, Examples
This guy is all around. Do we really need him in our specs as well? 😏
https://qt4cg.org/pr/1078/xslt-40/Overview.html#applying-templates
"Title": "How to Win Elections",
"Authors": [ "...
Pull request #1078 created #created-1078
1060 Formatting XPath/XQuery
Editorial; closes #1060.
This PR attempts to unify the presentation of XPath and XQuery code. It’s not complete, but it should definitely improve the status quo.
The chosen formatting and indentation rules can certainly be discussed. My major objective was consistency: I selected rules that were used frequently enough in the given documents.
Apart from the presentation stuff, this PR fixes various minor bugs in the rules and examples.
Pull request #1077 created #created-1077
Correct the status of new language features
This PR corrects the status of certain language features that the change appendix in the spec incorrectly describes as having not been accepted by the WG., The features in question are:
The rules for reporting type errors during static analysis have been changed so that a processor has more freedom to report errors in respect of constructs that are evidently wrong, such as
@price/@value
, even though dynamic evaluation is defined to return an empty sequence rather than an error.
This change has in fact been discussed and accepted by the group. See PRs #603 and #884.
Record types are added as a new kind of ItemType, constraining the value space of maps.
Record types have become a fundamental feature of much of our work, with many additional capabilities relying on them. They became an official part of the spec with the closure of issue #172.
Local union types are added as a new kind of ItemType, constraining the value space of atomic values.
Enumeration types are added as a new kind of
ItemType
, constraining the value space of strings.
Local union types and enumeration types became an official part of the spec with the acceptance of PR #691
The lookup operator
?
can now be followed by a string literal, for cases where map keys are strings other than NCNames.
These changes were endorsed by acceptance of PR #926.
The rules for value comparisons when comparing values of different types (for example, decimal and double) have changed to be transitive. A decimal value is no longer converted to double, instead the double is converted to a decimal without loss of precision. This may affect compatibility in edge cases involving comparison of values that are numerically very close.
We still have open issues regarding comparison, conversion, and promotion of numeric values. See for example issue #986. So we may yet decide to roll back these changes. For practical purposes it's sensible to treat the current text as status quo, since so many individual changes have been made that unwinding can only be treated as a new issue.
A
for member
clause is added to FLWOR expressions to allow iteration over an array.
The current specification of for member
results from the acceptance of PR #752.
Pull request #1076 created #created-1076
1075 Drop 'with' expressions
Proposes dropping "with" expressions from the spec.
Fix #1075
Issue #1075 created #created-1075
Drop "with" expressions
The current draft includes a proposal for a "with" expression to establish the namespace context for an XPath expression. This has never been reviewed or accepted by the group. See §4.1 of the language specifications.
I propose to raise a PR that drops this feature, in order to force discussion as to whether we want it in its current form, or to replace it with something better, or to drop it entirely.
Pull request #1074 created #created-1074
Confirm status of provisional functions
The purpose of this PR is to bring the current F&O draft specification into a state where it is confirmed as the current status quo accepted by the CG.
The following functions (mainly new, some amended) that have been present in the draft for some while, but with caveats about their status, are confirmed as part of the status quo:
- fn:slice
- fn:format-number
- fn:stack-trace
- map:filter
- map:replace
- array:replace
The following functions are dropped (for the time being):
- fn:json
- map:substitute
The actual PR essentially changes text that alludes to the status of these functions, it does not change the actual specifications.
The current state of qt4tests in relation to these functions is:
fn:slice - OK fn:format-number - missing tests for recent changes fn:stack-trace - no tests map:filter - OK map:replace - no tests array:replace - no tests fn:json - no tests map:substitute - no tests
Pull request #1073 created #created-1073
XQFO (editorial)
Examples fixed.
Pull request #1072 created #created-1072
883 Return type of load-xquery-module
Use a record type for the return type of the function.
Fix #883
Pull request #1071 created #created-1071
1070 Bare Brace map constructor syntax
Makes the keyword "map" in map constructors optional.
Fix #1070
QT4 CG meeting 069 draft minutes #minutes-03-12
Draft minutes published.
Issue #220 closed #closed-220
Encapsulation
Issue #262 closed #closed-262
Navigation in deep-structured arrays
Issue #274 closed #closed-274
What would it take/would it be possible to build a module repository for QT?
Issue #295 closed #closed-295
Extend support for self-reference in record types
Issue #314 closed #closed-314
Basic Operations on Maps and Arrays
Issue #825 closed #closed-825
array:members-at
Issue #829 closed #closed-829
fn:boolean: EBV support for more item types
Issue #960 closed #closed-960
Should ??KS flatten the results
Issue #961 closed #closed-961
Simulating Objects: Performance
Issue #1037 closed #closed-1037
fn:json-to-xml: 'number-parser' option
Issue #1058 closed #closed-1058
1037 fn:json-to-xml: 'number-parser' option
QT4 CG meeting 069 draft agenda #agenda-03-12
Draft agenda published.
Issue #1070 created #created-1070
Concise syntax for map construction
It has been suggested that we should allow a "bare braces" syntax for map construction. This would reduce visual clutter especially when defining options arguments, as in
serialize($result, map{"method": "adaptive", "indent": true()})
I believe there are no syntactic obstacles to dropping the "map" keyword. The main reason it is there was because there was competition for the construct with people doing so-called scripting extensions who wanted "bare braces" to represent blocks of statements.
Allowing {"method": "adaptive"}
would align with JSON.
But I think we should go a step further and drop the quotes:
{method: "adaptive"}
except that we could allow a string literal if the key isn't an NCName, as with record type syntax.
Could we do this and still allow computed or non-string keys? I don't think we need to, the existing syntax remains available.
So I propose we allow:
serialize($result, {method: "adaptive", indent: true()})
While we're about it, is there any enthusiasm for allowing
serialize($result, {method: "adaptive", indent: ✅})
(: U+2705 :)
Issue #596 closed #closed-596
Pinned values: Transforming Trees
Issue #1069 created #created-1069
fn:ucd
This issue floats the idea of a new function, fn:ucd
(for Unicode character database).
The working signature would be fn:ucd($codepoint as xs:positiveInteger) as map(*)?
. In the returned map, each entry would have a key that is a Unicode property name (full or abbreviated) and a value that reflects the property of $character
in the Unicode character database.
What would users get? Access to a deep store of character data not otherwise (easily) available, such as name, name alias, bidirectional properties, age (when it entered Unicode), breaks (word, sentence, grapheme), scripts, and dozens of other properties. See Unicode TR 44. Although many properties are of specialized interest, I think most people would find at least a few of these properties of significance.
Can't we already do this with regular expressions? Well, no. Category escapes in XPath regular expressions, e.g., \p{Lm}
, are based upon general categories, but the properties mentioned above cut across these general categories. For example, general category Pd
dash is not coterminous with the property Dash
. Property Quotation_Mark
crosses many subcategories of P
Punctuation.
A few use cases:
Extrapolation
string-to-codepoints('ɑϞ') ! ('U+' || dec-to-hex(.) || ': ' || ucd(.)('Name'))
would return ('U+251: LATIN SMALL LETTER ALPHA', 'U+3DE: GREEK LETTER KOPPA')
Filtering
if (some $i in string-to-codepoints($text) satisfies ucd($i)('Soft_Dotted')) then...
Annotating
<div class="{myfunc:most-frequent((string-to-codepoints($text) ! ucd(.)('Script')))}">
<xsl:value-of select="$text"/>
</div>
might produce
<div class="Syriac">ܡܠܟܘܬܐ ܕܫܡܝܐ ܐܝܬܝܗ̇. ܠܐ ܚܫܘܫܘܬܐ ܕܢܦܫܐ܆ ܥܡ ܝܕܥܬܐ ܕܫܪܪܐ ܕܗܠܝܢ ܕܐܝܬܝܗܝܢ</div>
And so forth. I can think of dozens of different types of operations where fn:ucd
might be significant.
Thoughts?
Pull request #1068 created #created-1068
73 fn:graphemes
First draft of fn:graphemes
, in response to discussion at #73 .
A battery of tests will be submitted as a PR to the qt4tests repository.
Issue #1067 closed #closed-1067
fn:deep-equal: significant children
Issue #1067 created #created-1067
fn:deep-equal: significant children
The current rules of fn:deep-equal are:
e. Let
significant-children($parent)
be the sequence of nodes obtained by applying the following steps to the children of$parent
, in turn: i. Comment nodes are discarded if the optioncomments
is false. ii. Processing instruction nodes are discarded if the optionprocessing-instructions
is false. iii. Adjacent text nodes are merged. … …the sequencesignificant-children($i1)
is deep-equal to the sequencesignificant-children($i2)
.
If my interpretation is correct, the following expression is now expected to return true
…
deep-equal(
<e>A<!---->B</e>,
<e>AB</e>
)
…and we need to update various test cases (e.g. K2-SeqDeepEqualFunc-22).
Pull request #1066 created #created-1066
1052 Simplify the results of parse-csv
Changes parse-csv to deliver the results in a simpler format:
(a) the result structure is less deeply nested: one record with four entries
(b) the actual data is delivered as a sequence of arrays of strings, closely aligned with the result of csv-to-arrays
The rules in the spec have also been rearranged to reflect this, so the rules are now organised according to the values delivered for each of these four fields.
The examples in the spec are changed to reflect the new output format; in addition they have been editorially reorganized so each example is more self-contained, avoiding the need for extensive scrolling to find the values of variables referenced in each example.
Fix issue #1052
Issue #1065 created #created-1065
fn:format-number: further notes
This issue summarizes suggestions for fn:format-number
from the QT4 Meeting 068 that have not yet been incorporated into the current draft:
- The Unicode Common Locale Data Repository (CLDR) should be referenced; it has recommendations for all of the languages in Unicode and some variants.
- We could consider introducing an options map so that we can just add more things later (such as e.g. an option for using the default decimal format for parsing the picture string, see https://github.com/qt4cg/qtspecs/issues/1048#issuecomment-1978869499).
Issue #919 closed #closed-919
Should predicate callbacks use EBV?
Issue #944 closed #closed-944
Coercion rules: implicit types
Issue #1047 closed #closed-1047
Incorrect note for `fn:some` and `fn:every`
Issue #1064 closed #closed-1064
340-editorial fn:format-number
Pull request #1064 created #created-1064
340-editorial fn:format-number
…an addendum to the editorial change I made yesterday; will be merged in a minute.
Issue #1054 closed #closed-1054
Spec fn:message #id using old name fn:log
Issue #1057 closed #closed-1057
1054 Spec fn:message #id using old name fn:log
Issue #1063 created #created-1063
deep-equal() - option to compare functions liberally
We have changed deep-equal() so it no longer automatically treats functions as not equal.
However, it is still in practice infeasible to use deep-equal() for comparison of test results that include function items because it is not in general possible to supply an expected result that compares true to the function item actually returned. Since comparison of test results is an important use case for deep-equal, this is a serious limitation. It affects our own process that checks that output from examples in the spec is correct: the examples for parse-csv, for example, are artificially adjusted to make test comparison feasible by eliminating the function items in the result, which reduces the pedagogical value of the examples.
There should be an option such as strict-function-comparison=true|false
. If set to false, then the function properties such as name, arity, and signature are compared, but the function body is ignored and assumed equal.
QT4 CG meeting 068 draft minutes #minutes-03-05
Draft minutes published.
Issue #1053 closed #closed-1053
1047 Default predicate for some#1 and every#1
Issue #1046 closed #closed-1046
1038 take-while predicate no longer uses EBV
Issue #413 closed #closed-413
New function: parse-csv()
Issue #1017 closed #closed-1017
Change csv-to-xml() to produce an XHTML table
Issue #1043 closed #closed-1043
CSV parsing - "blank" rows
Issue #1051 closed #closed-1051
1043 Clarification of CSV edge cases
Issue #340 closed #closed-340
fn:format-number: Specifying decimal format
Issue #1049 closed #closed-1049
340-partial fn:format-number: Specifying decimal format
Issue #1061 closed #closed-1061
discussion - language pragmas
Pull request #1062 created #created-1062
150bis - revised proposal for fn:ranks
This proposal is an amended/alternative proposal for the fn:ranks function, taking into account the work done on the original issue #150 and the PR #1027 and the comments raised. Acknowledgements to the original author for the idea and for a lot of good work on examples etc.
It amends the previous proposal as follows:
(a) the signature and the semantics are aligned with fn:sort. This adds some functionality (multiple sort keys, ascending/descending) and also removes some complexity (two different collations for comparing input items and result items)
(b) the style of exposition is changed editorially for consistency with other functions
Issue #1061 created #created-1061
discussion - language pragmas
The motivation is introducing breaking changes into the language that may have value, but not enough value to justify a breaking change.
Haskell uses language pragmas for this, and actually most (well, a lot) Haskell code does not use the base specification, and quite common constructs (GADTs, multi param type classes) require extensions.
Haskell devs are used to this, it may require some referring to the the top of the file to change the pragmas but its a working solution to introducing optional things that may be breaking changes.
Thoughts?
Benefits
- allows breaking changes to be introduced
Costs
- developers may have to refer to the language pragma to correctly understand the code
- implementation explosion, extensions may not be independent and interact causing an explosion of combinations of extensions (though i think its reasonable for an implementation to just implement combinations that are practical).
I'm biased, I want to introduce breaking changes, but am thwarted by the versioning strategy.
Issue #1060 created #created-1060
Formatting XPath/XQuery
I got reminded today that the specification documents are kind of “wild”, because all code snippets use a different formatting:
- The indentation is inconsistent (the tendency seems to be 2 spaces, in accordance with the function signatures). Repeatedly, indentations are used that don’t seem to follow any conventions at all.
- Sometimes,
function
,map
andarray
keywords are followed by a space, sometimes not. My preference would be not be too stingy; we have enough space. - Sometimes, the
return
keyword starts in a new line, sometimes it’s attached to the previous line.
This is certainly something we cannot finalize too early, but I think we shouldn’t be too erratic in an official document, even though it’s “just code”.
Related: #1000.
Pull request #1059 created #created-1059
1019 XQFO: Unknown option parameters
Issue: #1019
Pull request #1058 created #created-1058
1037 fn:json-to-xml: 'number-parser' option
Issue: #1037
Pull request #1057 created #created-1057
1054 Spec fn:message #id using old name fn:log
Issue: #1054
Issue #1056 closed #closed-1056
Simplifying match templates
Issue #1056 created #created-1056
Simplifying match templates
I like match templates a lot, I think they are a USP for XSLT, but I find using them quite clumsy e.g.
- priority rules are quite subtle (I couldn't tell you what they are not, and I tend to make them explicit)
- because each match sits in a different template they tend to sort of drift around in the spaghetti of the code
- they don't naturally extend to nested local matches....everything exists at the top level.
If you compare this with main stream functional match expressions then they are quite syntactically different, and I think the mainstream syntax is probably a bit simpler (and much more familiar) (I can see this potentially extending to lots of subsequent things but I'll keep it to the headline)
I think something like
<xsl:template mode="foo" as="xs:string">
<xsl:match select="Foo">
<xsl:sequence select="'this is a foo'"/>
</xsl:match>
<xsl:match select="Bar">
<xsl:sequence select="'this is a bar'"/>
</xsl:match>
<xsl:match>
<xsl:sequence select="'this is something else'"/>
</xsl:match>
</xsl:template>
- templates are matched in sequence (as is the norm), no opaque priority rules
- if nothing is matched then nothing is returned...I have effectively a catchall match above.
- everything is cohesive, the template contains all matches....no secret ones hidden at the bottom of the file.
there's lots of holes here,
- how does this interact with existing match templates?
- are the a different syntax for the same thing?
- how do they work with includes and imports?
my guesses are...they ARE just different syntax for the existing infrastructure...because thats the smallest change. and then the other questions are answered by how the above syntax maps into "priority" but tbh, as I barely know how the current priority rules work, I can't really give a sensible guess.
tbh, if this is just different syntax then secret matches CAN exist elsewhere in the spaghetti, but at least the programmer does have a construct to not do that, rather than the default contract to lack cohesion from the outset.
Issue #1055 created #created-1055
xsl:variable/@as - simplifying the language - attempt 2
I've thought about it.
The key issue I had which genuinely caused me years of confusion (I didnt understand it so I ignored it, and dealt with it by typeing random xslt code)....this....
<xsl:variable name="presentationMediaElement" as="element(urn:presentationMedia)">
<presentationMedia/>
</xsl:variable>
if I don't declare the "as" then it does something different and confusing (it assumes its a document element I think, though I NEVER want it to do this).
so for stylesheets declared as version "4.0"+, can we make the default interpretation of that its an element?
Does this breaks backwards compatability with v1? tbh, the code is already incompatible because the equivalent 1.0 code requires node-set
, its already broken, so I suggest making the fix simple to understand.
why is this so irksome to me? because for me its incredibly confusing
its confusing because (and i didnt express this well the last time), it makes a type declaration have inconsistent behaviours.
In languages with OO (is it reynolds?) type systems this also happens BUT in an OO type system an expression has a type than can be cast to a subtype and a subtype is very special because everything that is true of the supertype (in the constained type logic) is true of the subtype (you can express this in terms of set/class membership in a universe if thats how you think about these things).
but in this case, this isnt the case....the two interpretations are disjoint, this isnt a cast.
So the concrete proposal is uniquely define the semantics of.
<xsl:variable name="presentationMediaElement">
<presentationMedia/>
</xsl:variable>
to be
<xsl:variable name="presentationMediaElement" as="element(urn:presentationMedia)">
<presentationMedia/>
</xsl:variable>
from 4.0 onwards.
(ironically, personally i will probably still put the "as" clause in, but if i were trying to learn the language today I'd understand this on day 1, not day 1000).
P.S.
I have a suspicion I still dont fully understand it, but i'm sure someone will point that out in due course.
Issue #1054 created #created-1054
Spec fn:message #id using old name fn:log
- https://qt4cg.org/specifications/xpath-functions-40/Overview.html#func-message
- https://qt4cg.org/specifications/xpath-functions-40/Overview.html#func-log
Issue #1018 closed #closed-1018
Output of parse-csv()
Pull request #1053 created #created-1053
1047 Default predicate for some#1 and every#1
Changes the default predicate for fn:some#1
and fn:every#1
to be fn:boolean#1
, which takes the EBV of the items in the input sequence. The previous use of fn:identity#1
caused some unexpected behaviour.
Issue #1052 created #created-1052
parse-csv() - simplify output
Currently parse-csv produces a structure like this:
map {
"columns": map {
"names": map{"one":1, "two":2},
"fields": ("one", "two")
}
"rows": (
map{
"fields": ("aaa", "bbb"),
"field" fn($col){$this?fields[$col]}
},
map{
"fields": ("ccc", "ddd"),
"field" fn($col){$this?fields[$col]}
}
)
}
There are a number of ways this could be improved.
- The structure is needlessly different from the return value of maps-to-arrays(). Users will get confused between the two representations, and will find it hard to switch from one to the other. For example, one delivers rows as sequences, the other delivers rows as arrays.
- There are too many levels in the structure; the expressions to select within it are unnecessarily complicated, and users will get poor diagnostics when they get it wrong.
- In this example (with two columns) for each row there is one map, one sequence, one function, and two strings - five values in all. The output of
csv-to-arrays
has only three objects (one array and two strings). However hard an optimized implementation tries to reduce the overhead, the space occupied by a million-row parsed CSV is likely to be larger than needed. - The use of sequences rather than arrays means that no JSON-serialization of the structure is possible
I propose using a flatter structure, like this pseudo-code sketch:
map {
"column-index": map{"one":1, "two":2},
"columns": ["one", "two"]
"rows": (
["aaa", "bbb"],
["ccc", "ddd"]
)
"get": fn($row, $col){$this?rows[$row]($col)}
"size": fn(){count($this?rows)}
}
This doesn't meet all the objections outlined above; for example it represents rows as a sequence of arrays, which is consistent with csv-to-arrays
, but not JSON-serializable. But I think it's a considerable improvement.
Pull request #1051 created #created-1051
1043 Clarification of CSV edge cases
Gives a more precise definition of blank rows and empty fields, and generally adds detail on how edge cases should be handled.
Fix #1043.
Issue #1038 closed #closed-1038
Backwards incompatibility caused by use of EBV in callback functions
Issue #1050 created #created-1050
Potential (low-risk) Ambiguities in XPath EBNF
After demonstrating iXML
XPath grammar production at the meeting of 27th Feburary, it seemed worth recording some of the ambiguity issues encountered, if only so others might be aware of possible pitfalls.
Please note that the Lexical Structure notes in the spec do resolve these ambiguities, by extra-grammatical interpretations, most notably the choice of longest conforming match, but for grammar/parsers which don't specify or support this, such as InvisibleXML, ambiguities might arise, though there may be ameliorating changes to the resulting grammar that will resolve them. I am not advocating changes to the specification EBNF but merely noting where such problems might occur from my implementation experience, and potentially suggesting some workarounds.
Here are a couple of cases:
TypeName / AtomicOrUnionType
The rule for ItemType
is ~
ItemType ::= ... TypeName| .... | AtomicOrUnionType | ...
where both TypeName
and AtomicOrUnionType
resolve solely to the EQName
production. The grammar interpretation notes suggests (I think) that it binds to TypeName
if such exists in the current static context, which is an extra-grammatical concept, but I may be mistaken.
StringTemplate
The productions for StringTemplate are:
[106] StringTemplate ::= "`" (StringTemplateFixedPart | StringTemplateVariablePart)* "`"
[107] StringTemplateFixedPart ::= ((Char - ('{' | '}' | '`')) | "{{" | "}}" | "``")*
[108] StringTemplateVariablePart ::= EnclosedExpr
where it relies on longest match semantics to avoid ambiguity. (If this was not the case a potential infinity of empty StringTemplateFixedPart
productions could be satisfied, or any sequential partitions of a sequence of characters.)
An alternative (recursive and more cumbersome) formulation, which avoids the ambiguity is (in an iXML grammar for compactness):
StringTemplate: -"`", StringTemplateContent?, -"`".
-StringTemplateContent: StringTemplateFixedPart |
StringTemplateVariablePart |
StringTemplateVariablePart, StringTemplateContent |
StringTemplateFixedPart, StringTemplateVariablePart, StringTemplateContent?.
StringTemplateFixedPart: ("{{"; "}}"; "``"; ~["`{}"])+.
StringTemplateVariablePart
remains unchanged. (iXML doesn't support character set subtraction, so ~["``{}"]
(any character except...) is used for the Char - ....
term.) By allowing a fixed part only to be followed by a variable part, this effectively permits the content either to be empty, or a sequence of parts such that StringTemplateVariablePart
terms can be consecutive, but not StringTemplateFixedPart
and it seems to work effectively, at least in my iXML parser.
Reactions, corrections, remarks, praise and brickbats welcome. I'll document any more as I find them. John
Pull request #1049 created #created-1049
340-partial fn:format-number: Specifying decimal format
The PR introduces an additional $format
argument to fn:format-number
, which allows you to override decimal formats with custom properties.
Next, we may need to clarify if the current specification already allows processors to provide custom decimal formats (https://github.com/qt4cg/qtspecs/issues/340#issuecomment-1968856655). It’s not part of this PR.
Issue #1048 created #created-1048
fn:format-number: relax restrictions on exponent-separator (possibly minus-sign, percent, per-mille)
The current rules for decimal formats are too restrictive (i.e., too much focused on Anglo-Saxon formatting rules). The most prominent case is the Arabic exponent-separator „character“, which consists of two characters: عر
(https://www.localeplanet.com/icu/ar/). The exponent separator of other locales is not restricted to a single character either. For example, se-NO
uses ·10^
.
When we include the ICU library in the analysis, we also find minus-sign
, percent
and per-mille
properties that are longer than 1 character. Examples:
- The
minus-sign
character forhe
consists of200e
and002d
(200e
is the Left-to-Right Mark). - The Arabic
percent
character consists of066a
and061c
(061c
is the “Arabic Letter Mark”). - The
per-mille
property ofen-US-posix
is0/00
.
Issue #1047 created #created-1047
Incorrect note for `fn:some` and `fn:every`
fn:some
and fn:every
state (non-normatively):
"If the second argument is omitted or an empty sequence, the first argument must be a sequence of xs:boolean values.".
I don't think this note is correct. If the default predicate identity#1
is used, it is coerced to the required type function(item()) as xs:boolean
, so the effective predicate is fn($x as item()) as xs:boolean {identity($x)}
. This atomises the result of calling identity($x)
and casts the result to xs:boolean
. Therefore expressions such as some([true()])
and some(<a>true</a>)
return true, not an error.
Pull request #1046 created #created-1046
1038 take-while predicate no longer uses EBV
See issue #1038, which pointed out compatibility problems with using EBV for callback predicates, as proposed in issue #919.
In specifying fn:take-while
we anticipated acceptance of the proposal to use EBV for predicate callbacks; now that we have decided not to make that change, this PR brings take-while
into alignment with other functions using a predicate callback.
Issue #1016 closed #closed-1016
Editorial comments on fn:parse-csv()
Issue #1042 closed #closed-1042
1016 Editorial cleanup - csv-to-arrays
Issue #236 closed #closed-236
map:build: sequence of keys
Issue #1041 closed #closed-1041
236 map:build: sequence of keys
Issue #988 closed #closed-988
960 Pinned and labeled values
QT4 CG meeting 067 draft minutes #minutes-02-27
Draft minutes published.
Issue #485 closed #closed-485
Predeclared namespaces in XQuery
Issue #1040 closed #closed-1040
485 Predeclared namespaces in XQuery: output
Issue #1029 closed #closed-1029
Make argument of fn:void optional
Issue #1032 closed #closed-1032
1029 Make argument of fn:void optional
Issue #1033 closed #closed-1033
QT4CG-066-01 Add note that whitespace and comments in regexen are lexical constructs
Issue #356 closed #closed-356
array:leaves
Issue #843 closed #closed-843
Standard, array & map functions: Equivalencies
Issue #872 closed #closed-872
Symmetry: fn:items-at → fn:get
Issue #990 closed #closed-990
Transitive closure on non-nodes
Issue #1007 closed #closed-1007
How to invert a predicate function
Issue #1030 closed #closed-1030
allow pattern matches in axis expression
Issue #1034 closed #closed-1034
QT4CG-066-xx Add note regarding absence of drop-while / skip-while
Issue #1024 closed #closed-1024
Precedence of `otherwise` operator
Issue #1031 closed #closed-1031
1024 Change precedence of 'otherwise' operator
Issue #1003 closed #closed-1003
919 Use EBV in boolean callbacks
Issue #1045 created #created-1045
Functions to manage namespace usage
Prior to saving XML generated in XQuery I often tweak the namespace usage. This makes the XML lighter and clearer for the casual reader and is sometimes mandated by users and systems. I think providing builtin solutions for these cases would ease these tasks.
Common cases are:
-
Remove unused prefixes Example: the function presented at https://stackoverflow.com/questions/23002655/xquery-how-to-remove-unused-namespace-in-xml-node
-
Make a namespace the default wherever it is used. Example:
functx:change-element-ns-deep($nodes,$targetns,"")
See http://www.xqueryfunctions.com/xq/functx_change-element-ns-deep.html -
Remove the use of all/some namespaces Example: BaseX https://docs.basex.org/wiki/Utility_Module#util:strip-namespaces
A somewhat related issue https://github.com/qt4cg/qtspecs/issues/266
QT4 CG meeting 067 draft agenda #agenda-02-27
Draft agenda published.
Issue #1044 created #created-1044
CSV row delimiter - allowed values
Section 15.4.2.1 says:
The row delimiter defaults to matching any of CRLF ( ), LF ( ), or CR ( ). Valid values for the row delimiter are a single Unicode character, or one of CRLF, LF, or CR, that has not been marked for use as the column delimiter. Implementations must raise [[err:FOCV0002](] if the row-delimiter option is set to a multi-character string other than CRLF ( )
- It's not entirely clear to me what this is saying. Are alternative row delimiters other than newline delimiters allowed (
row-delimiter:('|','/')
? - The statement in this section doesn't align with the error conditions appearing in the actual function specs, which says: "A dynamic error [[err:FOCV0002] occurs if one or more of the values for field-delimiter or quote-character are specified and are not a single character." - no mention here of the row-delimiter.
Issue #1043 created #created-1043
CSV parsing - "blank" rows
The CSV parsing specification states "A blank row is represented as an empty array.".
-
It's not clear what "blank" means here. Does it depend on the whitespace-trimming option?
-
It would be more logical to return an array containing a single zero-length string, since any other line containing no field delimiter is considered to contain one field.
-
Alternatively, it might make sense to ignore the row entirely.
Pull request #1042 created #created-1042
1016 Editorial cleanup - csv-to-arrays
The changes here are almost entirely editorial, reordering material, removing duplication and changing some of the language for consistency with the rest of the spec. There is one substantive change - the function csv-to-simple-rows
is renamed csv-to-arrays
.
Fix #1016
Pull request #1041 created #created-1041
236 map:build: sequence of keys
Issue: #236
Pull request #1040 created #created-1040
485 Predeclared namespaces in XQuery: output
Issue: #485
Issue #1039 created #created-1039
Allow dynamic collations in XQuery "order by" and "group by"
I propose that in "order by" and "group by" clauses, the keyword "collation" should be followed by an expression rather than by a URILiteral.
The only problem this causes is if the expression depends on variables in the tuple stream, because obviously the collation must be selected for the tuple stream as a whole, not for each individual tuple.
We can solve this problem by amending the rules for the scope of variables bound in FLWOR expressions (§4.15.1 rule 1) so that collation expressions are excluded from the scope; or perhaps (it might be simpler) to make it a static error if the collation expression refers to a variable bound in the containing FLWOR expression.
If the syntax allows a general expression then a simple "quoted-string"
will be interpreted as a StringLiteral
rather than a URILiteral
. As far as I'm aware the two things are syntactically and semantically identical so this isn't a problem.
Issue #1038 created #created-1038
Backwards incompatibility caused by use of EBV in callback functions
Changing fn:filter and similar functions that existed in 3.1 to use the EBV of the callback function's result has introduced a backwards incompatibility. In 3.1, the function conversion rules were used to convert the callback function's result to xs:boolean
. This involves atomization. If the callback returned the untyped node <a>false</a>
, this is atomised as false()
, but its EBV is true()
.
Revealed by test case fn:filter-006.
I think it's unlikely to happen much in practice, but it's a bit nasty. Perhaps we shouldn't make the change to use EBV for functions that existed in 3.1?
Perhaps we should even consider reverting the change entirely. It's not exactly essential.
Issue #1037 created #created-1037
fn:json-to-xml: 'number-parser' option
A function supplied via the number-parser
option of fn:json-to-xml
is now allowed to return zero or more one items (see #973). Analogous to the action
argument of fn:replace
, the result should be converted to a string by invoking fn:string
on the result. An example:
json-to-xml('-1', map { 'number-parser': abs#1 })
→ <fn:number>1</fn:number>
json-to-xml('1', map { 'number-parser': fn { true#0 } })
→ err:FOTY0013
No change is required for fn:parse-json
.
Issue #1036 created #created-1036
parse-json: liberal parsing
I noticed that some new test cases (fn-parse-json-712
, fn-parse-json-716
, possibly others) rely on specific liberal parsing rules.
@michaelhkay Do you think that it could make sense to formalize some of those rules, or should we rather fix the test cases?
Issue #1005 closed #closed-1005
regular expressions - whitespace
Issue #709 closed #closed-709
(Un)Checked Evaluation
Issue #459 closed #closed-459
Eager and lazy evaluation
Issue #135 closed #closed-135
Arrays' counterparts for functions on sequences, and vice versa
Issue #94 closed #closed-94
Functions that determine if a given sequence is a subsequence of another sequence
Issue #43 closed #closed-43
Support standard and user-defined composite values using item type definitions
Issue #1001 closed #closed-1001
fn:subsequence-where: equivalent `fn:slice` expression
Issue #1020 closed #closed-1020
When to apply the coercion rules
Issue #1035 created #created-1035
Add default values for parameters in constructor functions for records
We have added implicit constructor functions for named record types; we should allow the parameters in these functions to take explicit default values.
For example
declare item type my:complex as record(r as xs:double, i as xs:double := 0)
At the same time we might consider introducing fixed values, for example
declare item type my:rectangle as record(height, width, area ::= function($rect){$rect?height * $rect?width))
in which (a) the area function must NOT be supplied as an argument to the constructor function call, and (b) a map in which the area field is different from this fixed value is not a valid instance of the my:rectangle
record type.
Pull request #1034 created #created-1034
QT4CG-066-xx Add note regarding absence of drop-while / skip-while
In response to comments noted in the minutes of meeting 066, and made in writing against PR #1008, add a note justifying the absence of drop-while or skip-while functions.
Pull request #1033 created #created-1033
QT4CG-066-01 Add note that whitespace and comments in regexen are lexical constructs
Adds a note explaining why whitespace and comments are not explicit in the regex grammar; see action QT4CG-066-01
Pull request #1032 created #created-1032
1029 Make argument of fn:void optional
Allows the use of fn:void#0
when required.
Fix #1029
Pull request #1031 created #created-1031
1024 Change precedence of 'otherwise' operator
Changes the precedence of the otherwise
operator so that @price otherwise @cost * 2
now means @price otherwise (@cost * 2)
rather than (@price otherwise @cost) * 2
.
Fix #1024
QT4 CG meeting 066 draft minutes #minutes-02-20
Draft minutes published.
Issue #999 closed #closed-999
regular expression addition - comments
Issue #1022 closed #closed-1022
999 Allow comments in regular expressions
Issue #1028 closed #closed-1028
960(partial) Recognize alternative representation of JSON null
Issue #617 closed #closed-617
Implicit constructor functions for record types and union types
Issue #953 closed #closed-953
617 Define record constructors
Issue #1002 closed #closed-1002
Reinstate subsequence-before
Issue #1008 closed #closed-1008
1002 Add fn:take-while function (replacing subsequence-before)
Issue #655 closed #closed-655
fn:sort-with: Comparators
Issue #795 closed #closed-795
655 fn:sort-with
Issue #1023 closed #closed-1023
1020 explain consequences of function coercion
Issue #1025 closed #closed-1025
1001 Fix incorrect operator precedence in subsequence-where
Issue #1030 created #created-1030
allow pattern matches in axis expression
There's a danger that this already exists, and that i dont know about it, but i dont think it does.
Consider this SO question.
https://stackoverflow.com/questions/78027093/selecting-preceding-cousins-inclusing-siblings
the questioner is writing this
/root/level1/level2[@id='6']/preceding::level2[parent::level1[parent::root]][1]
eeek...look at the nasty nested predicates
when he/she wants to write this
/root/level1/level2[@id='6']/preceding::(root/level/level2)[1]
there is an answer on the question which sort of shows how horrific the problem is in general.
(its a problem that crops up quite a lot for me)
Issue #1029 created #created-1029
Make argument of fn:void optional
If you want to supply a function that always returns an empty sequence, fn:void#0 would be useful; but currently there is no arity-zero variant.
Example: map:build(...., combine:=fn:void#0)
returns a map in which any key that occurs more than once in the input is mapped to an empty sequence.
The first argument of fn:void
should default to an empty sequence.
QT4 CG meeting 066 draft agenda #agenda-02-20
Draft agenda published.
Pull request #1028 created #created-1028
960(partial) Recognize alternative representation of JSON null
Defines an option in parse-json and json-doc to define a representation for JSON null, defaulting to ()
as currently used. Selecting a different value may be useful because it bypasses the problem that the ?
and ??
operators flatten the results, causing ()
to be elided.
Suggests use of the QName fn:null
as an alternative representation; and changes the JSON serialization method to recognize this QName as a representation of null.
Pull request #1027 created #created-1027
150 fn:ranks
As proposed and discussed here: https://github.com/qt4cg/qtspecs/issues/150
Issue #1026 created #created-1026
XSLT match patterns on pinned maps and arrays
Given that <xsl:apply-templates select="pin(.)??course?code"/>
will select items that are labeled with their position in the containing tree of maps and arrays, it should be possible to match the selected items with a match pattern of the form
match="?course?code"
that operates in a similar way to patterns such as course/code
in XML.
Perhaps the pinning of the map should be done automatically by the xsl:apply-templates
instruction.
Pull request #1025 created #created-1025
1001 Fix incorrect operator precedence in subsequence-where
Fixes the "equivalent expression" to subsequence-where.
Fix issue #1001
Issue #1024 created #created-1024
Precedence of `otherwise` operator
I made a mistake when specifying subsequence-where, caused by misunderstanding the precedence of the otherwise
operator: see issue #1001.
In the expression
let $start := index-where($input, $from)[1]
otherwise count($input) + 1
I failed to realise that otherwise
binds more tightly than +
.
I'm opening the issue to solicit views as to whether we have got this right.
One might take the view that the closest thing to otherwise
in other familiar language is the ternary conditional operator, which has lower precedence than anything else including and
and or
; but then, its first operand is a boolean expression while it's relatively unlikely that the operands of otherwise
will be boolean. I'm therefore thinking that it might be best to put it between 'eq' and '||`, so
$a eq $b otherwise $c || $d
parses as
$a eq ($b otherwise ($c || $d))
Issue #827 closed #closed-827
map:empty, map:exists ← array:empty, array:exists
Issue #779 closed #closed-779
Hash/checksum function
Issue #978 closed #closed-978
948 Reflected the comments of the CG on the specification of scan-left and scan-right
QT4 CG meeting 065 draft minutes #minutes-02-13
Draft minutes published.
Issue #720 closed #closed-720
From Records to Objects
Issue #985 closed #closed-985
720 Add lookup arrow expressions (method invocations)
Issue #949 closed #closed-949
Partial Function Applications: Allow return of function name
Issue #972 closed #closed-972
949 Partial Function Applications: Allow return of function name
Issue #42 closed #closed-42
Relax type incompatibility in order by clause (impl. dep. instead of XPST0004)
Issue #55 closed #closed-55
Provide an XML version of the stack trace
Issue #79 closed #closed-79
fn:deep-normalize-space($e as node())
Issue #989 closed #closed-989
character sequence constructor 'a' to 'z'
Issue #994 closed #closed-994
Invoking maps & arrays: allow sequences?
Issue #1009 closed #closed-1009
QT4CG-064-03, QT4CG-064-04: Examples, Return type of `fallback`
Issue #1010 closed #closed-1010
1009 Examples, Return type of parse-json:fallback
Issue #916 closed #closed-916
720 Allow methods in maps with access to $this
Pull request #1023 created #created-1023
1020 explain consequences of function coercion
Adds explanatory material to explain my interpretation of the spec and the consequences on backwards compatibility. No change to the spec is proposed. (To review the PR, I suggest reading the change markings in the XQuery spec.)
Fix issue #1020
Pull request #1022 created #created-1022
999 Allow comments in regular expressions
Fix #999
Issue #1021 created #created-1021
Extend `fn:doc`, `fn:collection` and `fn:uri-collection` with options maps
fn:doc
, fn:collection
and fn:uri-collection
currently expect only a single argument, a URI.
There is no way of adding additional parameters to those functions.
Several implementations of XPath have worked around that limitation by
- passing of parameters via query string as part of the URI:
- see https://www.saxonica.com/documentation10/index.html#!sourcedocs/collections
- exist-db's implementation of
uri-collection
works similarly
- create custom functions in other namespaces to add an options map as a second parameter
saxon:doc
in Saxon https://www.saxonica.com/documentation10/index.html#!changes/extensions/9.7-9.8fetch:doc
in baseX https://docs.basex.org/wiki/Fetch_Module#fetch:doc
While both approaches do work well, they do fall flat in terms of interoperability and discoverability.
A script written for Saxon leveraging saxon:doc
will not work on baseX in vice versa even though they offer options with some overlap.
And a developer looking at the language specification will not discover that these options even exist.
I would like to add a second signature to the above functions with an options map as a second argument.
fn:doc($href as xs:string?) as document-node()?
fn:doc($href as xs:string?, $options as map(xs:string, *)? := ()) as document-node()?
NOTE: Looking at the other two functions below I believe the first parameter should be defined as $href as xs:string? := ()
fn:collection( $uri as xs:string? := ()) as item()*
fn:collection( $uri as xs:string? := (), $options as map(xs:string, *)? := ()) as item()*
fn:uri-collection( $uri as xs:string? := ()) as xs:anyURI*
fn:uri-collection( $uri as xs:string? := (), $options as map(xs:string, *)? := ()) as xs:anyURI*
Since a lot of those options depend on the current runtime most of them will be "free" options. This will also help us get to a specification quickly and circumvent long infighting about some very specific details.
I do see, however, a good chance of specifying a small set of options that would work across implementations.
Possible standard options
For fn:doc
validation
: wether and how to validate the input files against a schemawhitespace
: (strip-space
,stripws
) what to do with whitespace in the input documentparser
: could be used to define a different parser (for html documents)
For collection
and uri-collection
I see the following:
recurse
: traverse collection trees down into its subcollectionsstable
: this is already vaguely mentioned in the spec and would benefit from a clearer specificationtype
: (akamedia-type
orcontent-type
) while the allowed values will be implementation defined the key should be standardised
This would bring the above functions to follow a pattern developers are already familiar with (see fn:serialize
and others)
Thanks for initial input by @ChristianGruen, Liam Quin and @michaelhkay
QT4 CG meeting 065 draft agenda #agenda-02-13
Draft agenda published.
Issue #1020 created #created-1020
When to apply the coercion rules
The rules for function calling say that the coercion rules are applied to the values supplied as function arguments; they are also applied in other circumstances such as when binding values to variables. The coercion rules are applied (as far as the spec is concerned) whether or not the supplied value already matches the required type.
Saxon has always attempted to optimise this process: if the supplied value is already an instance of the required type, no coercion takes place.
I have discovered at least one case where this assumption is incorrect: the coercion rules are not idempotent in the case where the supplied value matches the required type. This case concerns function coercion, exemplified by the new test case FunctionCall-058: if the expected type of a callback parameter is function(xs:integer) as xs:boolean
, and the supplied value for the callback is a function that accepts xs:decimal
, then the coercion rules say that a call to the supplied function that supplies an xs:decimal
must be rejected as a type error even though the supplied function accepts it.
Note that this means we have introduced a rather subtle backwards incompatibility. In XQuery 3.1, coercion was not applied to variable bindings, so the following would work (the supplied function matches the declared type of the variable):
declare variable $f as function(xs:integer) as xs:boolean
:= function($x as item()) as xs:boolean {string($x)};
return $f("banana");
(see new tests VarDecl065/066)
In 4.0 I believe this is supposed to throw a type error, because the supplied function is wrapped in a wrapper function that checks that the supplied argument is an integer.
We have extended the coercion rules considerably in 4.0, and we need to be confident that there are no other similar cases where the coercion rules are no longer idempotent.
Issue #1019 created #created-1019
XQFO: Unknown option parameters
The current option parameter conventions are:
- It is not an error if the options map contains options with names other than those described in this specification. Implementations MAY attach an ·implementation-defined· meaning to such entries, and MAY define errors that arise if such entries are present with invalid values. Implementations MUST ignore such entries unless they have a specific ·implementation-defined· meaning. Implementations that define additional options in this way SHOULD use values of type
xs:QName
as the option names, using an appropriate namespace.
The obvious consequence is that wrongly typed or unsupported options are not reported as such:
serialize($node, map { 'format': 'html' })
I think we should still allow proprietary options, but raise errors when an option is neither defined in the specification nor supported by the given implementation. On the one hand, this will help users to spot typos (e.g., byte-order-mask
or instead of byte-order-mark
). On the other hand, options that are supported by one implementation will be rejected, which feels reasonable to me, as options usually change either the result, or the way how the input is treated.
If we believe that this change is too disruptive, we could tolerate entries with xs:QName
keys.
Issue #1018 created #created-1018
Output of parse-csv()
I propose making some simplifications to the output of parse-csv() to make it more amenable to processing.
- Represent each row as a map, rather than as a structure with a data field and an accessor function. Note that implementations worried about memory usage can devise a custom map implementation optimised for the case where many maps have the same regular structure. (cf recent thread about Javascript "shapes")
- The key for a field in this map should be an integer if (i) column-names is set to false, or (ii) the column in question does not have a unique header name; in other cases it should be the name from the header.
- Replace the top-level
columns
record with a simple array of field names. It's easy enough to map names to positions using index-of.
I also propose changing the name to csv-to-maps
for consistency with csv-to-table
and csv-to-arrays
.
We should advocate use of csv-to-arrays where data is to be accessed positionally, and csv-to-maps where it is to be accessed by column names, and optimise the design accordingly.
Looking at a use case, the first example (§15.4.7.1) would be unnecessary if as proposed we change csv-to-xml to generate XHTML directly, But if it were needed, it would change from
let $csv := fn:parse-csv(`name,city{$crlf}Bob,Berlin`)
return <table>
<thead>{
for $column in $csv?columns?fields
return <th>{ $column }</th>
}</thead>
<tbody>{
for $row in $csv?rows return <tr>
{ for $field in $row?fields return <td>{ $field }</td> }
</tr>
}</tbody>
</table>
to
let $csv := fn:parse-csv(`name,city{$crlf}Bob,Berlin`)
return <table>
<thead>{
for $column in $csv?columns
return <th>{ $column }</th>
}</thead>
<tbody>{
for $row in $csv?rows return <tr>
{ for $column in $csv?columns return <td>{ $row?$column }</td> }
</tr>
}</tbody>
</table>
Issue #1017 created #created-1017
Change csv-to-xml() to produce an XHTML table
I propose (a) renaming csv-to-xml
as csv-to-table
, and (b) changing the output to be an XHTML table. Specifically, instead of outputting
<csv xmlns="http://www.w3.org/2005/xpath-functions">
<columns>
<column>name</column>
<column>city</column>
</columns>
<rows>
<row>
<field column="name">Bob</field>
<field column="city">Berlin</field>
</row>
<row>
<field column="name">Alice</field>
<field column="city">Aachen</field>
</row>
</rows>
</csv>
it should output:
<table xmlns="http://www.w3.org/1999/xhtml">
<thead>
<tr>
<th>name</th>
<th>city</th>
</tr>
</thead>
<tbody>
<tr>
<td title="name">Bob</td>
<td title="city">Berlin</td>
</tr>
<tr>
<td title="name">Alice</td>
<td title="city">Aachen</td>
</tr>
</thead>
</table>
Benefits:
(a) the data is just as easy to manipulate or transform as the current output (b) it can be copied directly into HTML transformation output if required (c) it is familiar to users (d) we don't have to write, test, and document a schema (e) there may well be libraries that can perform further transformations on the structure, for example conversion to other table representations, extraction to spreadsheet formats, etc.
Issue #1016 created #created-1016
Editorial comments on fn:parse-csv()
(a) The spec says:
The first argument is CSV data, as defined in ..., in the form of a sequence of xs:string values.
But in fact, the argument is a single (optional) xs:string value, not a sequence.
(b) The spec says:
If $csv is the empty sequence, implementations must return a parsed-csv-structure-record whose rows entry is the empty sequence.
If $csv is the empty sequence, but column name extraction has been requested, or explicit column names have been supplied, then the parsed-csv-structure-record returned by implementations must have a rows entry whose value is the empty sequence.
The second paragraph seems to add nothing to the first.
(c) And the phrase "implementations must return XXX" is unidiomatic. The normal form of words is "the function returns XXX".
(d) The grammar of the sentence "Handling of delimiters, and whitespace trimming, are handled using..." is inelegant.
(e) References to the record type names (such as parsed-csv-structure-record) should be hyperlinked.
Pull request #1015 created #created-1015
1013 [XSLT] Clarify effect of accumulator capture on non-element nodes
Adds a sentence saying that when an accumulator rule with capture="yes" matches a non-element node, the capture attribute has no effect.
Fix #1013
Issue #1014 created #created-1014
Predicates, sequences of numbers: Feedback
Feedback on #996:
Successful result: early exit
If the EBV is computed, and if the first item is a node, the remaining items are ignored:
'OK'[ <a/>, 1, 'x' ] → 'OK'
I would suggest doing the same for predicates that start with a number:
- If a comparison is successful, an implementation should be allowed to skip the remaining comparisons.
- If
$seq
starts with a number,E[$seq]
andE[position() = $seq]
will become equivalent:
$seq[1 to 100, 'x']
$seq[position() = (1 to 100, 'x')]
Obviously, an error still needs to be raised if a comparisons leads to a type error.
Error Codes
If we don’t equate E[$seq]
and E[position() = $seq]
, it would be useful to stick with FORG0006
(instead of XPTY0004
) [1]:
- It would be confusing to get
FORG0006
forE['x', 1]
andXPTY0004
forE[1, 'x']
. - If a processor uses unified implementations for EBV and predicate checks, it leads to additional effort just because the error code differs.
If we use different error codes, existing tests need to be revised ([2], maybe others).
[1] https://github.com/qt4cg/qtspecs/pull/996/files#diff-b37a92a9eb3ab9ba48a00de9627a1124466b9c86ecb2b4989d04be3942c597a6R8240 [2] https://github.com/qt4cg/qt4tests/blob/70e52c690a26bbeee0641af14ccb319a2cc98081/prod/Predicate.xml#L1158-L1165
Issue #1012 closed #closed-1012
Fix some incorrect examples in the F&O spec
Issue #1013 created #created-1013
[XSLT] Need to say what happens when a capturing accumulator rule matches a non-element node
The option capture="yes" has been added to xsl:accumulator-rule
; its purpose is to indicate that the entire subtree under an element is to be captured during streamed processing of the document, and is made available as an in-memory tree once the element end tag has been processed.
We need to say what happens when such a rule matches a node other than an element. I think it makes sense for the capture="yes" to be ignored, optionally with a warning.
Pull request #1012 created #created-1012
Fix some incorrect examples in the F&O spec
No issue raised; the errors are revealed by the generated QT4 tests.
Issue #1011 created #created-1011
fn:transform() improvements
- The spec talks about how to invoke an XSLT 1.0, 2.0, or 3.0 processor, but not a 4.0 processor.
- There is no way of supplying a source document in a way that allows streaming. Saxon has added a
source-location
parameter for this purpose; this should be in the standard. - If the stylesheet is to read streamed input, then there also needs to be control over whether and how it does schema validation.
- When calling from XSLT, the best default for
base-output-uri
is probably the value ofcurrent-output-uri()
. The default is currently implementation-defined, but we should recommend this possibility. - The post-process option was added with the aspiration that it would enable secondary result documents (xsl:result-document output) to be written directly (e.g. to filestore) as a side-effect. However, it fails to achieve this. There should probably be an option to request this even though we cannot define its semantics precisely.
Pull request #1010 created #created-1010
1009 Examples, Return type of parse-json:fallback
Issue: #1009
I’ve used this PR to fix some other buggy examples in the XQFO spec.
Issue #1009 created #created-1009
QT4CG-064-03, QT4CG-064-04: Examples, Return type of `fallback`
Thanks for the attentive inspection of #975.
The type xs:untypedAtomic
for the fallback
function of fn:parse-json
made no sense indeed: JSON escape sequence can never be converted to numbers. The return type will be xs:anyAtomicType
instead of item()
(the result will be converted to a string).
I’ll revise the rules and add some examples.
Issue #977 closed #closed-977
Ignore this, it's just a test
Pull request #1008 created #created-1008
1002 Add fn:take-while function (replacing subsequence-before)
Adds function fn:take-while
, replacing/reinstating previously proposed items-before()
and subsequence-before()
.
Fix #1002
Issue #1007 created #created-1007
How to invert a predicate function
It's nice to be able to write
index-where($in, contains(?, 'e'))
to select all the items that contain an 'e'.
What should we write in order to select all the items that do not contain an 'e'? All formulations seem a bit clumsy in comparison:
index-where($in, fn{not(contains(., 'e'))})
index-where($in, chain((contains(?, 'e'), not#1))
Perhaps this is a sufficiently common requirement that it would be helpful to allow
index-where($in, inverse(contains(?, 'e')))
where inverse($predicate)
is defined as fn($it, $pos){not($predicate($it, $pos))}
Issue #1006 created #created-1006
regular expression addition - word boundaries
Could we provide support for the regex character sequence \w
for matching word boundaries?
It’s already support by some processors via vendor-specific flags, and would be very helpful even if didn’t over the full Unicode range.
Issue #1005 created #created-1005
regular expressions - whitespace
There is some confusion about the rationale for defining the multi-character escape for whitespaces in a recent discussion on Slack:
\s
is limited to[#x20\t\n\r]
- In contrast,
\w\
covers[#x0000-#x10FFFF]-[\p{P}\p{Z}\p{C}]
, i.e., considers the full Unicode range
Do we know the reason?
I assume it’s both too late and out of scope to change that in our specs, but maybe we can improve the XQFO spec and…
- mention why
\s
does not include\p{Zs}
or\p{Z}
- add an example for looking up non-breaking spaces… for example:
matches(
string-join(('my', 'pleasure'), char(0xA0)),
'\p{Z}'
)
Issue #1004 closed #closed-1004
fn:char updated as agreed 2024-02-06: drop the form char('#x20')
Pull request #1004 created #created-1004
fn:char updated as agreed 2024-02-06: drop the form char('#x20')
The forms char('#32`) and char('#x20') are dropped.
Issue #963 closed #closed-963
Errors in forming function items (continued)
Issue #888 closed #closed-888
Reclassify XPDY0002 as a type error
Issue #992 closed #closed-992
888, 963: Error handling for unsatisfied context dependencies
Pull request #1003 created #created-1003
919 Use EBV in boolean callbacks
Changes functions with a $predicate callback function to use the effective boolean value of the result, mainly to allow things like index-where(*, fn{self::x})
.
Fix #919
QT4 CG meeting 064 draft minutes #minutes-02-06
Draft minutes published.
Issue #187 closed #closed-187
Add a 'while' clause to FLWOR expressions
Issue #943 closed #closed-943
187 Add FLWOR expression while clause
Issue #260 closed #closed-260
array:index-of
Issue #968 closed #closed-968
260 array:index-of
Issue #969 closed #closed-969
843-partial Standard, array & map functions: Equivalencies
Issue #973 closed #closed-973
fn:parse-json, fn:json-to-xml: `number-parser`, `fallback`
Issue #975 closed #closed-975
973 fn:parse-json, fn:json-to-xml: number-parser, fallback
Issue #984 closed #closed-984
959-partial Add fn:seconds function
Issue #993 closed #closed-993
989 (partial) Allow char() to take integer argument
Issue #830 closed #closed-830
Revise appendix D.4 of F+O: Illustrative user-written functions
Issue #997 closed #closed-997
830 Drop F+O appendix D.4
Issue #816 closed #closed-816
Predicates: Support for numeric sequences
Issue #996 closed #closed-996
816 Allow a predicate in a filter expression to be a sequence of numbers
Issue #995 closed #closed-995
937 revised in light of CG feedback
Issue #628 closed #closed-628
distinct-values and duplicate-values: order of results
Issue #987 closed #closed-987
628 Define result order for distinct-values and duplicate-values
Issue #911 closed #closed-911
Type "Promotion" in the coercion rules
Issue #980 closed #closed-980
911 Coercion to allow double to decimal etc
Issue #966 closed #closed-966
Rewrite spec of deep lookup operator: edits
Issue #979 closed #closed-979
966 Minor fixes to deep lookup
Issue #964 closed #closed-964
fn:has-attributes
Issue #970 closed #closed-970
XQFO: Context item → value
Issue #971 closed #closed-971
970 XQFO: Context item → value
Issue #1002 created #created-1002
Reinstate subsequence-before
There's a question on StackOverflow today:
https://stackoverflow.com/questions/77944304/
that makes me think dropping subsequence-before
might have been a mistake (the replacement, subsequence-where
, doesn't allow the end condition to be exclusive).
The question is how to find all the consecutive list
elements that follow a given para
element. That would be solved with subsequence-before(following-sibling::*, fn{not(self::list)})
. Doing it with subsequence-where
is much harder - you need to drop the final element in the result if it is not a list
element, while also taking into account that the result might be empty.
I would like to propose reinstating subsequence-before; or perhaps inverting the predicate and naming it subsequence-while()
, so it becomes subsequence-while(following-sibling::*, fn{self::list})
assuming we accept the proposal in issue #919 to allow a callback predicate to use EBV.
Issue #1001 created #created-1001
fn:subsequence-where: equivalent `fn:slice` expression
I probably should have tagged #940 with »Request Changes«, as I believe the equivalent expression with fn:slice
needs to be fixed (or removed if it turns out to be too quirky): https://github.com/qt4cg/qtspecs/pull/940#issuecomment-1919399348.
QT4 CG meeting 064 draft agenda #agenda-02-06
Draft agenda published.
Issue #940 closed #closed-940
878 Add subsequence-where function
Issue #1000 created #created-1000
XQFO Code in the Rules sections
In #978, it’s being discussed what is the best language for presenting code in the Rules sections of the XQFO specification. Currently, XPath is used for compact equivalencies, for example…
(: array:size :)
count(array:members($array))
(: fn:remove :)
$input[not(position() = $positions)].
...while XQuery is used for more complex expressions, including function declarations, or when the XPath representation would be syntactically more complex. Examples:
(: fn:deep-equal :)
declare function equal-strings(
$string1 as xs:string,
$string2 as xs:string,
$collation as xs:string,
$options as map(*)
) as xs:boolean {
let $n1 := if ($options?whitespace = "normalize"))
then normalize-unicode(?, $options?normalization-form)
else identity#1
let $n2 := if ($options?normalize-space)
then normalize-space#1
else identity#1
return compare($n1($n2($string1)), $n1($n2($string2)), $collation) eq 0
}
(: fn:index-where :)
for $item at $pos in $input
where $predicate($item, $pos)
return $pos
(: …flatten, fold-left, while-do, others :)
Finally, we have many cases in which XPath/XQuery code is omitted, either because the presented feature is basic enough, because the equivalent code would get too complicated, or (e.g., for fn:doc
) because it does not provide means to express the feature.
We should strive for consistency and decide which language(s) the majority of us believes is the best choice…
- XPath & XQuery (what we currently have)
- XPath only
- XPath, XQuery and XSLT (whatever seems most appropriate)
- Other pseudocode
- Don’t use pseudocode at all if it is too complex to be represented with moderately simple XPath code
Issue #999 created #created-999
regular expression addition - comments
The original Perl regular expression syntax allows comments with the x flag. They use # to introduce comments up to a newline.
Maybe we could support XPath-style comments in regular expressions, such as (:#.......#:)
when the x flag is present?
Today i use (?:comment: stuff here )?
but this requires that "stuff here" can be compiled into a regular expression!
Issue #998 created #created-998
regular expression addition - lookbehind assertions and lookahead assertions
look-ahead assertions are i think the most useful things not found in qt regular expressions, and also look-behind.
This lets you do things like
replace( ., '
/ ( [^/]+ ) (*positive_lookahead: /)
', '...', 'x')
replacing components between /..../ but not consuming the trailing /, so that /a/b/c/d/ comes out as /../../../../
Perl uses (?=pattern), (*pla:pattern), (*positive_lookahead:pattern) (?!pattern), (*nla:pattern), (*negative_lookahead:pattern) to match only if the pattern is (or is not) followed by a match to pattern,
and (?<=pattern), \K, (*plb:pattern), (*popsitive_lookbehind:pattern) (?<!pattern), (*nlb:pattern), (*negative_lookbehind:pattern) for zero-width look-behind assertions.
Note, libpcre (and older Perl version) restrict lookbehind assertions to fixed length. You can write (?<=dog|cat) food to match " food" preceded by "dog" or "cat", but you cannot write (?<=dogs?|cats?) barking
\C is also forbidden, as are capturing subgroups. But the facility is still very useful, and reduces the need for repeated substitutions.
I propose adding only the first form in each case, not the newer "*" forms, which are less widely supported.
Pull request #997 created #created-997
830 Drop F+O appendix D.4
This PR drops the non-normative appendix D.4, which contained illustrative user-written functions. It was a very patchy and disorganised collection of functions which were primarily there because someone had proposed adding a function to the standard library and the WG had turned down the suggestion on the grounds that users could easily write the function themselves. It's not worth the effort of rewriting the appendix to take 4.0 enhancements into account.
Fix #830
Pull request #996 created #created-996
816 Allow a predicate in a filter expression to be a sequence of numbers
Fix #816
Pull request #995 created #created-995
937 revised in light of CG feedback
This revises #937 (catalyzed by #779) in light CG discussion that approved the PR.
- Output is now
xs:hexBinary?
- Second parameter
$algorithm
replaced with an option map. In the specs I avoidedfos:values/fos:value
, because this would disallow for case/space normalization, and it would effectively disallow any implement-defined algorithms not on the list of three algorithms, and trigger the dynamic error described in rule 6 in the option parameter conventions. - Options map has only one option. It doesn't make sense to provide an option changing the kind of output.
- I think this is the first example of an options map where the
fos:meaning
has rich text (paragraphs, unordered lists). It builds and renders fine locally. - Extra note on the output format.
- Examples are expressed as chained functions, to illustrate how to get the customary string values.
Issue #994 created #created-994
Invoking maps & arrays: allow sequences?
Can’t we support integer sequences as arguments in dynamic function calls on maps and arrays?
The following query is already valid…
[ 3, 4, 5 ] ? (2, 1)
…but the number of users who are able to decode this syntax is very limited. It would be easier to allow:
[ 3, 4, 5 ](2, 1)
I would expect the results to be returned in the supplied order (4
and 3
).
Pull request #993 created #created-993
989 (partial) Allow char() to take integer argument
Addresses the use case in issue #989. (But leave the issue open for now).
Discussion point: should we drop the options char("#32")
and char("#x20")
as they now seem redundant?
Pull request #992 created #created-992
888, 963: Error handling for unsatisfied context dependencies
Fix #963 by providing more detail on the expected error handling for partial function application.
Fix #888 by making XPDY0002 a type error rather than a dynamic error.
Issue #991 created #created-991
Invisible-xml - missing details
The spec for invisible-xml doesn't say whether the parsing function returns an element node or a document node.
It should also say, for completeness, that the parsing function is "nondeterministic with respect to node identity" - that is, if you parse the same input twice, its undefined whether you get the same node twice, or different nodes.
Issue #990 created #created-990
Transitive closure on non-nodes
In PR #988 I inadvertently used the transitive-closure function to process non-nodes; that's not supported by the current specification of the function.
The only difficulty in extending it is how to define a suitable identity comparator so we know when to terminate. Probably this should be done using a callback, defaulting to op('is')
. In the use case of PR #988, the comparator could be supplied as false#0
- the step function is acyclic, so we can treat all items reached as distinct.
Issue #989 created #created-989
character sequence constructor 'a' to 'z'
Although you can write 'a' to 'z' as
(string-to-codepoints('a') to string-to-codepoints('z') ! codepoints-to-string()
i’m not sure this is easily discoverable.
Note, 'a' to 'z' is obviously dependent on the current collation - if you're using EBCDIC may the dogs help you. I’m assuming, however, any two Unicode characters could appear in the string literals, and it’d be an error to have more than one character in either string.
This means 'ċ' to 'ŗ’ would be an error, not equivalent to 'c' to 'r' (taking the first character of each string), since those are not precomposed forms.
Mostly, i tend to write this because of other languages - e.g. Perl has 'a' .. 'ÿ' or whatever, as does Ruby.
ICU in Python has UnicodeSet('[[:Ll:]&[:Latin:]]') which is powerful but grokhard (& here is intersect i think).
Although 'a' to 'z' is probably what i've seen & used most often, 'a' to 'f' and '0' to '9' are also obvious candidates.
Pull request #988 created #created-988
960 Pinned and labeled values
This PR introduces the concepts of pinned and labelled values and the way in which they can be used to obtain additional information about the results of a deep lookup or map/array navigation operation. The changes at this stage are confined to a new section in the data model spec introducing the concept of labeled items, and a new section in the XPath language spec showing how these are used when navigating maps and arrays. This is a first step; if the WG approves of the general approach, there will be a lot more detail to add in due course.
The PR addresses a number of open issues:
Issue #960 - flattening of results from $map??KS Issue #711 - using annotations for navigation of JSON trees Issue #596 - pinned values: transforming trees Issue #350 - CompPath (composite objects path) expressions Issue #334 - Transient properties: selection and update in maps and arrays Issue #262 - navigation in deep-structured arrays Issue #108 - template match using values of tunnel parameters
It does not claim to resolve them all, but I believe it provides the groundwork for doing so.
Pull request #987 created #created-987
628 Define result order for distinct-values and duplicate-values
Fix #628
Issue #986 created #created-986
Numeric Comparisons
We've been trying to change the semantics of numeric comparison without breaking existing applications. As a result, the current status quo is very messy. Let's review where we are.
The eq/lt operators, given mixed operand types, convert decimal operands to double and compare as double. No change from 3.1. This comparison is not transitive in edge cases. The =
and <
operators are defined in terms of eq
and lt
.
Map key comparisons compare as "infinite precision decimal". No change from 3.1. This comparison is now exposed as fn:atomic-equal().
deep-equal() refers to atomic-equal(), which is a change in behaviour from 3.1.
distinct-values() refers to deep-equal(), which is a change in behaviour - deliberate, because it needs to be transitive.
index-of() refers to eq. No change from 3.1.
compare() has been newly introduced; like atomic-equal() it uses infinite precision decimal for comparison.
sort() uses compare(). This is a change from 3.1; again needed because transitivity is important.
min() and max() use compare(). This is a change from 3.1.
The new highest() and lowest() functions use sort().
XSLT for-each-group refers to distinct-values().
XSLT xsl:sort currently refers to numeric-compare() and will presumably change to use compare().
XSLT xsl:merge refers to xsl:sort
XQuery "group by" refers to deep-equal()
XQuery "order by" refers to compare()
So:
- Nearly everything now uses decimal comparison where the operands are of mixed type
- There are many different ways that we say this - it's often indirect. There are only two comparison methods, but you have to follow a chain of references to work out which one applies.
- The two exceptions that still do comparison the 3.1 way (converting both operands to xs:double) are (a) the eq/lt/=/< operators, and (b) the index-of and array:index-of functions.
There are definitely things that now break. For example I was working on tests yesterday with assertions in the form deep-equal(nodes/number(.), (8.2, 5.4, 6.5)) - that is, comparing doubles to decimals. The nodes actually contain the strings "8.2", "5.4", "6.5". The test was failing because converting the string "8.2" to a double and then converting the double to a decimal does not produce the decimal value 8.2.
This mixed bag really doesn't seem acceptable. What options do we have?
- Be bold: make everything uniformly use transitive comparisons, and accept that some user code will break.
- Be timid: use transitive comparisons only where it really matters (distinct-values, grouping, sorting) and use promotion to double everywhere else.
- Compromise: introduce a compatibility mode, or a context option that allows users to control the behaviour, or another set of comparison operators.
Any other ideas?
Pull request #985 created #created-985
720 Add lookup arrow expressions (method invocations)
Fix #720.
Replaces #916.
Issue #948 closed #closed-948
fn:scan-left and fn:scan-right - produce accumulation of results
Pull request #984 created #created-984
959-partial Add fn:seconds function
Issue: #959
Issue #983 created #created-983
fn:reduce (or fn:fold without initial value)
Various languages (Kotlin, F#, Haskell, Rust, Scala, others) offer two functions for what we summarize as folds: one that accepts an initial value and another one that consumes the first item of the input as initial value. The first function is usually called fold
, the latter is called reduce
, but some languages (like JavaScript) pack the functionality into a single function.
We have the same options:
- We could tweak
fn:fold-left
andfn:fold-right
in a way that thezero
argument is ignored if it’s not explicitly supplied:
fold-left(1 to 5, action := op('*'))
However, the behavior would then differ from fold-left(1 to 5, (), action := op('*'))
, which is something we tried to avoid in more recent functions (there are functions like fn:name
, though, that behave similarly: fn:name(())
and fn:name()
does something different).
- We could also introduce a separate function
fn:reduce
(or 2 variants, resp. 4 if we include arrays):
fn:reduce(
$input as item()*,
$action as function(item()*, item()) as item()*
) as item()*
This would allow us to do reduce(1 to 5, op('*'))
, and for some people, a reduce function will be more familiar than a fold.
On Wikipedia, there’s a good summary on folds in various languages.
Issue #982 created #created-982
Add position argument to scan-left and scan-right
We have added an optional position argument to nearly all callback functions that are invoked once for each item in a sequence. This argument is omitted from the new scan-left and scan-right functions. It should be added for consistency.
One of the proposed use cases for scan-left and scan-right is for debugging calls on fold-left and fold-right. This use case requires that the callback functions in the two cases are compatible.
Background note: the optional position argument is modelled on Javascript, where it is permitted in all the common higher-order functions such as filter
, forEach
, reduce
and reduceRight
(which are the Javascript equivalent of fold-left and fold-right). Javascript doesn't appear to offer an equivalent of scan-left
and scan-right
.
Issue #981 created #created-981
Identify optional arguments in callback functions
It was pointed out today that is not obvious, looking at a function signature like
fn:filter(
$input as item()*, |
$predicate as function(item(), xs:integer) as xs:boolean |
) as item()*
that the second argument of the $predicate
function is optional.
At least in the documentation, it would be useful to capture this in some way. Being "optional" here means that it makes sense, semantically, to supply an arity-1 function, in which case the caller will not supply the second argument.
Perhaps it would also be useful to go beyond documentation, and attach some syntax and semantics to it. Specifically, if the signature of the callback function indicates that the first N arguments are required, then supplying a function item of arity less than N will result in a type error.
Pull request #980 created #created-980
911 Coercion to allow double to decimal etc
Fix #911 The coercion rules are changed to allow implicit casts among numeric types, for example a double can be supplied where the required type is decimal. (The term "promotion" is now used only for operators, where the two operands must be converted to a common type.)
Pull request #979 created #created-979
966 Minor fixes to deep lookup
Fix #966
Apply suggestions in Christian Grün's comments on PR #927.
Pull request #978 created #created-978
948 Reflected the comments of the CG on the specification of scan-left and scan-right
Reflected the comments of the CG on the specification of scan-left and scan-right
Pull request #977 created #created-977
Ignore this, it's just a test
I'm trying to work out why sometimes PR succeeds when it contains markup errors...
Issue #976 closed #closed-976
Fix markup errors with fos:notes
Pull request #976 created #created-976
Fix markup errors with fos:notes
QT4 CG meeting 063 draft minutes #minutes-01-30
Draft minutes published.
Issue #957 closed #closed-957
948 Added fn:scan-left and fn:scan-right
Issue #965 closed #closed-965
XQFO: minor edits and bug fixes
Pull request #975 created #created-975
973 fn:parse-json, fn:json-to-xml: number-parser, fallback
Issue: #973.
In addition, I fixed the description for the fallback
option fn:parse-json
, as it seemed incomplete to me:
The function is called when the JSON input contains a special character (as defined under the escape option) that is valid according to the JSON grammar, whether the special character is represented in the input directly or as an escape sequence.
Issue #974 closed #closed-974
Rules for context-dependent function references in XSLT (e.g. regex-group#1)
QT4 CG meeting 063 draft agenda #agenda-01-30
Draft agenda published.
Issue #974 created #created-974
Rules for context-dependent function references in XSLT (e.g. regex-group#1)
I'm not sure where we are on this one.
Does regex-group#1 capture the "current matching substrings" component of the dynamic context?
XSLT 4.0 test case analyze-string-101 suggests that it does, and that this represents a change from 3.0 -- there are separate versions of the test with different expected results for the two cases.
But §5.4 of the XSLT 4.0 spec still ends with the sentence:
This rule does not extend to the XSLT extensions to the dynamic context defined in this section. If a dynamic function call is made that depends on the XSLT part of the dynamic context (for example, regex-group#1(2)), then the relevant components of the context are cleared as described in the table above.
I suspect this sentence should have been deleted, but need to track down the history.
Issue #973 created #created-973
fn:parse-json, fn:json-to-xml: `number-parser`, `fallback`
- See #33:
number-parser
option needs to be added tofn:json-to-xml
. - Similar to
fn:replace
, I would suggest usingfunction(xs:untypedAtomic) as item()?)
as signature for both thefallback
and thenumber-parser
option, and (forfallback
) to invokefn:string#1
on the result. This way, explicit casts in the code get obsolete. Queries like the following one…
parse-json(
'-123',
map { 'number-parser': fn($n) { $n => number() => abs() }
)
…can then be simplified to parse-json('-123', abs#1)
.
Pull request #972 created #created-972
949 Partial Function Applications: Allow return of function name
Issue: #949. The major change is the rule for the name
property of partial function applications for static function calls.
In addition, I have unified the presentation of the different function item expressions.
Affects test cases like xqhof40
(which need to be fixed with or without this PR).
Pull request #971 created #created-971
970 XQFO: Context item → value
Issue: #970
Issue #970 created #created-970
XQFO: Context item → value
Resulting from #129 (related: #755): Many rules in the XQFO spec still refer to the context item, which is currently defined as follows:
When the context value is a single item, it can also be referred to as the context item; when it is a single node, it can also be referred to as the context node.
We need to make clear what’s going to happen if the context value is not a single item:
- In many cases, this can simply be done by replacing “context item” with “context value”.
- In some cases (e.g. for
fn:string#0
), we should specify that an error is raised (usuallyXPTY0004
) if the input is not a single item.
Issue #878 closed #closed-878
Proposed extension to subsequence
Pull request #969 created #created-969
843-partial Standard, array & map functions: Equivalencies
Issue: #843
Maybe we should keep the issue open after merging this PR.
Pull request #968 created #created-968
260 array:index-of
Issue: #260
Issue #967 created #created-967
XPath Appendix I: Comparisons
Adopted from https://github.com/qt4cg/qtspecs/issues/260#issuecomment-1908033129
It would be useful if all functions that perform comparison included a cross-reference to the new XPath Appendix I; and I note that appendix doesn't seem to mention index-of.
@michaelhkay I’ve taken the liberty of assigning this to you, as I wasn’t sure what this is about.
Issue #874 closed #closed-874
878 Proposed extension to subsequence
Issue #966 created #created-966
Rewrite spec of deep lookup operator: edits
@michaelhkay I've created a little new issue to keep track of the 3 minor suggestions that I made in the PR that we merged today: https://github.com/qt4cg/qtspecs/pull/927
Pull request #965 created #created-965
XQFO: minor edits and bug fixes
Issue #818 closed #closed-818
Foxpath integration
Issue #693 closed #closed-693
QT4 Tests without counterpart in the specs
Issue #639 closed #closed-639
fn:void: Naming, Arguments
Issue #937 closed #closed-937
779 hash function
Issue #946 closed #closed-946
fn:iterate-while → fn:while-do, fn:do-until
Issue #962 closed #closed-962
946 fn:iterate-while → fn:while-do, fn:do-until
Issue #951 closed #closed-951
Parameters with default values: fn:lang, fn:id, fn:idref, fn:element-with-id
Issue #958 closed #closed-958
951 Parameters with default values: fn:lang, fn:id, fn:idref, fn:element-id
Issue #945 closed #closed-945
Module import: apparent contradiction
Issue #952 closed #closed-952
945 module import contradiction
Issue #950 closed #closed-950
Minor edits (examples, rules)
Issue #939 closed #closed-939
Remove fn:numeric-compare
Issue #941 closed #closed-941
939 Remove fn:numeric-compare
Issue #936 closed #closed-936
877 revised rules for op:binary-less-than
Issue #861 closed #closed-861
Precise meaning of $E??KS
Issue #927 closed #closed-927
861 Rewrite spec of deep lookup operator
QT4 CG meeting 062 draft minutes #minutes-01-23
Draft minutes published.
Issue #964 created #created-964
fn:has-attributes
Trivial (motivated by a user request):
As there is an fn:has-children
function, it seems surprising that there is no fn:has-attributes
function.
I would suggest…
- adding this function to the spec, or
- indicating in a note (for
fn:has-children
) why this function is missing.
Issue #963 created #created-963
Errors in forming function items (continued)
In #894, the following rule was added to the definition of Named Function References:
An error is raised if the identified function depends on components of the static or dynamic context that are not present, or that have unsuitable values. […]
DC0001
is raised for the callfn:id#1
if the context item is not a node in a tree that is rooted at a document node.
We should be consistent and add this rule to Partial Function Applications and Inline Function Expressions as well. Perhaps such rules could be defined just once for all affected function item constructors in the parent section?
Also, the error code doesn’t seem to be properly defined in the spec, it shows [ERROR errorref DC0001 NOT FOUND]
(maybe that’s intentional at this editing stage.)
QT4 CG meeting 062 draft agenda #agenda-01-23
Draft agenda published.
Pull request #962 created #created-962
946 fn:iterate-while → fn:while-do, fn:do-until
My first thought was to name the second function fn:do-while
, but fn:do-until
with an inversed predicate seemed more appropriate to me.
Issue: #946
Issue #961 created #created-961
Simulating Objects: Performance
Related to #953, #917 and #916, I wonder whether we are aware enough of the essential differences when we think of objects in a functional language:
- Mutable objects are extremely efficient, as an update is a simple main-memory value change.
- Immutable data structures need to be fully copied if a single value changes. As a result, the update of a map with, let’s say, 1 string and 50 functions would be a new map with 1 string and 50 functions. Even with efficient immutable map implementations that we have, I doubt that it makes sense to create full copies with 1+50 entries, of which only 1 string will be different.
- Imagine a FLWOR expression that creates 1000 of such maps, with possibly 1 value that’s different in each instance. We don’t need 1000 copies of 50 functions; the memory consumption would be much smaller if we only stored relevant values.
This thread is not about premature optimization; I just want to be sure we think about the obstacles when using maps for objects. Maybe the solutions are already on the horizon; maybe we could tackle some of the concerns with the definition of default values…
declare record person(
name as xs:string,
title := (),
full := fn { string-join((?title, ?name), ' ') }
);
…and maps with type annotations. If we don’t materialize defaults, the embedded annotation would indeed need to effect functions like map:get
, as questioned by Michael in https://github.com/qt4cg/qtspecs/pull/953#issuecomment-1896078605.
Issue #960 created #created-960
Should ??KS flatten the results
Currently the result of ??KS
(like ?KS
) is flattened. So if you do ??dimensions
and the value of each dimensions
entry is a sequence of zero or more numbers, the result munges them all together into a single sequence (dropping any empty values in the process).
Should we change this, for example to return a sequence of arrays, or an array of sequences?
This makes life a bit more difficult in the simple case where all the values are singletons -- and notably, when constructing a path such as ??A??B??C
-- but it makes it possible to handle the more general case where they aren't all singletons.
Issue #959 created #created-959
Milliseconds ↔ xs:dayTimeDuration, Unix time ↔ xs:dateTime
We should extend the constructor functions to convert integers (millisecond, and the Unix time, starting from 1970-01-01T00:00:00Z
) to xs:dayTimeDuration
and xs:dateTime
instances…
xs:dateTime(12345),
xs:dayTimeDuration(12345)
…and it should be possible to convert the values back to integers.
Related: https://docs.basex.org/wiki/Conversion_Module#Dates_and_Durations
Pull request #958 created #created-958
951 Parameters with default values: fn:lang, fn:id, fn:idref, fn:element-id
Editorial (#951): Reverts the changes made in #901 for 4 context-dependent functions.
Pull request #957 created #created-957
948 Added fn:scan-left and fn:scan-right
As discussed at https://github.com/qt4cg/qtspecs/issues/948
Pull request #956 created #created-956
850-partial Editorial improvements to parse-html()
Related to issue #850, but doesn't close it entirely.
Issue #955 created #created-955
Options parameters as record types
In the new parse-html() function, the content of the options parameter is described using a record type. This differs from other functions, that describe the type as map(*), and have a statement that "the option parameter conventions apply".
Ideally we should use record types for all options parameters. However we need to check carefully that this does not affect edge-case compatibility, for example implicit conversions or the acceptability of extensions. If we can't do that then we should bring parse-html() into line with other functions such as parse-json().
Issue #954 created #created-954
Establish a default value for the XSLT fixed-namespaces attribute
The newly-defined fixed-namespaces attribute on the xsl:stylesheet
element is a huge positive step towards improving programmer's productivity by removing the need to provide up to 9 namespace declarations in every stylesheet module, thus reducing unnecessary cluttering, simplifying and slimming the code and increasing its readability.
It seems like an accidental omission that the current text doesn't specify a default value for this attribute. If there is a well-chosen default value, this would even further decrease the requirements for the programmer to engage in such a non-problem-solving activity as entering memorized strings, and would prevent errors such as either not providing the correct values for the namespace-uris or forgetting to specify this new attribute.
One obvious candidate for a default value of the fixed-namespaces attribute is #standard
, which means that without having to press even a single additional key, the XSLT programmer gets all standard namespaces automatically bound to the well-known prefixes:
xsl
xml
xs
xsi
fn
math
map
array
err
Proposal:
Please, augment the current text by specifying that the default value for the fixed-namespaces attribute is #standard
Pull request #953 created #created-953
617 Define record constructors
Fix #617
Note that this is a first step. Noticeably we can't yet use these constructor functions to create records that have methods. It's nevertheless a big step forward.
Pull request #952 created #created-952
945 module import contradiction
Soem editorial clarifications regarding XQuery module import and schema import.
Fix #945
Issue #928 closed #closed-928
Minor edits through ch. 15
Issue #947 closed #closed-947
Reorganise F+O chapter 15 [editorial]
Issue #530 closed #closed-530
Escaping of forward slash in JSON output method
Issue #942 closed #closed-942
530 Fix typo, escape-solidus not escape-uri-attributes
Issue #880 closed #closed-880
872 Symmetry: fn:items-at → fn:get
QT4 CG meeting 061 draft minutes #minutes-01-16
Draft minutes published.
Issue #930 closed #closed-930
Obsolete comment under fn:deep-equal()
Issue #933 closed #closed-933
930 drop obsolete note about comments and PIs
Issue #932 closed #closed-932
931 Add rules for duration precision
Issue #931 closed #closed-931
Precision of duration arithmetic
Issue #737 closed #closed-737
295: Boost the capability of recursive record types
Issue #951 created #created-951
Parameters with default values: fn:lang, fn:id, fn:idref, fn:element-with-id
Since #895, absent optional arguments and empty sequences in built-in functions are treated identically. Exceptions are functions that already had different rules for such cases (e.g., fn:node(())
always returns an empty sequence, no matter what the context is).
I noticed we should also exclude fn:lang
, fn:id
, fn:idref
, and fn:element-with-id
: Otherwise, a compiler won’t be able to statically assess if a function call is dependent on the context.
Pull request #950 created #created-950
Minor edits (examples, rules)
- Examples were fixed.
- The equivalent expression for
map:values
was changed to$map?*
.
Issue #949 created #created-949
Partial Function Applications: Allow return of function name
Without wanting to revive #889, an important observation I picked up is that named function references and “partially applied functions without applications” can be considered identical. That is, there should be no reason to distinguish between count#1
and count(?)
.
Currently, however, partially applied functions are currently defined to lose the reference to the original function and its arity (and some Qt4 tests ensure that this is the case: https://github.com/qt4cg/qt4tests/blob/8649941e0e695ff8fb4cb27c52e99590cc88126f/misc/HigherOrderFunctions.xml#L1933).
From a user perspective, I see no reason why the two cases should be treated differently, and I would argue that we should either treat them identically or (at least) allow implementations to treat them identically, i.e., allowing an implementation to return count
for function-name(count(?))
.
Issue #948 created #created-948
fn:scan-left and fn:scan-right - produce accumulation of results
fn:scan-left and fn:scan-right - produce accumulation of results
In XPath 4.0 so far we still don't have a convenient way to express the functionality of producing a series of accumulated (accrued) results when applying a folding function over a collection (sequence, array, ...) of items. The general use-case for this is the task to produce a sequence of running totals when applying an operation over a sequence of data points: produce the partial sums of loan payments over fixed periods, produce the compounded amounts of a deposit with fixed interest rate over years, ..., etc.
Two functions (shamelessly borrowed from Haskell):
- fn:scan-left
- fn:scan-right
fn:scan-left
This function has a similar signature to that of fn:fold-left
and produces the same final result, however it produces the complete (ordered) sequence of all partial results from every new value the accumulator gets during the evaluation of fn:fold-left
.
Signature
fn:scan-left($input as item()*,
$zero as item()*,
$action as function(item()*, item()) as item()*
) as array(*)*
Properties
This function is ·deterministic·, ·context-independent·, and ·focus-independent·.
Rules
The function is equivalent to the following implementation in XPath(return clause added for completeness):
let $scan-left-inner := function($seq as item()*,
$zero as item(),
$fun as function(item()*, item()) as item()*,
$self as function(*)
) as array(*)*
{
let $result := [$zero]
return
if(empty($seq)) then $result
else
(
$result, $self(tail($seq), $fun($zero, head($seq)), $fun, $self)
)
},
$scan-left := function($seq as item()*,
$zero as item(),
$fun as function(item()*, item()) as item()*
) as array(*)*
{
$scan-left-inner($seq, $zero, $fun, $scan-left-inner)
}
return
$scan-left(1 to 10, 0, op('+'))
Examples:
$scan-left(1 to 10, 0, op('+'))
produces:
[0]
[1]
[3]
[6]
[10]
[15]
[21]
[28]
[36]
[45]
[55]
fn:scan-right
This function has a similar signature to that of fn:fold-right
and produces the same final result, however it produces the complete (ordered) sequence of all partial results from every new value the accumulator gets during the evaluation of fn:fold-right
.
Signature
fn:scan-right($input as item()*,
$zero as item()*,
$action as function(item()*, item()) as item()*
) as array(*)*
Properties
This function is ·deterministic·, ·context-independent·, and ·focus-independent·.
Rules
The function is equivalent to the following implementation in XPath(return clause added for completeness):
let $scan-right-inner := function($seq as item()*,
$zero as item()*,
$f as function(item(), item()*) as item()*,
$self as function(*)
) as array(*)*
{
if(empty($seq)) then [$zero]
else
let $rightResult := $self(tail($seq), $zero, $f, $self)
return
([$f(head($seq), head($rightResult))], $rightResult)
},
$scan-right := function($seq as item()*,
$zero as item()*,
$f as function(item(), item()*) as item()*
) as array(*)*
{
$scan-right-inner($seq, $zero, $f, $scan-right-inner)
}
return
$scan-right(1 to 10, 0, op('+'))
Examples:
$scan-right(1 to 10, 0, op('+'))
produces:
[55]
[54]
[52]
[49]
[45]
[40]
[34]
[27]
[19]
[10]
[0]
QT4 CG meeting 061 draft agenda #agenda-01-16
Draft agenda published.
Issue #899 closed #closed-899
Simplifying the language - types have behaviour.
Pull request #947 created #created-947
Reorganise F+O chapter 15 [editorial]
This PR reorganizes the subsections of F+O chapter 15 (the XML/HTML/JSON/CSV/IXML chapter). There is no change to content apart from a couple of introductory sentences. I'm planning to do some fine-grained work on the content in due course, but to make that easier to review it seems best to do the top-level reorganisation first. I'm therefore hoping that this PR will go through quickly "on the nod" so I can use it as a baseline for the detail changes.
Issue #946 created #created-946
fn:iterate-while → fn:while-do, fn:do-until
First feedback shows that fn:iterate-while
is helpful, but the name needs to be improved: It implies that the first iteration occurs before the invocation of the test… Which could sometimes be helpful, too.
I suggest renaming the function to fn:while-do
(“do” is commonly used when while loops are specified), and adding fn:do-until
.
Issue #945 created #created-945
Module import: apparent contradiction
XQuery 5.12 paragraph 2 says:
If a module A imports module B, the static context of module A will contain the [in-scope schema definitions]... of module B.
Paragraph 10 says:
A [module import] imports only functions, variable declarations, and item type declaratons; it does not import other objects from the imported modules, such as [in-scope schema definitions] or [statically known namespaces].
They can't both be right, surely?
Issue #944 created #created-944
Coercion rules: implicit types
Since 2.0, the coercion rules (formerly function conversion rules) have allowed implicit conversion from decimal to double, decimal to float, and float to double on function calls; other conversions such as double to decimal or float to decimal are not allowed. This has never made very much sense because in some implementations, decimal to float is a lossy conversion whereas float to decimal is not.
One option would be to allow conversion from any numeric type to any other.
The main caveat here is that I don't think it makes sense to allow a double such as 1.5e0 to be supplied where the required type is xs:integer. We have introduced new conversions that make it possible to supply a decimal where an integer is expected, but only if the decimal is in the value space of integer.
A possible formulation would be:
If the required type is a numeric type (that is, xs:decimal, xs:double, xs:float, or any type derived from these), and if the supplied value is a numeric value, then the supplied value is cast to the required primitive type, and if the result is in the value space of the actual required type it is then relabelled as an instance of the actual required type (if not, the conversion fails).
This means that supplying 1.0e0 for an argument expecting xs:integer (or xs:positiveInteger, etc) would work (it would cast to xs:decimal and then relabel as xs:integer), but supplying 1.1e0 would fail.
Pull request #943 created #created-943
187 Add FLWOR expression while clause
Fix #187
Pull request #942 created #created-942
530 Fix typo, escape-solidus not escape-uri-attributes
Fix #530.
On half a dozen occasions, escape-uri-attributes
is used where escape-solidus
is clearly intended.
Issue #886 closed #closed-886
Binary map keys
Pull request #941 created #created-941
939 Remove fn:numeric-compare
Issue: #939
Pull request #940 created #created-940
878 Add subsequence-where function
Supersedes PR #874
Following discussion of PR #874 which proposed an extended subsequence() function with options to define the start and end position by predicates, this new PR proposes instead a subsequence-where() function that allows the start and end position to be defined by predicates, leaving the existing subsequence() function unchanged.
The items-before/after/starting-where/ending-where quartet are dropped.
The new function is inclusive at both ends. To start at the item after the one that matches the start condition, apply tail() to the result. To finish before the item that matches the end condition, apply trunk() to the result.
Issue #893 closed #closed-893
fn:compare: Support for arbitrary atomic types
Issue #918 closed #closed-918
Minor cx through chap. 14
Issue #939 created #created-939
Remove fn:numeric-compare
Related action: QT4CG-060-04
@michaelhkay I proposed to merge fn:numeric-compare
into fn:compare
in #866; your response was:
Folding fn:numeric-compare into fn:compare is more feasible, but you've then got one function that does two different jobs; there's no type safety to ensure that the arguments have compatible types, and you need ad-hoc rules to say which combinations of arguments are valid and which aren't. The merit of two separate functions is that each is a total function over the domain implied by its signature.
Do you think the concerns are still relevant, or should we tackle this?
Issue #938 created #created-938
Canonical serialization
This issue picks up suggestions from #779 regarding canonical serialization, and solicits from the community group input on if such a function is desirable, and what such a function might look like.
In the context of #779, the idea was that two XML documents with different physical representations, but semantically equivalent, could be serialized to a canonical form, with a hash value applied to each confirming identity. Of course, with canonical operation, a simple string comparison would be sufficient, absent any hashing.
XML Signature was suggested as one approach, with some hesitation. I would like to suggest, instead, that we look to implement Canonical XML Version 1.1 (herein CX1.1), perhaps with map options that calibrate how CX1.1 is implemented. I have no experience using CX1.1, so user input is welcome.
Another point of discussion is whether this merits a new function, e.g., fn:canonical-serialize
, or should be built upon fn:serialize
. A problem with the latter option, is that such an approach makes no sense without the method
option specified as xml
. Another approach would be to go deeper, into the serialization spec, and expand the xml
method to ensure a canonical option.
I believe that this function would be extremely useful. When preparing test suites, output could be saved as secondary documents as canonical XML, and any subsequent regression tests could adjust comparanda to canonical XML, and very precise node-wise comparisons could be made.
I look forward to everyone's input.
Pull request #937 created #created-937
779 hash function
First draft of hash function, proposed in #779.
Error message left as to-do item; guidance from editors appreciated.
I opted to leave out wrapper/cryptographic functionality, such as salting, and to demonstrate via example how it could be done by a developer on their own. In my opinion what we need here is a simple atomic function that can be incorporated into other molecular functions.
I may tinker with the prose description up to CG discussion, so comments are welcome.
Pull request #936 created #created-936
877 revised rules for op:binary-less-than
Rule 3 for op:binary-less-than
was a bit of a mess (see #877), and needed to be expressed as a recursive operation.
My proposed revision depends of phraseology drawn from fn:decode-from-uri
, fn:deep-equal
, and 5.3.2 Unicode Codepoint Collation (here slightly adjusted from unordered to ordered list).
Issue #876 closed #closed-876
Placement of fn:in-scope-namespaces(), fn:in-scope-prefixes(), fn:namespace-uri-for-prefix()
Issue #909 closed #closed-909
893 fn:compare: Support for arbitrary atomic types
Issue #860 closed #closed-860
Unary Lookup when the context value is a sequence
Issue #926 closed #closed-926
860 Editorial rearrangement of spec for shallow lookup
Issue #780 closed #closed-780
format-number() etc incompatibility
Issue #925 closed #closed-925
780 Document incompatibility in format-number etc
Issue #935 closed #closed-935
Fix the fo test catalog
Pull request #935 created #created-935
Fix the fo test catalog
A duplicate name was introduced.
Issue #648 closed #closed-648
Schema for FN namespace should block extension and substitution
Issue #924 closed #closed-924
648 Disallow user modifications to schema for FN namespace
Issue #913 closed #closed-913
XQFO: under/unused variable apparatus
Issue #923 closed #closed-923
913-new-examples-for-local-name-etc
Issue #915 closed #closed-915
[Editorial] Incorrect terminology: function implementation is now function body
Issue #922 closed #closed-922
915 function body terminology
Issue #914 closed #closed-914
XQFO minor edits
Issue #912 closed #closed-912
XQFO: Minor edits
Issue #906 closed #closed-906
fn:deep-equal: unordered → ordered
Issue #907 closed #closed-907
906 fn:deep-equal: unordered → ordered
Issue #898 closed #closed-898
Drop the requirement for document-uri() uniqueness
Issue #905 closed #closed-905
898 - relax the constraints on document-uri
Issue #821 closed #closed-821
Annotations: Make default namespace explicit
Issue #904 closed #closed-904
821 Annotations: Make default namespace explicit
Issue #895 closed #closed-895
Parameters with default values: allow empty sequences
Issue #901 closed #closed-901
895 Parameters with default values: allow empty sequences
Issue #934 created #created-934
String comparison in deep-equal
The code showing how strings should be compared in deep-equal has gone awry, it doesn't match the prose. In equal-strings(), the lines
let $n1 := if ($options?whitespace = "normalize"))
then normalize-unicode(?, $options?normalization-form)
else identity#1
let $n2 := if ($options?normalize-space)
then normalize-space#1
else identity#1
should read:
let $n1 := if ($options?whitespace = 'normalize')
then normalize-space#1
else identity#1
let $n2 := if ($options?normalization-form))
then normalize-unicode(?, $options?normalization-form)
else identity#1
Actually, the whole thing can now be expressed more concisely using fn:chain:
declare function equal-strings(
$string1 as xs:string,
$string2 as xs:string,
$collation as xs:string,
$options as map(*)
) as xs:boolean {
let $norm := fn:chain(?,
(normalize-unicode(?, $options?normalization-form)[$options?whitespace = "normalize"],
normalize-space#1[$options?normalize-space]))
return compare($norm($string1), $norm($string2), $collation) eq 0
}
Pull request #933 created #created-933
930 drop obsolete note about comments and PIs
Fix #930
The note is obsolete because adjacent text nodes are now combined after stripping comments and PIs.
Pull request #932 created #created-932
931 Add rules for duration precision
Fix #931
Adds rules for the precision of durations and operations on durations, analogous to the existing rules for dates/times.
Issue #931 created #created-931
Precision of duration arithmetic
We specify that dates/times are manipulated at least to millisecond precision, but we have no similar statement for durations.
See https://stackoverflow.com/questions/77752844
michael.hor257k points out:
The 2.0 specification states: "The result is obtained by casting $arg to an xs:dayTimeDuration ... and then computing the seconds component as described in 10.3.2.3 Canonical representation." And then: "The canonical representation of xs:dayTimeDuration restricts ... the value of the seconds component to xs:decimal valued from 0.0 to 59.999... ", with reference to XML Schema Part 2: Datatypes which mandates "a minimum fractional second precision of milliseconds or three decimal digits". None of this appears in the 3.0 spec, though the examples still show a decimal digit being extracted. –
Issue #929 closed #closed-929
map:values() - Would it be better to return an array?
Issue #930 created #created-930
Obsolete comment under fn:deep-equal()
The notes for fn:deep-equal() include the paragraph:
By default, the contents of comments and processing instructions are significant only if these nodes appear directly as items in the two sequences being compared. The content of a comment or processing instruction that appears as a descendant of an item in one of the sequences being compared does not affect the result. However, the presence of a comment or processing instruction, if it causes a text node to be split into two text nodes, may affect the result.
This is no longer true: we fixed it so that adjacent text nodes are merged after stripping comments and PIs.
Issue #929 created #created-929
map:values() - Would it be better to return an array?
The new function map:values()
returns the values present in a map, flattened into a sequence.
This loses information if the values are not all singletons.
Would it be better to return an array?
That is, to return array:build(map:pairs($map), fn{?value}))
Pull request #928 created #created-928
Minor edits through ch. 15
Light edits here for consistency, clarity. I didn't touch the CSV prose much, knowing it is subject to major revisions.
Pull request #927 created #created-927
861 Rewrite spec of deep lookup operator
Fix #861
This is a complete rewrite of the spec for deep-lookup, hopefully clarifying some edge cases and fixing bugs, but not intended to introduce any major changes.
Pull request #926 created #created-926
860 Editorial rearrangement of spec for shallow lookup
Rearranges the spec for lookup expressions so that unary lookup is now defined in terms of postfix lookup, not the other way around; this simplifies the rules when the context value is not a singleton, or when the key specifier expression is context-dependent.
Fix #860
Pull request #925 created #created-925
780 Document incompatibility in format-number etc
Fix #780
Changes the XSLT and F+O specs to document a minor incompatibility arising from the change to functions such as format-number()
to accept an argument of type union(xs:string, xs:QName)
rather than xs:string
.
In addition, in XSLT, all such functions now accept union(xs:string, xs:QName)
rather than union(xs:QName, xs:string)
. This is primarily to make them all consistent.
Pull request #924 created #created-924
648 Disallow user modifications to schema for FN namespace
Fix #648
Issue #889 closed #closed-889
Rename "Named Function Reference"
Pull request #923 created #created-923
913-new-examples-for-local-name-etc
I have created new (executable) examples for functions name, local-name, namespace-uri, node-name, count, number.
There are of course many other functions that would benefit from the same treatment.
Fix #913
Pull request #922 created #created-922
915 function body terminology
Fix #915
Pull request #921 created #created-921
920 Allow xsl:break and xsl:next-iteration within branch of xsl:switch
Allow xsl:break and xsl:next-iteration within branch of xsl:switch
Fix #920