@qt4cg statuses in 2024
This page displays status updates about the QT4 CG project from 2024.
See also recent statuses.
Pull request #1669 created #created-1669
1667 Revise handling of non-XML characters in parse-json
Fix #1667
Issue #1659 closed #closed-1659
List-valued options represented as arrays
Pull request #1668 created #created-1668
Minor copy edits (no issue raised)
Various small copy edits.
Also adds summary descriptions to record definitions in the function catalog, which requires a schema and stylesheet change.
Issue #1667 created #created-1667
Invalid XML characters in JSON input
We have changed the data model (§2.8.4) (see PR #546) so that implementations may allow characters that are not valid XML characters.
We have not explored the impact of this change on parse-json(), which is one of the obvious places where non-XML characters may arise. For example, JSON allows unescaped C1 control characters.
(Note however that the data model explicitly bans unpaired surrogates, and I think that rule should apply to parse-json() even though the JSON grammar allows it).
Hopefully it only requires clarification notes to be added to the spec, and not any substantive change.
Issue #1657 closed #closed-1657
1624 Add note explaining nodetest subtyping
Issue #1664 closed #closed-1664
1659 option conventions sequences as arrays
Pull request #1666 created #created-1666
1649 result of function annotations
Brings the spec of fn:function-annotations into line with the test cases and examples
Fix #1649
Pull request #1665 created #created-1665
1650 Tidy up fn:type-of
Drop fn:node-kind from the function catalog so it disappears from the function finder
Correct one example of fn:type-of and add some more examples.
Fix #1650
Pull request #1664 created #created-1664
1659 option conventions sequences as arrays
Pull request #1663 created #created-1663
Remove DTD/stylesheet distractions at the top of the schema
We decided to close #374 without further action. This PR just cleans up the relevant schema file by removing the DTD and stylesheet related comment(s).
Issue #1662 created #created-1662
xsl:sort - add composite sort keys
The fn:sort function supports composite sort keys (where the sort key is a sequence, it's treated as a sequence of sort keys in major to minor order).
We could add the same feature for xsl:sort
, perhaps driven by the attribute composite="yes" for compatibility with grouping keys.
Issue #1661 created #created-1661
QName arguments: also allow strings
In #747, a syntax for QName literals was proposed (Q"prefix:local"
). Concerns were raised that the new syntax could be mixed up with the existing Q{URI}local
syntax, and that too many syntax variants are confusing.
In attribute and element constructors, it is already possible to supply prefix:local-name
and Q{uri}local
strings for names:
element { 'xml:name' } {},
element { 'Q{uri}name' } {}
With option declarations, it is possible to write:
declare option output:cdata-section-elements 'xml';
<xml>text</xml>
…and it is counterintuitive that serialize($xml, { 'cdata-section-elements': 'xml' })
is not legal.
I wonder whether we cannot simply allow both QNames and strings for arguments (and arguments in options) in the existing standard functions. There are fewer cases than I had suspected:
Function | Argument
--- | ---
fn:error
| $code
fn:format-number
| $options
: format-name
(can also be xs:NCName
)
fn:deep-equal
| $options
: unordered-elements
fn:serialize
| $options
: cdata-section-elements
, suppress-indentation
fn:function-lookup
| $name
fn:load-xquery-module
| $options
: variables
, vendor-options
fn:transform
| $options
: initial-function
, initial-mode
, initial-template
, ...
fn:elements-to-maps
| $options
: layouts
fn:schema-type
| $name
For some options, like the method
option of fn:serialize
, we already allow both strings and QNames.
QT4 CG meeting 103 draft minutes #minutes-12-17
Draft minutes published.
Issue #1660 created #created-1660
Further suggestions for fn:path
Several good suggestions for fn:path made at today's review:
(a) For namespaces, an option to identify elements by the result of the name() function - that is, using the actual prefix of each element, rather than a prefix obtained from an externally supplied map (b) The ability to get a path to a node from a supplied ancestor rather than from the root. (Defaulting perhaps to a path from the context node, though that interacts awkwardly with the default for the target node itself.) (c) Some more advice and guidance, especially relating to the different use cases depending on whether the path is for use by software or for (diagnostic?) use by human developers.
Issue #1619 closed #closed-1619
XSLT: keys as maps
Issue #1622 closed #closed-1622
1619 Specify XSLT map-for-key function
Issue #332 closed #closed-332
Add a namespace uris option to fn:path
Issue #1620 closed #closed-1620
332 Add options for fn:path
Issue #1627 closed #closed-1627
Drop validate() and valid() functions from schema-type-record
Issue #1633 closed #closed-1633
1627 Tweaks to schema type functions
Issue #374 closed #closed-374
Can't view the XSD for XSLT in the browser
Issue #523 closed #closed-523
Dealing with component name conflicts with library packages
Issue #1655 closed #closed-1655
JSON maps
Issue #1634 closed #closed-1634
Decimal formats in XPath/XQuery static context: updates needed
Issue #1638 closed #closed-1638
1634 Update description of decimal properties in the static context
Issue #1652 closed #closed-1652
Use function/xfunction markup
Issue #1653 closed #closed-1653
1652 Use function markup
Issue #1659 created #created-1659
List-valued options represented as arrays
The specification of the of the option parameter conventions contains this:
In cases where an option is list-valued, by convention the function should accept either a sequence or an array: but this rule applies only if the specification of the option explicitly accepts either. Accepting a sequence is convenient if the value is generated programmatically using an XPath expression; while accepting an array allows the options to be held in an external file in JSON format, to be read using a call on the fn:json-doc function.
In particular, it says
...this rule applies only if the specification of the option explicitly accepts either...
However I could not find any option that explicitly makes use of "accepts either" in the specification. In the tests, I found two cases where arrays are passed as option values: numberformat-510
(here the option isn't even list-valued), and serialize-xml-106a
.
I am wondering whether the above paragraph might be superfluous. The text preceding it says that option values are coerced to the required type, and that implies converting an array to its member sequence, doesn't it?
Issue #1658 created #created-1658
fn:elements-to-maps: `empty`, normalize space ?
The current rules say:
If
empty($EE/(* | text())
(that is, if there are no child elements or text nodes) then: […]empty
Ifempty($EE/text()[normalize-space()])
(that is, there are no text node children other than whitespace), then: […]list
Maybe it would be consistent to add [normalize-space()]
to the condition of the empty
layout?
QT4 CG meeting 103 draft agenda #agenda-12-17
Draft agenda published.
Issue #1550 closed #closed-1550
More requirements for type information
Pull request #1657 created #created-1657
1624 Add note explaining nodetest subtyping
Fix #1624
by adding a note explaining the problem.
Issue #1656 created #created-1656
Ordered Maps: Updates
If maps are updated, insertion/deletion order may be an issue, even more if maps will be ordered by default (#1651).
This topic needs to be discussed in more depth before we take any actions.
Personally, I think we should focus on XML updates first (related: #1225).
Issue #1457 closed #closed-1457
Common name for maps & arrays
Issue #1588 closed #closed-1588
Move the Streamability chapter?
Issue #1592 closed #closed-1592
fn:elements-to-maps: Observations
Issue #1654 closed #closed-1654
Type annotations on maps and arrays
Issue #1655 created #created-1655
JSON maps
Now that we are discussing different types of maps, there could also be JSON maps. Then each map would have a property json
, that can be false or true. If it is false, it is an ordinary map like now. If it is true, it is a "JSON map".
A JSON map can only have string keys, and all map functions would enforce that constraint by casting the key to string.
Parse-json and json-doc would return a JSON map. As would the bare brace {}
constructor for compatibility with Javascript.
For example
let $json := parse-json('{"1": 234}')
return map:put($json, 1, 456)
would return a JSON map {"1": 456}
.
let $json := parse-json('{"1": 234}')
return map:contains($json, 1)
would return true.
Issue #1654 created #created-1654
Type annotations on maps and arrays
Currently maps and arrays have very little type safety. You can say that your function expects array(xs:string), but that involves testing what the array actually contains, and there's nothing to stop you then appending an integer to the array.
I would like to explore the possibility of having arrays and maps annotated with a type (either always, or optionally), and for this type to constrain operations such as array:append() and map:put().
@dnovatchev has suggested that ordered maps and unordered maps should be different types, and I think it would be difficult to do that unless we move to structural typing. It's also more consistent with typing of atomic values and nodes - though it raises a question about sequences, where the type is purely descriptive.
This would also have implications for records: presumably a map could be annotated with a record type, and this too could constrain the operations available on the value.
This is a rather big change and I put it forward fairly tentatively, but I'm interested to hear people's views.
Pull request #1653 created #created-1653
1652 Use function markup
Replace <code>
with <function>
tags where appropriate.
Fix #1652
Issue #1652 created #created-1652
Use function/xfunction markup
PR #1616 instroduced improved support for the function and xfunction tags.
We should now change the documents to take advantage of this.
QT4 CG meeting 102 draft minutes #minutes-12-10
Draft minutes published.
Issue #1616 closed #closed-1616
A little cleanup; support function/xfunction globally
Issue #1636 closed #closed-1636
Initial conversion of EXPath Binary/File
Issue #1103 closed #closed-1103
CSV Parsing - handling line ending normalization
Issue #1643 closed #closed-1643
1103 Normalize line endings in CSV prior to parsing
Issue #1637 closed #closed-1637
Obsolete note in fn:function-lookup
Issue #1642 closed #closed-1642
1637 Add/Amend notes to fn:function-lookup
Issue #1554 closed #closed-1554
XQFO: Formal Specification
Issue #1641 closed #closed-1641
1554-change-formal-specification-heading
Issue #1639 closed #closed-1639
Rules for schema-aware elements-to-maps are incomplete
Issue #1640 closed #closed-1640
1639 Add missing rule for elements-to-maps
Issue #1628 closed #closed-1628
XQuery version number
Issue #1629 closed #closed-1629
1628 Clarify rules for XQuery version declaration
Issue #1651 created #created-1651
Ordered Maps: maps that retain insertion order
Currently, XDM maps are “unordered”: An implementation is allowed to organize entries in a way that optimizes lookup, not order. The entries do not have a predictable order unless they are explicitly sorted.
There are cases in which it is helpful if the “insertion order” is preserved – i.e., the order in which new map entries are added to a map. While the insertion order is not relevant if a map is exclusively used for lookups, it may be beneficial if the input includes deliberately sorted key/value pairs, such as (often) in JSON data, configurations or key/value sequences.
I created this issue because there was some confusion in #564, and on Slack, about this map flavor and “sorted maps”, which are discussed in issue #564: Sorted maps hold all map entries sorted by the key, using a comparator or (in its basic variant) fn:data#1
.
PR #1609 attempts to solve both requirements at once.
Issue #1650 created #created-1650
fn:node-kind, fn:type-of: Editorial
fn:node-kind
is still listed in the function indexfn:type-of
:type-of($e//doc/child::node())
→type-of($e/child::node())
QT4 CG meeting 102 draft agenda #agenda-12-10
Draft agenda published.
Issue #1635 closed #closed-1635
Abbreviate suffixes on cross-spec links
Issue #1649 created #created-1649
Result type of fn:function-annotations()
The function signature and the prose rules say
The result is a sequence of maps, each being an instance of record(key as xs:QName, value as xs:anyAtomicType*)
But one of the examples returns a singleton map in which the QName is the key and the value is the associated value. The test cases also follow that pattern.
Issue #1648 created #created-1648
fn:elements-to-maps: Types
Copied from https://github.com/qt4cg/qtspecs/issues/1592#issuecomment-2493270899:
With regard to types, I would propose to introduce a separate option:
elements-to-maps(
<value>42</value>,
{ 'types': { 'value': 'number' } }
)
→ { "value": 42 }
I have a preference for strings, as we can prefix them with @
. Next, the representation could be identical to the result, which I believe is more intuitive:
elements-to-maps(
<value count='3'/>
{ 'types': { '@count': 'number' } }
)
→ { "value": { "@count": 3 } }
Of course, we could also have two options (element-types
, attribute-types
).
Issue #1647 created #created-1647
fn:elements-to-maps: Explicit Layouts
If a user chooses a custom layout, it should always be applied, or (if inappropriate, by all means) an error message should be raised.
The rationale: I think that the current fallback behavior is flawed. Explicit settings should never be overridden by implicit choices.
Issue #1646 created #created-1646
fn:elements-to-maps: Robustness
Copied from https://github.com/qt4cg/qtspecs/issues/1592#issuecomment-2493187896 and https://github.com/qt4cg/qtspecs/issues/1592#issuecomment-2495502890:
[USER2] More user feeback:
It’s confusing that the following function calls lead to completely different outputs:
elements-to-maps(
<person>
<name>Akila</name>
<age>34</age>
</person>
)
{"person":{"name":"Akila","age":"34"}}
elements-to-maps(
<person>
<name>Akila</name>
<name>Jaha</name>
<age>34</age>
</person>
)
{"person":[{"name":"Akila"},{"name":"Jaha"},{"age":"34"}]}
The initial feedback I gathered so far is that the function works fine if the input is regular and uniform, but as soon as there are slight deviations, it can get wild. Here are some plain examples how a small change to the input results in fairly different output:
<xml>
<info>X</info>
<address>A</address><address>B</address>
</xml>
→ { "xml": ["A", "B"] }
<xml>
<info>X</info>
<address>A</address>
<address>B</address>
</xml>
→ { "xml": [{ "info": "X" }, { "address": "A" }, { "address": "B" }] }
<xml id='id0'>
<address>A</address>
<address>B</address>
</xml>
→ { "xml": { "@id": "id0", "address": ["A", "B"] } }
Possible solutions:
- Enable
uniform
by default (performance considerations should not outweigh usability concerns) - Change the rules for
record
fromall-different(*!node-name())
tonot(all-equal(*!node-name()))
- Editorial changes: Stress in the introduction that robustness is a secondary requirement.
Issue #1645 created #created-1645
fn:elements-to-maps: Debugging
Copied from https://github.com/qt4cg/qtspecs/issues/1592#issuecomment-2493180757, slightly revised:
For regular data, it is convenient to have heuristics that choose layouts automatically. For slightly irregular data that needs manual revisions, it can get messy:
[USER1] User feedback:
I have no idea which layout is used for my XML data. A function would be helpful that does not return the transformed data, but the layouts used for the transformation.
We could…
- offer an extra function,
- add a
debug
option to trace layout information, or - add an option to include layouts in the output:
<p><a>A</a><b>B</b><c/></p> => elements-to-maps({ 'debug': true() })
{
"p(record)": {
"a(simple)": "A",
"b(simple)": "B",
"c(empty)": ""
}
}
Issue #1644 created #created-1644
fn:elements-to-maps: Mixed Content
Even though the function may not be used primarily for mixed content, we should make it easier to convert such XML input to maps/JSON.
If I understand correctly, the safest solution to retrieve a consistent result currently is:
fn:elements-to-maps(
$mixed-content,
{ "disable-layouts": ("empty", "empty-plus",
"simple", "simple-plus", "list",
"list-plus", "record", "sequence")
}
)
Maybe we can simplify this? If we had an inclusive option, it could possibly be:
fn:elements-to-maps(
$mixed-content,
{ "enable-layouts": "mixed" }
)
We could also consider xml:space=''preserve
attributes and apply mixed
to all descendant nodes (but it shouldn't be the only solution).
Related: #1592
Pull request #1643 created #created-1643
1103 Normalize line endings in CSV prior to parsing
Fix #1103
Simplifies the spec by doing line-ending normalization unconditionally prior to CSV parsing. CRLF sequences are no longer retained within quoted fields.
Pull request #1642 created #created-1642
1637 Add/Amend notes to fn:function-lookup
Fix #1637
Pull request #1641 created #created-1641
1554-change-formal-specification-heading
Fix #1554
Changes the heading "formal specification" to "formal equivalent", and expands on the explanatory text.
Pull request #1640 created #created-1640
1639 Add missing rule for elements-to-maps
Fix #1639
Issue #1639 created #created-1639
Rules for schema-aware elements-to-maps are incomplete
The rules for schema-aware layout selection produce no answer in the case where the element has a simple type but empty layout and simple layout are both disabled.
I propose to fall back to "mixed" layout in this case.
Issue #1630 closed #closed-1630
Two minor corrections of `minus-sign` spec
Pull request #1638 created #created-1638
1634 Update description of decimal properties in the static context
Fix #1634
Supersedes #1630
Issue #1637 created #created-1637
Obsolete note in fn:function-lookup
It says:
Equally, these specifications do not define any mechanism for creating context-dependent functions other than the built-in context-dependent functions, but neither do they rule out the existence of such functions.
which is no longer true (user-defined functions can take context-dependent default arguments)
Issue #1462 closed #closed-1462
fn:deep-equal: default option
Pull request #1636 created #created-1636
Initial conversion of EXPath Binary/File
Issue #1602 closed #closed-1602
Additional Operations on Arrays - redundant/spurious text
Pull request #1635 created #created-1635
Abbreviate suffixes on cross-spec links
Stylesheet change for cross-spec links (xspecref, xnt, xtermref) to drop the redundant "40" version suffix - for example the suffix becomes DM rather than DM40, since the vast majority of links will point to the 4.0 version of the document.
Issue #1634 created #created-1634
Decimal formats in XPath/XQuery static context: updates needed
See also #1630.
The description of the decimal format properties in the XPath and XQuery static context needs to be updated to align with changes defining the options map of the format-number
function.
The XSLT xsl:decimal-format element also needs to be checked for consistency, though most of the required changes have been made.
Pull request #1633 created #created-1633
1627 Tweaks to schema type functions
Fix #1627
Minor adjustments to the rules for fn:schema-type, fn:atomic-type-annotation, and fn:node-type-annotation based on implementation and testing experience.
Issue #1632 created #created-1632
Add xsl:map/@select
I was surprised to discover that the xsl:map
instruction does not allow a select
attribute.
For many use cases it might make the instruction equivalent to xsl:sequence
:
<xsl:map select="map:build(.....)"/>
<xsl:map select="{'a': 1, 'b': 2}"/>
but it still has documentary value; and there are other cases where it's not merely cosmetic:
<xsl:map select="$map1, $map2 => map:remove('extra'), {'extra': 17}"/>
There's no change to the semantics, the value of the select attribute is handled just like the value of the sequence constructor.
Issue #1625 closed #closed-1625
Editorial: misplaced notes for absolute/relative path expressions.
Issue #1626 closed #closed-1626
1625 Editorial changes to notes on path expressions
Issue #1631 created #created-1631
xsl:apply-templates (without select) should allow inline content
The specification of XSLT 3.0 on the matter is pretty loose and seems to allow the fact that xsl:apply-templates without select attribute should contains the elements to process inline.
It seems like a rather nice feature and may allow to do trick that are quite difficult to do right now
(copy of https://github.com/w3c/qtspecs/issues/31 )
QT4 CG meeting 101 draft minutes #minutes-12-03
Draft minutes published.
Issue #1596 closed #closed-1596
1592 Rework rules for selecting a layout
Issue #1615 closed #closed-1615
Drop the terms "module context" and "expression context"
Issue #1623 closed #closed-1623
1615 Editorial rearrangement of "context" sections
Issue #1614 closed #closed-1614
Fix xfunction refs in XSLT
Issue #1605 closed #closed-1605
Change csv-to-xml() to return a document node, not an element node
Issue #1613 closed #closed-1613
1605 csv-to-xml to return document node rather than element
Issue #1194 closed #closed-1194
New function fn:query()
Issue #1608 closed #closed-1608
fn:compare depends on implicit timezone
Issue #1611 closed #closed-1611
1608 add dependency to fn compare
Pull request #1630 created #created-1630
Two minor corrections of `minus-sign` spec
Per #1250, minus-sign
is now a string rather than a single character.
This change:
- corrects that in one place where it was still said to be a character
- changes the formulation from "represent" to "mark" a negative number.
Sorry for opening a branch in this repo. I did this accidentally, omitting the fork that I originally wanted to create.
Pull request #1629 created #created-1629
1628 Clarify rules for XQuery version declaration
Fix #1628
Hopefully the new rules are clearer. They were motivated by a couple of test cases using weird version numbers such as "4.00".
Issue #1628 created #created-1628
XQuery version number
In XQuery 1.0 and 3.0 the version number was simply a string.
XQuery version 3.1 specified that
An XQuery version number consists of two integers separated by a dot.
In 4.0 we have taken this rather literally, and have spelled out the consequences in a note:
The version numbers 4.01 and 4.1 are equivalent: both have a major number of 4 and a minor number of 1. Version 4.10 by the same reasoning has a higher minor number than version 4.2.
This is completely counter-intuitive.
I propose that we eliminate the confusion by requiring the version number to consist of two single-digit integers separated by a dot.
QT4 CG meeting 101 draft agenda #agenda-12-03
Draft agenda published.
Issue #1627 created #created-1627
Drop validate() and valid() functions from schema-type-record
The functions schema-type(), atomic-type-annotation(), and node-type-annotation() return a schema-type-record in which two of the fields are function items validate() and valid().
I've come to the conclusion that these are difficult to specify, difficult to implement, and difficult to test, and that the benefit of providing them is not great. I propose to drop them. They can always be added back in later.
Pull request #1626 created #created-1626
1625 Editorial changes to notes on path expressions
Fix #1625
Purely editorial.
Issue #1625 created #created-1625
Editorial: misplaced notes for absolute/relative path expressions.
The note regarding leading-lone-slash ambiguity in 4.6.2 (relative path expressions) properly belongs in 4.6.1 (absolute path expressions)
Issue #1624 created #created-1624
document-node(a|b) is the same type as document-node(a)|document-node(b)
document-node(a|b)
is the same type as document-node(a) | document-node(b)
but the current subtyping rules don't say this.
Revealed by test case misc/subtyping-076
Note, this problem existed before we introduced document-node(X), the same is true of the expansion:
document-node(element(a|b))
is the same type as document-node(element(a)) | document-node(element(b))
Issue #1603 closed #closed-1603
1602 Editorial update to "other operations" on maps and arrays
Pull request #1623 created #created-1623
1615 Editorial rearrangement of "context" sections
This PR is purely editorial.
It drops some rarely-used and imprecise terminology like "module context", and clarifies the description of the role of the static and dynamic context.
In XQuery, it pulls together the material from §2.3.5 (the "Serialization" section of the processing model) and Appendix C.1 (the "static context" appendix) into a new section 5.22 Output Declarations.
Note that much of this material differs between XPath and XQuery, so please review both.
Fix #1615
Pull request #1622 created #created-1622
1619 Specify XSLT map-for-key function
Fix #1619
Specifies an XSLT function map-for-key that converts a key to a map.
Refines the semantics of fn:key() to align with maps in edge cases.
Issue #1621 created #created-1621
compare() with collations that do not support ordering
Many functions that rely on equality-comparison of strings, for example deep-equal() and the eq
operator, invoke compare(A, B, Collation)
. But we say that some collations only support equality comparison, not ordering. Presumably (we don't actually say), compare(A, B, Collation)
will fail if the collation does not support ordering; but if it fails, then equality comparisons will fail as well.
It's not obvious what we should do about this. The simplest fix is probably to say that all collations must support ordering as well as equality comparison. Or we could have a fourth result value from compare()
to say "values not equal, but their ordering is not defined"
Pull request #1620 created #created-1620
332 Add options for fn:path
Fix #332
Issue #1619 created #created-1619
XSLT: keys as maps
I propose an XSLT function map-for-key('keyname', $root) which returns a map $M having the property that map:get($M, $key)
returns the value of key('keyname', $key, $root)
.
This enables XSLT keys to be exploited in new ways: for example it becomes easy to merge the indexes for multiple documents, or to iterate over all the keys in a document.
These benefits can already be obtained by scrapping keys entirely and building maps instead; but keys do have some benefits (like remaining implicitly associated with particular documents, and being "more declarative") and if you've got a legacy application that makes extensive use of keys, this function gives you a bridging capability.
There are a few edge cases that will need ironing out, for example keys allow matching using a collation, which maps don't. (And the spec of xsl:key, now I come to think of it, says nothing about comparing date/time values in different timezones; I don't expect anyone has ever tried.)
Issue #1618 created #created-1618
Adaptive serialization: doubles
We should make the serialization spec more liberal when it comes to the representation of double values. The prescribed output format is format-number(?, '0.0##########################e0')
, which is very strict and often confusing when maps and arrays are serialized. Maps resulting from JSON conversions often contain doubles without users noticing it (related: #1583).
We should make the behavior implementation-dependent or align it with the serialization of JSON data (without losing its additional features to e.g. serialize function items or sequences). Backward compliance shouldn’t be an important issue, as the method was mainly introduced for debugging purposes.
Pull request #1617 created #created-1617
1606 Drop named item types, refine named record types, esp in XSLT
Fix #1606 Fix #1506 Fix #1485
This PR drops the general concept of declaring named item types in XQuery and XSLT, and focuses on declaring named record types. The rules for named record types are tidied up editorially in XQuery (for example there is a clearer distinction between the syntax production RecordType
and the concept of a record type, which can be declared either using that syntax, or otherwise). In XSLT the <xsl:item-type>
declaration is dropped and an <xsl:record-type>
declaration is introduced.
Pull request #1616 created #created-1616
A little cleanup; support function/xfunction globally
This is another PR related to #1610
@michaelhkay suggested that it would be nice to be able to use <function>
consistently. This PR attempts to implement that. (It also implements <xfunction>
which appears to have been an attempt to do this in the XSLT spec.)
Markup of the form <function>prefix:name#arity</function>
will attempt to find the definition of prefix:name
in the F&O and XSLT specifications. It will make an appropriate link. If no prefix
is provided fn:
is assumed and the #arity part is optional.
If someone can pull this PR locally (instructions below) and kick the tires (excuse me, "tyres") I'd appreciate it. I've done a little spot checking, but I can't say I've been comprehensive.
If we agree to merge this, it will then be possible to cleanup markup in some places. For example, it appears that the F&O spec relies on special processing of <code>
rather than <function>
. We should never have done that!
(There's no point looking at the formatted version of this PR, it's all build changes that won't be reflected there.)
Issue #1615 created #created-1615
Drop the terms "module context" and "expression context"
It's not at all clear what these terms are supposed to mean; they are rarely used, and when they are used, they only cause confusion.
For example, the sentence "The names of public variables and public functions must be unique within the [module contexts] of a query" doesn't bear scrutiny. (Can variables have the same names as functions? Yes they can. Can they have the same names as private variables and functions in the same module? No they can't.)
The term "expression context" can probably be usefully replaced in most places by "the static context of an expression".
I think that the idea behind "module context" is that a large part of the static context for expressions is the same for all expressions within a module. But I think that when we use the term, there is usually a better way of saying what we mean.
Pull request #1614 created #created-1614
Fix xfunction refs in XSLT
I changed xfunction
refs so that they point to the right URI for 40 functions.
(Partial fix for #1610 )
Pull request #1613 created #created-1613
1605 csv-to-xml to return document node rather than element
Fix #1605
Issue #1612 closed #closed-1612
Drop diagnostic message from stylesheet
Pull request #1612 created #created-1612
Drop diagnostic message from stylesheet
Debugging output was accidentally left in place.
Pull request #1611 created #created-1611
1608 add dependency to fn compare
Fix #1608
QT4 CG meeting 100 draft minutes #minutes-11-26
Draft minutes published.
Issue #1503 closed #closed-1503
$err:map in XSLT
Issue #1505 closed #closed-1505
1503 Add err:map, err:stack-trace, err:additional to XSLT
Issue #1527 closed #closed-1527
Rendition of record definitions in F&O spec
Issue #1586 closed #closed-1586
1527 Move record types into separate sections
Issue #1598 closed #closed-1598
$err:stack-trace: string, please
Issue #1599 closed #closed-1599
1598 $err:stack-trace: string, please
Issue #1593 closed #closed-1593
Item type syntax document-node(*)
Issue #1604 closed #closed-1604
1593 Allow `document-node(NameTestUnion)`
Issue #1570 closed #closed-1570
1550 Replace node-kind() with new type-of() function
Issue #1590 closed #closed-1590
What is the status of fn:current-mode() in XSLT?
Issue #1607 closed #closed-1607
1590 Drop draft current-mode function from catalog
Issue #1516 closed #closed-1516
Test failures in app-spec-examples
Issue #1601 closed #closed-1601
1516(B) Fix problems with testing examples
Issue #1594 closed #closed-1594
typos: dependant and repeated word
Issue #1600 closed #closed-1600
1594 typos: dependant and repeated word
Issue #1595 closed #closed-1595
Editorial: wording in https://qt4cg.org/specifications/xpath-functions-40/Overview.html#func-concat misses "be" in "and the arguments can sequences of strings "
Issue #1597 closed #closed-1597
1595 Editorial
Issue #1610 created #created-1610
Some cross references are incorrect
If you look in, for example XSLT, at references to the "FO40" spec, they actually attempt to link the w3.org location where it would have been if it was a REC. Not sure what the fix is, but...
Pull request #1609 created #created-1609
1651 Ordered Maps
Fix #564
Introduces ordered maps: specifically, sorted maps which return entries in order sorted by key, and fifo maps which return entries in the order of insertion.
Although this has been on the TODO-list for a long time and has many useful applications, raising a PR at this stage is particularly motivated by comments on the elements-to-maps() function pointing out that having a predictable order of properties in serialized JSON can be very useful, and that many existing XML-to-JSON converters achieve this. This gives the opportunity, for example, to parse JSON into a representation that retains input order, delete and/or add some properties, and then serializate the JSON with the order retained.
QT4 CG meeting 100 draft agenda #agenda-11-26
Draft agenda published.
Issue #1608 created #created-1608
fn:compare depends on implicit timezone
The properties of fn:compare state that it is context-dependent on collations, but fail to say that it also depends on implicit timezone.
Pull request #1607 created #created-1607
1590 Drop draft current-mode function from catalog
Fix #1590.
A draft spec for this function is in the function-catalog, but it has never been referenced in the published spec and the draft is incomplete.
This PR has no impact on the published specs, only on processes that access the function catalog.
Issue #1606 created #created-1606
Drop named item types other than named record types
We started with named item types, they were mainly intended for defining records, but allowed any type. Then we realised that records required extra capability, especially recursive definitions and constructors, so we introduced declare record
for that case. This begs the question as to whether the general declare type
is still useful enough to merit inclusion. I suspect that if we had declare record
and didn't have declare type
, no one would be clamouring for it.
It's not as if all our work on this feature is done. There's still a fair bit to do on the XSLT side, as well as issues like #1520, and I don't think we've really sorted all the issues relating visibility of types to visibility of variables and functions using those types.
Issue #1605 created #created-1605
Change csv-to-xml() to return a document node, not an element node
Generally, functions that construct a new node tree return a document node rather than an element node. This is friendlier, because it means for example that path expressions starting with "/" can be used. (An exception is analyze-string, which it's too late to change).
I propose to bring csv-to-xml
into line.
Pull request #1604 created #created-1604
1593 Allow `document-node(NameTestUnion)`
Fix #1593
Pull request #1603 created #created-1603
1602 Editorial update to "other operations" on maps and arrays
Updates and aligns the "Other Operations" sections for maps and arrays.
Issue #1602 created #created-1602
Additional Operations on Arrays - redundant/spurious text
F&O Sections 18.3.1 (Singleton Arrays) and 18.3.2 (Value Maps) are almost identical to each other, and neither seems to bear much relationship to the section heading. The material is non-normative so this is a purely editorial issue.
Pull request #1601 created #created-1601
1516(B) Fix problems with testing examples
Fix #1516
- Fixes some examples in the spec where the expected results were apparently incorrect
- Introduces a mechanism for giving a test assertion for an example that is separate from the published result, for example where alternative results are possible
- Marks some tests as schema-aware so a non-schema-aware processor won't attempt to run them
All the tests for features that are implemented in Saxon now run successfully.
Pull request #1600 created #created-1600
1594 typos: dependant and repeated word
Issue: #1594
Pull request #1599 created #created-1599
1598 $err:stack-trace: string, please
Issue: #1598
Issue #1598 created #created-1598
$err:stack-trace: string, please
One unfortunate thing about $err:stack-trace
is that it is difficult to serialize, for example as json: serialize($err:map, { 'method': 'json' })
does not work anymore.
I think we should not focus on optimization concerns, but rather return a plain string. If an implementation wants to optimize it further, it shouldn’t be that hard to internally represent it as a lazy string that is generated only when requested.
Pull request #1597 created #created-1597
1595 Editorial
Issue: #1595
Pull request #1596 created #created-1596
1592 Rework rules for selecting a layout
I've reworked the rules for selecting a layout. There's probably more to be done, but this is a start - feeback welcome. I'm marking this "revise" for the moment because I haven't finished it yet. There's no deliberate changing of the spec apart from fixing errors.
Issue #1595 created #created-1595
Editorial: wording in https://qt4cg.org/specifications/xpath-functions-40/Overview.html#func-concat misses "be" in "and the arguments can sequences of strings "
I think there is a slight wording/grammar issue in https://qt4cg.org/specifications/xpath-functions-40/Overview.html#func-concat saying
The function can now take any number of arguments (previously it had to be two or more), and the arguments can sequences of strings rather than single strings.
It it probably meant to say
The function can now take any number of arguments (previously it had to be two or more), and the arguments be can sequences of strings rather than single strings.
Issue #1594 created #created-1594
typos: dependant and repeated word
I spotted two typos in the XSLT 3 spec, and I'm reporting them here because they are in the XSLT 4 spec as well, in https://raw.githubusercontent.com/qt4cg/qtspecs/refs/heads/master/specifications/xslt-40/src/xslt.xml
- dependant, in "versions of the packages on which this package is dependant."
- "the the ", in "appears in the the initializing expression"
Issue #1593 created #created-1593
Item type syntax document-node(*)
I propose that the syntax document-node(NameTest)
be allowed as a synonym for document-node(element(NameTest))
to match a document node that has exactly one element node child matching NameTest
(possibly with comment, and processing-instruction siblings).
The rationale is that document-node()
is often used in situations where a well-formed document is required, and people are more likely to use the more precise type document-node(*)
if it can be expressed with less verbosity.
For example, the return type of parse-xml()
can then be document-node(*)
, and the return type of parse-html()
can be document-node(*:xhtml)
. I propose also that csv-to-xml()
be brought into line by returning document-node(fn:csv)
.
There are a number of places in F&O where we currently accept or return document-node()
and could be more specific by changing this to document-node(*)
.
The verbosity is especially apparent when we want to use the type (document-node(element(X)) | element (X))
which can now be abbreviated to (document-node(X) | element(X))
which reads much more clearly,
There are also a number of places where we currently require element(*)
-- for example the first argument of elements-to-maps
-- where it would be user-friendly to change this to (document-node(*) | element(*))
(with the semantics that supplying a document node has the same effect as supplying the outermost element of the document).
We should also clarify that document-node(element())
does not match a document node having text node children - something that is not currently stated very explicitly.
QT4 CG meeting 099 draft minutes #minutes-11-19
Draft minutes published.
Issue #528 closed #closed-528
fn:elements-to-maps (before: Review of the fn:json() function)
Issue #1575 closed #closed-1575
528bis element to map
Issue #1491 closed #closed-1491
Empty record?
Issue #1577 closed #closed-1577
1491 Empty record types
Issue #1585 closed #closed-1585
Update RELAX NG grammar for XSLT
Issue #767 closed #closed-767
parse-html(): case of SVG element names
Issue #1582 closed #closed-1582
767 Fix reference to HTML5 spec
Issue #69 closed #closed-69
fn:document, fn:function-available: default arguments
Issue #1581 closed #closed-1581
69 Add default for current-merge-group $source
Issue #1580 closed #closed-1580
1462 Change default for deep-equal options
Issue #1493 closed #closed-1493
fn:xml-to-json: Amendments
Issue #1578 closed #closed-1578
1493 Expand the rules for handling numbers in xml-to-json
Issue #1574 closed #closed-1574
Grammar productions missing spec conditionality
Issue #1576 closed #closed-1576
1574 Mark some productions as XQuery only
Issue #1592 created #created-1592
fn:elements-to-maps: Observations
This is a placeholder for feedback on the recently added fn:elements-to-maps
function.
Adopted from https://github.com/qt4cg/qtspecs/pull/529#issuecomment-1765060154 (and as also suggested by @dnovatchev), some rules still refer to JSON. I think we should refer to the XDM, XML or maps instead. Examples:
- mapping XML to ~~JSON~~ a map
- ~~JSON~~ Map equivalent (13x) → adjust syntax
- their ~~JSON~~ map equivalents
- …etc
Issues that have not fully been discussed: https://github.com/qt4cg/qtspecs/pull/529#issuecomment-1765761565
- https://github.com/qt4cg/qt4tests/issues/181: empty-plus shouldn't require that attributes exist.
- https://github.com/qt4cg/qt4tests/issues/180: "list" incorrectly states that it doesn't apply where the INNER element has attributes.
…more to come.
Issue #1349 closed #closed-1349
Nothing
Issue #421 closed #closed-421
Make sure the build system syntax checks the syntax of examples
Issue #92 closed #closed-92
Simplify rule for attribute values on Extension Instructions used to invoke named templates
Issue #1552 closed #closed-1552
fn:siblings() on parentless nodes
Issue #1573 closed #closed-1573
1552 Change fn:siblings to include self in all cases
Issue #1591 created #created-1591
Implausible filter expressions
I propose to classify E[P]
as an implausible expression if the only possible value of P that has an effective boolean value is the empty sequence.
An example might be $uris[parse-uri(.)]
. The result of parse-uri
is either a map or an empty sequence, so computing the EBV will give either false or an error.
Classifying an expression as implausible licenses the processor to reject it as a static error.
Issue #1590 created #created-1590
What is the status of fn:current-mode() in XSLT?
It appears in the function catalog but not in the specification.
Issue #1589 closed #closed-1589
Implement an instruction/function finder in XSLT
Pull request #1589 created #created-1589
Implement an instruction/function finder in XSLT
See also #1588
Issue #1588 created #created-1588
Move the Streamability chapter?
This is a minor thing, but it annoys me every single time. Open up the XSLT spec and search for any instruction and the first hit in the ToC is always 19.8.4.x "Streamability of [instruction]" which has never been the reason I was looking for the instruction.
Seems we could move 19 to 27, putting it before Serialization or we could tinker with the markup so that 19.8.4.x didn't appear in the ToC.
Pull request #1587 created #created-1587
557 Add fn:unparsed-binary function
Adds the function fn:binary-resource
Also fixes some inconstencies in the handling of static/executable base URI in other resource access functions.
Fix #557
Issue #1579 closed #closed-1579
Allow $key in map:contains to be empty
Pull request #1586 created #created-1586
1527 Move record types into separate sections
Changes the rendition of record type definitions so each is now defined in a section of its own, extracted from the function catalog into the narrative spec by means of a processing instruction, following the precedent of function definitions. Record type definitions can therefore be cross-referenced using a specref, but they are automatically cross-referenced if named in a function signature.
Fix #1527
QT4 CG meeting 099 draft agenda #agenda-11-19
Draft agenda published.
Pull request #1585 created #created-1585
Update RELAX NG grammar for XSLT
This PR updates the RELAX NG grammar to be (more) consistent with the XSD grammar (and consequently more correct for XSLT 4.0)
- Relaxes the definition of QNames so that prefixes declared with
fixed-namespaces
can be supported - Adds the
fixed-namespaces
attribute toxsl:stylesheet
,xsl:transform
, andxsl:package
- Adds the
xsl:switch
element - Adds the
xsl:array
andxsl:array-member
elements - Adds
shallow-copy-all
to the possible values foron-no-match
- Adds
separator
toxsl:apply-templates
- Removes
select
attribute fromxsl:copy
- Adds
as
attribute toxsl:sequence
- Updated the declarations for
xsl:accumulator-rule
,xsl:array
,xsl:array-member
,xsl:attribute
,xsl:catch
,xsl:comment
,xsl:map-entry
,xsl:matching-substring
,xsl:message
,xsl:namespace
,xsl:non-matching-substring
,xsl:on-empty
,xsl:on-non-empty
,xsl:otherwise
,xsl:param
,xsl:processing-instruction
,xsl:sequence
,xsl:sort
,xsl:value-of
,xsl:variable
,xsl:when
, andxsl:with-param
so that they accept either aselect
attribute or a sequence constructor
More changes may also be required. A comprehensive comparision of the RNC and XSD schemas is needed, #1584
Issue #1584 created #created-1584
Review the XML Schema and RELAX NG schemas for XSLT 4.0 for compatibility
It's likely that we've allowed them to drift.
Issue #1583 created #created-1583
JSON: Parsing and serializing numbers, often undesired E notation
If JSON numbers are converted to XML and serialized as JSON, it is confusing to end up with an E notation for large numbers. An example:
'100000000000000000000'
=> parse-json()
=> serialize(map { 'method': 'json' })
Obviously, lossless roundtripping is not possible (1e20
is a valid JSON number, so we cannot distinguish it from 100000000000000000000
), but as the E notation is much less common than integers, maybe we could try to return more numbers in their integer representation if the result would be equivalent?
Related: #1445
Pull request #1582 created #created-1582
767 Fix reference to HTML5 spec
The reference to §13.2.9 of the WhatWG DOM spec should be a reference to §13.2.9 of the WhatWG HTML spec.
Fix #767
Issue #854 closed #closed-854
Need more discussion and explanation of deep-lookup operator
Issue #1392 closed #closed-1392
`element(a|b)` vs `(element(a)|element(b))`
Pull request #1581 created #created-1581
69 Add default for current-merge-group $source
I found that the two functions mentioned in the issue (document and function-available) had been updated as suggested. However I also checked all the XSLT-specific functions, and found that for current-merge-group(), the prose has been updated to say what happens if the argument is omitted, but the signature does not actually define a default. I have corrected this.
Fix #69
Issue #1035 closed #closed-1035
Add default values for parameters in constructor functions for records
Pull request #1580 created #created-1580
1462 Change default for deep-equal options
Issue #1579 created #created-1579
Allow $key in map:contains to be empty
I wonder if we could relax the 3rd line of the current signature...
map:contains(
$map as map(*),
$key as xs:anyAtomicType
) as xs:boolean
...to:
$key as xs:anyAtomicType?
It hit me tonight, as I was running a lengthy operation, and some nulls were passed to key
. I expected the function to return false, not raise an error.
Pull request #1578 created #created-1578
1493 Expand the rules for handling numbers in xml-to-json
I have (a) added more explanation of why the conversion is needed, and (b) described the conversions more prescriptively.
Fix #1493
Pull request #1577 created #created-1577
1491 Empty record types
As well as allowing record()
for an empty record type as proposed in #1491, this PR also allows named record declarations in XQuery to have no fields. In the course of implementing this I discovered there was old text in the F&O "Constructors" section which duplicated but had become out of sync with the XQuery spec, so much of this has been deleted and replaced with a cross-reference. Named record definitions for XSLT have not yet been defined, this is unfinished business.
Fix #1491
Pull request #1576 created #created-1576
1574 Mark some productions as XQuery only
Marks the relevant production rules as XQuery-only
Fix #1574
Issue #529 closed #closed-529
528 fn:elements-to-maps
Pull request #1575 created #created-1575
528bis element to map
Supersedes #529 Fix #528
Coming back to this after a long gap, I have redrafted the proposal. I've tried to take as many of the comments into account as possible, but I'm aware that I haven't responded to them all.
Issue #1574 created #created-1574
Grammar productions missing spec conditionality
The XPath grammar currently projects the following productions, which appear to be only relevant to XQuery, are not reachable from the XPath
root, and either lack an if="xquery40"
condition, or wrongly have xpath40
in the condition:
ExtendedFieldDeclaration
, ParamListWithDefaults
,ParamWithDefault
In fact, the grammar doesn't allow default values for arguments of inline functions. Is this the case?
Issue #1546 closed #closed-1546
1538 Add XSLT support for json-lines
Issue #1538 closed #closed-1538
Add XSLT support for the new json-lines serialization option
Pull request #1573 created #created-1573
1552 Change fn:siblings to include self in all cases
Fix #1552
Issue #1572 closed #closed-1572
Fix markup error
Pull request #1572 created #created-1572
Fix markup error
Merging allowed a changes
block to become split into two blocks, which isn't allowed.
QT4 CG meeting 098 draft minutes #minutes-11-12
Draft minutes published.
Issue #1449 closed #closed-1449
Discussion: include/import of files.
Issue #1454 closed #closed-1454
1449 Relax rules on multiple xsl:includes
Issue #1540 closed #closed-1540
XSLT: self-reference in global variables
Issue #1544 closed #closed-1544
Allow (some) self-references in global variables
Issue #1548 closed #closed-1548
Managing indentation parameters for serialization
Issue #1560 closed #closed-1560
1548 Clarify default for xsl:output/@indent
Issue #689 closed #closed-689
fn:stack-trace: replace with $err:stack-trace
Issue #1470 closed #closed-1470
689 fn:stack-trace: replace with $err:stack-trace
Issue #1555 closed #closed-1555
parse-json() - default for the `escape` option
Issue #1565 closed #closed-1565
1555 change default for parse json escape
Issue #1486 closed #closed-1486
Editorial corrections & cleanups
Issue #1556 closed #closed-1556
1486 Editorial corrections & cleanups
Issue #1567 closed #closed-1567
Missing change log entries
Issue #1569 closed #closed-1569
1567 Supply missing change metadata
Issue #1571 created #created-1571
Discussion: On the implementability of the specs and helping implementors
Functions and Operators
There are 4 classes of function here:
- functions that have to be implemented natively -- e.g.
fn:parse-html
; - functions that are implemented in terms of native operations -- i.e. the
dm:*
andop:*
functions; - functions that can be implemented in XSLT or XQuery but can be done more efficiently natively;
- functions that can be implemented in XSLT or XQuery as efficiently as they can natively.
It could be useful to generate a function library of the form namespace/function.xqy
and namespace/function.xsl
that has the implementation of the functions that can be implemented in XSLT and XQuery. This would allow implementors to import/include those implementations into their processors/engines. -- This is more flexible than providing them all in a single file as implementors can include the functions they don't have implementations for without having to edit the files every time the spec changes.
Note: JavaScript supports polyfill files for new classes/functions so that engines that don't support those features can get a functioning implementation of that function/class.
Note: Many JavaScript engines implement various functions in JavaScript itself.
XPath and XQuery
We could make the EBNF available as a separate file in addition to the iXML grammar that has been discused/worked on. This would help implementors on the lexer and parser at least. There's not much else we can do here as the language is custom.
XSLT
We have the XMLSchema and RelaxNG grammars to help with validation. Implementors could use these in their build systems to provide API bindings to the data model.
XDM
We could provide the XDM/XPath specific XMLSchema extensions as a separate XMLSchema definition to allow implementors to get access to the type infomation for these such as for xs:numeric
.
Issue #1325 closed #closed-1325
Variadic System Functions: Principles?
Issue #1478 closed #closed-1478
Drop variadic functions
Issue #1535 closed #closed-1535
1478 Drop variadic functions
Issue #1463 closed #closed-1463
fn:element-number: Feedback
Issue #1543 closed #closed-1543
Drop fn:element-number
Issue #1534 closed #closed-1534
Allow xsl:result-document/@select
Issue #1549 closed #closed-1549
1534 Allow xsl:result-document/@select
Issue #1553 closed #closed-1553
Define positional predicates on axis steps more formally
Issue #1557 closed #closed-1557
1553 Expand explanation of predicates in axis steps
Issue #1522 closed #closed-1522
Ambiguity in XSLT Pattern Grammar
Issue #1558 closed #closed-1558
1522 Fix syntax ambiguity in patterns
Issue #1515 closed #closed-1515
higher order function group-by or gather-by for grouping
Issue #1559 closed #closed-1559
1515 Add cross-references to map:build
Issue #1561 closed #closed-1561
schema-for-xslt40 is invalid
Issue #1562 closed #closed-1562
1561 Correct the schema for XSLT 4.0
Issue #1563 closed #closed-1563
Errors in examples of new fn:schema-type function
Issue #1564 closed #closed-1564
1563 Fix fn:schema-type examples
Pull request #1570 created #created-1570
1550 Replace node-kind() with new type-of() function
Drops the newly-introduced fn:node-kind()
function in favour of a more general function fn:type-of()
.
Pull request #1569 created #created-1569
1567 Supply missing change metadata
Fix #1567
Issue #1568 created #created-1568
Define a Unicode case-insensitive collation
Unfinished business from issue #668.
Analogously to the current HTML case-insensitive collation (which case-normalises ASCII characters only), define a Unicode case-insensitive collation that case-normalizes all Unicode characters. It is basically equivalent to converting both strings to lower-case and then comparing using code-point collation. Although UCA collations allow for case-insensitivity, they combine this with lots of other baggage such as ignoring punctuation characters.
Issue #1567 created #created-1567
Missing change log entries
Some 4.0 changes have no entries in the change log.
Issue #1566 created #created-1566
EXPath Modules: Future
Would it be realistic to move the most important EXPath specifications (Binary, File, maybe other modules) to the W3 realm?
Nowadays, it has become close to impossible to get into contact with Florent Georges reliably, and we have no guarantee that https://expath.org/ remains online.
I would be ready to transform the File Module to a new format and to maintain it in the future.
QT4 CG meeting 098 draft agenda #agenda-11-12
Draft agenda published.
Pull request #1565 created #created-1565
1555 change default for parse json escape
Fix #1555
Pull request #1564 created #created-1564
1563 Fix fn:schema-type examples
Fix #1563
Issue #1563 created #created-1563
Errors in examples of new fn:schema-type function
primitive-type
and base-type
are functions.
The base type of positiveInteger
is nonNegativeInteger
.
Pull request #1562 created #created-1562
1561 Correct the schema for XSLT 4.0
Fix #1561
Test case catalog-005 now passes, showing that all the non-error stylesheets in the test suite are valid against the schema.
Issue #1561 created #created-1561
schema-for-xslt40 is invalid
The schema for XSLT 4.0 (included as a free-standing file and incorporated as an appendix) is not a valid schema.
(a) The union type for the fixed-namespaces
attributes contains children of the form <xs:simpleType ref=".."/>
which though intuitive, is not allowed.
(b) The first xs:assert
in the definition of xsl:for-each-group
has misplaced parentheses in the call on count()
.
In addition, the schema needs some updates to catch up with the latest XSLT 4.0 syntax changes. The problems are revealed by XSLT 4.0 test case catalog-005.
Pull request #1560 created #created-1560
1548 Clarify default for xsl:output/@indent
Fix #1548
XSLT 3.0 specified no default for xsl:output/@indent in the case of the JSON and Adaptive output methods. This PR sets the default to "no".
I believe this is sufficient to close #1548.
Issue #1348 closed #closed-1348
Grammar rules: redundancies
Pull request #1559 created #created-1559
1515 Add cross-references to map:build
Purely editorial; adds cross-references to map:build (for example from XSLT and XQuery grouping) to make the function more visible.
@Fix #1515
Pull request #1558 created #created-1558
1522 Fix syntax ambiguity in patterns
Fix #1522
Pull request #1557 created #created-1557
1553 Expand explanation of predicates in axis steps
Purely editorial.
Fix #1553
Pull request #1556 created #created-1556
1486 Editorial corrections & cleanups
Issue: #1486
Issue #1555 created #created-1555
parse-json() - default for the `escape` option
See https://github.com/w3c/qt3tests/issues/65
where it is pointed out that we have test cases that assume the default for the escape
option of parse-json() is false
, whereas the spec says it should be true
.
The Saxon implementation (and presumably any other implementation that passes the tests) sets the default to false
, and if we were arguing from first principles then I think I would argue this is a better choice.
We need either to change the tests or to change the spec. Since we can't change the 3.1 spec retrospectively, neither choice is particularly attractive.
Issue #1554 created #created-1554
XQFO: Formal Specification
In the XQFO, the “Formal Specification” sections present XPath/XQuery expressions that are equivalent to the introduced function.
In a previous meeting, it has been noted that “Equivalent Expression” may be a better term for these sections.
Would it be sufficient to simply rename the section, and adapt the wording in 1.5.5 Formal Specification?
Issue #1553 created #created-1553
Define positional predicates on axis steps more formally
The effect of positional predicates on axis steps (for example preceding-sibling::*[1]
) is an area that causes users a lot of trouble. We could provide a more formal definition, and we could also provide more notes and examples.
In particular, we haven't expanded the notes and examples to explain what happens when you have a range of integers such as preceding-sibling::*[1 to 3]
, which is now allowed. (Spoiler alert: if the siblings are A B C D E, you get C D E in that order).
Issue #1552 created #created-1552
fn:siblings() on parentless nodes
It feels rather odd that the result of fn:siblings()
should include the argument node, except when it is parentless.
If fn:siblings were defined as preceding-sibling::node() | self::node() | following-sibling::node()
, then the start node would be included even if it is parentless -- and even if it is an attribute or namespace node.
Issue #1551 closed #closed-1551
Correct return type for fn:siblings
Pull request #1551 created #created-1551
Correct return type for fn:siblings
As noted during the review at yesterday's meeting. (I was supposed to correct it before applying the PR, but pressed the wrong key...)
Issue #1550 created #created-1550
More requirements for type information
See original issue #148.
There have been requests for further type information beyond that supplied by the four new functions
node-kind()
atomic-type-annotation()
node-type-annotation()
schema-type()
One of the requests was to be able to test if an item is a map, and array, some other function, a node, or an atomic value.
This could perhaps be done by broadening node-kind() to a function item-kind() that returns "map" for a map, "array" for an array, etc. We could also return the result in item-type syntax, say map(*)
or array(*)
or function(*)
.
What else is needed?
Issue #1539 closed #closed-1539
New function: System’s default time zone for arbitrary date/time values.
Issue #1545 closed #closed-1545
1539 New civil-timezone function
Issue #1542 closed #closed-1542
Formalize definitions of axes
Issue #1547 closed #closed-1547
1542 Add "formal" definitions of non-primitive axes
QT4 CG meeting 097 draft minutes #minutes-11-05
Draft minutes published.
Issue #148 closed #closed-148
Get the type of a value
Issue #1523 closed #closed-1523
148 New functions to get type information
Issue #1541 closed #closed-1541
QT4CG-096-1 Add notes explaining EBNF notation
Pull request #1549 created #created-1549
1534 Allow xsl:result-document/@select
Fix #1534
QT4 CG meeting 097 draft agenda #agenda-11-05
Draft agenda published.
Issue #1548 created #created-1548
Managing indentation parameters for serialization
In the discussion of PR #1497 at meeting 096, some concern was expressed that the default value for indentation might be problematic in testing.
It was observed that all of the serialization parameter settings are defined by the host language, not by the serialization specification, but that did not resolve the concerns.
Can we/should we/would we mandate that indentation is disabled by default? (Is that not already the case in XQuery and XSLT?)
Pull request #1547 created #created-1547
1542 Add "formal" definitions of non-primitive axes
Fix #1542
Pull request #1546 created #created-1546
1538 Add XSLT support for json-lines
Fix #1538
I also did some editorial cleanup of the serialization spec, in particular parameters like indent
now have the value true
or false
, while recognizing that some host languages may allow alternative representations such as yes/no or 1/0.
Pull request #1545 created #created-1545
1539 New civil-timezone function
Fix #1539
Pull request #1544 created #created-1544
Allow (some) self-references in global variables
Fix #1540
Pull request #1543 created #created-1543
Drop fn:element-number
Fix #1463
Issue #1542 created #created-1542
Formalize definitions of axes
It would be good if the definitions of the various axes were less informal.
The four axes children, parent, attribute, and namespace are defined directly in terms of data model accessors.
The remaining axes can be defined as follows, where axis names are used as functions:
self($node): $node
ancestor($node): transitive-closure($node, parent#1)
ancestor-or-self($node): ancestor($node) | $node
descendant($node): transitive-closure($node, child#1)
descendant-or-self($node): descendant($node) ! $node
following($node): $node / ancestor-or-self() / following-sibling() / descendant-or-self()
following-or-self($node): following($node) | $node
following-sibling($node): parent() / child() [. >> $node]
following-sibling-or-self($node): following-sibling($node) | $node
preceding($node): $node => ancestor-or-self() / preceding-sibling() / descendant-or-self()
preceding-or-self($node): preceding($node) | $node
preceding-sibling($node): parent() / child() [. << $node]
preceding-sibling-or-self($node): preceding-sibling($node) | $node
Pull request #1541 created #created-1541
QT4CG-096-1 Add notes explaining EBNF notation
Issue #1500 closed #closed-1500
Coupling of global variable-bound maps to character maps in XSLT
Issue #1530 closed #closed-1530
1500 New XSLT character-map() function
Issue #1540 created #created-1540
XSLT: self-reference in global variables
We should change the rules for XSLT global variables to align with the new rules for XQuery.
Specifically, drop the rule that a global variable is out of scope within its own definition, falling back on the existing circularity rules to disallow cases where the reference is genuinely circular.
The effect would be to allow a global variable to be bound to a recursive inline function, for example (nonsense example)
<xsl:variable name="tot" select="fn($x){if ($x=0) then 0 else $tot($x - 1) + 2"/>
Issue #1471 closed #closed-1471
JSON Serialization: Sequences on Top Level
Issue #1497 closed #closed-1497
1471 JSON Serialization: json-lines
Issue #1539 created #created-1539
New function: System’s default time zone for arbitrary date/time values.
It is not easy to find out the correct timezone for a given UTC xs:dateTime
in the current region. We have fn:implicit-timezone
, but it only refers to the current date and time.
The following code works for at least BaseX and Saxon; it applies the system’s default time zone to a given xs:dateTime
item:
let $dtm := xs:dateTime('2024-07-01T01:01:01Z')
let $ms := xs:integer(($dtm - xs:dateTime('1970-01-01T00:00:00Z')) div xs:dayTimeDuration('PT0.001S'))
let $tz := xs:dayTimeDuration('PT' ||
Q{java:java.time.ZonedDateTime}ofInstant(
Q{java:java.time.Instant}ofEpochMilli($ms),
Q{java:java.time.ZoneId}systemDefault()
)
=> Q{java:java.time.ZonedDateTime}getOffset()
=> Q{java:java.time.ZoneOffset}getTotalSeconds()|| 'S')
return adjust-dateTime-to-timezone($dtm, $tz)
It returns 2024-07-01T03:01:01+02:00
(MESZ) on systems located in Leipzig and nearby cities.
We could either introduce a function that
- returns an
xs:dayTimeDuration
timezone for a givenxs:dateTime
item (with the system’s default time zone applied), or - converts an
xs:dateTime
item to the system’s default time zone at the given date/time.
Suggestions for good names are welcome.
See also https://xmlcom.slack.com/archives/C01GVC3JLHE/p1730216267200449:
Issue #1536 closed #closed-1536
document-uri of xslt transformation primary output
QT4 CG meeting 096 draft minutes #minutes-10-29
Draft minutes published.
Issue #1366 closed #closed-1366
In the EBNF, use explicit separator syntax
Issue #1498 closed #closed-1498
1366 Use ++ and ** operators in EBNF
Issue #868 closed #closed-868
fn:intersperse → fn:join, array:join($arrays, $separator)
Issue #1504 closed #closed-1504
868 fn:intersperse → fn:join, array:join($arrays, $separator)
Issue #1318 closed #closed-1318
Function Coercion: Records, Maps, Arrays
Issue #1501 closed #closed-1501
1318 Function Coercion: Records, Maps, Arrays
Issue #1495 closed #closed-1495
Drop "context value static type"
Issue #1496 closed #closed-1496
1495 Drop context value static type
Issue #1519 closed #closed-1519
Add `-or-self` variants of all relevant axes
Issue #1532 closed #closed-1532
1519 Add -or-self axes
Issue #1525 closed #closed-1525
Add more explanation on enumeration types
Issue #1529 closed #closed-1529
1525 Add notes on enumeration types
Issue #1499 closed #closed-1499
Editorial: reduce noise in serialization spec for unused options
Issue #1531 closed #closed-1531
1499 Deduplicate text relating to unused serialization parameters
Issue #1533 closed #closed-1533
Actions QT4CG-095-01 and -02 - follow-up on computed node constructors
Issue #1538 created #created-1538
Add XSLT support for the new json-lines serialization option
Add XSLT support for the new json-lines serialization option (PR #1497)
Issue #1537 created #created-1537
XSLT: local functions within an enclosing xsl:mode
I recently wrote a multi-phase transformation and tried out the new "enclosing modes" feature where the template rules for a mode are enclosed within the xsl:mode
element. Worked very well, and really helps to give the stylesheet a more modular structure. But I found myself wanting to write "helper" functions within the xsl:mode
definition.
I don't think it would too difficult to add this feature. I imagine that such functions would be scoped to the enclosing xsl:mode, and would automatically have higher import precedence than anything outside the mode. There are probably a few complications e.g. if the arity range overlaps a function with the same name declared outside the enclosed mode, but I would think it's manageable.
Issue #1536 created #created-1536
document-uri of xslt transformation primary output
When using XSLT transformations, it would be helpful in many cases to know the transformation target, that is the base-uri of the primary result document. This would allow, for example,
- to make the uri of secondary result documents relative to the primary base-uri to ensure that all output is in the same target directory
- to copy media files from transformation source directory into transformation target directory
- to generate a transformation report as secondary output, which informs and links to the primary result document
I am pretty sure that there are situations where the URI of the primary result document is undefined or unknown. The XPath function fn:document-uri
is decribed as "Returns the URI of a resource where a document can be found, if available.". So maybe we could have a new function fn:primary-result-document-uri
as "Returns the URI of the primary result document, if available.".
Pull request #1535 created #created-1535
1478 Drop variadic functions
Fix #1478 Fix #1325
This PR drops variadic functions, reverting to the situation in previous versions where concat was in a special category of its own.
We decided (see the referenced issues) not to introduce further variadic system functions, mainly in the interests of extensibility, and the same argument apply to user-defined functions.
It is not a great hardship to write f((x, y, z))
rather than f(x, y, z)
and dropping the feature therefore removes a fair bit of complexity that has proved to be of rather limited value.
QT4 CG meeting 096 draft agenda #agenda-10-29
Draft agenda published.
Issue #1534 created #created-1534
Allow xsl:result-document/@select
Nearly all XSLT instructions that accept a sequence constructor also allow the input to be supplied using a select expression.
xsl:result-document
is an exception.
A particular use case for this is with a multi-phase transformation where you want to capture the result of the first phase of processing in a variable, and then output it before further processing, perhaps for diagnostics or perhaps because the processing pipeline branches.
Pull request #1533 created #created-1533
Actions QT4CG-095-01 and -02 - follow-up on computed node constructors
- Adds advice and guidance on avoiding reserved words
- Drops XQuery-specific material from the XPath spec
- Adds a paragraph to the incompatibilities appendix
Pull request #1532 created #created-1532
1519 Add -or-self axes
Fix #1519
Pull request #1531 created #created-1531
1499 Deduplicate text relating to unused serialization parameters
Fix #1499
Pull request #1530 created #created-1530
1500 New XSLT character-map() function
Fix #1500
Pull request #1529 created #created-1529
1525 Add notes on enumeration types
Fix #1525
Issue #1528 created #created-1528
Computed node constructors: observations
Observations/conclusions from the exchange on Slack regarding computed node constructors:
- We should highlight the breaking change in the appendix: J.1 Incompatibilities relative to XQuery 3.1.
- We should present a list of keywords which is ambiguous and exclude other ones (like e.g.
count
orvalue
) - Maybe we should generally discourage users from using the legacy NCName syntax, and remove corresponding examples in the spec, as further versions of the language may lead to new incompatibilities.
- Syntax errors in quoted element names were should already be detected at parse time, for example by using pseudo quotes:
# currently
CompNodeName ::= StringLiteral | UnreservedName | ("{" Expr "}")
CompNodeNCName ::= StringLiteral | UnreservedNCName | ("{" Expr "}")
# proposed
CompNodeName ::= UnreservedName | ('"' UnreservedName '"') | ("'" UnreservedName "'") | ("{" Expr "}")
CompNodeNCName ::= UnreservedNCName | ('"' UnreservedNCName '"') | ("'" UnreservedNCName "'") | ("{" Expr "}")
A quick evaluation over appr. 8,000 XQuery files resulted in the following list of occurrences of possible incompatibilities (spread across appr. 400 files):
- 300x
attribute type {
- 107x
element option {
- 98x
element record {
- 85x
attribute count {
- 81x
attribute value {
- 31x
element value {
- 24x
element type {
- 18x
attribute text {
- 15x
attribute values {
- 12x
attribute namespace {
- 11x
attribute default {
- 10x
element text {
- 10x
attribute key {
- 10x
element key {
- 9x
element group {
- 9x
element count {
- 8x
element to {
- 8x
element item {
- 5x
element div {
- 4x
attribute collation {
- 4x
attribute encoding {
- 4x
attribute comment {
- 3x
attribute context {
- 3x
attribute case {
- 3x
element map {
- 3x
element empty {
- 2x
element items {
- 2x
attribute to {
- 1x
attribute empty {
- 1x
attribute item {
- 1x
element element {
- 1x
attribute where {
- 1x
element values {
- 1x
attribute if {
- 1x
element in {
- 1x
element comment {
- 1x
attribute start {
- 1x
attribute end {
- 1x
element document {
Issue #1527 created #created-1527
Rendition of record definitions in F&O spec
We're making progress here but there are still things that need fixing.
Currently a record definition like uri-structure-record is expanded:
(a) into a full definition (with explanations of all the fields) at the point where a PI of the form <?record-description uri-structure-record?>
appears. This may be either within a specific function entry in the function catalog, or in the narrative prose of xpath-functions.xml.
(b) into a concise definition (without explanations of fields) at the point where the type is referenced in a function signature.
A hyperlink to the record definition is created (i) where the type name appears in a function signature, and (ii) manually, using the generic link format <loc href="#uri-structure-record">uri-structure-record</loc>
. However, the target of the link is the first concise definition (see (b) above) rather than the full definition.
There are also some limitations in the rendering of the definitions. The full definition does not indicate which fields are optional/required, though this information is available in the XML. Neither the full nor the concise definition appears to indicate whether the record is extensible.
Issue #1526 closed #closed-1526
Emergency fix to test generator stylesheet
Pull request #1526 created #created-1526
Emergency fix to test generator stylesheet
There's a parse-xml() call in the test generator stylesheet that's failing to process some of the test examples, for reasons that aren't entirely clear. This is causing the entire build to fail. This PR adds a try/catch around the parse-xml() call so that a failure only affects the individual tests, not the entire build.
Issue #1525 created #created-1525
Add more explanation on enumeration types
In 3.2.6 Enumeration Types we already have some explanation of how enumeration types work, and I think it's sound, but I think a few more words about the consequences might be useful.
We note correctly that 'red' instance of enum('red', 'green', 'blue')
is false.
We should also note that let $red as enum('red', 'green', 'blue', 'yellow') := "red" return $red instance of enum('red', 'green', 'blue')
is true; and indeed that a string S that is cast or coerced to any enumeration type that permits S is an instance of every enumeration type that permits S. This is a conscious design decision that has both advantages and disadvantages, so we should explain the consequences carefully.
Issue #1524 closed #closed-1524
Coercing records: error codes
QT4 CG meeting 095 draft minutes #minutes-10-22
Draft minutes published.
Issue #1524 created #created-1524
Coercing records: error codes
By analysing the coercion rules for complex data structures (see the discussion in #1501), I wondered which error codes, apart from XPTY0004
, may arise from the coercion of records.
If we attempt to coerce a function item to a string, we get FOTY0013
:
let $r as xs:string := true#0
return $r
I would expect to also get FOTY0013
if a map value, which is a function item, is coerced to a string. Is this correct? If not, how does this case differ from the first one?
let $r as record(a as xs:string) := { 'a': true#0 }
return $r
Issue #1518 closed #closed-1518
Add to changes metadata
Issue #1517 closed #closed-1517
1516(A) Fix failing F&O examples
Issue #1509 closed #closed-1509
XQuery import schema (location hints)
Issue #1510 closed #closed-1510
1509 Drop obsolete/redundant text about "import schema" location hints
Issue #1507 closed #closed-1507
Formal spec of fn:parse-integer
Issue #1508 closed #closed-1508
1507 Make format-integer spec legible
Issue #1357 closed #closed-1357
Rendering of new vs. updated features
Issue #1521 closed #closed-1521
Update the changed/new marks in the ToC
Issue #1345 closed #closed-1345
Bare brace ambiguity resolution in practice
Issue #1511 closed #closed-1511
1345 Re-allow bare-brace map constructors everywhere
Issue #1179 closed #closed-1179
Editorial: `array:values`, `map:values`
Issue #1169 closed #closed-1169
Maps & Arrays: Consistency & Terminology
Issue #1114 closed #closed-1114
Partial function application: Keywords and placeholders
Issue #1065 closed #closed-1065
fn:format-number: further notes
Issue #735 closed #closed-735
Local functions in XSLT
Issue #573 closed #closed-573
Node construction functions
Issue #1512 closed #closed-1512
Disallow reserved names in computed processing-instruction and namespace node constructors
Issue #1513 closed #closed-1513
1512 Disallow reserved names in namespace and PI constructors
Issue #1458 closed #closed-1458
Arguments that have a default value but don't accept ()
Issue #1502 closed #closed-1502
1458 Arguments that have a default value but don't accept ()
Pull request #1523 created #created-1523
148 New functions to get type information
Provides four new functions:
- node-kind
- schema-type
- atomic-type-annotation
- node-type-annotation
to return type information using a new record structure schema-record-type.
Fix #148 Partial Fix for #1271
Issue #1522 created #created-1522
Ambiguity in XSLT Pattern Grammar
I've tested the grammar against the ~500 distinct patterns in the stylesheets of the attr/match
test sets (which is the largest collection I can find). As far as I can tell, there is one ambiguity inherent in the current grammar, which is not covered by notes, in that the pattern
id()
(and similar for element-with-id()
, key()
and root()
) can parse (in reduced form) in two ways:
<Pattern40 xmlns:ixml="http://invisiblexml.org/NS" ixml:state="ambiguous">
<RootedPath>
<FunctionCallP>
<OuterFunctionName>id</OuterFunctionName>
<ArgumentListP/>
</FunctionCallP>
</RootedPath>
</Pattern40>
<Pattern40 xmlns:ixml="http://invisiblexml.org/NS" ixml:state="ambiguous">
<PostfixExprP>
<FunctionCallP>
<OuterFunctionName>id</OuterFunctionName>
<ArgumentListP/>
</FunctionCallP>
</PostfixExprP>
</Pattern40>
that is, there are two paths, via RootedPath
and RelativePathExprP/StepExprP/PostfixExprP
to get from PathExprP
to FunctionCallP
Pull request #1521 created #created-1521
Update the changed/new marks in the ToC
Fix #1357
This is a purely cosmetic PR. I think @ChristianGruen is right that it is unfortunate that we've lost the distinction between "new" and "only updated" functions. I'm trying to make that work again, in a way that's more visually distinct.
I have:
- Use
Δ
for changed sections. - Use
Δ➕
for new functions. (I'm making no effort to determine if non-function sections are new or changed; I doubt that it's either worth the effort or likely to be correct. One could argue that➕
alone is sufficient, but I liked the consistency this way. Even new functions do change between drafts.) - Per @michaelhkay, a function is "new" if it doesn't appear on F&O 3.1.
- I found it distracting that the delta symbol preceded the ToC entry. That's purely aesthetic, I guess, but I've put them at the end.
You can see the results here: https://qt4cgtest.nwalsh.com/branch/iss-1357/xpath-functions-40/Overview.html
Issue #1520 created #created-1520
Type declarations of cyclically dependent modules
The specification of Item Type Declarations has this restriction:
The declaration of an item type (whether locally declared in a module or imported from a public declaration in an imported module) must precede any use of the item type name: that is, the name only becomes available in the static context of constructs that lexically follow the relevant item type declaration or module import. A consequence of this rule is that cyclic and self-referential definitions are not allowed.
But modules explicitly are allowed to have cyclic dependencies:
Implementations must resolve cycles in the import graph, either at the level of target namespace URIs or at the level of location URIs, and ensure that each module is imported only once.
Does that mean that the modules below are valid? They depend on each other, but the types that they define do not have a cyclic dependency. In particular, the use of each type lexically follows the relevant module import, as asked for.
If this case must be supported, the usefulness of the above restriction for XQuery processors is somewhat limited, IMO it compromises this suggestion for handling declared types:
Named item types have been designed so that a reference to an item type name can be expanded (that is, replaced by its definition) as soon as the reference is encountered during query parsing.
(: a.xqm :)
module namespace a = 'a';
import module namespace b = 'b' at 'b.xqm';
declare type a:t1 as b:t2;
declare type a:t2 as xs:integer;
(: b.xqm :)
module namespace b = 'b';
import module namespace a = 'a' at 'a.xqm';
declare type b:t1 as a:t2;
declare type b:t2 as xs:integer;
Issue #1519 created #created-1519
Add `-or-self` variants of all relevant axes
Add the axes preceding-or-self
, following-or-self
, preceding-sibling-or-self
, following-sibling-or-self
, with the obvious meanings.
A minor convenience avoiding the frequent need to write things like (. | preceding-sibling::*)[@x][last()]
; especially useful because preceding-sibling-or-self returns results starting at the context node whereas (. | preceding-sibling)
reverses the order.
QT4 CG meeting 095 draft agenda #agenda-10-21
Draft agenda published.
Pull request #1518 created #created-1518
Add to changes metadata
No issue raised.
Adds links from change metadata in the spec to issue and PR numbers in GitHub in many cases where these were previously missing.
Pull request #1517 created #created-1517
1516(A) Fix failing F&O examples
Corrects many of the errors in tests identified in issue #1516
Issue #1516 created #created-1516
Test failures in app-spec-examples
I'm getting the following test failures (exluding ones where the Saxon implementation is known to be incomplete).
fo-test-fn-csv-to-xml-004 to -010
These are failing because the results are formatted with whitespace which the example expression does not actually generate.
fo-test-fn-format-number-005
Error on line 3 at column 16 (in expression on line 1116) of ... qt4tests/app/fo-spec-examples.xml FODF1310 format-number picture: Digit sign must not appear after a zero-digit sign in the integer part of a sub-picture
The test is format-number(12345, '0,###^0', { 'percent': '%:pc' }) Result: "14pc"
which seems to be a complete aberration.
fo-test-fn-format-number-008, -009, -010
fail saying decimal-format 'de' is not defined. The dependency is documented in narrative prose, but is not codified so that the test generator knows about it.
fo-test-fn-function-annotations-* **
failing because the examples use a non-existant function xs:QName#2. Probably fn:QName#2 is intended.
** fo-test-fn-highest-005 ** Error XPST0142 Keyword key does not match the name of any declared parameter of function fn:highest
Possible Saxon bug ??
fo-test-fn-hours-from-dateTime-007
Example is incorrect - it calls year-from-dateTime not hours-from-dateTime
fo-test-map-build-007
Results are not deep-equal to the stated result -haven't established why. Possible whitespace issue.
fo-test-map-pairs-001
Getting "assert-permutation failed" from the Saxon test driver. Possible problem with the test driver?
Issue #1336 closed #closed-1336
Editorial: fos record descriptions within xmlspec prose
Issue #1515 created #created-1515
higher order function group-by or gather-by for grouping
xpath 4 offers a new function fn:partition
: "Partitions a sequence of items into a sequence of non-empty arrays containing the same items, starting a new partition when a supplied condition is true". Looks like an equivalent to XSLT xsl:for-each-group
with @group-starting-with
.
I would appreciate another new HOF as an equivalent to XSLT xsl:for-each-group
with @group-by
, that is Partitions a sequence of items into a sequence of non-empty arrays containing the same items, where all items in a partition give the same value when a function f is applied. This is called gather-by
in Mathematica.
Issue #1514 created #created-1514
Editorial: optional position argument in function signature for for-each and other HOF
The change in 4.0 for the higher order function for-each
is that "the $action callback function accepts an optional position argument". But the function signature is
fn:for-each(
$input as item()*,
$action as fn(item(), xs:integer) as item()*
) as item()*
I read $action as fn(item(), xs:integer) as item()*
as a function with two mandatory parameters item()
and xs:integer
. Since the second (position) argument should be optional, i would expect:
fn:for-each(
$input as item()*,
$action as fn(item(), xs:integer?) as item()*
) as item()*
There are several HOFs with this new optional position argument which seems to be mandatory in the function signature.
Pull request #1513 created #created-1513
1512 Disallow reserved names in namespace and PI constructors
Fix #1512
Issue #1512 created #created-1512
Disallow reserved names in computed processing-instruction and namespace node constructors
The new rules banning unquoted reserved names in computed element and attribute constructors should apply equally to processing instruction and namespace node constructors.
Pull request #1511 created #created-1511
1345 Re-allow bare-brace map constructors everywhere
Fix #1345
Having changed computed element/attribute constructors to require reserved names to be quoted, we can now reintroduce bare-brace map constructors without ambiguity.
Pull request #1510 created #created-1510
1509 Drop obsolete/redundant text about "import schema" location hints
Fix #1509
Issue #1509 created #created-1509
XQuery import schema (location hints)
In XQuery §5.11, the paragraph starting "The first [URILiteral]) in a schema import..." contains obsolete information about the handling of location hints. It should refer instead to the new and more complete treatment given four paragraphs later at "The [URILiterals] that follow the at keyword are optional location hints..."
Pull request #1508 created #created-1508
1507 Make format-integer spec legible
Fix #1507
Make the "formal spec" of fn:parse-integer legible (and portable between XPath and XQuery) by avoiding use of XML character references in the code.
Issue #1507 created #created-1507
Formal spec of fn:parse-integer
In the formal spec of fn:parse-integer, it is not visually clear what the second argument of translate() is in
let $preprocessed-value := translate($value, "_
", "")
We should spell out the characters by using codepoints-to-string() or char().
Issue #1506 created #created-1506
Type declarations: Constructor functions?
I’ve tagged this as a discussion issue:
Would it make sense to declare constructor functions for the new XQuery type declarations, similar to what we now have for records, or do we believe that’s over the top?
It might improve typing in complex code, and it would allow us to write things like:
declare type world:continent as enum('Africa', 'America', 'Asia', 'Australia', 'Europe');
world:continent('Africa')
For the type above, an implicit constructor function would be created that would simply coerce its argument to the declared type:
declare function world:continent($value) as world:continent { $value };
One current drawback is that this only works for prefixed types (see #657), whereas it’s possible to define types without prefix, and reference them in local type declarations:
declare type continent as enum('Africa', 'America', 'Asia', 'Australia', 'Europe');
let $c as continent := 'Africa'
return $c
Pull request #1505 created #created-1505
1503 Add err:map, err:stack-trace, err:additional to XSLT
Fix #1503
Pull request #1504 created #created-1504
868 fn:intersperse → fn:join, array:join($arrays, $separator)
Issue: #868
Issue #1503 created #created-1503
$err:map in XSLT
XSLT needs to be brought into line with XQuery in terms of the variables available in a catch clause, in particular $err:map (see PR #493)
Pull request #1502 created #created-1502
1458 Arguments that have a default value but don't accept ()
Issue: #1458
Issue #1330 closed #closed-1330
$fallback argument of map:get() and array:get() should allow () to be supplied
Pull request #1501 created #created-1501
1318 Function Coercion: Records, Maps, Arrays
Issue: #1318
Coercion rules added for maps and arrays.
Issue #1500 created #created-1500
Coupling of global variable-bound maps to character maps in XSLT
In an application I am writing now, the xsl:output-character
s I am writing in my xsl:character-map
are of interest elsewhere in the XSLT complex that is slowly emerging.
The exercise makes me realize that character maps can be interesting in their own right. We give xsl:character-map
s names, and include them within each other, because they group meaningfully related character-string pairs. Such sets are the sort of thing one might want to have more closely coupled to the XSLT apparatus. For example, someone might create a xsl:character-map
to deal with Unicode characters in a particular script. And those characters are of interest in their own right, and the character selection might engage with other processes that need to interact with those characters.
What if we were to extend @use-character-maps
to allow character maps to draw from other preexisting maps? The list of eqNames
in @use-character-maps
would be resolved first against names of character maps. For any eqName
that is not the name of a character map, the processor would search for a global variable or global parameter by that name. Any referenced global variable/parameter must be empty or a map. Every key must be castable as a character, and every value must be a string. Failure on any of these points would raise an error.
Here is an example of hypothetical XSLT code, to illustrate how the innovation might be productively useful, producing two different character maps, each of which might be appropriate for one type of serialization or another:
<xsl:item-type name="letters:grc" as="record(transliteration as xs:string, name as xs:string)"/>
<xsl:variable name="master-map" as="map(*)">
<xsl:map>
<xsl:map-entry key="'α'" select="letters:grc('a', 'alpha')"/>
<xsl:map-entry key="'β'" select="letters:grc('b', 'beta')"/>
</xsl:map>
</xsl:variable>
<xsl:variable name="serialization-transliteration-map" as="map(xs:string, xs:string)">
<xsl:map>
<xsl:for-each select="map:keys($master-map)">
<xsl:map-entry key="." select="$master-map(current())?transliteration"/>
</xsl:for-each>
</xsl:map>
</xsl:variable>
<xsl:variable name="serialization-name-map" as="map(xs:string, xs:string)">
<xsl:map>
<xsl:for-each select="map:keys($master-map)">
<xsl:map-entry key="." select="$master-map(current())?name"/>
</xsl:for-each>
</xsl:map>
</xsl:variable>
<xsl:character-map name="transliteration" use-character-maps="serialization-transliteration-map"/>
<xsl:character-map name="names" use-character-maps="serialization-name-map"/>
In other words, if an xsl:character-map
is just a map, why not give it access to other XSLT structures that are maps?
Issue #1499 created #created-1499
Editorial: reduce noise in serialization spec for unused options
I suggest that instead of the repetitive use of paragraphs like "The json-lines serialization parameter is not applicable to the XML output method.", we should have a general statement that serialization parameters are not applicable unless otherwise specified; perhaps accompanied by a chart showing which parameters apply to which methods.
Pull request #1498 created #created-1498
1366 Use ++ and ** operators in EBNF
Fix #1366
Issue #1487 closed #closed-1487
xsl:array - don't allow content to be supplied in array form
QT4 CG meeting 094 draft minutes #minutes-10-15
Draft minutes published.
Issue #1472 closed #closed-1472
1471 JSON Serialization: Sequences on Top Level
Pull request #1497 created #created-1497
1471 JSON Serialization: json-lines
Closes #1471 and #1472.
Pull request #1496 created #created-1496
1495 Drop context value static type
Fix #1495
Also corrects one or two broken cross-spec links.
Issue #1488 closed #closed-1488
1487 in xsl:array, drop option to construct arrays from arrays
Issue #1495 created #created-1495
Drop "context value static type"
Drop the "context value static type" from the definition of the static context.
In the absence of a specification for static typing, the feature is unused, and we have dropped similar features such as statically-known document and collection types.
Issue #1394 closed #closed-1394
XSLT Default priority for `element(p:*)` etc
Issue #1442 closed #closed-1442
1394 Add new default priority rules
Issue #1378 closed #closed-1378
1375 bugs in pattern syntax
Issue #1375 closed #closed-1375
XSLT: names of functions in pattern
Issue #1467 closed #closed-1467
Modest editorial corrections to XSLT specs through 2.7
Issue #1483 closed #closed-1483
Type `none`
Issue #1489 closed #closed-1489
1483 return type of fn:error
Issue #1308 closed #closed-1308
fn:apply argument names
Issue #1490 closed #closed-1490
1308 In fn:apply, Correct $array to $arguments
Issue #1312 closed #closed-1312
Productions missing ws:explicit
Issue #1492 closed #closed-1492
1312 Add ws:explicit annotations
Issue #1183 closed #closed-1183
transient() - a function to make functions nondeterministic
Issue #1305 closed #closed-1305
Almost all functions in FO that must process multiple string items, can have as a parameter only a single collation
Issue #1473 closed #closed-1473
fn:identity: make it variadic
QT4 CG meeting 094 draft agenda #agenda-10-15
Draft agenda published.
Issue #1469 closed #closed-1469
Function finder
Issue #1494 created #created-1494
Records: Introduction?
It has been reported to me that the XQuery specification provides a nice and compact introduction on maps and arrays, but there currently is no comparable introduction for records yet. The “Changes” section on 3.2.8.3 Record Type gives a hint: It contains the sentence “Record types are added as a new kind of ItemType, constraining the value space of maps.”.
In addition, maybe we could rename the section Named Record Types to “Record Declaration”, analogous to “Variable Declaration”, “Context Value Declaration” and “Function Declaration(s)”. “Item Type Declarations” could be renamed to “Type Declaration”.
Issue #1493 created #created-1493
fn:xml-to-json: Amendments
Maybe I was too quick in waving through #1476 as I believe that the current version is a bit sketchy. It says:
An element
$E
namednumber
is processed by copying the string value of$E
to the output, making any changes that are necessary to ensure that the result is a valid JSON number. Such changes include:
- Removing leading and trailing whitespace.
- Removing a leading plus sign.
- Removing redundant leading zero digits.
- Adding a zero digit before or after a decimal point that is not preceded and followed by a digit.
- For input like
X
, we cannot ensure that it will be a valid JSON number, so I assume that the changed string needs to be validated before being output? - “Removing redundant leading zero digits” may not consider negative number like
-01
. - Does “Such changes include” imply that the list may not be comprehensive?
In many cases, the numbers to be output will be the result of an earlier json-to-xml
conversion. If it is generated with XPath numbers, we shouldn’t encounter plus signs, redundant leading zeros etc. either, so I would suggest getting rid of all post-processing. We could simply say:
An element
$E
namednumber
results in the output of the string value of the element if it is a valid JSON number. Otherwise,[FOJS0006]
is raised.
If we do want to tweak the string value, we should provide a complete set of rules; maybe something like…
The string value of an element
$E
namednumber
is modified by:
- removing leading and trailing whitespace,
- removing a single leading plus sign,
- removing redundant leading zero digits, which are optionally preceded by a leading minus sign,
- adding a zero digit before a decimal point that is not preceded by a digit, and
- adding a zero digit after a decimal point that is not followed by a digit.
If the result is a valid JSON number, it is output. Otherwise,
[FOJS0006]
is raised.
…but that’s still fuzzy (for example, it lacks the explanation of what redundant zero digits are).
Pull request #1492 created #created-1492
1312 Add ws:explicit annotations
Also updates the list of tokens using angle-brackets.
Fix #1312
Issue #1491 created #created-1491
Empty record?
Even if the use cases may be limited: Shouldn’t we allow empty record types and declarations for the sake of completeness?
(: check for empty map :)
if($map instance of record()) then ...
declare record empty();
let $empty as empty := {}
return $empty
Pull request #1490 created #created-1490
1308 In fn:apply, Correct $array to $arguments
Fix #1308
Pull request #1489 created #created-1489
1483 return type of fn:error
Fix #1483
Pull request #1488 created #created-1488
1487 in xsl:array, drop option to construct arrays from arrays
Issue #1487 created #created-1487
xsl:array - don't allow content to be supplied in array form
We reviewed and accepted a revised specification for xsl:array, but there was some unease about one of the options: specifically the ability to supply the content in the form of a sequence of arrays, which is then converted to an array of sequences.
Further work on implementation and on writing test cases inclines me to drop this option. There are several reasons:
- It is error-prone. When constructing nested arrays, there is a tendency to use
xsl:array/xsl:array
rather thanxsl:array/xsl:array-member/xsl:array
, and rather than leading to an error, this leads to incorrect results which can be hard to diagnose. - The specification relies on converting an array to a sequence using
$array?*
, but this is lossy, for example the array[(1,2),3]
is converted to the sequence(1,2,3)
. This provides a further source of potential confusion when users get it wrong. - Use of
xsl:array-member
seems to handle all the requirements and result in more readable code, and it's best to focus on providing a smaller number of different ways of achieving the same effect.
The proposal is to drop option 2(e) of section 22.1.
Issue #1450 closed #closed-1450
Syntax of computed element and attribute constructors
Issue #1480 closed #closed-1480
1450 Disallow reserved names in element/attribute constructors
QT4 CG meeting 093 draft minutes #minutes-10-08
Draft minutes published.
Issue #1486 created #created-1486
Editorial corrections & cleanups
XQFO:
- [x]
fn:hours-from-dateTime
: Wrong example:year-from-dateTime( xs:time("12:30:00") )
→ https://github.com/qt4cg/qtspecs/pull/1517/files#diff-7625c07ae8131ff65c3caa677b188ed2b9b66237312d11c05a2fa2838c6f5c67R9794 - [x]
fn:void
: “Formal Specification” should be dropped. As it is ·implementation-dependent· whether the supplied argument is evaluated or ignored, the empty sequence may not be equivalent. - [x]
fn:parse-uri
: Ampersands in 4th example (...&sort=relevance
) must be escaped. - [x]
fn:format-number
:format-number(0.14, '01%', {'percent': '%:pc')
- [x] Unify equivalent expressions
- [x]
fn:civil-timezone
:$dateTime
→$value
- [x]
fo-test-fn-siblings-002
XPath:
- [x] Add changes section for
=?>
(https://github.com/qt4cg/qtspecs/pull/985)
XQuery:
- [x]
declare record p:person {$first as xs:string, $last as xs:string, *);
- [x]
declare type app:invoice as map("xs:string", element(inv:paid-invoice))
- [x]
declare type app:overdue-invoices as map("xs:date", app:invoice*)
General:
- [x] Format code (see #1124)
Issue #1474 closed #closed-1474
xml-to-json: strip leading plus signs
Issue #1476 closed #closed-1476
1474 xml-to-json: ensure numbers are JSON conformant
Issue #1477 closed #closed-1477
1475 Stylesheet change to mark optional fields with '?'
Issue #1475 closed #closed-1475
In rendered named record types, optional fields are not so marked
Issue #1448 closed #closed-1448
Operations on the dateTime family of types
Issue #1481 closed #closed-1481
1448 Component extraction on gregorian types
Issue #1468 closed #closed-1468
Understanding the xsl:array constructor
Issue #1482 closed #closed-1482
1468 Revise the xsl:array instruction
Issue #1351 closed #closed-1351
declare item type → type
Issue #1277 closed #closed-1277
Declare named record types
Issue #1355 closed #closed-1355
1351 Add "declare record" in XQuery
Issue #1485 created #created-1485
Record declarations in XSLT
In PR #1355 we have added "declare record" syntax to XQuery.
We should now do the same for XSLT.
Issue #46 closed #closed-46
xsl:sequence: @as
Issue #1403 closed #closed-1403
Align AnyMapTest, AnyArrayTest and with ElementTest
Issue #1484 created #created-1484
Functions that expect a record type should make it extensible
In general a function (or other operation, e.g. an XSLT instruction) that expects a record type as input should make that record type extensible. For example, array:of-members
should accept record(value as item()*, *)
rather than record(value as item()*)
as currently defined; similarly map:of-pairs
should accept record(key, value, *)
.
Two reasons: (a) it avoids the user having to remove extraneous fields from records if they happen to be present, and (b) it avoids the system having to check whether extraneous fields are present.
For example, it now becomes legal (and perhaps sometimes useful) to write array:of-members(map:pairs($map))
; currently this fails because the result of map:pairs
includes key
fields which array:of-members
does not permit.
QT4 CG meeting 093 draft agenda #agenda-10-08
Draft agenda published.
Issue #1483 created #created-1483
Type `none`
The function fn:error
uses none
as return type. This type is only defined in the outdated XQuery 1.0 and XPath 2.0 Formal Semantics specification:
- Do we want to live with the current state?
- Should we include a better definition in the current specifications?
- Should we simply use
item()*
as return type, even if the function never returns anything?
If we keep none
, the test case generator should be revised for Keywords-fn-error-1.
Pull request #1482 created #created-1482
1468 Revise the xsl:array instruction
Attempts an improved (more intuitive) specification for the xsl:array instruction.
Fix #1468
Issue #1239 closed #closed-1239
XSLT xsl:next-match with select attribute
Issue #1273 closed #closed-1273
Generalize for-each-pair to work with any number of input sequences
Pull request #1481 created #created-1481
1448 Component extraction on gregorian types
Fix #1448
Pull request #1480 created #created-1480
1450 Disallow reserved names in element/attribute constructors
Fix #1450
Issue #1479 created #created-1479
Default element namespace in XQuery: interaction of 'fixed' and '##any'
In §5.14 we document two changes to "declare default element namespace":
-
The [default namespace for elements and types] can now be declared to be
fixed
for a query module, meaning it is unaffected by a namespace declaration appearing on a direct element constructor. [citation missing - should be issue #65, PR #753] -
The [default namespace for elements and types] can be set to the value ##any, allowing unprefixed names in axis steps to match elements with a given local name in any namespace. [Issue #296 PR #1181]
We don't discuss how these two options interact. Are they completely orthogonal?
The description of ##any
says its effect is that "an unprefixed name appearing in any other context where an element or type name is expected is treated as being in no namespace." This deserves a mention in the rules for direct and computed element constructors.
In §2.2.1 we have added the statement: The statically known namespaces may include a binding for the zero-length prefix; however, this is used only in limited circumstances because the rules for resolving unprefixed QNames depend on how such a name is used.
It's not clear how such a binding can come into existence. It can't be done using any of the prolog declarations that bind a prefix. And under the current rules, I don't think it can be done using a namespace declaration attribute on a fixed element constructor. Perhaps it was the intent that xmlns="abc"
would do this if "fixed" is set, but that's not the case currently.
Section §4.12.1.2 Namespace Declaration Attributes could do with a "changes" section explaining what has changed.
In §4.12.1, Direct Element Constructors, we say "If the element name has no namespace prefix, the namespace binding for the zero-length prefix in the [statically known namespaces] is used; if there is no such binding, the element name will be in no namespace." That can't be right, there's no mention of the default element/type namespace at all.
In §4.12.3.1 Computed Element Constructors, the same problem arises for a bare unprefixed NCName, but for a dynamically-computed name, we use the default element/type namespace.
Issue #1478 created #created-1478
Drop variadic functions
We've made a decision -- for good reasons of maintaining extensibility -- not to make any of the functions in the system function library variadic.
I think this raises the question of whether the feature is worth retaining. And we already had the situation that many functions that were obvious candidates for variadicity (min, max, distinct-values, sum) couldn't use the feature because they already had additional optional arguments.
If it's not a good design principle for system-defined functions, then the same applies equally to user-defined functions.
We could drop the feature and just revert to defining concat() as a special case. It would remove a significant amount of complexity.
Pull request #1477 created #created-1477
1475 Stylesheet change to mark optional fields with '?'
Fix #1475
Pull request #1476 created #created-1476
1474 xml-to-json: ensure numbers are JSON conformant
Fix #1474
Describes the changes that might be needed to supplied numbers to ensure the output conforms with JSON syntax.
Independently, adds a note to appendix to G.4 to affirm that changes to argument keywords have no backwards compatibility implication, satisfying action QT4CG-091-01
Issue #1464 closed #closed-1464
Inconsistent spelling: implementer or implementor?
Issue #1466 closed #closed-1466
1464 Standardize on "implementer" spelling
Issue #1475 created #created-1475
In rendered named record types, optional fields are not so marked
For example in uri-structure-record as shown beneath the signatures of fn:build-uri and fn:parse-uri, the field names are not marked "?" to indicate the field is optional.
The underlying XML contains this information correctly.
Issue #1461 closed #closed-1461
### Errors in `misc/BuiltinKeywords`
Issue #1460 closed #closed-1460
1323b Function parameters names: $uri → $source
Issue #1323 closed #closed-1323
Function parameters names: $uri vs. $href
Issue #1474 created #created-1474
xml-to-json: strip leading plus signs
xml-to-json now retains the format of supplied numbers, except that it strips off redundant leading zeros to ensure they are valid JSON.
It also needs to strip off redundant leading plus signs for the same reason.
QT4 CG meeting 092 draft minutes #minutes-10-01
Draft minutes published.
Issue #1436 closed #closed-1436
1323 Function parameters names: $href → $uri
Issue #1445 closed #closed-1445
fn:xml-to-json: `number-formatter` option
Issue #1455 closed #closed-1455
1445 Drop number-formatter option, retain string value
Issue #1437 closed #closed-1437
1325 Variadic System Functions limited to `fn:concat`
Issue #1429 closed #closed-1429
1403 Align type tests
Issue #1465 closed #closed-1465
1461 Generate correct tests for functions involving named record types
Issue #1473 created #created-1473
fn:identity: make it variadic
…see https://github.com/qt4cg/qtspecs/pull/1437#issuecomment-2346599503.
Pull request #1472 created #created-1472
1471 JSON Serialization: Sequences on Top Level
Issue: #1471
Issue #1471 created #created-1471
JSON Serialization: Sequences on Top Level
Extracted from #576:
All serialization methods except for json
allow sequences to be output on top level. We should also allow this for JSON data. This way, constructs like…
declare option output:item-separator '
';
declare option output:method 'text';
for $json in ({ 1: 2 }, { 3: 4 })
return serialize($json, { 'method': 'json' })
…can be simplified to:
declare option output:method 'json';
{ 1: 2 },
{ 3: 4 }
We should still not allow sequences within JSON structures. It would be inconsistent to have different output rules depending on the number of items in a sequence ((1)
and [1]
would not be serialized identically).
Pull request #1470 created #created-1470
689 fn:stack-trace: replace with $err:stack-trace
Issue: #689
Issue #1469 created #created-1469
Function finder
With the number of functions in the fn:namespace now standing at 213, the use of a drop-down in the F+O "function finder" has become a little unwieldy. Are there ways it could be improved, for example by reducing the size of the list when you type the first character of the required function name?
Issue #1468 created #created-1468
Understanding the xsl:array constructor
In reading the XSLT specs on the xsl:array
constructor (#406 adopted at meeting 28) I find myself somewhat confused. Either I need to be illumined, the specs need to be clarified, or some other adjustments are needed.
First, I'm still struck by the disparity between maps and arrays; there's a xsl:map-member
but no xsl:array-member
. That horse has been beaten many times, I know. But I need to record that the disparity is very noticeable. The disorientation is aggravated by the second example in the constructor overview:
<xsl:array use="?value">
<xsl:for-each-group select="0 to 19" group-adjacent=". idiv 4">
<xsl:map-entry key="'value'" select="current-group()"/>
</xsl:for-each-group>
</xsl:array>
A map entry within an array? This just strikes me as a hack, and is suggestive of a design flaw in the language. (I know what you're about to say in response; hold that thought...) That's exacerbated when I see example 2c invoking an array constructor within an array, only to get rid of the nested array to get at what you wanted in the first place, the items grouped into members.
Furthermore, the value of attribute @use
, ?value
, is cryptic. Yes, I get what's going on, but the first several times reading it I thought it was very abracadabrish. And that same feeling hit me for the other examples' values of @use
: .()
and ?*
. The .()
bit of syntax appears only twice in the specs (here), without explanation.
If @use
is evaluated once for every item, then in example 2b...
<xsl:array use=".()">
<xsl:for-each-group select="0 to 19" group-adjacent=". idiv 4">
<xsl:sequence select="current-group#0"/>
</xsl:for-each-group>
</xsl:array>
...I would expect the expression .()
to be applied to twenty items, because a sequence of five sequences of four items each is simply one sequence of twenty items, innit? Or are we allowed here and only here for a sequence constructor to create a sequence of sequences?
It seems that @use
is striving to do something similar to @group-by
in xsl:for-each-group
. But it isn't. Understanding the goal is obfuscated by the vague name. Use what, to what end?
Overall, I feel that the array constructor is a 2nd-class citizen in the specs. I love using the map constructor in XSLT: it tells users very clearly what's going on. I don't look forward to the using the array constructor in XSLT, because I think it does the opposite. But maybe I'm wrong.
I don't have a specific proposal to fix, because I understand some of the conceptual hurdles to a putative xsl:array-member
. Still. I can't help but wish that that's what we had.
Pull request #1467 created #created-1467
Modest editorial corrections to XSLT specs through 2.7
- Two largish sections had duplicate prose nearby;
- Some punctuation rendered consistently;
- Some substantive insertions/edits for clarification, when it seemed clear to me.
Let me know if any of these changes are misfires.
Pull request #1466 created #created-1466
1464 Standardize on "implementer" spelling
Fix #1464
Pull request #1465 created #created-1465
1461 Generate correct tests for functions involving named record types
This PR addresses issue #1461 in a fairly narrow way, without attempting to tackle the deeper problem identified in issue #1336.
This involves expanding the fos:type
entries for types such as uri-structure-record
that are referenced in function signatures, resulting in some duplication with the fos:record-description
entries that describe the same types in a different place.: hopefully the resolution to #1336 will resolve that duplication.
Issue #1464 created #created-1464
Inconsistent spelling: implementer or implementor?
We use both spellings.
oxfordreference.com says: Although the variant spelling implementor predominated for much of the late 20th century, today implementer is considered standard.
Personally, being a late 20th century kind of guy, I prefer "implementor", but we should be consistent.
Issue #1463 created #created-1463
fn:element-number: Feedback
- All other XQFO functions that have
.
as the default value for a node parameter (fn:name
, etc.) return an empty sequence if the passed argument is an empty sequence. Could we use the same rule forfn:element-number
? - The last example is incomplete; it must be
$e//section...
- The last example should return
1.2
instead of1.1
. If the given result is correct, the equivalent XPath expression needs to be revised.
While it was a no-brainer to implement the function, its specification seems overwhelming to me. I eventually read all the notes, but at the end I was more confused than in the beginning ;·) It could be one of those functions that are more accessible to users than to implementors, though.
Still, maybe at least 1, 2 more examples could be provided? The last example goes into that direction, but it could be a bit cryptic for non-experts (e.g. it expects users to know that the result of $s/ancestor-or-self::section
is returned in document order).
As this function seems to be mostly targeted to XSLT users, maybe we should offer XSLT examples. I assume that XQuery developers may rather be tempted to enumerate all children/descedants of a node, and write recursive code like…
declare function enumerate($element, $numbers) {
for $child at $pos in $element/*
let $n := $numbers || '.' || $pos
return ($n, enumerate($child, $numbers || $n))
};
In general, if we were able to generalize the function in one way or another, that would be great.
Issue #1462 created #created-1462
fn:deep-equal: default option
The default value for the option parameter of fn:deep-equal
is:
{ 'collation': fn:default-collation() }
Can we simply use {}
, or is there a particular reason for mentioning the collation
and no other option?
Issue #1461 created #created-1461
### Errors in `misc/BuiltinKeywords`
### Errors in `misc/BuiltinKeywords`
Test cases Keywords-fn-parse-html-1
and Keywords-fn-build-uri-1
contain invalid function signatures, caused I think by a bug in generate-keyword-test-set.xsl
. Both seem to omit the SequenceType
of one of the arguments:
fn:parse-html(html := ?, options := ?) instance of
function((xs:string | xs:hexBinary | xs:base64Binary)?, ) as document-node(element(*:html))?
fn:build-uri(parts := ?, options := ?) instance of function(, map(*)?) as xs:string
Examining generate-keyword-test-set.xsl
shows:
<xsl:for-each select="arg">
<xsl:if test="position() != 1">, </xsl:if>
<xsl:text>{@type}</xsl:text>
</xsl:for-each>
but for these two function definitions, type information is indirected via a @type-ref
attribute:
<fos:proto name="parse-html" return-type="document-node(element(*:html))?">
<fos:arg name="html" type="(xs:string | xs:hexBinary | xs:base64Binary)?"/>
<fos:arg name="options" type-ref="parse-html-options"
default="{
"method": "html",
"html-version": "5"
}"/>
</fos:proto>
and
<fos:proto name="build-uri" return-type="xs:string">
<fos:arg name="parts" type-ref="uri-structure-record"
example='{
"scheme": "https",
"host": "qt4cg.org",
"port": (),
"path": "/specifications/index.html"
}'/>
<fos:arg name="options" type="map(*)?" default="{}"/>
</fos:proto>
Originally posted by @johnlumley in https://github.com/qt4cg/qtspecs/issues/1451#issuecomment-2358082401
Issue #1451 closed #closed-1451
Minor syntax errors in FO examples
Issue #1453 closed #closed-1453
Fix typo in load-xquery-module example
Issue #1235 closed #closed-1235
Function Identity: Treating function items with identical bodies
Issue #1439 closed #closed-1439
1235 Function Identity: Treating function items with identical bodies
Issue #1435 closed #closed-1435
1421 fn:unix-dateTime: Revisions
Issue #1421 closed #closed-1421
`fn:unix-time`: Revisions
Issue #1422 closed #closed-1422
`fn:hash`: Revision
Issue #1433 closed #closed-1433
1422 fn:hash: Revision
Issue #1427 closed #closed-1427
Add a function equivalent to xsl:number
Issue #1430 closed #closed-1430
1427 Add element-number function
Issue #1373 closed #closed-1373
XQFO: Editorial
Issue #1434 closed #closed-1434
1373 XQFO: Editorial
Issue #1322 closed #closed-1322
fn:collation-available (editorial)
Issue #1438 closed #closed-1438
1322 fn:collation-available (editorial)
Pull request #1460 created #created-1460
1323b Function parameters names: $uri → $source
Second attempt. Closes #1323 and #1436.
In addition, includes a change to fn:json-doc($uri)
, and fixed an editorial bug for the fn:escape-html-uri
function.
Issue #1444 closed #closed-1444
Implement improvement to bibligraphy entry for IEEE 802.3
Issue #1446 closed #closed-1446
Limits on xs:dateTime
Issue #1447 closed #closed-1447
1446 Rephrase conformance rule on xs:dateTime limits
Issue #1459 created #created-1459
Function properties and arities (editorial)
I try to understand the semantics behind the Properties section in the XQFO spec. Here are some examples of existing functions:
fn:format-integer
The two-argument form of this function is ·deterministic·, ·context-dependent·, and ·focus-independent·. It depends on default language. The three-argument form of this function is ·deterministic·, ·context-independent·, and ·focus-independent·.
fn:format-time
This function is ·deterministic·, ·context-dependent·, and ·focus-independent·. It depends on default calendar, and default language, and default place, and implicit timezone.
fn:index-of
The two-argument form of this function is ·deterministic·, ·context-dependent·, and ·focus-independent·. It depends on collations, and implicit timezone. The three-argument form of this function is ·deterministic·, ·context-dependent·, and ·focus-independent·. It depends on collations, and static base URI, and implicit timezone.
fn:element-number
The zero-argument form of this function is ·deterministic·, ·context-dependent·, and ·focus-dependent·. The one-argument form of this function is ·deterministic·, ·context-independent·, and ·focus-independent·. The two-argument form of this function is ·deterministic·, ·context-independent·, and ·focus-independent·. The three-argument form of this function is ·deterministic·, ·context-independent·, and ·focus-independent·.
I am afraid I cannot give hints for improvement yet; I just noticed it is increasingly difficult to interpret. For the moment, I wonder…
- Are all of these current definitions correct?
- In many cases, if the user supplies an empty sequence as argument, the default value is used instead. This contrasts with the current property description that relies a lot on the function arities. For example,
fn:index($a, $b)
andfn:index-of($a, $b, ())
should have identical properties, but the current wording implies that the three-argument of this function has different properties, e.g., it is said to rely in this case on the static base URI. - Some functions with default values have a single property description (see
fn:format-integer
), other have multiple descriptions (seefn:format-time
). Is there a particular rule behind that? - Could we merge the descriptions for arities with identical properties?
- Maybe it will be easier not to consider the arities at all anymore?
Issue #1396 closed #closed-1396
Rendition of EBNF syntax scraps
Issue #1458 created #created-1458
Arguments that have a default value but don't accept ()
The following functions define an argument that has a default value, but don't allow () to be supplied explicitly:
lang
id
element-with-id
idref
parse-html
As determined by running the following query against the function catalog:
//*:arg[@default and not(ends-with(@type,'?') or ends-with(@type, '*'))]/(ancestor::*:function/(@prefix||':'||@name))
There are others which this query doesn't pick up, mainly where the defaultable argument is a function, including:
array:build map:build array:get map:get fn:highest fn:lowest
Issue #1224 closed #closed-1224
Attribute priority for xsl:accumulator-rule
Issue #409 closed #closed-409
XSLT: xsl:next-match and xsl:apply-imports interaction with on-multiple-match
Issue #296 closed #closed-296
Default namespace for elements; especially in the context of HTML
Issue #1457 created #created-1457
Common name for maps & arrays
Past thoughts:
- https://github.com/qt4cg/qtspecs/issues/826#issuecomment-1821359131
- https://github.com/qt4cg/qtspecs/issues/1456#issuecomment-2358257374 ff.
Copied from Michael:
It would be great if we could agree on a collective term for "maps and arrays". "Structured item" feels too generic to me. I've toyed with terms like "tabulation", "tabula", "composition", "dataset", "compendium", "aggregate". Perhaps "combo"? It's best to have a word that stands out from the crowd if we can't find one whose meaning is self-explanatory.
Issue #1381 closed #closed-1381
fn:parse-uri: Observations (cont.)
Issue #1387 closed #closed-1387
fn:build-uri: Observations
Issue #1456 created #created-1456
Filtering by type in lookup expressions
We have dropped the syntax ??type(T)
for filtering the results of lookup expressions, because of problems with syntax ambiguity. This issue seeks an alternative.
Although selection by type also makes sense with shallow lookup, it is most relevant with deep lookup. The main need arises with intermediate steps of a path such as ?? X ?? Y
which gives a dynamic error if X selects something that is not a map or array. This is consistent at one level with // X // Y
, except that // X
can never select something that isn't a node.
The main problems with filtering using an [. instance of record(p, q)]
predicate is that it's very long-winded. For example, if we want to select only those members of a selected array that are sequences of a particular record type, without flattening everything else, we have to write something like ?? values::* ?[. instance of record(p, q)+] ? *
, which is a bit of a nightmare.
Starting from the end goal, I would like to be able to write something close to ??record(first, last)
to select all the items of this record type at any depth. We know that syntax doesn't work, because ??NCName
is already taken. That's also true for ??items::record(first, last)
, unless we change the rules for what can appear after ::
.
Also, there's another syntax hazard: what we want here is a SequenceType, not an ItemType, and that means that it can contain a trailing ?
occurrence indicator, which is easily confused with the next lookup operator in a path.
Looking at it from all angles, I do feel the best solution is to prefix the record(first, last)
with a marker character so that we know we've got a type filter here. Characters that might do the job include @
, #
, $
, %
, ^
, ~
. Of these, my preference remains ~
, for three reasons:
(a) it's currently unused: overloading a different symbol is more likely to cause visual confusion
(b) one of the traditional uses of ~
is to indicate a "matches" or "is kind of like" relationship.
(c) there's a mnemonic association between "tilde" and "type" (compare "at" and "attribute")
Issue #1324 closed #closed-1324
Executable specifications
Issue #1335 closed #closed-1335
Data Model primitives for Maps and Arrays
Pull request #1455 created #created-1455
1445 Drop number-formatter option, retain string value
Fix #1445
Pull request #1454 created #created-1454
1449 Relax rules on multiple xsl:includes
Fix #1449
Pull request #1453 created #created-1453
Fix typo in load-xquery-module example
Fix #1451
Issue #1372 closed #closed-1372
Unknown option: FORG0013 → XPTY0004
Issue #1431 closed #closed-1431
1372 Unknown option: FORG0013 → XPTY0004
Issue #1379 closed #closed-1379
Circular dependencies: XQDY0054 vs. XPST0008 vs. optional errors
Issue #1432 closed #closed-1432
1379 Initializing expression: Allow self references
Issue #1364 closed #closed-1364
1314 Change to type() syntax to fix ambiguity
Issue #1314 closed #closed-1314
Ambiguity in XPath EBNF - Lookup with TypeQualifier vs DynamicFunctionCall
Issue #1414 closed #closed-1414
XSLT spec abstract, introduction
Issue #1389 closed #closed-1389
fn:while-do: Optional error: will not terminate
Issue #1440 closed #closed-1440
1387 Another tweak to build-uri
Issue #1452 created #created-1452
Links from the agendas/minutes to the dashboard don't redirect when the PR is no longer on the dashboard
Could they?
S.M.O.P. I suppose.
Issue #1451 created #created-1451
Minor syntax errors in FO examples
The example for fn-load-xquery-module
contains I think an errant semicolon:
let $expr := "2 + 2"
let $module := `xquery version "4.0";
module namespace dyn="http://example.com/dyn";
declare %public variable $dyn:value := {$expr};`
let $exec := load-xquery-module("http://example.com/dyn",
{'content':$module});
let $variables := $exec?variables
return $variables(QName("http://example.com/dyn", "value"))
The semicolon after the third let
is not permitted by the grammar - it only appears as a separator in Version
, Module
and Prolog
. Removing that semicolon permits a parse under the current grammar.
Issue #832 closed #closed-832
77 Lookup returning path selection
Issue #1450 created #created-1450
Syntax of computed element and attribute constructors
The syntax of computed element and attribute constructors causes parsing problems, and problems in extending the grammar (for example it restricts the contexts in which bare-brace map constructors can appear).
In the syntax element|attribute _name_ { _content-expression_ }
I propose that:
(a) it should be possible to use a StringTemplate
in place of the name. This will normally just be the name enclosed in backticks but interpolated expressions are allowed.
(b) reserved names should be disallowed. This is a backwards-incompatible change. The reserved names are the non-delimiting terminal symbols listed in A.3.2. We should add a warning that additional names might be reserved in future versions, and advise use of backticks.
Issue #1449 created #created-1449
Discussion: include/import of files.
I don't especially do this.
a) because the environment that we run XSLT in (95% of the time) doesnt support it b) because its too rigid to use (maybe I'm doing it wrong).
Motivation:
I have 'module' List.xsl lets say that models lists, and has a function (in psuedo XPath so you can see the types) (a module contains constructors and ideally all functions related to, here, a list)
function tryHead($xs as list()) as maybe()
I have 'module' Maybe.xsl lets say that models Maybe
function toList($xs as maybe()) as list()
So Maybe.xsl needs to know about List.xsl and List.xsl needs to know about Maybe.xsl
if I use xsl:include or xsl:import (in saxon) I get:
The stylesheet module includes/imports itself directly or indirectly
(which I'm happy reflects the correct behaviour given the spec - i.e. I don't think this is a bug in the implementation)
Given that I CAN write this cycle in a single file, without restrictions on the order of the constructs (i.e. this isnt a restriction of the language itself - this isnt alway true in other languages), it seems less than ideal that I can't freely compose files in order to replicate the situation an allow decomposition into logical files.
(I expect this is a consequence of the rules around priority of templates, which I am broadly ignorant of, and to be honest, largely not directly concerned with - I don't use this to 'compose' templates, but to write 'function' libraries)
(MK has answered a question in stack overflow related to this, which solved the issue by reorganising the files, which is what I currently do, but its not ideal that I cannot decompose and test code in isolation)
Is this worth resolving?
Many years ago I was a C programmer, and I remember a similar issue that was resolved with #define, whereby header files were imported something like this (excuse my made up C syntax)
#ifndef _Maybe_
#include "Maybe.h"
#endifndef
and then Maybe.h would define _Maybe_
, this would mean, if you followed the idiom, that the file was included by the preprocessor, at most once (I'm not suggesting this mechanism is directly applicable, just that this is a common issue elsewhere).
Issue #1448 created #created-1448
Operations on the dateTime family of types
Now that we support choice types in function signatures, we can easily generalize functions such as year-from-dateTime
to work on all types in the dateTime family (specifically dateTime, date, time, gYear, gYearMonth, gMonth, gMonthDay, gDay). A request for a component that is not present in the value returns ().
We can apply this to all seven X-from-dateTime() functions.
Of course this makes the X-from-date() and X-from-time() versions redundant, but that's not a problem.
Pull request #1447 created #created-1447
1446 Rephrase conformance rule on xs:dateTime limits
Fix #1446
Issue #1446 created #created-1446
Limits on xs:dateTime
Section 9.1.1 says
All minimally conforming processors must support positive year values with a minimum of 4 digits (i.e., YYYY)
The word "minimum" here is ambiguous.
I suggest
All minimally conforming processors must support year values in the range 1 to 9999.
For bonus points, drop the use of the undefined term "minimally conforming", and link instead to something defined in section 1.2 Conformance.
Issue #1445 created #created-1445
fn:xml-to-json: `number-formatter` option
I try to, but I don’t get happy with the number-formatter
option of fn:xml-to-json
. It results in a lot of special-casing at serialization time for a very special case. More generally, …
- The option would be much more helpful for outputting maps and arrays as JSON with at serialization time, but it is not available as standard serialization parameter.
fn:xml-to-json
represents numbers as text nodes, so numbers already have a string representation. If needed, they could easily be preprocessed before being serialized as JSON.
I imagine that the major use cases for the option will be:
- to avoid scientific representation for doubles, and
- to be able to serialize
INF
andNaN
.
Couldn’t we simply allow numeric string values (i.e., the children of fn:number
elements) to be output without changes?
'1000000' => json-to-xml() => xml-to-json() (: 1000000 instead of 1.0E6 :)
xml-to-json(<fn:number>INF</fn:number>) (: INF instead of error :)
We already have the number-parser
option, which allows us to generate a string representation for numbers that will be suitable for being converted back with fn:xml-to-json
. For other cases, the XML source can easily be updated before being passed to fn:xml-to-json
.
If we really want to keep the option, …
- We should add examples for its usage in the spec.
- We need more tests.
- I would suggest using the function type
(fn(xs:untypedAtomic) as xs:untypedAtomic)?
, similar as forfn:replace
andfn:json-to-xml
(this would allow things likexml-to-json($input, { 'number-formatter': fn { . + 1 } })
). - We should also add it as a general serialization parameter to be able to write:
serialize(xs:double('NaN'), {
'method': 'json',
'number-formatter': fn($n) { if(is-NaN($n)) then '"NaN"' else $n }
})
If it’s too much hassle to introduce it as a general serialization parameter, we should at least provide it for fn:serialize
.
Pull request #1444 created #created-1444
Implement improvement to bibligraphy entry for IEEE 802.3
Thank you, Wendell.
Issue #1443 closed #closed-1443
CSS tweaks for productions
Pull request #1443 created #created-1443
CSS tweaks for productions
- Make the productions a little easier to read by removing the border and background on
code
elements in production tables. - Make the background color for production notes a little less ... intense.
Issue #1441 closed #closed-1441
1396 Improve presentation of grammar rules
Pull request #1442 created #created-1442
1394 Add new default priority rules
Adds rules for the default priority of new match pattern options such as element(p:*)
and element(p|q)
.
Fix #1394
Pull request #1441 created #created-1441
1396 Improve presentation of grammar rules
Improves the presentation of grammar rules.
- Production numbers are dropped
- The width of the RHS column is increased so there is less line wrapping
- The summary EBNF is alphabetically sorted
Issue #1185 closed #closed-1185
1179 array:values, map:values → array:get, map:get
Pull request #1440 created #created-1440
1387 Another tweak to build-uri
In #1387 @ChristianGruen observes that the +
seems to be a special case in the tel:
scheme. In fact, I think we concluded that the special case is that you shouldn't encode the path segment of a non-hierarchical URI.
This PR attempts to implement that.
Issue #1269 closed #closed-1269
Could the labeling of grammar productions be improved?
Pull request #1439 created #created-1439
1235 Function Identity: Treating function items with identical bodies
Closes #1235
Pull request #1438 created #created-1438
1322 fn:collation-available (editorial)
Closes #1322
Pull request #1437 created #created-1437
1325 Variadic System Functions limited to `fn:concat`
Closes #1325
Pull request #1436 created #created-1436
1323 Function parameters names: $href → $uri
Closes #1323
Pull request #1435 created #created-1435
1421 fn:unix-dateTime: Revisions
Closes #1421
Pull request #1434 created #created-1434
1373 XQFO: Editorial
Closes #1373
Pull request #1433 created #created-1433
1422 fn:hash: Revision
Closes #1422
Pull request #1432 created #created-1432
1379 Initializing expression: Allow self references
With function declarations, recursive functions can be declared:
declare function local:factorial($x) {
if($x > 1) then $x + local:factorial($x - 1) else $x
};
local:factorial(5)
We should allow the same for variable declarations:
declare variable $factorial := fn($x) {
if($x > 1) then $x + $factorial($x - 1) else $x
};
$factorial(5)
I believe it is sufficient to simplify the definition of initializing expressions and drop the exception “other than the variable being declared”.
Related: https://www.w3.org/Bugs/Public/show_bug.cgi?id=15791 (the static dependency check was given up before due to fn:function-lookup
).
Closes #1379
Pull request #1431 created #created-1431
1372 Unknown option: FORG0013 → XPTY0004
Closes #1372
QT4 CG meeting 089 draft minutes #minutes-09-10
Draft minutes published.
Issue #1209 closed #closed-1209
1183 Add transient mode and the transient{} expression
Issue #1426 closed #closed-1426
Byte ordering of CRC-32 hash result
Issue #1428 closed #closed-1428
1426 Add notes on endianness of CRC-32
Issue #1360 closed #closed-1360
1348 Some grammar simplifications
Issue #1391 closed #closed-1391
Annotations: duplicate names
Issue #1393 closed #closed-1393
1391 Change function-annotations to return a sequence
Issue #1411 closed #closed-1411
uri-structure-record gives type of path-segments property as xs:string?, not xs:string*
Issue #1412 closed #closed-1412
Fix typo in uri-structure-record
Issue #1413 closed #closed-1413
Dispose of action QT4CG-080-05, add absolute to parse-uri
Issue #1408 closed #closed-1408
The description of XPTY0117 still refers to the "function conversion rules"
Issue #1417 closed #closed-1417
1408 Fix reference to "function conversion rules" in XPTY0117
Issue #1415 closed #closed-1415
xsl:item-type needs to be added to the list of XSLT declaration components
Issue #1418 closed #closed-1418
1415 Add to lists of XSLT declarations and instructions
Issue #1419 closed #closed-1419
1337bis Replace a few remaining occurrences of "atomic value"
Issue #1423 closed #closed-1423
1387b Clarify parse-uri/build-uri encoding rules, and remove options
Issue #1388 closed #closed-1388
1387 Clarify the encoding rules
Issue #1385 closed #closed-1385
Quantifier expressions: optional positional argument
Issue #1424 closed #closed-1424
Typo in XQuery spec: expresssions
Issue #1425 closed #closed-1425
1424 Fix typo
Pull request #1430 created #created-1430
1427 Add element-number function
Fix #1427
Pull request #1429 created #created-1429
1403 Align type tests
fixes #1403
Allow Array and Map tests to omit the asterisk to match any array or map.
array()
<>array(*)
map()
<>map(*)
Pull request #1428 created #created-1428
1426 Add notes on endianness of CRC-32
Fix #1426 (partially; it would also be good to have a better citation to IEEE 802.3)
Issue #1427 created #created-1427
Add a function equivalent to xsl:number
I propose adding a function to perform node numbering in a manner analogous to xsl:number (but without the formatting aspects, which can be handled using format-integer, and without multi-level numbering).
I envisage a function along the following lines:
node-number($node as element(),
$from as (document-node()|element()),
$count as fn($node as element()) as xs:boolean?)
as xs:integer
The function returns the number of element nodes that satisfy all the following conditions:
- they are descendants of $from
- they are preceding-or-self nodes of $node
- they satisfy the $count predicate
$node defaults to the context node. $from defaults to the parent of $node. $count defaults to a function that returns true for an element that has the same name as $node, false otherwise.
With no arguments, node-number() applied to (say) a <p>
element returns the number of preceding-sibling <p>
elements plus one. [Not quite. Under this definition, it would also count <p>
elements that are descendants of a preceding-sibling element]
The rationale for this proposal is (a) to make the core functionality of xsl:number available in environments other than XSLT, and (b) within XSLT, to make it available in contexts such as a predicate of a match pattern where it is currently difficult or impossible to invoke xsl:number
except by wrapping it in a user-defined variable or function.
Issue #1426 created #created-1426
Byte ordering of CRC-32 hash result
Test case hash-str-034 takes a 150,000-character string consisting of 50,000 repetitions of the string "ABC", and computes its CRC-32 hash value.
On Java the result comes back as decimal 105475755, which we translate to 0x06496EAB
On C# the result comes back as a 4-byte array AB 6E 49 06. My guess is we need to reverse the byte order, but I'd like to understand why...
I guess we need to say something about byte ordering, but what?
I tried to find the definitive specification. We simply point to IEEE 802.3 without a specific URI or section or version number. 802.3 actually appears to be a large family of standards. Can we do better than that?
I'm interested to know whether the specification actually defines the result to be a 32-bit (unsigned?) integer rather than an array of four bytes. If that's the case, then our spec needs to say how the 32-bit integer is converted to a sequence of 4 octets, which is what fn:hash actually returns. The C# example shows that there's more than one way of doing it.
Pull request #1425 created #created-1425
1424 Fix typo
Close #1424
Thank you, Amanda.
Issue #1424 created #created-1424
Typo in XQuery spec: expresssions
The XQuery 3.1 spec and XQuery 4.0 draft spec both have an extra "s" in the word expressions, showing it as expresssions
. The typo occurs in the C.1 Static Context Components section.
I know this repo is only for 4.0 work. Should I make a separate issue in the https://github.com/w3c/qtspecs repo?
Pull request #1423 created #created-1423
1387b Clarify parse-uri/build-uri encoding rules, and remove options
This is a slightly more radical alternative to #1388
In addition to clarifying the encoding rules (maintaining slightly different rules for path segments, query parameters, and fragment identifiers), it removes the path-separator
and query-separator
options.
Issue #1422 created #created-1422
`fn:hash`: Revision
We should make the algorithm argument in the the function signature of fn:hash
explicit and change the signature from…
fn:hash(
$value as (xs:string | xs:hexBinary | xs:base64Binary)?,
$options as map(*)? := {}
) as xs:hexBinary?
…to:
fn:hash(
$value as (xs:string | xs:hexBinary | xs:base64Binary)?,
$algorithm as xs:string? := 'MD5',
$options as map(*)? := {}
) as xs:hexBinary?
It will happen often enough that the function is used for algorithms other than MD5:
(: OLD :) hash("ABC", { "algorithm": "CRC-32" })
(: NEW :) hash("ABC", "CRC-32")
Issue #1421 created #created-1421
`fn:unix-time`: Revisions
In yesterday’s meeting, fn:unix-time
was added to the spec. It was proposed to…
- drop support for negative values and
- change the function name to
fn:unix-dateTime
.
I sympathize with both suggestions. What do others think?
Issue #1420 closed #closed-1420
Markup fix
Pull request #1420 created #created-1420
Markup fix
There's a markup error in the hash function that breaks the automatic test build. This PR fixes that.
(I'm just going to merge this straight away.)
Pull request #1419 created #created-1419
1337bis Replace a few remaining occurrences of "atomic value"
Replace a few remaining occurrences of "atomic value" with "atomic item".
Pull request #1418 created #created-1418
1415 Add to lists of XSLT declarations and instructions
Fix #1415
Adds xsl:item-type to the list of declarations. Adds xsl:array. xsl:map, and xsl:map-entry to the list of instructions.
Pull request #1417 created #created-1417
1408 Fix reference to "function conversion rules" in XPTY0117
Fix #1408
Issue #1416 created #created-1416
Key-value pairs: built-in record type `pair`
A built-in record type should be defined for key-value pair maps, which currently defined as follows…
A key-value pair map is a map containing two entries, one (with the key
"key"
) containing the key part of a key value pair, the other (with the key"value"
) containing the value part of a key value pair.
…as we can expect them to be used more often in the future. Would it be possible/make sense to assign the record type to the xs
namespace (xs:pair
)?
In addition, we should…
- use it consistently in the text (XQFO, deep lookups, etc.),
- replace existing
record(key as xs:anyAtomicType, value as item()*)
references withxs:pair
, and - rename
$input
or$map
arguments to$pair
.
Examples for function signatures to be updated:
map:of-pairs(
$pairs as xs:pair*,
$combine as fn(item()*, item()*) as item()* := fn:op(',')
) as map(*)
map:pair(
$key as xs:anyAtomicType,
$value as item()*
) as xs:pair
map:pairs(
$map as map(*)
) as xs:pair*
We could add numerous other record types, but maybe we can handle those in separate issue and keep this one focused on key/value pairs.
Issue #1343 closed #closed-1343
Drop the static typing feature
Issue #1344 closed #closed-1344
1343 Drop the static typing feature
Issue #1337 closed #closed-1337
Atomic value → atomic item
Issue #1361 closed #closed-1361
1337 Atomic value becomes atomic item
Issue #959 closed #closed-959
Milliseconds ↔ xs:dayTimeDuration, Unix time ↔ xs:dateTime
Issue #1358 closed #closed-1358
959 fn:unix-dateTime
Issue #1321 closed #closed-1321
Leading lone slash
Issue #1367 closed #closed-1367
1321 leading lone slash
Issue #1401 closed #closed-1401
Casting from duration to string
Issue #1409 closed #closed-1409
1401 Rewrite of F+O section 20, Casting
Issue #1368 closed #closed-1368
Further improvements to BuiltInKeywords test needed
Issue #1316 closed #closed-1316
XPath: type declarations in quantified expressions
Issue #1384 closed #closed-1384
1316 Type declarations in quantified expressions
Issue #1228 closed #closed-1228
– Adding the BLAKE3 hashing algorithm to fn:hash
Issue #1193 closed #closed-1193
Parsing Functions: Empty input
Issue #1231 closed #closed-1231
1193 Parsing Functions: Empty input
Issue #1339 closed #closed-1339
Drop unordered mode
Issue #1342 closed #closed-1342
1339 Deprecate ordering mode declaration
Issue #1350 closed #closed-1350
unparsed-text-available() signature
Issue #1352 closed #closed-1352
1350 Fix signature for unparsed-text-available
Issue #1347 closed #closed-1347
The escape-solidus option should apply to xml-to-json
Issue #1353 closed #closed-1353
1347 Add escape-solidus option to xml-to-json function
Issue #1346 closed #closed-1346
Typos in fn:format-number
Issue #1359 closed #closed-1359
1346 Fix minor typos in format-number
Issue #1369 closed #closed-1369
fn:round: `rounding-mode` → `mode`?
Issue #1370 closed #closed-1370
1369 fn:round: rounding-mode → mode
Issue #1320 closed #closed-1320
fn:parse-uri: Observations
Issue #1380 closed #closed-1380
1320 Attempt to resolve a bug in parse-uri
Issue #1374 closed #closed-1374
Duplicate keys in map constructor
Issue #1383 closed #closed-1383
1374 - allow static error for duplicate keys
Issue #1382 closed #closed-1382
Missing XSLT error code
Issue #1386 closed #closed-1386
1382 add error code XTSE4040
Issue #1390 closed #closed-1390
1368 built in keywords improvements
Issue #1398 closed #closed-1398
1397 Add missing change log entry for constructor functions
Issue #1397 closed #closed-1397
Missing change log entry for zero-arity constructor functions
Issue #1395 closed #closed-1395
Choice item types: subtyping
Issue #1400 closed #closed-1400
1395 Revise rules for subtyping of choice item types
Issue #1402 closed #closed-1402
Update schema for XSLT 4.0 to include agreed syntax changes
Issue #1405 closed #closed-1405
1404 Change fn:invisible-xml grammar parameter to xs:string?
Issue #1404 closed #closed-1404
Invisible-xml function: why is $grammar of type item()?
Issue #917 closed #closed-917
Better support for typed maps
Issue #1371 closed #closed-1371
(type)switch: braces after `case` keyword
Issue #1399 closed #closed-1399
XSLT fixed-namespaces - contradictory statements
Issue #1406 closed #closed-1406
1399 clarify fixed-namespaces spec
Issue #1415 created #created-1415
xsl:item-type needs to be added to the list of XSLT declaration components
https://qt4cg.org/specifications/xslt-40/Overview.html#dt-declaration needs xsl:item-type
adding
Pull request #1414 created #created-1414
XSLT spec abstract, introduction
As currently written, preliminary parts of the the XSLT specs are out of sync with other specs, or have inconsistencies.
- The section called Abstract mostly describes what's different in 4.0. That despite a section shortly later specially devoted to the topic.
- The section called Abstract does not lead with its best foot forward: the language does a lot more than XML-to-XML transforms.
- The section called "What is XSLT?" never answers the question posed in the header.
- The section called "What is XSLT?" begins with a sentence that both repeats earlier material and is not germane to the header.
In the attached PR, I have proposed moving several paragraphs. I have not intentionally edited them. I have also taken a stab at defining XSLT. I have tried to calibrate the Abstract so that it is both sparing, in the spirit of the XQFO abstract.
I don't think that what I've put in here is any way finalized. But I hope it is a step in the right direction, and a good basis for further refinement.
Pull request #1413 created #created-1413
Dispose of action QT4CG-080-05, add absolute to parse-uri
This PR disposes of my action to add absolute
to the output of fn:parse-uri
.
The rules that define an absolute URI: "has a scheme", "does not have a fragment identifier", "is a hierarchical scheme" don't fit neatly into a single place, so there are two different places where the setting is considered.
Because of this, I think it's simpler to set absolute
to true
when the URI is absolute, but not to false
when it isn't. That could be rectified with a bit more prose.
Pull request #1412 created #created-1412
Fix typo in uri-structure-record
Fix #1411
Issue #1411 created #created-1411
uri-structure-record gives type of path-segments property as xs:string?, not xs:string*
The current spec defines uri-structure-record
as having a
path-segments? as xs:string?
property, which doesn't make sense and also contradicts the example given later:
The unescaped form is easily accessible from path-segments:
("", "path", "to", "a/b")
It should be xs:string*
.
Issue #1410 closed #closed-1410
Ignore this PR
Pull request #1410 created #created-1410
Ignore this PR
Just testing
Pull request #1409 created #created-1409
1401 Rewrite of F+O section 20, Casting
Fix #1401
The main changes are:
- The three derived types xs:integer, xs:dayTimeDuration, and xs:yearMonthDuration are no longer treated as primitive for the purpose of this section. They are now treated as derived types, but given special status where necessary as "quasi-primitive".
- In places where the F+O rules give the same result as the canonical representation in XSD 1.1, we now defer to XSD 1.1 rather than replicating the rules. Many of the rules originate with XPath 2.0, which was published before XSD 1.1, but which anticipated some of the changes in XSD 1.1, for example the use of a seven-component model for dates/times, and a two-component model for durations. XPath 3.0/3.1 failed to take advantage of the resulting opportunity for rationalisation.
- Generally the language is a bit less terse, with more notes and examples
- The rules have more to say about the type annotation of the result. In some places the spec appeared to imply that the type annotation on the result must be the target type; in others it appeared to imply that the type annotation must be unchanged from the source (for example 19.1.1 "If ST is xs:string or a type derived from xs:string, TV is SV. [presumably with unchanged type annotation]). The spec is now hopefully clearer that the result TV MUST be an instance of TT and MAY be an instance of some other type derived from TT, especially in the case where the value is unchanged.
Issue #1408 created #created-1408
The description of XPTY0117 still refers to the "function conversion rules"
The description of XPTY0117 still refers to the "function conversion rules" rather than the "coercion rules".
Issue #1407 created #created-1407
Improve the spec prose and table of content layout for types
This has been discussed on Slack. The key points are:
- The terminology in the table of content is mixed (Union Types vs Function Tests), focused on the grammar not the organisation.
- The test sections (Function Test, etc.) lack definitions and overviews of the item types like is done with the other types, such as choice item types.
I also think the other sections like the Subtype Relationships are largely fine as they are as they are dealing with specific rules around subtypes, coercion, etc.
So I propose:
- Adding an overview of what the function types/tests, etc. are like is done with the union types, etc.
- Adding a note to the maps overview similar to "A map is a function of the form
function (xs:string) as item()*
that can be passed a key name and returns the value of that key entry." - Adding a note to the arrays overview similar to "An array is a function of the form
function (xs:integer) as item()*
that returns the items of the array at the specified index, or raises a FO.... error if the index is out of bounds."
I also propose renaming and reorganizing parts of the table of contents as follows (omitting unaffected sections):
- 3 Types
- 3.1 Sequence Types
- 3.2 Item Types
- 3.2.1 General item types
- 3.2.2 Atomic Types
- 3.2.3 Union Types
- 3.2.4 Namespace-sensitive Types
- 3.2.5 Δ Choice Item Types
- 3.2.6 Δ Enumeration Types
- 3.2.7 Δ Node Types
- 3.2.8 Δ Function Types ~ NOTE: Covers the current 3.2.8.1 Δ Function Test section.
- 3.2.9 Maps and Records
- 3.2.10 Array Types
And to make the heading names in the subtype section consistent:
Pull request #1406 created #created-1406
1399 clarify fixed-namespaces spec
Fix #1399.
The only change is to remove a sentence from <note>...</note>
markup.
Pull request #1405 created #created-1405
1404 Change fn:invisible-xml grammar parameter to xs:string?
Issue #1404 created #created-1404
Invisible-xml function: why is $grammar of type item()?
In the invisible-xml
function, the $grammar argument has type item()?
. There's no evident reason why it isn't xs:string?
; there's no clue in the specification or example what it might mean to supply something that isn't a string (or an empty sequence).
Note that declaring it as item()
rather than xs:string?
suppresses atomization of the supplied value.
Issue #1403 created #created-1403
Align AnyMapTest, AnyArrayTest and with ElementTest
I always wondered why ElementTests for any element can be written as element()
which is equivalent to element(*)
but as map()
and as array()
are not permitted and always have to include the asterisk (map(*)
) .
This is still the case even in XPath4 and I think we should have a look, if this can be omitted. It does not add any value IMHO, but maybe I am overlooking something important.
https://qt4cg.org/specifications/xquery-40/xquery-40.html#doc-xquery40-ElementTest https://qt4cg.org/specifications/xquery-40/xquery-40.html#doc-xquery40-ArrayTest https://qt4cg.org/specifications/xquery-40/xquery-40.html#doc-xquery40-MapTest
The "fix" would be as simple as
[255] AnyMapTest ::= "map" "(" "*"? ")"
[265] AnyArrayTest ::= "array" "(" "*"? ")"
-- update after comment from @michaelhkay --
The same does not apply to FunctionTest as as function()
specifies a zero-arity function wheras as function(*)
matches functions of any arity.
Pull request #1402 created #created-1402
Update schema for XSLT 4.0 to include agreed syntax changes
Updates the schema for XSLT 4.0 stylesheets to accommodate changes recently agreed:
- Support for xsl:function/@variadic
- Support for extensions to decimal-format properties, for example
percent="%:pc"
.
Issue #1401 created #created-1401
Casting from duration to string
We've just been examining the rules for casting from duration / dayTimeDuration / yearMonthDuration to string (F+O section 20.1.1) and they are fairly impenetrable: I think we could do better.
- We use the phrase "canonical representation as defined in XSD 1.1 part 2"; a more specific reference would be helpful. Mentioning the function
·durationCanonicalMap·
would take readers straight to the right place. - The actual XSD spec has one or two typos, and uses some notations that might not be immediately obvious to every reader, for example
|m|
to meanabs(m)
.
I think the best answer to this is to add non-normative examples, which are badly missing from this whole section.
This would probably require splitting of 20.1.1 into subsections. Alternatively, we could add examples to the string() function, which would have the benefit that we have validation machinery to check that the examples under a specific function are correct; we could point to these examples from 20.1.1.
Pull request #1400 created #created-1400
1395 Revise rules for subtyping of choice item types
Corrects/clarifies the rules for determining whether A is a subtype of B when either or both is a choice item type.
Fix #1395
Issue #1399 created #created-1399
XSLT fixed-namespaces - contradictory statements
We say:
Any one of the strings xsl, xml, xs, xsi, fn, math, map, array, err. This has the effect of binding that particular namespace prefix to the [reserved namespace] with which it is conventionally associated, whether or not the [native namespace bindings] contain a binding for this prefix.
Note: ...
If the namespace prefix is explicitly bound to a different namespace, for example xmlns:math="java:java.util.Math", then that binding takes precedence.
Given
<xsl:stylsheet fixed-namespaces="math" xmlns:math="java:java.util.Math">
The normative statement implies that "java" is bound to http://www.w3.org/2005/xpath-functions/math
, while the note implies it is bound to "java:java.util.Math"
-- which surely makes more sense. I suspect the intended reading of the normative sentence is "whether or not the [native namespace bindings] contain this binding; but it needs clarifying/correcting.
Pull request #1398 created #created-1398
1397 Add missing change log entry for constructor functions
Fix #1397
Issue #1397 created #created-1397
Missing change log entry for zero-arity constructor functions
We've changed constructor functions to allow arity zero, but there is no change log entry for this.
Issue #658, PR #662
Issue #1396 created #created-1396
Rendition of EBNF syntax scraps
In the EBNF syntax tabulation, the column width available for the RHS of productions is rather limited, leading to excessive line wrapping, especially where the entire syntax is displayed in Appendix A.
I haven't been entirely successful in sorting this out, but record my experiments here.
(a) In the CSS stylesheet qtspecs.css line 169, change the padding from 30px to 10px
(b) remove all the non-breaking spaces that are inserted after the production number and around the ::=
operator. (Search for ::=
in xmlspec-2016.xsl
(c) get rid of the tbody element that surrounds each tr by putting the productions in a prodgroup, achieved by adding a prodgroup element around the line <prodrecap id="BNF-Grammar-prods" ref="BNF-Grammar-prods"/>
(line 43 of ebnf.xml)
Unfortunately this doesn't seem to allocate extra width to the columns, it merely creates extra whitespace to the right of the table. I can't work out why that is happening.
Another step would be to remove the final right-hand column containing comments, instead moving the comments to be under the RHS of the production. But the logic here is complex...
Finally, a pragmatic move would be to reduce the length of the longest production names, such as FunctionSignatureWithDefaults
, which bloat the width of the LH column.
Issue #1315 closed #closed-1315
12 div-3
Issue #1395 created #created-1395
Choice item types: subtyping
We say: [with the example changed for clarity]
3.3.2.2 Choice Item Types
[1] If B is a [choice item type], then A ⊆ B is true if A ⊆ M is true for some item type M among the alternatives of B.
[2] If A is a [choice item type], then A ⊆ B is true if M ⊆ B is true for every item type M among the alternatives of A.
Note:
Because an [enumeration type] is defined as a choice type of singleton enumerations, these rules have the consequence, for example, that enum("P", "Q") is a subtype of enum("P", "Q", "R").
Now, the first condition [1] doesn't hold: enum("P", "Q")
is not a subtype of enum("P")
, nor of enum("Q")
, nor of enum("R")
.
But the second condition [2] does hold: enum("P")
and enum("Q")
are both subtypes of enum("P", "Q", "R")
, under rule [1].
So the rules are far from clear when both A and B are choice item types.
I think it probably needs a combined rule.
(i) if both A and B are choice item types, then A ⊆ B is true if every item type a among the alternatives of A satisfies a ⊆ b for some item type b among the alternatives of B.
The existing two rules are really just special cases of that for singleton choices.
Issue #1394 created #created-1394
XSLT Default priority for `element(p:*)` etc
There are new possibilities for node tests used as XSLT patterns, for example element(p:*)
, element(*:local)
, element(p:*, T)
, element(*:local, T)
. These need to have default priorities assigned.
Pull request #1393 created #created-1393
1391 Change function-annotations to return a sequence
Fix #1391
- Changes the data model to clarify exactly what the annotations of a function item are
- Clarifies the XQuery description of what effect annotations in the query prolog have
- Changes the fn:function-annotations function to return a sequence of key value pairs in which there can be duplicate keys.
Issue #1392 created #created-1392
`element(a|b)` vs `(element(a)|element(b))`
We have introduced two syntax extensions which achieve the same effect:
element(a|b)
vs (element(a)|element(b))
Do we actually want both?
If we do, I would suggest that we define element(a|b)
as a shorthand for (element(a)|element(b))
.
Are they actually equivalent? I think it's fairly clear that they match the same items. It's less clear what the subtyping rules have to say. I think each is a subtype of the other, which means they are substitutable for each other in a function signature, but it needs a lot of digging in the spec to demonstrate this. Are they equivalent from the point of view of coercion rules? I think they probably are, because no coercion actually takes place when the required type is a node type.
So I think there's no technical bug here just a lack of clarity. It would be easier (assuming we want to retain the syntax at all) if we defined element(a|b)
as a shorthand for (element(a)|element(b))
.
Minor editorial issues:
- In 3.2.7.2 example 5 is incorrect --
element(xhtml:*|svg:*|mathml|*)
- the last|
should be:
. - In the changes section of 3.2.7.2, the
element(A|B)
syntax is not mentioned.
Issue #1391 created #created-1391
Annotations: duplicate names
The function fn:function-annotations
returns a map from QNames to values, which rather assumes that a function cannot have two annotations with the same name. But there is nothing in either the data model or the XQuery language spec that imposes this restriction.
Pull request #1390 created #created-1390
1368 built in keywords improvements
Rewrite of stylesheet for generating keyword tests
Issue #1389 created #created-1389
fn:while-do: Optional error: will not terminate
In many cases, a compiler may detect that a fn:while-do
or fn:do-until
function call will not terminate. Trivial examples:
while-do((), true#0, identity#1)
while-do((), exists#1, fn($c) { $c, $c }
We should add a fixed error code that an implementation sʜᴏᴜʟᴅ (or ᴍᴀʏ?) raise when it encounters such a case. The error can be raised statically (e.g., as the result of type checks) or dynamically (when the input is not known at compile time). We could reuse XQDY0054
.
In principle, the error code could also be used to reject non-terminating recursive functions:
declare function local:rec() { local:rec() };
local:rec()
declare function local:duplicate($input) {
if(exists($input)) then $input else local:duplicate(($input, $input))
};
local:duplicate(())
let $oh := fn($my) { $my($my) }
return $oh($oh)
(: if we legalize this, see #1379 :)
declare variable $rec := fn() { $rec };
$rec()
Pull request #1388 created #created-1388
1387 Clarify the encoding rules
I decided to put this in a separte PR so that it can be reviewed in isolation. I don't expect it to be controversial, exactly, but it may require adjustment.
Issue #1387 created #created-1387
fn:build-uri: Observations
@ndw I’ll again start with a single test case, fn-build-uri-from-parse-020:
<test-case name="fn-build-uri-from-parse-020">
<description>Builds an example from the specification</description>
<created by="Norm Tovey-Walsh" on="2023-03-10"/>
<test>fn:build-uri(map {
"uri": "ldap://[2001:db8::7]/c=GB?objectClass?one",
"scheme": "ldap",
"authority": "[2001:db8::7]",
"host": "[2001:db8::7]",
"path": "/c=GB",
"query": "objectClass?one",
"query-parameters": map {
"": "objectClass?one"
},
"path-segments": ("", "c=GB")
})</test>
<result>
<assert-eq>"ldap://[2001:db8::7]/c=GB?objectClass?one"</assert-eq>
</result>
</test-case>
I would expect ldap://[2001:db8::7]/c=GB?=objectClass%3Fone
as result, as the spec says…
If the
query-parameters
key exists in the map, its value must be a map. A sequence of strings is constructed from the values in the map. For each key and each value associated with that key in turn:
- If the key is the empty string, the string constructed is the value encoded with
encode-for-uri
.
…and encode-for-uri('?')
gives us %3F
. How do you handle that in your code?
Pull request #1386 created #created-1386
1382 add error code XTSE4040
Fix #1382
Issue #1385 created #created-1385
Quantifier expressions: optional positional argument
This has been requested multiple times before (albeit not in this repository): What about adding a positional argument to quantifier expressions? It it simple to implement, and it would allow us to do things like…
every $item at $pos in $input satisfies $item = $pos * 2
…which would be equivalent to…
every($input, fn($item, $pos) { $item = $pos * 2 })
…and much better to read than e.g.…
not((for $item at $pos in $input return $item = $pos * 2) = false())
Pull request #1384 created #created-1384
1316 Type declarations in quantified expressions
Fix #1316
- Quantified expressions now allow a type declaration in both XQuery and XPath
- A type declaration now induces coercion of the supplied value to the required type
Pull request #1383 created #created-1383
1374 - allow static error for duplicate keys
Fix #1374
Issue #1382 created #created-1382
Missing XSLT error code
§2.7.4 says:
If R has an as attribute, the SequenceType S declared by R must be a subtype of T, according to the relationship subtype(S, T) defined in [].
There is no allocated error code, I propose to allocate XTSE4040.
Issue #1381 created #created-1381
fn:parse-uri: Observations (cont.)
The rules for fn:parse-uri
say:
If the fragment is the empty string, it is discarded and the fragment is the empty sequence.
However, the result of fn-parse-uri-043 contains an empty fragment string:
<test-case name="fn-parse-uri-043">
<description>Parses a URI ending in "#"</description>
<created by="Michael Kay" on="2023-10-27"/>
<test>fn:parse-uri("https://qt4cg.org/specifications/xpath-functions-40/Overview.html#")</test>
<result>
<assert-deep-eq>map {
"authority": "qt4cg.org",
"fragment": "",
"host": "qt4cg.org",
"path": "/specifications/xpath-functions-40/Overview.html",
"path-segments": ("", "specifications", "xpath-functions-40", "Overview.html"),
"scheme": "https",
"hierarchical": true(),
"uri": "https://qt4cg.org/specifications/xpath-functions-40/Overview.html#"
}</assert-deep-eq>
</result>
</test-case>
I assume the tests should be updated?
With regard to the spec, I would suggest adding the normalization rule to query parameters:
If the query is the empty string, it is discarded and the query is the empty sequence.
Pull request #1380 created #created-1380
1320 Attempt to resolve a bug in parse-uri
Fix #1320
(A speculative PR in the hope that my proposed resolution is acceptable.)
Issue #1379 created #created-1379
Circular dependencies: XQDY0054 vs. XPST0008 vs. optional errors
It remains a challenge (for me) to understand how the specification expects implementations to handle circular dependencies. Sorry in advance for mixing up the specification, tests and the behavior of Saxon (it basically demonstrates my confusion):
Error Codes
The definition of err:XPST0008
is:
It is a static error if an expression refers to an element name, attribute name, schema type name, namespace prefix, or variable name that is not defined in the static context, except for an ElementName in an ElementTest or an AttributeName in an AttributeTest.
The definition of err:XQDY0054
is:
It is a dynamic error if a cycle is encountered in the definition of a module’s dynamic context components, for example because of a cycle in variable declarations.
Queries
The following test case K-InternalVariablesWith-15b expects XPST0008
as error:
declare variable $var1 := $var1;
true()
I would expect XQDY0054
as error, and true()
to be a valid alternative, as $var1
is indeed defined in the static context. Think of a similar query (for which I would as well expect XQDY0054
or true()
to be correct results):
declare variable $var1 := $var2;
declare variable $var2 := $var1;
true()
Saxon outputs the following error for the first query…
[XPST0008] Circular definition of global variable: $var1 uses $var1.
…which is confusing in itself, as the error code and the message do not really match. The output of the second query is consistent:
[XQDY0054] Circular definition of global variable: $var2 uses $var1, which uses $var2.
Another test case that expects XQDY0054
is K-InternalVariablesWith-17a:
declare variable $var := local:func1();
declare function local:func1() { local:func2($var) };
declare function local:func2($arg2) { 1 };
true()
Again, I would expect true()
to be a valid alternative, in K-InternalVariablesWith-17
(the original XQuery 1.0 test), the rational for reporting an error was…
A prolog variable having a circular dependency, by having a variable reference in a call site argument. This is an error even though the variable isn't used, because implementations cannot skip reporting static errors.
…but we now have a dynamic error.
Suggestions
XPST0008
should only be raised if no declaration exists for that variable in the query prolog.XQDY0054
should be optional for cases in which the affected code is never evaluated.
I would like to hear some feedback on this, and I will be happy to move the corresponding questions to qt4tests
and/or the Saxon bug tracker if we believe that the spec is comprehensive enough.
Issue #1376 closed #closed-1376
User defined operators
Pull request #1378 created #created-1378
1375 bugs in pattern syntax
- Changes the pattern syntax to allow any EQName to be used for function calls; the constraints on which functions can be called are defined outside the actual grammar.
- Reinstate root() as a permitted function
- Allow
doc()/id()
without requiring parentheses around the second function call - Stylesheet changes to point external spec references at the 4.0 specs rather than 3.0.
Fix #1375
Issue #1377 closed #closed-1377
Fix inconsistent definition of return-type-ref-occurs
Pull request #1377 created #created-1377
Fix inconsistent definition of return-type-ref-occurs
The DTD defines it as a IDREF type which it clearly isn't.
Issue #1376 created #created-1376
User defined operators
This came up on stack overflow and is probable worth a few comments on.
The motivation was actually to implement something called applicative style programming, but the essence of it, is it requires operators.
The simplest (to me) mechanism is the one used in Haskell (to turn a function into an operator) whereby
Prelude> let concatPrint x y = putStrLn $ (++) x y
Prelude> concatPrint "a" "b"
ab
Prelude> "a" `concatPrint` "b"
ab
the 1st line defines a function the 2nd line calls the function the 3rd line uses the function as a operator by wrapping it in backticks.
I don't believe anything 'clever' happens its simply a mechanical rewriting of the expressions before its parsed. As Haskell is curried it generalises naturally to n parameters, in XPath you'd probably restrict it to two (and maybe one) parameters.
Issue #1375 created #created-1375
XSLT: names of functions in pattern
The spec says
In a [FunctionCallP], the EQName used for the function name must have local part doc, id, element-with-id, key, or root, and must use the [standard function namespace] either explicitly or implicitly.
But the grammar does not allow an EQName, it only allows:
[10] FunctionCallP ::= OuterFunctionName ....
[11] OuterFunctionName ::= "doc" | "id" | "element-with-id" | "key" | URIQualifiedName
For example the grammar does not allow fn:doc('a:xml')
when fn is bound to the conventional namespace.
Also: the XSLT 3.0 grammar allows the root()
pattern, but the 4.0 grammar does not.
Issue #1374 created #created-1374
Duplicate keys in map constructor
We say (concerning map constructors)
If two or more entries have the [same key value] then a dynamic error is raised [[err:XQDY0137].
I think we should say:
if every MapKeyExpr is a Literal then this error SHOULD be reported as a static error.
Issue #1365 closed #closed-1365
Occurrence indicator meaning "exactly one"
Issue #1373 created #created-1373
XQFO: Editorial
- Wrong example:
hash("password123" || $salt, "sha-1234567")
- The following test lacks a
use="v-deep-equal-at"
attribute:
deep-equal(
$at//name[@first="Bob"],
$at//name[@last="Barker"],
options := { 'items-equal': op('is') }
)
graphemes($crlf)
: Result should bechar('\r') || char('\n')
instead of$crlf
(otherwise, test doesn’t work)- fix
fn:unix-time
example:1969-12-31T23:59:59.999Z
(thanks @johnlumley) - TODO for
fn:replicate
can be dropped fn:parse-uri
: Remove example forFOUR0002
.
…to be continued
Issue #1372 created #created-1372
Unknown option: FORG0013 → XPTY0004
I would suggest dropping https://qt4cg.org/specifications/xpath-functions-40/Overview.html#ERRFORG0013 and replacing it with a type error, XPTY0004
. This way, it can be perfomed during the type checks (which may also be the best solution if we define options as records).
I’ll be glad to create the PR for the spec and the tests.
Issue #1371 created #created-1371
(type)switch: braces after `case` keyword
As suggested by Pieter Lamers in https://app.slack.com/client/T011VK9115Z/C01GVC3JLHE, we could alternatively allow curly braces in case
clauses:
switch($n) {
case 1 { 'one' }
case 2 { 'two' }
default { 'dunno' }
}
This would be in aligment with the new syntax alternative for if
expressions: if($n) { 'one' } else { 'two '}
.
Pull request #1370 created #created-1370
1369 fn:round: rounding-mode → mode
Closes #1369
…no need to update any tests.
Issue #1369 created #created-1369
fn:round: `rounding-mode` → `mode`?
We could rename the new $rounding-mode
parameter of fn:round
to $mode
(analogous to $precision
, which isn’t $rounding-precision
).
If it sounds reasonable, it’s very probable I’d be able to take charge of this.
Issue #1368 created #created-1368
Further improvements to BuiltInKeywords test needed
- The test generated for fn:seconds() is incorrect (can't handle required type xs:decimal?)
- An incorrect argument is supplied for fn:pin()
Pull request #1367 created #created-1367
1321 leading lone slash
Fix #1321
Fixed as suggested in the issue. I have added a list of tokens that can appear at the start of a RelativePathExpr, produced by analyzing the grammar using a custom stylesheet leading-tokens.xsl which is available for reuse, but not integrated into the build.
Issue #1366 created #created-1366
In the EBNF, use explicit separator syntax
We could pick up the ++
and **
operators from Invisible XML for defining "a sequence of [one|zero] or more Xs separated by Ys".
So for example
Annotation ::= "%" EQName ("(" AnnotationValue ("," AnnotationValue)* ")")?
changes to
Annotation ::= "%" EQName ("(" (AnnotationValue ++ ",") ")")?
and
TypedFunctionTest ::= ("function" | "fn") "(" (SequenceType ("," SequenceType)*)? ")" "as" SequenceType
becomes
TypedFunctionTest ::= ("function" | "fn") "(" (SequenceType ** ",") ")" "as" SequenceType
Note that there is an impact on anyone who uses the machine-readable XML version of our grammar. However, the current version could easily be generated by preprocessing.
Issue #1365 created #created-1365
Occurrence indicator meaning "exactly one"
There are several reasons it would be good to have an occurrence indicator meaning "exactly one".
- It's generally a good idea if there is explicit syntax that's equivalent to the default: it enables you to write code that is more explicit about the fact that you have chosen the default option; it makes code generation easier.
- Sometimes you need to write an ItemType in parentheses to avoid a following
?
or*
being interpreted as an occurrence indicator. Having an explicit occurrence indicator for "exactly one" would avoid this need. - It makes it clear that you are writing a SequenceType, not an ItemType.
I don't expect this to be used very often, but I think it would be useful to allow it. I propose using U+00B9, superscript 1.
For example, as="xs:integer¹"
We can now write:
function($a, $b) as xs:integer¹ *
to mean a sequence of functions each returning a single integer; while writing
array(xs:integer¹ )
just serves to remind the reader that the array entries must be single integers.
Issue #1014 closed #closed-1014
Predicates, sequences of numbers: Feedback
Pull request #1364 created #created-1364
1314 Change to type() syntax to fix ambiguity
Fix #1314
Issue #1363 created #created-1363
map:get and array:get
I find the callback/fallback arguments to map:get
and array:get
rather unsatisfactory. They complicate the specification, and the use cases and examples are tenuous. I don't think they offer a great deal of convenience over alternative ways of achieving the same effect.
I would like to propose scrapping these arguments, reverting to the 3.1 specification, and adding a new pair of functions map:try-get()
and array:try-get()
with a return type of record(found as xs:boolean, value? as item()*)
.
The specification for map:try-get($map, $key) is
if (map:contains($map, $key))
then {"found":true(), "value":map:get($map, $key)}
else {"found":false()}
The specification for array:try-get($array, $index) is
if ($index = 1 to array:size($array))
then {"found":true(), "value":array:get($map, $index)}
else {"found":false()}
Though it would probably be better to define it the other way around, that is define *:get
in terms of *:try-get
.
The name try-get
comes from C#. I'm not immensely enthusiastic about it, but it will get some name recognition. I would probably prefer test-get
.
One of the aims, of course, is to enable you to find whether a value exists and get the value in a single call to the map (so the key only gets hashed once, for example). This benefit will only materialise if ?found
and ?value
are implemented without requiring another full map lookup, that is if access to fixed fields in simple record types is optimised. That's a challenge we leave to implementors. We're not quite there yet in Saxon - we do have a map implementation in which the fields occupy fixed slot positions, but the static inferencing to reference fields by slot number isn't quite there yet.
Issue #1362 closed #closed-1362
Update version of DeltaXML
Pull request #1362 created #created-1362
Update version of DeltaXML
This (should!) fix the diffs
Pull request #1361 created #created-1361
1337 Atomic value becomes atomic item
Fix #1337
I was half-expecting this change to have repercussions, but in practice it seems very clean.
There are probably more places that should change, e.g. use "xs:boolean item" rather than "xs:boolean value", but we can deal with them as we come to them.
Pull request #1360 created #created-1360
1348 Some grammar simplifications
Fix #1348
Makes some simplifications to the grammar rules, including some of those suggested.
I've avoided removing productions that are referenced by name in the prose of the spec; these names provide a useful handle to relate the syntax to the semantics.
I would love to introduce BNF for a comma-separated list. Invisible XML uses
XX ** ","
for zero or more occurrences of XX, separated by ",", and
XX ++ ","
for one or more occurrences of XX, separated by ",", and
Issue #1356 closed #closed-1356
Names of private functions in XQuery library modules
Pull request #1359 created #created-1359
1346 Fix minor typos in format-number
Fix #1346
Pull request #1358 created #created-1358
959 fn:unix-dateTime
Closes #959
Issue #1357 created #created-1357
Rendering of new vs. updated features
In previous versions of the XQFO drafts, it was possible to differentiate between NEW and UPDATED functions. It would be nice if this distinction could be preserved:
Issue #1356 created #created-1356
Names of private functions in XQuery library modules
Is there any good reason for the rule that the name of a private function or variable declared in a library module must be in the target namespace of the module? I think it makes sense to allow:
- private functions to be in any non-reserved namespace
- private variables to be in any non-reserved namespace or in no namespace
The reason I ask is that I'm looking at what the rule should be for item type declarations.
Issue #1354 closed #closed-1354
1202 Change Log for F&O spec
Issue #1202 closed #closed-1202
XQFO: Rendering of new/updated functions
Pull request #1355 created #created-1355
1351 Add "declare record" in XQuery
This PR does the following:
- Renames MapTest, ArrayTest, RecordTest, FunctionTest to MapType, etc (suggested in #1351 and elsewhere)
- Changes the XQuery item type declaration syntax to "declare type NNNN" rather than "declare item type" or "declare item-type", and fixes examples accordingly (also in #1351)
- Provides a separate syntax for declaring record types (as proposed in issue #1277). This declaration creates both a named item type and a constructor function for items of that type, with provision for default values. Only this kind of record type can be recursive.
I haven't yet taken on board the suggestion of changing the data model so that records are maps with a type annotation. I'm not totally opposed to the idea, but it's a significant change that needs more exploration, in particular the impact on all the map-related functions.
Fix #1351 Fix #1277
Follow-up work is needed to make equivalent changes for XSLT.
Pull request #1354 created #created-1354
1202 Change Log for F&O spec
Fix #1202
Since the changes are purely editorial but are likely to cause conflicts with technical changes, I would request an expedited review.
Pull request #1353 created #created-1353
1347 Add escape-solidus option to xml-to-json function
Fix #1347
Pull request #1352 created #created-1352
1350 Fix signature for unparsed-text-available
Fix #1350
Issue #1351 created #created-1351
declare item type → type
3.2.8.4 Recursive Record Tests presents the following declaration:
declare item type my:list as record(value as item()*, next? as my:list);
It should probably be declare item-type
.
Personally, I would prefer to just use declare type
(I guess we do not plan to introduce “sequence type” declarations?).
Issue #1350 created #created-1350
unparsed-text-available() signature
The $options parameter of unparsed-text-available() should have the same item type as in unparsed-text(), that is either a string or a map.
Issue #1349 created #created-1349
Nothing
In the Data Model we are missing an important concept - the concept of Nothing.
It has been believed that the type item()*
is sufficient to express all results that can be produced or expressed in the evaluation of an XPath expression.
In reality, there are XPath expressions whose result cannot be expressed unambiguously. Consider:
array:values([ (), 1, (2 to 4), [ 5 ] ])
According to the F&O 4.0 specification, the result must be:
(1, 2, 3, 4, [ 5 ])
And here the value of the first array member - ()
- is not returned.
Thus, in the general case, array:size($ar)
is not equal to count(array:values($ar))
,
and array:size($ar) - count(array:values($ar))
can be any non-negative integer.
At present there isn't a way in XPath to represent the lack of value. The empty sequence ()
is a value and thus using it to represent the lack of value is wrong. Also, if the lack of value is represented by ()
then this is completely lost/destroyed/vanishes when concatenated with other sequences.
The expression:
deep-equal([ ], [ () ])
evaluates to false()
, because the first array-argument has no members (contains nothing), and the second array-argument has one member - the value ()
.
Thus, ()
is not nothing.
It is time for us to be able to represent nothing in an explicit way.
We need a type nothing
, or maybe xs:nothing
that tells us that an expression may not evaluate to any value, even not to ()
.
Then we will be able to express correctly the type of a lookup expression as item()* | nothing
Issue #1348 created #created-1348
Grammar rules: redundancies
I’m a humble user of our grammar rules, and I’m definitely not an expert when it comes to their definition (my main obstacle is that changes to the XQuery grammar rules need to be compatible with the XPath and possibly XSLT grammars).
What made it difficult for me to read them in the past were the numerous redundancies (with some of them attached). Is this just “history”, or are there particular reasons for preserving or even enforcing redundance? I noticed that, sometimes, symbol names are used in the prose, but I failed to detect any reasonable pattern.
Do we believe it would be helpful to clean up the grammar rules, or does it rather feel out of scope?
# currently
ParamWithDefault ::="$" EQName TypeDeclaration? (":=" StandaloneExpr)?
Param ::= "$" EQName TypeDeclaration?
# could be
ParamWithDefault ::= Param (":=" StandaloneExpr)?`
Param ::= "$" EQName TypeDeclaration?
# currently
SchemaAttributeTest ::= "schema-attribute" "(" AttributeDeclaration ")"
AttributeDeclaration ::= AttributeName
AttributeName ::= EQName
# could be
SchemaAttributeTest ::= "schema-attribute" "(" EQName ")"
# currently
WindowVars ::= ("$" CurrentItem)? PositionalVar? ("previous" "$" PreviousItem)? ("next" "$" NextItem)?
CurrentItem ::= EQName
PreviousItem ::= EQName
NextItem ::= EQName
LetBinding ::= "$" VarName TypeDeclaration? ":=" StandaloneExpr
PositionalVar ::= "at" "$" VarName
VarName ::= EQName
VarRef ::= "$" VarName
# could be
WindowVars ::= ("$" EQName)? PositionalVar? ("previous" Var)? ("next" Var)?
LetBinding ::= Var TypeDeclaration? ":=" StandaloneExpr
PositionalVar ::= "at" Var
VarRef ::= Var
Var := "$" EQName
# currently
MapConstructorEntry ::= MapKeyExpr ":" MapValueExpr
MapKeyExpr ::= ExprSingle
MapValueExpr ::= StandaloneExpr
# could be
MapConstructorEntry ::= ExprSingle ":" StandaloneExpr
# currently
ForwardAxis ::= ("child" "::") | ("descendant" "::") | ("attribute" "::") | ("self" "::") |
("descendant-or-self" "::") | ("following-sibling" "::") | ("following" "::")
# could be
ForwardAxis ::= ("attribute" | "child" | "descendant" | "descendant-or-self" |
"following" | "following-sibling" | "self") "::"
# currently
CompNamespaceConstructor ::= "namespace" (Prefix | EnclosedPrefixExpr) EnclosedURIExpr
Prefix ::= NCName
EnclosedPrefixExpr ::= EnclosedExpr
# could be
CompNamespaceConstructor ::= "namespace" (NCName | EnclosedExpr) EnclosedURIExpr
# currently
Argument ::= StandaloneExpr | ArgumentPlaceholder
ArgumentPlaceholder ::= "?"
# could be
Argument ::= StandaloneExpr | "?"
…and so on.
PS: If we tweak the grammar, I would propose to rename ExprSingle
to SingleExpr
.
Issue #1347 created #created-1347
The escape-solidus option should apply to xml-to-json
Recently, we added an escape-solidus
option to fn:serialize()
. That option should apply to fn:xml-to-json()
as well.
It's not immediately obvious to me if there are other serialize options that should apply as well.
Issue #1346 created #created-1346
Typos in fn:format-number
In the rules section of fn:format-number, in the para starting "In the table", the word "rendition" is in the wrong font (use term
markup).
In the options table, the notation xs:string (: matching '.(:.*)?' :)
is flawed because a comment cannot contain the characters (:
. Simplest answer to this is to just write xs:string matching '.(:.*)?'
- it's intended for the human reader after all, not for machine execution.
Issue #1345 created #created-1345
Bare brace ambiguity resolution in practice
I have applied the new bare brace grammar rules to two existing projects, and (if I’m correct) all of the the following constructs cannot be parsed anymore (with the order reflecting my sense of urgency):
- Simple maps:
$data ! {}
- Sequence arrow:
{} => map:size()
- Predicates:
{ 'x': 1 }[map:keys(.) = 'x']
- Return clause:
let $x := ... return {}
- Paths:
<a/>/{}
- Instance of:
{} instance of record(x)
- If expression:
if($x) then { $x: $y } else {}
I think we should try hard to tweak the grammar for some of these.
I didn’t manage to construct an ambiguous example with return
and {}
; could someone help me?
Related: #1309
Pull request #1344 created #created-1344
1343 Drop the static typing feature
Fix #1343
Issue #1343 created #created-1343
Drop the static typing feature
Since XQuery 3.0, the effect of the static typing feature has been almost entirely implementation defined. Since there is no interoperability when this feature is in effect, there seems very little point leaving it as an optional feature in the spec. In any case, I think user experience of processors that attempt to implement the static typing feature has been rather negative, and most of the processors that implemented it have been left to languish and are unlikely to be upgraded to 4.0.
QT4 CG meeting 087 draft minutes #minutes-07-23
Draft minutes published.
Issue #1329 closed #closed-1329
load-xquery-module supplying content
Issue #1333 closed #closed-1333
1329 Add content option to load-xquery-module
Issue #1309 closed #closed-1309
Dangling else syntax ambiguity
Issue #1327 closed #closed-1327
1309 bare brace ambiguities
Issue #1331 closed #closed-1331
1324 Introduce markup for executable specs
Issue #1317 closed #closed-1317
Record Test Subtype Relationship
Issue #1332 closed #closed-1332
1317 Fix the record subtyping rules
Issue #1263 closed #closed-1263
1224 Add xsl:accumulator-rule/@priority attribute
Issue #1326 closed #closed-1326
Misleading summary for concat() - "string value"
Issue #1328 closed #closed-1328
1326 wording improvements for concat and string-join
Pull request #1342 created #created-1342
1339 Deprecate ordering mode declaration
The "declare ordering mode" declaration, and the ordered{}
and unordered{}
declarations are retained for compatibility, but are deprecated and no longer have any effect.
Fix #1339
Issue #1341 created #created-1341
Remove the `$position` argument from the `$action` function passed to folds
The $position
argument, passed to the $action
-function-argument of the folds is unnecessary and artificial:
- The addition of this argument resulted from the automatic adding this position-aware
$action
-function-argument to all functions processing sequences and producing result based on their values. - In doing so, no further analysis was made on the specifics of the fold functions.
- Though this issue has been raised again and again for months, no single and meaningful use-case has been provided.
- This addition departs away from the original meaning of folds as has been well-established by the developers community. Whether or not one needs to produce a variety of folds, such as sum, min, max, average, product, all / none / some / any - in all these cases the position of the individual items does not matter.
- One bad consequence of this change is that it makes it more difficult for the reader to grasp the meaning of a particular fold-function, and even to wonder if the spec talks about the same fold functions that the reader thought he knew well.
- This change results in unnecessarily complex documentation and testing.
- Users have expressed their dismay over the resulting complexity. To quote @benibela: "Too many variables make the code hard too read. And the implementation becomes slow, when it has to handle too many arguments. Especially with function coercion adding further type checks" . And @michaelhkay himself: "I'm inclined to propose dropping the position argument for both fold and scan. It complicates the specification and the use cases are unconvincing. I believe it has been incorrectly specified (for fold-left, the first time $action is called, the value supplied for $pos is 2, whereas for fold-right it is count($input)-1; and the "Error conditions" section talks of $action being applied to 2 arguments). For the -right forms in particular, the semantics are mind-bending enough without introducing this complication."
- It is very easy to make an accidental mistake and pass a 2-arg.
$action
function when a 3-arg. function was meant (or the other way around). - The giant think-tank of Microsoft gives us a good example of a better solution. They never added position-aware overloads to any folds or fold related methods of the Enumerable class. All the following methods do not have position-aware action-function arguments:
-
This is not an accidental mistake, as Microsoft added to other Enumerable methods overloads that do require position-aware
$action
-function arguments:
- Last and probably most important: @michaelhkay gave us a general and an elegant and very readable way of expressing any fold that needs positional information as a 2-step operation where the 1st step creates a map with entries:
{"position": $input[$pos]}
and then a fold-operation that only has a non-position-aware$action
-function argument, and has this map as input.
Proposed solution:
- Leave the folds unchanged - in order to preserve their original, established meaning and avoid introducing inadequate complexity.
- If deemed really necessary, define separate functions that can take as parameter a position-aware
$action
-function.
Issue #1340 created #created-1340
Namespace nodes and the namespace axis
It would be nice to bring XSLT, XPath, and XQuery into line here.
The current state of play seems to be:
XQuery: the namespace axis is not supported. Namespace nodes can be constructed, but they exist only as detached orphans; they can never be attached to a parent element.
XPath: the namespace axis is deprecated and support is optional. There is no mechanism for constructing namespace nodes.
XSLT: the namespace axis is mandatory. Namespace nodes can be constructed and can be attached to elements.
I believe that the only reason for the differences is that XQuery implementors were concerned that it would be difficult to implement namespace nodes efficiently. I think XSLT has clearly demonstrated that this concern is unjustified.
However, there are implementation complexities, primarily around the fact that namespace nodes have identity and parentage, so if a namespace is declared on a root element, then every element in the document has a namespace node for this namespace, and these have distinct identity. To implement this efficiently, the implementation has to instantiate namespace nodes lazily on demand, and then has to ensure that if the "same" namespace node is instantiated again, it has the same "identity".
I suggest a solution along the following lines, applied to all three languages:
(a) the namespace axis is supported and delivers namespace nodes
(b) operations that depend on the ordering, identity, or parentage of namespace nodes are deprecated and implementation-defined.
(c) the data model says that the in-scope namespaces of an element are in the form of a (prefix, URI) map. The semantics of the namespace axis are described in terms of constructing transient namespace nodes from this map.
Issue #1339 created #created-1339
Drop unordered mode
Is there any evidence that unordered mode is useful, or that any implementations actually take note of it (by delivering a different result if unordered mode is set)?
If not, could we drop it? I would suggest continuing to recognize the syntax, marking it deprecated, and saying it has no effect.
It's not doing a great deal of harm, but there are a lot of places where our examples in the spec assume ordered mode, but the examples don't explicitly call out this assumption. Similarly, a great many QT3 test cases would fail on an implementation that sets unordered mode by default.
Issue #1338 created #created-1338
Arrays and maps: Members, entries, values, contents, pairs, …
With version 4.0, we are adding a lot of promising and powerful new map and array features. This is a big step forward, compared to the obvious limitations of 3.1.
Some aspects of the 3.1 design have made it difficult (or impossible) to fully adjust array and maps, but (in my opinion) the old overall concept was impressively consistent – and it is definitely a big challenge to achieve a 4.0 design that is not too fragmented.
To me, this becomes particularly evident in the case of arrays. The following example sums up the items of all members of an array. For the cumbersome 3.1 solution…
for $pos in 1 to array:size($array)
return sum($array($pos))
…we now have at least several (roughly?) equivalent options to do this; for example…
for member $m in $array return sum($m)
array:members($array) ! sum(?value)
$array?entry::* ! sum(?value)
$array?value::* ! sum(.)
…which is great – but the downside is that we have introduced a terminological jungle. The examples above could imply that:
- for 1., an array member is a sequence (which it indeed is);
- for 2., an array member is a map;
- for 3., an array has entries (but there is no
array:entries
); - for 4., an array has values (which is true, but
array:value
returns a different structure).
Next, with the current proposals, $array:content::1
gives us the sequence-concatenated version of the first member of an array. Similar observations can be made with maps: map:entries($map)
returns singleton maps, whereas $map?entry::*
is actually equivalent to map:pairs
.
The fundamental obstacle are clear have already been discussed a lot, but I think that with each new concept, we should try really hard not to blur terminology, and work with terms that users can assign to the underlying concepts without too much guessing or trial’n’error.
My general suggestions would be to…
- align the new lookup terminology and the builtin functions, and
- omit, rename or drop builtin functions that do not rely on the existing or arising terminology.
My concrete proposals:
- As we already have
map:pairs
,$map-or-array?entry::*
should become$map-or-array?pair::*
, and we should add aarray:pairs
function, and probablyarray:of-pairs
(see #832). We shouldn’t do it the other way round and renamemap:pairs
tomap:entries
, as the existingmap:entry
function returns a singleton map. - If we keep calling the sequence-concatenated result “content”, we should include it in the definition of sequence-concatenation. In addition,
(array|map):values
should be renamed to(array|map):contents
(see #1179). - Due to the existence of
array:value::*
, we should make clear what an “array value” is, how it it positions itself in relation to an “array member”, and we should addmap:values
andarray:values
for equivalent results. - Due to the existence of
array:key::*
, we should add aarray:keys
function (which returns a dense integer range).1 to array:size($array)
could then be written asarray:keys($array)
. - As we have
map:entries
andmap:merge
, we could add equivalentarray:entries
andarray:merge
functions. - I would suggest dropping
array:members
/array:of-members
in favor of eitherarray:split
/array:join
,array:pairs
/array:of-pairs
(see 1.) orarray:entries
/array:merge
(see 5). I really believe that an “array member“ should not be a map; an “array pair” or ”array entry” certainly could.
One might question if we should really introduce map terminology for arrays. I think we have no other chance if we want to treat maps and arrays identically with lookup key specifiers, and it may help us later on to treat both data structures as similar as possible.
Issue #1337 created #created-1337
Atomic value → atomic item
I liked @michaelhkay’s proposal in https://github.com/qt4cg/qtspecs/issues/826#issuecomment-1821359131:
• The term "atomic item" (or just atom?) replaces "atomic value".
I often used “atomic items” in the past (although it’s no defined term at the moment) as it seemed more intuitive to me.
Issue #1336 created #created-1336
Editorial: fos record descriptions within xmlspec prose
In the F+O spec, the uri-structure-record
appearing in section 6.6, and various other similar record descriptions, are defined using the fos
namespace markup in xpath-functions.xml. Normally the fos
XML vocabulary is confined to function-catalog.xml, and is converted to the usual xmlspec vocabulary by the merge-function-specs
stylesheet.
Using this vocabulary directly within an xmlspec document means that the document doesn't validate against its DTD, and that the fos
islands aren't validated against the fos.xsd schema.
A better approach here might be to use XInclude to insert text from a separate, schema-validated, document.
Issue #31 closed #closed-31
Extend FLWOR expressions to maps
Issue #1160 closed #closed-1160
fn:is-collation-available
Issue #1334 closed #closed-1334
map:build parameter keywords
Issue #1335 created #created-1335
Data Model primitives for Maps and Arrays
In principle we ought to be able to define all operations on maps and arrays in terms of the primitives defined in the Data Model spec.
Currently the data model defines the primitives as dm:map-entries
, dm:array-size
, and dm:array-get
.
This is a workable set for retrieval functions, though it's not necessarily an ideal set. But what is missing is any primitives for map and array construction.
I think we need to regard the empty array and empty map as given, and then define array:append and map:put as primitives.
Since dm:map-entries()
isn't the same as the user-visible map:entries()
it might be a good idea to rename it.
Another way of defining the primitives would be to make iteration primitive, so the primitives become
dm:for-each-map-entry($map, fn($key, $value))
and
dm:for-each-array-member($array, fn($position, $value))
This has some merit in that (a) maps and arrays are treated symmetrically, and (b) there are only 2 primitives rather than 3.
Issue #1334 created #created-1334
map:build parameter keywords
Currently $keys, $value
. Both should be plural.
Pull request #1333 created #created-1333
1329 Add content option to load-xquery-module
Fix #1329
Pull request #1332 created #created-1332
1317 Fix the record subtyping rules
Fix as proposed in the issue.
Fix #1317
Pull request #1331 created #created-1331
1324 Introduce markup for executable specs
This PR does the following:
- Makes schema changes for the function catalog to allow an executable specification of a function to be marked up as such.
- Uses this markup initially for functions in the array namespace.
- Makes stylesheet changes to render this markup in the published spec.
- Commits an XSLT stylesheet (not yet integrated into the build system) that runs against the function catalog to produce an XQuery module whose effect is to declare functions based on the "executable specifications" and run the published executable examples against these functions, checking that they produce the expected result. The "success" output of this query (which runs with no source document) is an XML document containing an empty element
<result/>
. (If it has content, this will relate to tests that failed).
To get this to work, I had to tweak a couple of functions (array:sort and array:get) where Saxon has not yet implemented the required functionality. Although the test query is using the implementations from the spec, not those from Saxon, these implementations make calls on other functions where the Saxon implementation is used. For example array:fold-left calls fn:fold-left and this currently uses the Saxon implementation of fn:fold-left.
The query binds a dummy namespace to the array
prefix to avoid problems with reserved namespaces. There are a couple of "core" functions that have no executable specification -- notably array:of-members() -- and the generated query contains an implementation of these that maps the function in the dummy array namespace to the function in the true array namespace.
This is phase 1. Most of the machinery is in place. It now needs to be applied to executable specifications of functions in other namespaces.
The stylesheet has no Saxon dependencies, but it does include a couple of template rules to exclude specific functions/tests that Saxon does not currently implement. The stylesheet could be made portable across implementations by moving these exception cases to an overriding stylesheet module.
I did find a few examples of supposedly executable code that needed fixing; hopefully these will show up in the diff version.
Issue #1330 created #created-1330
$fallback argument of map:get() and array:get() should allow () to be supplied
As a general rule, if an argument is optional then it should accept an empty sequence.
Issue #1329 created #created-1329
load-xquery-module supplying content
I propose to provide an additional option
content as xs:string
for fn:load-xquery-module. The effect is to supply the content of the XQuery library module as a string. If supplied, the location-hints option is ignored.
Use case: I'm writing code that attempts to test the XQuery examples in the specification. This requires some kind of capability for dynamic XQuery execution, and this seems the simplest way of doing it.
Pull request #1328 created #created-1328
1326 wording improvements for concat and string-join
Very minor editorial improvements.
Fix #1326
Pull request #1327 created #created-1327
1309 bare brace ambiguities
Fix #1309
The proposal restructures the grammar so a bare brace map constructor (that is, one without the "map" keyword) can be used only where this causes no ambiguity, for example as a function argument.
Issue #1326 created #created-1326
Misleading summary for concat() - "string value"
The summary of fn:concat says that it concatenates the string values of its arguments. This isn't strictly correct: if a node is supplied as an argument, the node is atomized and the typed value is cast to a string, which doesn't necessarily give you the same result. For example (a) if the node is an attribute whose type is list, the result is the elements of the list without space separation, and (b) if the node is an element with element-only content, concat() fails although string() would succeed.
In 3.1, supplying a list-valued attribute would fail because the required type of each argument was xs:anyAtomicType?
. It now succeeds because the required type is xs:anyAtomicType?
- but perhaps it doesn't have the desired effect?
We should clarify these points with notes and examples. Also affects the ||
operator.
Issue #1325 created #created-1325
Variadic System Functions: Principles?
In the current spec, the following functions are variadic:
fn:concat
fn:codepoints-to-string
fn:distinct-ordered-nodes
The advantage of the variadic representation is that a user can omit additional parentheses; the drawback is that the function cannot be enhanced with parameters later on. In my point of view, fn:distinct-ordered-nodes
might be a candidate for that in the future (on the other hand, it will only be used by a very small user group one or the other way).
Other candidates for variadicity could be (among others):
fn:count
fn:exists
,fn:empty
fn:head
,fn:tail
,fn:trunk
,fn:foot
fn:one-or-more
,fn:exactly-one
,fn:zero-or-one
fn:innermost
,fn:outermost
,fn:unordered
,fn:reverse
fn:boolean
,fn:not
fn:identity
fn:data
,fn:has-children
(the 0-arity case would need to be preserved due to its special semantics)
Note that function calls like fn:exactly-one(E1, E2)
can be reasonable in practice, as E1
may return an empty sequence.
I wonder if we can define a principle which functions should be variadic?
QT4 CG meeting 086 draft minutes #minutes-07-16
Draft minutes published.
Issue #1324 created #created-1324
Executable specifications
For a number of functions we have provided executable specifications: that is, we have provided an XQuery function declaration that claims to be a conformant implementation of the function being specified.
We should add machinery to ensure that these reference implementations are correct: that is, that they compile, that they correctly run the examples in the spec, and that they can be used to run the tests for the relevant function in the test suite.
There are one or two cases where we have been doing this in an ad-hoc way by having additional test cases in the test suite that use the reference implementation (or a copy of it!) in place of the real function; but this is clumsy and the process should be automated.
It can be tied in with the mechanism that currently generates test cases from the examples in the spec.
We should probably handle several levels:
(a) There are cases where the reference implementation can be a simple XPath expression: for example the function fn:local-name($x)
delivers fn:local-name-from-QName(fn:node-name($x))
.
(b) In other cases a full XQuery function declaration is needed, especially where it makes use of a supporting helper function.
Ideally the reference implementation should use 3.1 syntax only (to make it easily testable on 3.1 implementations); but not if this sacrifices clarity, given that the primary audience is the human reader.
Issue #1244 closed #closed-1244
566-partial Rewrite parse-uri
Issue #1272 closed #closed-1272
Add xsl:value-of/@as attribute
Issue #1323 created #created-1323
Function parameters names: $uri vs. $href
I was asked about the difference between the function parameter names $href
and $uri
…
fn:doc($href)
(“Retrieves a document using a URI supplied as anxs:string
…”);fn:unparsed-text($href)
; othersfn:collection($uri)
,fn:uri-collection($uri)
; others
…and I couldn’t give a convincing answer. At least in BaseX, both fn:doc
and fn:collection
can be used interchangeably to address single resources (but things change when the target contains multiple resources).
Would it be reasonable to rename $href
to $uri
, or are there reasons why we need to differ those two?
If we rename the parameter, fn:resolve-uri
may be the only function using $href
(maybe even here, $uri
could work?).
Issue #1322 created #created-1322
fn:collation-available (editorial)
Minor observations:
- As the function input is a plain URI, I would propose to rename
$collation
to$uri
(seefn:collection
and other functions). - I believe we should make
fn:collation-available($uri, ())
andfn:collation-available($uri)
equivalent. Could we change the default value of$usage
to()
? - It would be helpful to have (successul and failing) examples for the
$usage
argument added to the specification. Maybe things get easier once test cases exist.
Issue #1321 created #created-1321
Leading lone slash
PR #1313 clarifies how tokenization of direct constructors works.
As a result we should rephrase the section on "leading lone slash" in A.1.2. This currently says (inter alia)
the < token could be either an operator or the start of a [DirectConstructor]
But we now speak of terminals rather than tokens, and a DirectConstructor is a terminal, whereas the <
character that appears at the start of a DirectConstructor is not a terminal.
The revised formulation affects the interpretation of examples like / < 5
(test cases PathExpr-5 and PathExpr-8) which I believe are now valid in XP40 and XQ40 (<
as a character can appear at the start of a RelativePathExpr, but <
as a token/terminal can't).
The rule currently talks of a "token that can appear at the start of a RelativePathExpr" without enumerating the tokens/terminals that can do so. It would be helpful to both implementors and users (and test authors) if we could enumerate them - I believe we can construct a list by conducting a suitable query against the grammar.
Issue #1266 closed #closed-1266
1158 Add array mapping operator
Issue #1320 created #created-1320
fn:parse-uri: Observations
@ndw I decided to only give feedback on the first test that fails; maybe it makes things easier. Next, I chose this repository (instead of qt4tests
), as you are the better person to judge if bugs are to be fixed in the tests or the spec.
The test case fn-parse-uri-012 uses fn:parse-uri("file:///c:/path/to/file")
and returns:
{
"uri": "file:///c:/path/to/file",
"scheme": "file",
"hierarchical": true(),
"path": "/c:/path/to/file",
"filepath": "c:/path/to/file",
"path-segments": ("", "c:", "path", "to", "file")
}
Following the current rules, I would have expected filepath
to be /c:/path/to/file
:
If URI matches `^([a-zA-Z][A-Za-z0-9\+\-\.]*):(.*)$`:
• SCHEME: file
• STRING: ///c:/path/to/file
If the scheme is known to be file and the string matches "^/*(/[a-zA-Z][:|].*)$":
• STRING: /c:/path/to/file
If the scheme is file or the empty sequence, and filepath is the empty sequence, filepath is also the whole string:
• FILEPATH: /c:/path/to/file
Would you propose to revise the test or the specification?
QT4 CG meeting 085 draft minutes #minutes-07-09
Draft minutes published.
Issue #1158 closed #closed-1158
Simple mapping operator for arrays
Issue #1319 created #created-1319
Specification Documents: Editors and Contributors
I have noticed that the headers of the specification documents of previous versions of the languages contain more than one editor. In some EXPath documents, both editors and contributors are listed (with editors doing the majority of the work).
As various people have now been contributing to the new specifications and the documents for the last months and years already, would it make sense (and would it conform to current W3 conventions) to name more than one person in the header? In either case, it should certainly be made clear that Michael Kay has contributed the very vast majority of the content.
Issue #1306 closed #closed-1306
46 Add @as attribute to xsl:sequence
Issue #1262 closed #closed-1262
1160 Add collation-available() function
Issue #1311 closed #closed-1311
Tokenization and element constructors
Issue #1313 closed #closed-1313
1311 Tokenizing after <
Issue #566 closed #closed-566
fn:parse-uri, fn:build-uri: Feedback
Issue #1318 created #created-1318
Function Coercion: Records, Maps, Arrays
Conclusion (2024-09-11): As we plan to keep coercion rules for records, we should add rules for arrays and maps as well: If $v as xs:int+
is successful for (1, 2)
, $v as array(xs:int)
should be successful for [ 1, 2 ]
.
I like to question the coercion rule that encompasses record tests:
- If R is a RecordTest and J is a map, then J is converted to a new map as follows:
- The keys in the supplied map are unchanged.
- In any map entry whose key is an
xs:string
equal to the name of one of the field declarations in R, the corresponding value is converted to the required type defined by that field declaration, by applying the coercion rules recursively (but with XPath 1.0 compatibility mode treated as false).
I would like us to drop this rule. I believe that both the instance checks and the conversions of large maps can get very expensive. In addition, it may even require recursive rebuilds of map structures if a supplied record test includes nested record tests.
I think it’s completely fair to expect users to deliver maps in a way that matches record definitions, and it could even be counterintuitive if map updates take place as a consequence of a simple function call.
If we want to stick to that coercion rule, we should also define coercion rules for maps and arrays.
Issue #1317 created #created-1317
Record Test Subtype Relationship
For a subtype relationship between record tests, A ⊆ B
, I am trying to understand the impact of the extensibility of A
, where B
is extensible, but I can't spot any.
The cases of extensible and non-extensible A
are distinguished in 3.3.2.10 Record Tests, points 5 and 6. The only difference between those, apart from A
's extensibility, is the first one saying
For every field that is declared in B but not in A, the declared type in B is item()*.
while the second one has
Every field that is declared in B with a type other than item()* is also declared in A.
So the first one asks for a type of item(*)
when a declaration is missing, and the second one allows a missing declaration only for a type of item(*)
. How can that not be the same?
QT4 CG meeting 085 draft agenda #agenda-07-09
Draft agenda published.
Issue #1316 created #created-1316
XPath: type declarations in quantified expressions
We have added type declarations to XPath "for" and "let" expressions, but not to "some" and "every", where they remain XQuery-only. This seems a needless inconsistency.
Issue #1315 created #created-1315
12 div-3
Revisiting an old issue here: should 12 div-3
parse?
Under the new 4.0 tokenization rules, it certainly doesn't.
But under Michael Dyck's interpretation of the 3.1 rules, it does parse; and accordingly we made it parse in Saxon: see https://saxonica.plan.io/issues/2715 .
Michael D's reasoning is at
https://lists.w3.org/Archives/Public/public-xsl-query/2016Mar/0037.html
He argued that the the longest token "consistent with the EBNF" is div
(because div-3
is not consistent with the EBNF), and that the rule requiring a space between an NCName and a hyphen does not apply because div
in this context is a keyword, not an NCName.
Specifically, in existing XPath processors, does it parse?
I'm going to defend the new rules in 4.0 here, in which tokenization is independent of syntactic context. I think that's a much clearer definition. But in the interests of full disclosure, the CG might like to note that this may be incompatible with the way some people have interpreted the 3.1 rules
Incidentally, removing the tweak that makes Saxon able to parse 12 div-3
doesn't break any test cases.
QT4 CG meeting 084 draft minutes #minutes-07-02
Draft minutes published.
Issue #729 closed #closed-729
xsi:schemaLocation
Issue #1254 closed #closed-1254
729 Add rules for use of xsi:schemaLocation during validation
Issue #1161 closed #closed-1161
More changes to drop the requirement for document-uri() uniqueness
Issue #1265 closed #closed-1265
1161 Further revision of document-uri constraints
Issue #1289 closed #closed-1289
Delete XQuery Appendix J
Issue #1293 closed #closed-1293
1289 Delete XQuery Appendix J
Issue #1314 created #created-1314
Ambiguity in XPath EBNF - Lookup with TypeQualifier vs DynamicFunctionCall
Additional to https://github.com/qt4cg/qtspecs/issues/1050 An additional ambiguity occurs in one of the deep lookup examples:
$tree ??$from ??type(record(to, distance))[?to=$to] ?distance
which can be simplified to
$tree ??type(foo)
where there is ambiguity between a LookupExpr
with TypeQualifier
and a DynamicFunctionCall
on a function named type
. That is, type
should perhaps be one of the restrictions on function name to avoid this ambiguity.
Whether something more fundamental is needed on the productions around [74],[75] and [84]-[88] I'm not sure, but certainly type
can appear either as a keyword for TypeQualifier
(consuming the bracketed type) or a value of an NCName
(with the bracketed section being a higher-level PositionalArgumentList
), both being part of a KeySpecifier
.
Pull request #1313 created #created-1313
1311 Tokenizing after <
Fix #1311
Defines precise rules for tokenization in the presence of direct element and PI constructors.
I chose to make the rules independent of the syntactic context.
Issue #1312 created #created-1312
Productions missing ws:explicit
Some lexical productions including
Digits, DecDigit, HexDigits, HexDigit, BinaryDigits, BinaryDigit
are missing the ws:explicit annotation
Issue #1311 created #created-1311
Tokenization and element constructors
The new rules in Appendix A.3 on tokenization are, I believe, a great improvement on what went before. But I think there is one thing missing: they claim that the rules allow you to identify boundaries between tokens unambiguously independently of the syntactic context, but in the case of a token starting with <
, this isn't true: to distinguish whether <
represents a less-than-operator (or <=
operator) or whether it is the start of an element constructor, you need some context information.
Saxon's tokenization is still based on the principles outlined in the XPath 1.0 spec where tokens are disambiguated based on the immediately preceding and following tokens; this is becoming increasingly unviable. Most cases can be handled instead by moving the disambiguation into the parser rather than the tokenizer, but this relies on being able to find token boundaries without knowledge of context (as described in the 4.0 spec), which appears to be possible in all cases except <
.
Essentially we need to add an exception to the rule: "If the current position is not the end of the input, then return the longest [literal terminal]( or [variable terminal] that can be matched starting at the current position..."
I think the exception might be formulated as follows:
In XQuery, when the next character is <
and this is immediately followed by an NCNameStart character (for example X
) the next token could be either a less than operator, or a DirElemConstructor. The "longest terminal" rule cannot reliably distinguish these cases. Instead, the decision must take into account the syntactic context. A DirElemConstructor can only appear where the parser is expecting to read an expression, while the less-than operator can never appear where the parser is expecting an expression. This aspect of the syntactic context therefore needs to be communicated from the parser to the tokenizer.
Alternatively, the two cases might be distinguished by backtracking. The tokenizer could attempt to interpret the text following the <
character as a DirElemConstructor, and revert to the alternative interpretation if this fails.
Note: this was not explained clearly in 3.1. Perhaps it was covered by the quixotic phrase "the longest token consistent with the EBNF".
Issue #1310 created #created-1310
add fn:match-groups() function
Issue #1309 created #created-1309
Dangling else syntax ambiguity
I think this is ambiguous, or at any rate, involves arbitrary lookahead:
if (a = b) then if (c = d) {23} else {}
You only get a successful parse if you associate the "else" with the first "if", but you can't do that until you know that there isn't going to be a second "else".
Again, it's the fact that an expression can now begin with a left curly brace that's the culprit.
Issue #1308 created #created-1308
fn:apply argument names
The narrative of fn:apply repeatedly refers to $array
where $arguments
is intended.
Issue #1303 closed #closed-1303
Recognize 'fn' as well as 'function' in signatures
Issue #1307 created #created-1307
For symmetry, add functions array:scan-left and array:scan-right
The fold-left and fold-right functions are defined both for sequences and arrays; symmetry demands that the same should apply to scan-left and scan-right.
Issue #1281 closed #closed-1281
invisible-xml() return type
Issue #1294 closed #closed-1294
46 Add xsl:item and xsl:sequence/@as
Pull request #1306 created #created-1306
46 Add @as attribute to xsl:sequence
Revised PR that drops the proposed xsl:item instruction
Issue #1287 closed #closed-1287
Missing error conditions for fn:parse-xml()
Issue #1288 closed #closed-1288
1287 Define parse-xml error conditions
Issue #1298 closed #closed-1298
In change markup, handle multiple issue or PR numbers
Issue #1305 created #created-1305
Almost all functions in FO that must process multiple string items, can have as a parameter only a single collation
The problem:
At present the only XPath 4 function (that I am aware of) that can process multiple strings and use multiple collations (a specific collation for a specific string comparison) is fn:sort
.
Some very important functions, such as fn:deep-equal
and fn:compare
can have only one collation as a parameter.
This means that when we are comparing sequences of items which contain multiple strings each of which could need to be handled in a specific collation, we are not able to provide all such collations (but are providing just a single collation) to the comparing function - fn:deep-equal
or fn:compare
.
The end result is that all string comparisons will be done using that single collation and may not produce the correct result (that would be produced if the particular comparison was done with the particular collation).
Possible solutions.
It is difficult to provide a solution to this problem and the list below is open ended:
-
Add a
collation
property to the typexs:string
. Then we would specify the type as(xs:string, collation-name?)
-
Make
fn:deep-equal
andfn:compare
accept not a single collation but a sequence of collation-names. In this case a pair of strings will be compared once for every collation that is specified. The idea is that the sequence of collations would be provided ordered by decreasing specificity. The first result that is produced at least twice in this process (something like voting) would be the result of the comparison. In case of a tie, the comparison done with the collation that is earliest (supposed to be more specific) will have higher priority. -
Leave this as it is at present, but add to the specification a warning to the user that specifying a single collation-name may not be what they want.
-
Remove from
fn:sort
the multiple-collations parameter and allow only a single collation.
QT4 CG meeting 083 draft minutes #minutes-06-25
Draft minutes published.
Issue #1096 closed #closed-1096
Effect of atomization on array:index-of()
Issue #1295 closed #closed-1295
1096 Redefine array:index-of to use deep-equal for comparisons
Issue #1285 closed #closed-1285
Appendix H of F&O should mention change for unrecognised option parameters
Issue #1286 closed #closed-1286
Updated list of incompatibilities in F+O
Issue #1291 closed #closed-1291
Change obsolete notes on rounding
Issue #1292 closed #closed-1292
Fix issue 1291 (rounding)
Issue #1253 closed #closed-1253
XSLT: add xsl:switch to list of instructions within which whitespace is ignored
Issue #1255 closed #closed-1255
1253 whitespace in xsl:switch
Issue #1282 closed #closed-1282
Revise fn:invisible-xml
Issue #1304 closed #closed-1304
Fixed typo in the example of scan-right
Issue #1290 closed #closed-1290
Fix keyword tests to treat "fn" = "function"
Issue #1302 closed #closed-1302
Stylesheet fix for 1298
Pull request #1304 created #created-1304
Fixed typo in the example of scan-right
Fixed a minor typo in one of the examples for fn:scan-right
Issue #1297 closed #closed-1297
Minor correction to fn:scan-right - typo
Pull request #1303 created #created-1303
Recognize 'fn' as well as 'function' in signatures
Stylesheet changes to the stylesheet that generates tests for all parameter keywords in system functions. The changes recognize fn(
in function signatures indicating that the generated test case needs to supply a function as the argument value.
Pull request #1302 created #created-1302
Stylesheet fix for 1298
Changes stylesheet to handle case where a change log entry refers to multiple issues or PRs. See "Lookup Expressions" in XQuery for an example.
Issue #1301 closed #closed-1301
1298 links to multiple issues
Pull request #1301 created #created-1301
1298 links to multiple issues
Changes stylesheet so that where a change entry refers to multiple issues or PRs, the links to each one are rendered correctly. (For an example, see "Lookup Expressions" in XQuery.)
QT4 CG meeting 083 draft agenda #agenda-06-25
Draft agenda published.
Issue #1300 closed #closed-1300
Commit the updated tests to the tests repository
Pull request #1300 created #created-1300
Commit the updated tests to the tests repository
Yes, it looks like that worked.
Issue #1299 closed #closed-1299
Attempt to rework how the test repository is updated
Pull request #1299 created #created-1299
Attempt to rework how the test repository is updated
This test will only work if I merge it, so ...
Issue #1298 created #created-1298
In change markup, handle multiple issue or PR numbers
In change markup we sometimes use multiple issue or PR numbers, for example
<change issue="123 456" PR="789 799">
but the stylesheet is not currently recognizing this and generating multiple links.
Pull request #1297 created #created-1297
Minor correction to fn:scan-right - typo
In the first example the expression was:
scan-right(1 to 5, 0, op('+'))
but must be:
scan-right(1 to 10, 0, op('+'))
Pull request #1296 created #created-1296
982 Rewrite of scan-left and scan-right
Fix #982
-
The "equivalent expression" is replaced with one that is much shorter and hopefully easier to understand, though hopelessly inefficient as an actual implementation.
-
The result no longer includes the zero value. This seems simpler, and is consistent with other expositions I have read, e.g. of the Scala functions.
-
The signature of scan-left and scan-right is now identical to fold-left and fold-right, which apart from having the virtue of consistency, makes it much easier to specify one in terms of the other. The change is that the callback function now allows a position argument.
Issue #1048 closed #closed-1048
fn:format-number: relax restrictions on exponent-separator (possibly minus-sign, percent, per-mille)
Pull request #1295 created #created-1295
1096 Redefine array:index-of to use deep-equal for comparisons
Fix #1096
Using deep-equal for comparisons seems a reasonable default that avoids the atomization problem.
Note, I would personally be quite happy to drop the function as an alternative resolution.
Pull request #1294 created #created-1294
46 Add xsl:item and xsl:sequence/@as
Fix #46
Fix #1272
Pull request #1293 created #created-1293
1289 Delete XQuery Appendix J
Fix #1289
Pull request #1292 created #created-1292
Fix issue 1291 (rounding)
Delete an obsolete note.
Make the spec consistent with regard to the keyword "to-even" vs "half-to-even".
Fix #1291
Issue #1291 created #created-1291
Change obsolete notes on rounding
There are a couple of non-normative notes on rounding that don't reflect the latest changes to the spec.
In addition, the specification has been left inconsistent as to whether the option for rounding to even is written as "to-even" or "half-to-even". The latter spelling was intended.
Issue #1245 closed #closed-1245
fn:format-dateTime: Properties
Issue #1264 closed #closed-1264
1245 Correct properties of format-DT function family
Issue #1236 closed #closed-1236
QT4CG-078-01 fn:unparsed-text-lines, normalize newlines
Pull request #1290 created #created-1290
Fix keyword tests to treat "fn" = "function"
When generating keyword tests for higher-order functions, the parameter type now generally uses "fn" rather than "function", which causes the stylesheet to generate incorrect tests.
Issue #1289 created #created-1289
Delete XQuery Appendix J
Appendix J contains examples of XQuery applications. It already acknowledges that some of these could significantly benefit from using new features in versions 3.0 and 3.1. I think it has now served its purpose and it's time to remove it.
Pull request #1288 created #created-1288
1287 Define parse-xml error conditions
Fix #1287
Issue #1287 created #created-1287
Missing error conditions for fn:parse-xml()
I propose to define:
FODC0007 - DTD or schema validation was requested in parse-xml()
, and the XML was found to be invalid.
FODC0008 - Invalid value for the schema-validation option of parse-xml()
(for example, "Type XXX" where XXX is an invalid QName or a QName that does not refer to a known type.
QT4 CG meeting 082 draft minutes #minutes-06-18
Draft minutes published.
Issue #1274 closed #closed-1274
Further refinement of fn:round()
Issue #1275 closed #closed-1275
1274 Further rounding modes
Issue #1268 closed #closed-1268
QT4CG-077-03 Add note on document order across documents
Issue #1270 closed #closed-1270
QT4CG-081-01 Add cross refererence from fn:round-half-to-even
Issue #1276 closed #closed-1276
QT4CG-081-03 parse-xml-[fragment]: $options should be optional
Issue #1278 closed #closed-1278
Line endings in unparsed-text-lines()
Issue #1279 closed #closed-1279
1278 - line endings in unparsed-text-lines
Issue #1267 closed #closed-1267
fn:apply() - contradiction in spec
Issue #1280 closed #closed-1280
1267 fn:apply contradictions
Pull request #1286 created #created-1286
Updated list of incompatibilities in F+O
Added an incompatibility regarding unrecognized option values.
Removed an incompatibility regarding normalisation of line endings in unparsed-text().
Fix #1285
Issue #1285 created #created-1285
Appendix H of F&O should mention change for unrecognised option parameters
We have made a change to the option parameter conventions so that unrecognised options are no longer ignored, they are now rejected. We should document this in Appendix H as a backwards incompatibility.
Issue #1284 created #created-1284
Build issue: Unsupported specref to [streamability-fn-distinct-ordered-nodes]
I think the problem is that there's no "streamibility of fn:distinct-ordered-nodes" section. The expanded versions of the XSLT 4.0 specification all contain generated links to such a section, but there's no such section.
Editorial oversight, or should that function not generate the specref for some reason?
Pull request #1283 created #created-1283
77b Update expressions
This PR is the result of splitting PR #832 into two parts; this part extracts update expressions into a separate proposal, for ease of review.
Pull request #1282 created #created-1282
Revise fn:invisible-xml
- Resolve
QT4CG-080-04
: NW to revise p:invisible-xml, fix #991 - Resolve
QT4CG-081-04
: NW to update the function signature of fn:invisible-xml - Resolve
QT4CG-081-04
: NW to describe why the grammar option can be empty on fn:invisible-xml
Issue #1281 created #created-1281
invisible-xml() return type
It has been pointed out that the return type of fn:invisible-xml()
, currently fn(xs:string) as item()
, could be more precisely given as fn(xs:string) as document-node()
.
Pull request #1280 created #created-1280
1267 fn:apply contradictions
Fix #1267
Pull request #1279 created #created-1279
1278 - line endings in unparsed-text-lines
Fix #1278
Issue #1278 created #created-1278
Line endings in unparsed-text-lines()
The status quo text says:
The $options argument is interpreted in the same way as the $options argument of [fn:unparsed-text]. In particular, for backwards compatibility, the supplied argument may be either a string (the name of an encoding) or a map.
If the normalize-newlines option is set to true, then the single character U+000A (NEWLINE) , the single character U+000D (CARRIAGE RETURN) , and the character pair (U+000D (CARRIAGE RETURN) , U+000A (NEWLINE) ) are all recognized as line delimiters for the purpose of splitting the text into lines. If the option is set to false, then only the single character U+000A (NEWLINE) is recognized.
The result of the function is the same as the result of the expression:
(fn:unparsed-text($href, map:put($options, 'normalize-newlines', true())) => fn:tokenize('\n')) [not(position()=last() and .='')]
There's clearly an inconsistency here. In unparsed-text, the default for normalize-newlines is false. But the equivalent expression ignores the supplied value and uses normalize-newlines=true. The second paragraph says that setting normalize-newlines to false means that only NL is recognized, but the equivalent expression contradicts this.
I think the answer is to disallow the normalize-newlines option entirely, and recognize all three line endings as in 3.1.
Issue #1277 created #created-1277
Declare named record types
Raised in response to action QT4CG-063-06.
Currently named item types are simply an alias - except when they declare a record type, in which case they have some magic properties by allowing the record type to be recursive, and by the fact that they implicitly create constructor functions.
We would also like to refine these constructor functions for example by allowing default values for fields to be defined.
It might therefore make sense to define a separate construct for declaring named record types (in both XQuery and XSLT), and perhaps putting these in a separate category in the static context.
We might also consider introducing some built-in named record types, for example for key-value pairs, so that users can conveniently construct instances of these record types without explicitly declaring them. These would presumably be in the fn namespace.
Pull request #1276 created #created-1276
QT4CG-081-03 parse-xml-[fragment]: $options should be optional
Changes the function signature of parse-xml and parse-xml-fragment so the $options argument can be set to an empty sequence.
Pull request #1275 created #created-1275
1274 Further rounding modes
Fix #1274
Issue #1274 created #created-1274
Further refinement of fn:round()
I've been adding tests and an implementation of the changes to fn:round() which now allow control of midpoint rounding, and this generates some thoughts.
Firstly, we're using the rounding modes "floor" and "ceiling" with a different meaning from Java class RoundingMode, which may confuse some users. In our spec, these only affect handling of midpoint values, whereas in Java they affect all values, for example rounding 1.7 with rounding mode "floor" gives 1.0.
Secondly, the function library only offers fn:floor()
and fn:ceiling()
to an integer. There's no way, for example, of rounding 1.9997 to 1.999 (which happens to be what we do in format-time()
).
So I propose that we extend the set of rounding modes to:
- floor - towards negative infinity
- ceiling - towards positive infinity
- toward-zero - towards zero (i.e. truncate)
- away-from-zero - away from zero
- half-to-floor - to nearest, or floor if at midpoint
- half-to-ceiling - to nearest, or ceiling if at midpoint
- half-toward-zero - to nearest, or toward zero if at midpoint
- half-away-from-zero - to nearest, or away from zero if at midpoint
- half-to-even - to nearest, or to even if at midpoint
Issue #1273 created #created-1273
Generalize for-each-pair to work with any number of input sequences
Inspired by https://stackoverflow.com/questions/78614003
fn:for-each-pair is great when you want to select corresponding items from two input sequences. But what if there are more than two?
We could provide
for-corresponding-items($input-sequences as array(item*)), $action as function(item()*))
For example
for-corresponding-items([(1,2,3), ("a","b","c"), (true(), false(), true())],
fn{ array{.} }
)
returns [1, "a", true()], [2, "b", false()], [3, "c", true()]
Issue #1272 created #created-1272
Add xsl:value-of/@as attribute
It has been suggested that we should add an @as attribute to xsl:value-of.
The intent is to use this when returning a function result:
<xsl:function name="f:incr">
<xsl:param name="x" as="xs:integer"/>
<xsl:value-of select="$x+1" as="xs:integer"/>
</xsl:function>
In the absense of the @as attribute, the instruction constructs a text node as now. If @as is added, the effect of the instruction is to evaluate the select expression and coerce the result to the specified type.
There are a number of questions of detail. What do we do about the @separator and @disable-output-escaping attributes? Do we allow as="text()"
so that there is an explicit way of getting the default behavior?
More generally, will this actually make users' lives easier? It might read better than xsl:sequence
in this situation, but it isn't any more discoverable. Users who write code by copy-and-paste will still write xsl:value-of without realising the significance of the @as attribute.
Issue #1271 created #created-1271
Schema validation in XPath
In PR #1257 we added an options parameter to parse-xml() which includes the ability to request schema validation. We should make it clear what happens if this option is used in a non-schema-aware processor.
XPath does not have the XQuery validate expression but it can now request validation by calling serialize() => parse-xml()
. This doesn't seem very satisfactory - either XPath shouldn't have a validation capability, or it should do it properly.
This will also affect other functions such as doc() if we add options for schema validation.
Pull request #1270 created #created-1270
QT4CG-081-01 Add cross refererence from fn:round-half-to-even
Issue #1248 closed #closed-1248
for member allowing empty
Issue #1241 closed #closed-1241
Node constructor vs. otherwise/map constructor
Issue #1259 closed #closed-1259
1241 Add constraint to resolve node constructor ambiguity
Issue #1246 closed #closed-1246
fn:json-to-xml: `number-parser` option
Issue #1258 closed #closed-1258
1246 Revert incompatibility in json-to-xml number formatting
Issue #305 closed #closed-305
parse-xml() and whitespace stripping
Issue #1257 closed #closed-1257
305 Add options parameter for parse-xml and parse-xml-fragment
Issue #1187 closed #closed-1187
Decimal rounding
Issue #1260 closed #closed-1260
1187 Add midpoint-rounding option to fn:round()
Issue #991 closed #closed-991
Invisible-xml - missing details
Issue #1256 closed #closed-1256
991 Fix editorial details in fn:invisible-xml
Issue #1250 closed #closed-1250
1048 Extended decimal format properties
Issue #1249 closed #closed-1249
31 Introduce "for key $k value $v in $map"
Issue #1119 closed #closed-1119
Declare namespace bindings in XPath
Issue #1055 closed #closed-1055
xsl:variable/@as - simplifying the language - attempt 2
Issue #955 closed #closed-955
Options parameters as record types
Issue #954 closed #closed-954
Establish a default value for the XSLT fixed-namespaces attribute
Issue #745 closed #closed-745
Support for inline (anonymous) xslt functions
Issue #379 closed #closed-379
Namespace handling in parse-html
Issue #1181 closed #closed-1181
296 Allow default-namespace=##any
Issue #266 closed #closed-266
Add an option on xsl:copy-of to copy a subtree with a change of namespace
Issue #168 closed #closed-168
XSLT Extension Instructions invoking Named Templates
Issue #111 closed #closed-111
FLWOR tracing
Issue #1013 closed #closed-1013
[XSLT] Need to say what happens when a capturing accumulator rule matches a non-element node
Issue #1015 closed #closed-1015
1013 [XSLT] Clarify effect of accumulator capture on non-element nodes
Issue #956 closed #closed-956
850-partial Editorial improvements to parse-html()
Issue #920 closed #closed-920
The rules for the "tail position" of a sequence constructor need to take account of xsl:switch
Issue #921 closed #closed-921
920 Allow xsl:break and xsl:next-iteration within branch of xsl:switch
Issue #1269 created #created-1269
Could the labeling of grammar productions be improved?
There was some discussion at meeting 081 about whether the labeling of grammar productions could be improved. The current numbering isn't stable, but may have useful implications. The productions appear in snippets, sometimes more than once in different places.
Could we do better? How?
Pull request #1268 created #created-1268
QT4CG-077-03 Add note on document order across documents
Issue #1267 created #created-1267
fn:apply() - contradiction in spec
The spec says:
The arity of the supplied function $function must be the same as the size of the array $array.
The effect of calling fn:apply($f, [$a, $b, $c, ...]) is the same as the effect of the dynamic function call $f($a, $b, $c, ....). For example, the function conversion rules are applied to the supplied arguments in the usual way.
These two rules appear contradictory. If the function conversion rules (should be: coercion rules) are applied in the usual way, then it is possible to supply excess arguments, which will be ignored.
If excess arguments can be supplied then the example apply($f, array:subarray([ "a", "b", "c", "d", "e", "f" ], 1, function-arity($f)))
becomes meaningless, since the same effect can be achieved with apply($f, [ "a", "b", "c", "d", "e", "f" ])
.
(Also, we could now write apply($f, [ "a", "b", "c", "d", "e", "f" ]?[1 to function-arity($f)])
)
Pull request #1266 created #created-1266
1158 Add array mapping operator
Fix #1158
Pull request #1265 created #created-1265
1161 Further revision of document-uri constraints
Fix #1161
Pull request #1264 created #created-1264
1245 Correct properties of format-DT function family
Fix #1245
Note: since any of the last three arguments can now be present but set to (), the relevant context dependency becomes independent of arity, so the rules can be simplified.
Pull request #1263 created #created-1263
1224 Add xsl:accumulator-rule/@priority attribute
Fix #1224
Pull request #1262 created #created-1262
1160 Add collation-available() function
New function collation-available()
Existing function fn:collation() no longer fails if the constructed collation URI is unavailable.
QT4 CG meeting 081 draft agenda #agenda-06-11
Draft agenda published.
Issue #1261 created #created-1261
Add decimal-divide function
Introduce a decimal-divide function that
(a) defines the precision of the required result, perhaps with rounding options (b) returns both the quotient and the remainder in a single operation (as a map/record)
Pull request #1260 created #created-1260
1187 Add midpoint-rounding option to fn:round()
Fix #1187
Pull request #1259 created #created-1259
1241 Add constraint to resolve node constructor ambiguity
Fix #1241
Pull request #1258 created #created-1258
1246 Revert incompatibility in json-to-xml number formatting
Fix #1246
Pull request #1257 created #created-1257
305 Add options parameter for parse-xml and parse-xml-fragment
Fix #305
Pull request #1256 created #created-1256
991 Fix editorial details in fn:invisible-xml
Fix #991
Pull request #1255 created #created-1255
1253 whitespace in xsl:switch
Fix #1253
Pull request #1254 created #created-1254
729 Add rules for use of xsi:schemaLocation during validation
Fix #729
Issue #1253 created #created-1253
XSLT: add xsl:switch to list of instructions within which whitespace is ignored
Add xsl:switch
to the list in 3.13.1 rule 5.
QT4 CG meeting 080 draft minutes #minutes-06-04
Draft minutes published.
Issue #1252 created #created-1252
Add a new function `fn:html-doc`
Motivation
The current specification has functions to retrieve and parse XML (fn:doc#1
), and JSON (fn:json-doc#1
) but not for html.
For convenience and consistency I propose to add fn:html-doc($href as xs:string?) as document-node()?
to be added to the spec.
Justification
Parsing an html document after retrieving it from a source is a very common use-case.
@cedporter raised the question if we also want to have fn:csv-doc#1
or maybe instead of adding specialised functions to retrieve and parse a document per data format rather extend fn:doc
with an option to parse the retrieved document as being in a certain format (xml, json, html or csv).
Issue #871 closed #closed-871
Action qt4 cg 027 01 next match
Issue #1216 closed #closed-1216
Detailed comments on math:e, sinh(), cosh(), tanh()
Issue #1230 closed #closed-1230
1216 Detailed comments on math:e, sinh(), cosh(), tanh()
Issue #1233 closed #closed-1233
517 Major edits to fn:chain, clarification only
Issue #814 closed #closed-814
XSLT: Rules for on-no-match="shallow-copy-all"
Issue #774 closed #closed-774
What should be percent-encoded in a URI?
Issue #1251 created #created-1251
Allow sequence constructor in extension instructions that are implemented with named templates
This is follow up of #168
I would like to extend https://qt4cg.org/specifications/xslt-40/Overview.html#invoking-templates-with-extension-instructions and allow sequence constructor inside extension instruction. I'm using such instructions in my code and it would be nice to be able to rewrite them to pure XSLT 4.0 code. Sequence constructor could be mapped to predefined parameter name. E.g.
<t:_>Hello world.</t:_>
Would be translated to
<xsl:call-template name="t:_">
<xsl:with-param name="xsl:input">Hello world.</xsl:with-param>
</xsl:call-template>
Pull request #1250 created #created-1250
1048 Extended decimal format properties
Pull request #1249 created #created-1249
31 Introduce "for key $k value $v in $map"
Implements ForClause / ForExpression iterating over entries in maps.
Note that the feature is described differently in XP and XQ - I have made some starter attempts to reconverge the two specs, but there is more to be done. (However, the change involved reordering sections, which will adversely affect the diff version).
Issue #1248 created #created-1248
for member allowing empty
The XQuery spec says:
The allowing empty option is available only when processing sequences, not when processing arrays.
It also says "This option is not available with "for member"
But then it gives an example:
for member $x allowing empty in []
Logically, on grounds of orthogonality, I think we should allow it. (However, there is a justification for NOT allowing it, namely that there's no "null" value to which we can bind the range variable in this case).
QT4 CG meeting 080 draft agenda #agenda-06-04
Draft agenda published.
Issue #1247 created #created-1247
`??type(T)` in lookup expressions - shortcuts
We've introduced the syntax ??type(T)
in lookup expressions to allow selection of items of a particular type.
The most common usages for T are to select a record type or an array type. It would be useful to provide shortcut syntax for such cases: ??record(longitude, latitude, *)
as a shortcut for ??type(record(longitude, latitude, *))
, ??array(xs:integer)
as a shortcut for ??type(array(xs:integer))
.
Note this aligns with syntax for XSLT pattern matching where for most item types (with the notable exception of atomic types) the type()
wrapper can be omitted.
Issue #1246 created #created-1246
fn:json-to-xml: `number-parser` option
With #973, the number-parser
option was added to fn:json-to-xml
. It has been reported back to us that the current definition introduces a backward incompatibility:
json-to-xml('1234567')
…now returns:
<number xmlns="http://www.w3.org/2005/xpath-functions">1.234567E6</number>
Before, we got:
<number xmlns="http://www.w3.org/2005/xpath-functions">1234567</number>
I’m convinced that the number-parser
option is am important addition for fn:json-to-xml
– I’ve already seen its application in practice – but I think we’ll need to change the default, which currently xs:double#1
.
It seems sufficient to me to change xs:double#1
to identity#1
.
As a side note, 17.4.2 XML Representation of JSON still states that:
The fn:json-to-xml function creates an element whose string value is lexically the same as the JSON representation of the number.
…which means that the semantics of number-parser
have not been incorporated in this section yet (my fault).
QT4 CG meeting 079 draft minutes #minutes-05-28
Draft minutes published.
Issue #1245 created #created-1245
fn:format-dateTime: Properties
The Properties of fn:format-dateTime
say:
The two-argument form of this function is ·deterministic·, ·context-dependent·, and ·focus-independent·. It depends on default calendar, and default language, and default place, and implicit timezone. The five-argument form of this function is ·deterministic·, ·context-dependent·, and ·focus-independent·. It depends on implicit timezone, and namespaces.
They fail to mention the new three- and four-argument form.
The same applies to fn:format-date
and fn:format-time
.
Issue #1108 closed #closed-1108
566-partial Describe a less aggressive %-encoding for fn:build-uri
Issue #1232 closed #closed-1232
Rendition of RFC2119 terms
Issue #1237 closed #closed-1237
1232 consistent rendition of rfc2119 terms
Issue #894 closed #closed-894
Errors in forming function items
Issue #908 closed #closed-908
Function identity: documentation, nondeterminism
Issue #1000 closed #closed-1000
XQFO Code in the Rules sections
Issue #1242 closed #closed-1242
XSLT: system-property('xsl:xpath-version')
Issue #1243 closed #closed-1243
Change required result of system-property(...version)
Issue #1098 closed #closed-1098
566-partial Editorial improvements for parse-uri
Pull request #1244 created #created-1244
566-partial Rewrite parse-uri
What happened was, I discovered I'd messed with the code in two different branches.
In the course of trying to straighten that out, I came to the conclusion that the real difficulty in parsing URIs is what to do about all the special cases around file:
. And that the approach I'd taken in the previous draft was unnecessarily complicated.
I think this is better.
QT4 CG meeting 079 draft agenda #agenda-05-28
Draft agenda published.
Pull request #1243 created #created-1243
Change required result of system-property(...version)
Fix #1242
Issue #1242 created #created-1242
XSLT: system-property('xsl:xpath-version')
In XSLT 4.0 it should be required that system-property('xsl:xpath-version')
returns 4.0
Issue #1241 created #created-1241
Node constructor vs. otherwise/map constructor
I wonder whether we should care that the following expression could be interpreted in two ways:
<🤔/> ! (element otherwise {})
The result could either be <otherwise/>
or an empty map.
If we say it’s an edge case that we can ignore, it would be fine for me.
Before otherwise
was introduced, we used ?:
in our implementation, analogous to Kotlin’s Elvis operator. JavaScript now has ??
(which is no option for us).
Issue #1240 created #created-1240
$sequence-of-maps ? info()
We're increasingly using the design pattern where maps contain entries that are function items. If $map
is a map and it has an entry info
that is a zero-arity function, then $map ? info()
invokes the function. This looks appealingly as if it's a method application applying the method info()
on the object $map
, but that's not actually what's really going on underneath. What is really happening is that we evaluate ($map ? info)
which yields a function item, and then we dynamically call this function.
Now what if $maps
is a sequence of maps each of which has an info
field? This parses as ($maps ? info)()
. $maps ? info
returns a sequence of function items, and a dynamic function call can't be applied to a sequence of functions items. Instead you have to write ($maps ? info)!.()
which feels fairly bizarre.
Should we allow the LHS of a dynamic function call to be a sequence? On the whole, I don't tend to like operations to do implicit mapping over one of the arguments, but I feel like this might warrant an exception. The justification is that a dynamic function call is a postfix expression, and all the other postfix expressions accept a sequence on the LHS. Thoughts please.
Issue #1239 created #created-1239
XSLT xsl:next-match with select attribute
The ability of xsl:next-match
to apply a sequence of template rules to the same item is limited by the fact that the item in question cannot be changed in any way. For example, if I want a template rule that matches an array to sort the array and then continue processing using the next rule for arrays, it's not possible; applying templates to the new array will sort it again, ad infinitum, while doing next-match is only possible on the original array, not the new sorted version.
I propose that xsl:next-match should have a select attribute. The effect is to process the selected items using the current mode, considering only the template rules that are lower in ranking order than the current template rule.
Alternatively or in addition, we could drop the rule that instructions that change the context item also clear the current mode. This would enable, for example, xsl:next-match to be used within xsl:for-each. However, it's possible that the results here could be confusing: we need to look at use cases.
Issue #1238 created #created-1238
XSLT on-no-match="shallow-copy-all" - revised rules
The work on deep lookup with modifiers enables an improved set of rules for processing trees of maps and arrays using a mode with on-no-match="shallow-copy-all"
.
Recall that the intent is that if the user writes no template rules at all in such a mode, the effect is to recursively copy the entire structure without change. But it should be as easy as possible for the user to add template rules to override this processing for a selected part of the structure.
With this in mind, the proposed built-in rules are as follows:
For an array with no additional information available, we split it up into array members in a way that makes it possible to override the processing for a specific array member:
<xsl:template match="array(*)">
<xsl:array use="?member">
<xsl:apply-templates select="for member $m at $pos in .
return {"array-member":true(), "index": $pos, "member": $m}"
mode="#current"/>
</xsl:array>
</xsl:template>
The field 'array-member' here is a dummy, provided simply to make it easier to match these records at the next level of processing.
For the array members, when represented in this way, the items in the array member are processed one-by-one to produce a new array member:
<xsl:template match="record(array-member as xs:boolean, index as xs:integer, member as item()*)">
<xsl:map:entry key="'member'">
<xsl:apply-templates select="for $item at $pos in ?member
return {"array-member-item":true(),
"index": ?index,
"member": ?member,
"item": $item,
"position": $pos }"
mode="#current"/>
</xsl:map:entry>
</xsl:template>
The new array members are delivered as singleton maps, in the form expected by the match="array(*)" template given above.
For the individual items within each array member, the default is simply to apply-templates to the item:
<xsl:template match="record(array-member-item as ..., index as ..., member as ..., item as item(), position as ...)">
<xsl:apply-templates select="?item" mode="#current"/>
</xsl:template>
Similarly for a map with no additional information, we reconstruct the map by applying templates to its individual entries:
<xsl:template match="map(*)">
<xsl:map on-duplicates="op(',')">
<xsl:apply-templates select="for entry ($k, $v) in .
return {"map-entry":true(), "key": $k, "value": $v}"
mode="#current"/>
</xsl:array>
</xsl:template>
The built in processing for a map entry represented in this way is to reconstruct the map entry by applying templates to its individual items:
<xsl:template match="record(map-entry as xs:boolean, key as xs:anyAtomicType, value as item()*)">
<xsl:map:entry key="$key">
<xsl:apply-templates select="for $item at $pos in ?value
return {"map-entry-item":true(),
"key": ?key,
"value": ?value,
"item": $item,
"position": $pos }"
mode="#current"/>
</xsl:map:entry>
</xsl:template>
For the individual items within each map entry, the default is simply to apply-templates to the item:
<xsl:template match="record(map-entry-item as ..., key as ..., value as ..., item as item(), position as ...)">
<xsl:apply-templates select="?item" mode="#current"/>
</xsl:template>
And of course the fallback processing for items not in the above list is to return them unchanged:
<xsl:template match="item()">
<xsl:sequence select="."/>
</xsl:template>
This allows user-written template rules to intervene at any of these levels, and to have access to contextual information about the item they are processing. For example to rename map entries with key "comment" to have key "note" instead, use:
<xsl:template match="record(map-entry, *)[?key = 'comment']">
<xsl:map-entry key="'note'" select="?value"/>
</xsl:template>
There's one more refinement I would like, which is to provide access to the selection path for each map and array entry. I think this can be done by ensuring that within the template, items are labeled so that the function call selection-path(?value) or selection-path(?item) delivers the required result.
Issue #1146 closed #closed-1146
Identifying 4.0 Changes
Pull request #1237 created #created-1237
1232 consistent rendition of rfc2119 terms
Fix #1232
Issue #1236 created #created-1236
QT4CG-078-01 fn:unparsed-text-lines, normalize newlines
I think we should offer normalize-newlines
only for fn:unparsed-text
:
- It should never affect the result of
fn:unparsed-text-available
. - I doubt (after further consideration) that it’s really helpful for
fn:unparsed-text-lines
. The default should work just fine for nearly all users. If someone really needs to do more sophisticated string processing, it’s still possible to usefn:unparsed-text
and modify the result.
I have commited some test that (more or less) reflect the status quo: https://github.com/qt4cg/qt4tests/commit/68d40455f2404c379fdccddc1d524648ae4c8803
Issue #1189 closed #closed-1189
Function: distinct document order
Issue #1180 closed #closed-1180
fn:unparsed-text: `cache` option
Issue #1172 closed #closed-1172
Iterating maps: Positional access
Issue #1214 closed #closed-1214
fn:hash, CRC-32: Describe output
Issue #1235 created #created-1235
Function Identity: Treating function items with identical bodies
(One) requirement resulting from #908 (in particular, https://github.com/qt4cg/qtspecs/issues/908#issuecomment-1891524815):
The following examples reflect the status quo:
# | Function | Result
-- | -- | --
1 | deep-equal(<a/>, <a/>)
| true
2 | let $f := fn { <a/> } return deep-equal($f, $f)
| true
3 | deep-equal(fn { 1 }, fn { 1 })
| true
or false
4 | deep-equal(fn { <a/> }, fn { <a/> })
| false
Notes
- The result is
true
(regardless of the identity of the compared nodes). - The result is
true
(regardless of the identity of the nodes that result from evaluating the functions) - The result is allowed to be
true
orfalse
, depending on the optimization strategies of a processor. - Only in this case, the result must currently be
false
.
We should allow deep-equal
to return true
for function items that have the same arguments and bodies. An implementation should be allowed to use the same internal representation for multiple occurrences of such functions.
Issue #812 closed #closed-812
Coercion Rules: Unifications
Issue #358 closed #closed-358
serialization indent whitespace
Issue #101 closed #closed-101
fn:serialize line breaks
Issue #1234 created #created-1234
Seralization Parameters: Indentation, Whitespace, Newlines
Summary of #358 and #101:
Parameter | Description | Values
--- | --- | ---
indent-unit
| Character sequence to use for indentation. | Pattern: (\\t\| +)
Examples: ("\t"
, " "
, " "
)
indent-attributes
| Indent multiple attributes
(similar: HTML Tidy’s configuration option).
Should be used with spaces as indentation characters. | "yes"
, "no"
line-ending
| Newline character. | "\r\n"
, "\n"
, "\r"
, …
If no parameter is specified, the default of the implementation is used.
Pull request #1233 created #created-1233
517 Major edits to fn:chain, clarification only
@dnovatchev For your review. The current version of fn:chain
in the specs has unnecessary verbiage, is out of sync with other comparable spec entries, and has passages that cry out for clarity and concision. I do not think I have made any edits that change any substantive points. A summary:
- Summary simplified
- Rules: prose for the recursive process brought into conformity with other functions in the specs that describe recursive processes.
- Error conditions: two entries, and clarification of the conditions when they are triggered.
- Former note 1 deleted (excessively wordy, introduces unfamiliar or unnecessary terms/concepts).
- Former notes 3 and 4 moved to rules, distilled into more concise prose.
- Former note 5 (now 2) recast to convey what I think was your original intent, and allows the reader to more quickly compare
fn:chain
to two comparable methods.
Signature, properties, XPath rule, and examples are unchanged.
Issue #1192 closed #closed-1192
Allow "fn" as abbreviation for "function" in ItemType syntax
Issue #1232 created #created-1232
Rendition of RFC2119 terms
The specs use consistent markup for RFC 2119 terms (must, should, may) but the rendition differs between specs.
XQuery/XPath and XDM use bold text, XSLT and F&O use small caps, serialization uses bold caps.
I propose using small caps throughout.
QT4 CG meeting 078 draft minutes #minutes-05-21
Draft minutes published.
Issue #1229 closed #closed-1229
Rework record descriptions per ACTION QT4CG-070-01
Issue #1211 closed #closed-1211
QT4CG-076-01 Add examples of coercions
Issue #1208 closed #closed-1208
Reserved Function Names: item, empty-sequence
Issue #1212 closed #closed-1212
1208 correct details of formerly-reserved function names
Issue #1213 closed #closed-1213
1199 Add ellipsis markup for arguments in variadic functions
Issue #1199 closed #closed-1199
In F+O function signatures, show some indication that a function is variadic.
Issue #1207 closed #closed-1207
Array filter: Positional access
Issue #1217 closed #closed-1217
1207 Allow numeric predicates when filtering arrays
Issue #1219 closed #closed-1219
1218 Drop use of union(A,B) syntax
Issue #1218 closed #closed-1218
Residual references to union(A, B)
Issue #934 closed #closed-934
String comparison in deep-equal
Issue #1167 closed #closed-1167
Merge $collation into $options parameter of fn:deep-equal()
Issue #1191 closed #closed-1191
1167, 934 deep equal merge collations param
Issue #1197 closed #closed-1197
1192 Allow fn as abbreviation for function
Issue #1116 closed #closed-1116
unparsed-text() end-of-line normalization
Issue #1117 closed #closed-1117
1116 Add options param to unparsed-text
Issue #1223 closed #closed-1223
Minor: fixed URL
Issue #1222 closed #closed-1222
1214 hash examples
Issue #116 closed #closed-116
Clarify the fn:transform function() wrt multiple top-level elements
Issue #652 closed #closed-652
Defining a common function library for XPath, XSLT, and XQuery applications
Issue #1220 closed #closed-1220
73 copy&paste typo in fn:graphemes (combining diaeresis should be ZWJ)
Pull request #1231 created #created-1231
1193 Parsing Functions: Empty input
Issue: #1193 (covers only the obvious 4.0 inconsistencies)
Pull request #1230 created #created-1230
1216 Detailed comments on math:e, sinh(), cosh(), tanh()
Issue: #1216
Pull request #1229 created #created-1229
Rework record descriptions per ACTION QT4CG-070-01
I think this PR completes the work I started before and that we reviewed briefly some time ago.
Note: because this PR involves changes to the schemas and the stylesheets, the PR build is going to be...funky. I don't know if the diffs will be useful either. There are no (intentional) technical changes in this PR. You can preview the built result by look at https://qt4cgtest.nwalsh.com/branch/action-qt4cg-70-01/xpath-functions-40/Overview.html
A few notes:
- Instead of just reusing
fos:options
, I addedfos:record-description
with a very similar content model. - I couldn't completely remove some of the
<fos:type>
elements defined in the globals section at the top of thefunction-catalog.xml
because it causes IDREF failures when that file is parsed. - I tinkered a bit with the styling, and moved a bunch of inline table styles into CSS
QT4 CG meeting 078 draft agenda #agenda-05-21
Draft agenda published.
Issue #1221 closed #closed-1221
new function - fn:tail-recurse a function to allow users to hand roll their recursion and guarentee tail recursion.
Pull request #1228 created #created-1228
– Adding the BLAKE3 hashing algorithm to fn:hash
This is a resubmission of the original https://github.com/qt4cg/qtspecs/pull/1226. No new changes, this is fixing a pure git-technical issue.
Now the PR is submitted from a dedicated feature-branch and does not depend on any other branch
Pull request #1227 created #created-1227
150 PR resubmission for fn ranks
This is a resubmission of the original PR 1027 for function fn:ranks. No new changes, this is fixing a pure git-technical issue.
Now the PR is submitted from a dedicated feature-branch and not from master
Issue #1226 closed #closed-1226
Add the BLAKE3 hashing algorithm to fn:hash
Issue #1027 closed #closed-1027
150 fn:ranks
Pull request #1226 created #created-1226
Add the BLAKE3 hashing algorithm to fn:hash
This PR adds the BLAKE3 hashing algorithm as one of the hashing algorithms in fn:hash
This comes from a different branch than the one that contains the PR for fn:ranks
, thus both PRs must be active and independent of each other.
Issue #1225 created #created-1225
Generalization of Deep Updates
Note: This is a discussion issue, as I cannot contribute something substantial so far.
Observations
Our current development to support updates in the languages may come as a surprise to developers:
- The XQuery core specification (which includes X in its name) will include constructs for updating Maps and Arrays.
- To update XML, an implementation must support the XQuery Update (XQUF) specification.
I think we should…
- either embed map/array updates in XQUF, or
- support a modified subset of XQUF in our core specs (while remaining fully compatible with XQUF).
I believe 2. is more realistic. By providing a simplified syntax, we could tackle some of the shortcomings of XQUF, such as its verbosity, and seemingly unnecessary restrictions:
XQUF: Verbosity
The Transform expression (or Copy Modify expression, as it’s called in 3.0) has a cumbersome and wordy syntax for doing very trivial things:
copy $node := <a><b/></a>
modify delete node $node/b
return $node
The 3.0 Transform With syntax is a bit simpler, it utilizes the context item:
<a><b/></a> transform with {
delete node ./b
}
It resulted from the BaseX update
syntax…
<a><b/></a> update {
delete node ./b
}
…which comes with an ambiguity that forbids its unchanged adoption: element update {}
could be both an element constructor and an update statement. I think that dropping the curly braces (and, optionally, using parentheses) would resolve this issue.
XQUF: Restrictions
The XQUF syntax is very powerful, but it has some restrictions that require the use of FLWOR expressions when addressing multiple nodes. For example, the following statement is illegal…
replace //village with <village/>
…if the target is not a single node, which means that you have to write…
for $v in //village
return replace $v with <village/>
…or…
(: only supported in BaseX :)
//village ! (replace . with <village/>)
I’m pretty sure it would be safe to drop the restriction, which also exists for other update expressions, such as insert nodes NODES into SINGLE-NODE
or rename node NODE as 'NAME'
(delete nodes NODES
is legal). Allowing multiple targets would greatly reduce the number of iterations required within update blocks in practice.
XQuery Update light
I think the new update syntax should meet the following requirements (among others):
- Compatible with the XQUF node semantics.
- Similar syntax for supported input types.
- Chaining of update operations.
First, we would need to decide on a syntax that would be applicable to both maps/arrays and nodes. We could:
- Build on the proposal in #832, which introduces a new syntax for maps and arrays, and extend it for nodes:
update map INPUT-MAP { ... }
update array INPUT-ARRAY { ... }
update node INPUT-NODE { ... }
- Build on XQUF 3.0:
INPUT-MAP transform with { ... }
INPUT-ARRAY transform with { ... }
INPUT-NODE transform with { ... }
- Build on BaseX (allowing multiple input items and chains):
INPUT-MAPS update (...) update (...)
INPUT-ARRAYS update (...) update (...)
INPUT-NODES update (...) update (...)
Syntax 2. and 3. is challenging, as the type of the input can only be evaluated at time (and for XQUF it has to be determined statically whether an expression is an updating or non-updating).
As we currently have a proposal for 1., I will stick to that syntax, but allow an optional plural form for map
and array
(inspired by XQUF), and use chains. Within the the update block, we could now use the short syntax also for nodes without the node
/nodes
keywords:
update map $country-map {
delete ??entry:city
},
update maps $country-maps update {
rename ?entry:village as 'city'
},
update node $country-node {
delete //city
},
update nodes $country-nodes {
insert <lakes/> into .,
insert <mountains/> into .
} {
insert <lake/> into //lakes
)
Semantics
- Note that for XQUF update expressions it makes a difference whether multiple expressions are defined with the same block or in a subsequent block – which is why I think chains are essential.
- Even though the syntax would be similar for node and map/array updates, the inherent semantics would differ a lot – which is something, however, users would not need to care about too much: Node updates would greatly rely on XQUF, whereas map/array updates would be based on the new proposal.
I’m looking forward to everyone’s opinions.
Issue #1224 created #created-1224
Attribute priority for xsl:accumulator-rule
I propose that XSLT xsl:accumulator-rule
be allowed to take attribute priority
, to allow users to be more declarative in their accumulator rules. Even accumulators with two or three rules might require simple overshadowing: a default rule for the majority of nodes, with accommodation for certain exceptions. An explicitly declared priority rather than document order will allows users to better express their intentions, and processor-generated warnings about duplicate matches will be more meaningful.
Because the current rules stipulate that among multiple rules the last one in document order wins, I think that backward compatibility prevents us from using the default priority rules for templates (i.e., allotting -0.5, 0, 0.25 scores based on match pattern types). Rather, in this case, every accumulator rule is assumed to have priority 0, unless otherwise specified. If a node matches more than one rule of the same priority level, the last one wins. This simpler version of priority (assume zero, and if you know multiple matches will overlap, use @priority
) is one that many developers have come to use for templates.
Issue #1188 closed #closed-1188
fn:hash: Editorial
Pull request #1223 created #created-1223
Minor: fixed URL
Fixes link to Unicode TR29
Pull request #1222 created #created-1222
1214 hash examples
addresses #1214
Issue #1221 created #created-1221
new function - fn:tail-recurse a function to allow users to hand roll their recursion and guarentee tail recursion.
Motivation:
- I as a user of XPath want to write a recursive function.
- I write the function,
- I run it,
- it causes a stack overflow
So,
- I don't believe tail recursion detection is part of the spec thus an implementation may not implement it
- tail recursion detection I suspect is hard, and I suspect there are cases that are tail recursion that an implementation doesnt detect
Tail recursion is not an uncommon problem in other languages, in imperative languages I would simply implement the algorithm using a 'while' loop, creating the 'body' of the while loop is my problem, but once I've done it, I KNOW that in all implementations my algorithm will be executed tail recursively (imperative code is FULL of loops, stack overloads are not an issue).
An example
I want to implement a power function in C# I know how to write it recursively, but C# doesn't support tail recursion, so I have to turn it into a loop. I could do this in an ad hoc way using a while loop, but I could also write the loop once, and then ask the developer to pass in a function that defines the body of the loop
in C# that function could have the type (i.e. it takes a state and either returns a new state or null, a null would indicte the end of the 'loop')
State? recurse(State state)
and the library function that executes it have the signature:
State TailRecurse<State>(Func<State,State?> f, State state)
a complete example of how this would appear in C# would be: (note C# has a nuance w.r.t. the higher kinded type '?' and so the signature of TailRecurse below is slightly weaker than the one above).
var toPower3 = TailRecursion.Power2(3);
var result = TailRecursion.TailRecurse(toPower3, (0, 2));
Console.WriteLine(result.Value.Item2);
class TailRecursion
{
// actually we want, but because of a quirk in C# around the '?' higher kinded type we have to write the signature symetrically (I think).
//public static State TailRecurse<State>(Func<State, State?> f, State state)
public static State TailRecurse<State>(Func<State?,State?> f, State state)
{
while (true)
{
var result = f(state);
if (result == null)
{
return state;
}
state = result;
}
}
public static Func<(int,int)?,(int,int)?> Power2(int n)
{
return powerAndX =>
{
if (powerAndX.Value.Item1 > n)
{
return null;
}
return (powerAndX.Value.Item1 + 1, powerAndX.Value.Item2 * powerAndX.Value.Item2);
};
}
}
in XSLT I could write this function
<xsl:function name="kooks:pow" as="xs:integer">
<xsl:param name="x" as ="xs:integer"/>
<xsl:param name="n" as ="xs:integer"/>
<xsl:choose>
<xsl:when test="$n = 0">
<xsl:sequence select="$x"/>
</xsl:when>
<xsl:otherwise>
<xsl:sequence select="$x * kooks:pow($x,$n - 1)"/>
</xsl:otherwise>
</xsl:choose>
</xsl:function>
I know its tail recursive, but my environment may not for whatever reason detect it (I would hope it does, but I could be doing something much more complex, that IS tail recursive but the environment simply doesn't see it).
I can't write a loop in XPath etc, it doesnt exist, so I can't escape like I do in C# or scala, in F# (which also doesnt support while loops), I would have to write a function that I was sure F# detected as tail recursive and then path the 'body' of the while loop as a function.
in XSLT this could look like this (basically the same as the C# example)
here the C# signature
State TailRecurse<State>(Func<State,State?> f, State state)
has been translated by using item()*
for state an array(*)
for State?, where an empty array corresponds to null/none, and an array with 1 element corresponds to 'some' State.
This function IS tail recursive in a very simple way that I think all implementations would detect as such, and thus (if it does) I can pass any function I like and be confident it is processed tail recursively - I can of course do this now, I use saxon, (even though I'm wrestling with it to detect tail recursion for some bizarre reason which is probably my fault) I think it would ideally be a library function and (using a loop) allow non tail recursive environments to support tail recursion, or allow me to simply do the detection myself.
<xsl:function name="kooks:tailRecurse">
<xsl:param name="unfolder" as="function(item()*) as array(*)"/>
<xsl:param name="state" as="item()*"/>
<xsl:variable name="newState" select="$unfolder($state)"/>
<xsl:choose>
<!-- loop returns null/none - end of recurstion -->
<xsl:when test="array:size($newState) = 0">
<xsl:sequence select="$state"/>
</xsl:when>
<!-- else, unpack the state and loop again -->
<xsl:otherwise>
<xsl:sequence select="kooks:tailRecurse($unfolder,array:get($newState,1))"/>
</xsl:otherwise>
</xsl:choose>
</xsl:function>
For environments that don't do tail recursion detection, they can simple implement the analogous code to the C# example in their implementation i.e. map it to a while loop.
In both cases I think this is hopefully trivial for the implementor of the language.
Here's a complete example, with tailRecurse defined as above, that would guarentee (in an environment that detected it correctly) that any passed function is processed.
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:array="http://www.w3.org/2005/xpath-functions/array"
xmlns:map="http://www.w3.org/2005/xpath-functions/map"
exclude-result-prefixes="xs"
version="3.0"
xmlns:kooks="http://www.kookerella.com">
<!-- (state -> Maybe state) -> state -> state -->
<xsl:function name="kooks:tailRecurse">
<xsl:param name="unfolder" as="function(item()*) as array(*)"/>
<xsl:param name="state" as="item()*"/>
<xsl:variable name="newState" select="$unfolder($state)"/>
<xsl:choose>
<!-- loop returns null/none - end of recurstion -->
<xsl:when test="array:size($newState) = 0">
<xsl:sequence select="$state"/>
</xsl:when>
<!-- else, unpack the state and loop again -->
<xsl:otherwise>
<xsl:sequence select="kooks:tailRecurse($unfolder,array:get($newState,1))"/>
</xsl:otherwise>
</xsl:choose>
</xsl:function>
<xsl:function name="kooks:powUnfolder" as="function(item()*) as array(*)">
<xsl:param name="x" as="xs:integer"/>
<xsl:param name="n" as="xs:integer"/>
<xsl:sequence select="function($state) {
if (map:get($state,'power') >= $n)
(: we're done so return null/none :)
then array {}
else
(: else calculate the next power and loop again :)
let $newState := map {
'power': map:get($state,'power') + 1,
'result': map:get($state,'result') * $x
}
return array { $newState }
}"/>
</xsl:function>
<xsl:template match="/">
<twoToThePower4>
<xsl:variable name="seed" select="map { 'power':0,'result':1 }"/>
<xsl:sequence select="map:get(kooks:tailRecurse(kooks:powUnfolder(2,4),$seed),'result')"/>
</twoToThePower4>
</xsl:template>
</xsl:stylesheet>
I suspect these lines are not obvious.
<xsl:variable name="seed" select="map { 'power':0,'result':1 }"/>
<xsl:sequence select="map:get(kooks:tailRecurse(kooks:powUnfolder(2,4),$seed),'result')"/>
The first line says anything to the power 0 is 1, the second line says, I want the 'result of 2 to the power 4.
Note its VERY similar to xsl:iterate, but that requires an sequence to drive it, this is just general recursion.
Issue #73 closed #closed-73
Split a string by graphemes
Pull request #1220 created #created-1220
73 copy&paste typo in fn:graphemes (combining diaeresis should be ZWJ)
I wonder which font I need to use in order to see U+1F476 U+200D U+1F6D1 as a single grapheme. When I naively put the character sequence into an HTML page, the two glyphs will be rendered individually. How can we make sure that the grapheme will be rendered as intended for most people when they read the spec?
Pull request #1219 created #created-1219
1218 Drop use of union(A,B) syntax
Fix #1218
Issue #1218 created #created-1218
Residual references to union(A, B)
The XQuery spec has 10 uses of the obsolete syntax union(A, B)
, and the F&O spec has another 6. XSLT has 7, and XDM has two.
QT4 CG meeting 077 draft minutes #minutes-05-14
Draft minutes published.
Pull request #1217 created #created-1217
1207 Allow numeric predicates when filtering arrays
Fix #1207
Also a minor change: $V[23, "fred"]
now throws FORG0006 rather than XPTY0004. This keeps it compatible with 3.1 (in case anyone is catching the errors), and is more uniform: it seems unreasonable for $V[23, "fred"]
and $V["fred", 23]
to throw different errors.
Issue #1216 created #created-1216
Detailed comments on math:e, sinh(), cosh(), tanh()
I should have made these comments before we accepted the proposal, but it's only minor details.
In the example given for math:e
, the explanation of the example as a compound interest calculation seems a bit simplistic. There are all sorts of assumptions here about the initial investment, the frequency at which interest is calculated, etc. It might be better just to give the expression and the result and not attempt an interpretation.
According to IEEE 754-2008 table 9.1, sinh()
can produce overflow or underflow, cosh()
can produce overflow, and tanh()
can produce underflow. We seem to be catering for exceptions that cannot occur?
The example results should perhaps be tagged as approximate to ensure that they pass automated testing.
We should perhaps be referencing IEEE 754-2019 (though I'm reluctant to purchase a copy...)
Issue #1195 closed #closed-1195
Hash Function: CRC-32
Issue #1206 closed #closed-1206
1195 Hash Function: CRC-32
Issue #1196 closed #closed-1196
Math Functions: `math:e`, `math:sinh`, `math:cosh`, `math:tanh`
Issue #1205 closed #closed-1205
1196 Math Functions: math:e, math:sinh, math:cosh, math:tanh
Issue #1204 closed #closed-1204
1203 Define out-of-range conditions in CSV get function
Issue #1203 closed #closed-1203
CSV parsing: in call of get($R, $Z), what if $R is out of range
Issue #1215 closed #closed-1215
Fix ID/IDREF typo
Pull request #1215 created #created-1215
Fix ID/IDREF typo
I'm not sure how this slipped past the PR build checks...
Issue #1214 created #created-1214
fn:hash, CRC-32: Describe output
In the QT4 CG Meeting 077 it was suggested that the binary output of the newly added CRC-32
algorithm should be further described.
Issue #1198 closed #closed-1198
1189 distinct document order
Issue #1068 closed #closed-1068
73 fn:graphemes
Issue #1200 closed #closed-1200
QT4CG-075-02 Define the term sequence concatenation
Issue #146 closed #closed-146
fn:apply with last two arguments (array, map) for the positional and keyword args in a func-call
Issue #162 closed #closed-162
Support unbounded variadic functions on map parameter keys
Issue #369 closed #closed-369
Namespaces for Functions
Issue #572 closed #closed-572
fn:evaluate-xpath() function
Issue #1190 closed #closed-1190
1188 XQFO hash
Pull request #1213 created #created-1213
1199 Add ellipsis markup for arguments in variadic functions
Fix #1199
Issue #1210 closed #closed-1210
An edge case with coercion in 1.0 compatibility mode
Pull request #1212 created #created-1212
1208 correct details of formerly-reserved function names
Fix #1208
Pull request #1211 created #created-1211
QT4CG-076-01 Add examples of coercions
Added examples of coercion rules in action, as requested.
Note to reviewers: the XPath spec contains addition paragraphs explaining the effect of the 1.0 compatibility rules.
Issue #1210 created #created-1210
An edge case with coercion in 1.0 compatibility mode
XPath 3.1 says
In a static function call, if [XPath 1.0 compatibility mode] is true and an argument of a static function is not of the expected type, then the following conversions are applied sequentially to the argument value V:
In 4.0, we have dropped the phrase and [the supplied value] is not of the expected type. One effect of this omission is that when the expected type is xs:string?
and the supplied value is ()
, we now apply the string()
function which means we supply ""
rather than ()
as the coerced value.
Pull request #1209 created #created-1209
1183 Add transient mode and the transient{} expression
Fix #1183
QT4 CG meeting 077 draft agenda #agenda-05-14
Draft agenda published.
QT4 CG meeting 076 draft minutes #minutes-05-07
Draft minutes published.
Issue #1208 created #created-1208
Reserved Function Names: item, empty-sequence
The keywords item
and empty-sequence
have been dropped from the list of reversed function names:
- https://qt4cg.org/specifications/xquery-40/xquery-40.html#id-reserved-fn-names
- https://www.w3.org/TR/2017/REC-xquery-31-20170321/#id-reserved-fn-names
A note exists only for the dropped map
and array
keywords.
Issue #1207 created #created-1207
Array filter: Positional access
I apologize for warming up an already accepted feature, but I have mixed feelings about the deviating rules for filters and array filters. People can do…
for $n in $numbers[1 to 5]
return $n + 1
…but they won’t be able to do…
for member $n in $numbers?[1 to 5]
return $n + 1
Of course you can always use array:subarray
, array:slice
, etc., but from the perspective of symmetry and usability, it’s just not obvious why positions are exclusively allowed for sequences. Even if we regard numeric predicates as a design error, we should rather deliberately repeat other people’s mistakes…
Pull request #1206 created #created-1206
1195 Hash Function: CRC-32
Issue: #1195
Pull request #1205 created #created-1205
1196 Math Functions: math:e, math:sinh, math:cosh, math:tanh
Issue: #1196
Pull request #1204 created #created-1204
1203 Define out-of-range conditions in CSV get function
Fix #1203
Issue #1203 created #created-1203
CSV parsing: in call of get($R, $Z), what if $R is out of range
In the description of the ?get($R, $Z)
callback function, it is stated what happens if $Z doesn't identify an actual column (error when $Z is a string, return "" when it is an integer), but it's not stated what happens when $R (the row number) is out of range.
The existing text could also be improved because it talks of "the argument" to the function, when there are two.
The Saxon implementation currently returns "" and I proposed to say that unless anyone objects.
Issue #1202 created #created-1202
XQFO: Rendering of new/updated functions
It’s not always clear in the XQFO draft what has been updated when a function is marked as such. In the attached examples (there are several cases), either the Changed
section is missing, or the renderer fails to detect that the function has not actually changed:
Issue #1201 closed #closed-1201
Update the Saxonica EE repo
Pull request #1201 created #created-1201
Update the Saxonica EE repo
This should be harmless and I would have thought unnecessary, but there were reports of trouble reading Saxon EE 12.2 from Maven. This PR just updates the maven EE repo URI. The old URI redirects to the new one, but maybe that's causing a problem somehow?
Issue #1162 closed #closed-1162
Revert strict type for positional variables (xs:integer → xs:positiveInteger)
Issue #1165 closed #closed-1165
[Editorial] References to numeric codepoints in prose: consistency
Pull request #1200 created #created-1200
QT4CG-075-02 Define the term sequence concatenation
Purely editorial
Issue #1199 created #created-1199
In F+O function signatures, show some indication that a function is variadic.
Currently the markup indicates that a function is variadic, but this is not flagged in the displayed signature.
Pull request #1198 created #created-1198
1189 distinct document order
Add distinct-document-order function.
(Somehow this PR also includes changes to XSLT examples consequent on allowing "fn" as a synonym for "function")
Pull request #1197 created #created-1197
1192 Allow fn as abbreviation for function
Brings function tests into line with inline function syntax
Issue #1196 created #created-1196
Math Functions: `math:e`, `math:sinh`, `math:cosh`, `math:tanh`
The functions math:e
, math:sinh
, math:cosh
and math:tanh
might have been missed in the past, and would be simple to add.
Issue #1195 created #created-1195
Hash Function: CRC-32
I discovered only now that fn:hash
mentions that the function can also be used for computing cyclic redundancy checks.
Shouldn’t we add the basic CRC-32
algorithm to the list of mandatory algorithms?
Issue #1194 created #created-1194
New function fn:query()
I propose a new function fn:query()
to perform dynamic XQuery/XPath evaluation: a similar role to fn:transform() and xsl:evaluate.
I propose a design based on the design of fn:invisible-xml() - fn:query should take a query string as its argument, and return a function item that can be called to evaluate the query.
The fn:query() function will need an options map to supply significant aspects of the static context, for example the base URI. But I don't think we need to support everything. Public functions in the calling module should probably be made available automatically, in which case we don't need to support "import module".
The dynamic evaluation function will need an options map to supply significant aspects of the dynamic context, notably the context value and values of external parameters. The query result should be returned in "raw" (ie. unserialized) form.
Perhaps there should be an option language="xpath" to say that the "query" is actually an XPath expression; some implementations might find that easier to support, especially when the processor is itself an XPath processor.
(Motivation: Saxon has a pair of ancient extension functions saxon:compile-query
and saxon:query
and the design needs modernising, and bridging across to additional platforms. We might as well get it into the standard if we're doing that.)
Issue #1143 closed #closed-1143
Coercion Rules for Choice Item Types
Issue #1148 closed #closed-1148
1143 Coercion rules: handle choice types before atomization
Issue #1184 closed #closed-1184
1165-Use Unicode-style character references
Issue #1171 closed #closed-1171
Predicates returning xs:boolean vs. xs:boolean?
Issue #1182 closed #closed-1182
1171 Change predicate callbacks to allow empty return value
Issue #1170 closed #closed-1170
Editorial: fn:index-where; parentheses; …
Issue #1186 closed #closed-1186
1170 Editorial: fn:index-where
Issue #1193 created #created-1193
Parsing Functions: Empty input
I was asked why some of the parsing functions allow empty input and others don’t:
Function | Input
--- | ---
fn:parse-integer
| xs:string
fn:parse-uri
| xs:string
fn:parse-ietf-date
| xs:string?
fn:parse-QName
| xs:string
fn:parse-xml
| xs:string?
fn:parse-xml-fragment
| xs:string?
fn:parse-html
| (xs:string \| xs:hexBinary \| xs:base64Binary)?
fn:parse-json
| xs:string?
fn:parse-csv
| xs:string?
I would assume there is no rationale behind this, and that we should always allow empty input.
Issue #1192 created #created-1192
Allow "fn" as abbreviation for "function" in ItemType syntax
The syntax for inline functions is, by design, similar to the syntax for function tests. So it's confusing that you can abbreviate "function" to "fn" in the first case but not the second.
Open question: what about "declare function" in the XQuery prolog? My instinct is to leave that as it is.
Pull request #1191 created #created-1191
1167, 934 deep equal merge collations param
Fix #934 String comparisons in fn:deep-equal Fix #1167 Merge $collation and $options params of fn:deep-equal
Pull request #1190 created #created-1190
1188 XQFO hash
Fixed description
Issue #1189 created #created-1189
Function: distinct document order
The nodes resulting from path traversals are normalized by restoring document order and removing duplicates. There are use cases where this operation needs to be enforced, for which people do things like…
$nodes/.
$nodes/self::node()
$nodes union ()
$nodes except ()
…which all look arcane to occasional readers, and are prone to be accidentally optimized away.
It would be sensible to have a helper function that makes this operation explicit, so that users have some clue that something intentional is going on. Some articles called this postprocessing step distinct document order, which is why we added util:ddo
in the past – but I would love to find a both canonical and catchier name for it.
QT4 CG meeting 076 draft agenda #agenda-05-07
Draft agenda published.
Issue #1188 created #created-1188
fn:hash: Editorial
The summary of fn:hash
says…
Returns a string representation of the results from a specified hash, checksum, or cyclic redundancy check function upon the input.
…but the latest version returns a xs:hexBinary
value.
Issue #1187 created #created-1187
Decimal rounding
This is in response to a bug/feature request from a Saxon user: see https://saxonica.plan.io/issues/6408
Currently
(a) the round() function gives no control over rounding mode (towards zero, towards positive infinity, etc etc).
(b) decimal division leaves it entirely implementation-defined what the precision of the result should be, let alone what rounding is applied to attain that precision.
This can cause difficulties for users where, for example, financial accounting standards mandate a particular rounding mode.
Is this something we want to address?
Pull request #1186 created #created-1186
1170 Editorial: fn:index-where
Issue: 1170
- Editorial, 2 characters added
- “2. Redundant parens in function signatures“ is obsolete (https://github.com/qt4cg/qtspecs/pull/1182/files#diff-7625c07ae8131ff65c3caa677b188ed2b9b66237312d11c05a2fa2838c6f5c67R21233)
Pull request #1185 created #created-1185
1179 array:values, map:values → array:get, map:get
Issue: #1179
Pull request #1184 created #created-1184
1165-Use Unicode-style character references
Currently only affects F+O.
New element <char>U+xxxx</char>
is supported in the DTD and XSLT. It is automatically expanded to include the character name and glyph. I was hoping to get the Unicode character names directly from the Unicode database, but that's a bit unwieldy, so I created a little file (in the style directory) containing the names of the characters that we actually use.
Issue #1183 created #created-1183
transient() - a function to make functions nondeterministic
I propose a function transient()
which takes a function as input and returns a function that is functionally identical, but which relaxes the requirement for determinism.
For example, transient(current-dateTime#0)
returns a function that returns a date and time, but is not required to return the same date and time on every call, while transient(doc#1)
returns a function that dereferences an XML document URI, but is not required to return the same document every time it is called with the same URI.
A valid implementation, of course, could return the supplied function unchanged - the transient() function gives the implementation the freedom to relax the rules, but does not require it to do so.
transient()
seems like a more friendly name than nondeterministic()
.
Pull request #1182 created #created-1182
1171 Change predicate callbacks to allow empty return value
Return type changes from xs:boolean
to xs:boolean?
Pull request #1181 created #created-1181
296 Allow default-namespace=##any
Allows the default namespace for elements and types to have the special value "##any", which causes unprefixed QNames to match elements in any namespace. Use cases include:
- Casual ad-hoc XPath queries, where over-retrieval isn't a problem
- Use with HTML, where it can be unpredictable whether elements will be in a namespace, and where users are accustomed to browser behaviour with its "wilful violation" of the XPath 1.0 specification
- Any environment where multiple namespaces are in use for variants of what is essentially the same vocabulary of element names (for example, where the XML designer has made the mistake of versioning the namespace URI)
Issue #161 closed #closed-161
Support unbounded variadic functions on sequence parameters
Issue #1137 closed #closed-1137
161 Variadic functions
Issue #1180 created #created-1180
fn:unparsed-text: `cache` option
In #1117, we discussed the pros and cons of a deterministic
option for fn:unparsed-text
.
I would suggest making the function strictly nondeterministic, but adding a cache
option:
- That way we don’t equate determinism with caching, which can in fact be different things. For example, an implementation might decide to do deterministic things at compile time, whereas it might be better to cache data only when it’s actually requested.
- The meaning of the option will be easier to explain to non-experts (…and avoid confusion among experts).
QT4 CG meeting 075 draft minutes #minutes-04-30
Draft minutes published.
Issue #1086 closed #closed-1086
array:values spec cleanup
Issue #1087 closed #closed-1087
1086 Editorial changes to array:values
Issue #1166 closed #closed-1166
Invalid option keys: the rule is unclear
Issue #1168 closed #closed-1168
1166 Clarify rule on invalid option keys
Issue #1173 closed #closed-1173
array:build, map:build: Positional access
Issue #1174 closed #closed-1174
1173 array:build, map:build: Positional access
Issue #1177 closed #closed-1177
1162 Positional variables are xs:integer not xs:positiveInteger
Issue #553 closed #closed-553
New function fn:substitute()
Issue #1179 created #created-1179
Editorial: `array:values`, `map:values`
Triggered by #1087:
- Align
map:values
witharray:values
- Revise notes.
- Rename the functions.
@michaelhkay You suggested that content
might be a better name – rather than item
– for retrieving the sequence-concatenation of values in map lookups in https://github.com/qt4cg/qtspecs/issues/1169#issuecomment-2074378446. Should we rename the functions to array:contents
, map:contents
or use array:items
and map:items
?
Issue #1178 closed #closed-1178
1146 Add inline change markup in the XPath/XQuery spec
Pull request #1178 created #created-1178
1146 Add inline change markup in the XPath/XQuery spec
There is more work to be done to ensure the change log entries are complete, but this is a good start.
Issue #1159 closed #closed-1159
Filter operator for arrays
Pull request #1177 created #created-1177
1162 Positional variables are xs:integer not xs:positiveInteger
Reverts a change that made the type xs:positiveInteger as the impact of the change was not fully explored, and the same change was not made elsewhere e.g. to the argument type of array:get() or the return type of fn:position().
Note: the use of xs:positiveInteger has been retained for row and column numbers in the CSV functions, and for codepoints in the fn:char() function.
Issue #1176 created #created-1176
Use fn:parse-uri to check whether a filepath is relative or absolute
I have a question about the new function fn:parse-uri()
. A common use case is to check whether a file path is absolute or relative. For example, I want to check whether the file path images/img1.png
is relative and can therefore be converted to an absolute file path using resolve-uri()
. Or I want to check whether $base
is absolute and can therefore be used as the second argument in resolve-uri()
.
How would I use a uri-structure-record
map determined as a result of fn:parse-uri ()
to decide whether it is a relative or absolute file path?
Greetings, Frank
Issue #1175 created #created-1175
XPath: Optional parameters in the definition of an inline function
This is a proposal to extend the definition of an inline-function item with the ability to specify a set of optional/keyword-value parameters, following the sequence of positional parameters of the function.
This is very similar to what we already have for static function definitions: https://qt4cg.org/specifications/xquery-40/xpath-40.html#dt-function-definition and https://qt4cg.org/specifications/xquery-40/xpath-40.html#id-static-functions
While a static function definition has the following parts:
-
The function name, which is an expanded QName.
-
A (possibly empty) list of required parameters, each having:
-
a parameter name (an expanded QName)
-
a required type (a sequence type)
-
A (possibly empty) list of optional parameters, each having:
-
a parameter name (an expanded QName)
-
a required type (a sequence type)
-
a default value expression (an expression: see 4 Expressions)
-
A return type (a sequence type)
-
A (possibly empty) set of function annotations
-
A body. The function body contains the logic that enables the function result to be computed from the supplied arguments and information in the static and dynamic context.
For an inline function definition we will have:
-
A name of a variable to contain the function item being defined.
-
A (possibly empty) list of required parameters, each having:
-
a parameter name (an expanded QName)
-
an optional type (a sequence type)
-
A (possibly empty) list of optional parameters, each having:
-
a parameter name (an expanded QName)
-
an optional type (a sequence type)
-
a default value expression (an expression: see 4 Expressions)
-
An optional return type (a sequence type)
-
A body. The function body contains the logic that enables the function result to be computed from the supplied arguments and information in the static and dynamic context.
What is accomplished by introducing optional parameters?
The answer is the same as for the effect of having optional parameters in a static function definition: increased brevity, conciseness and clarity .
let $myFun := fn($pos1, $pos2, $posK, $kw1 := expr1, $kw2 := expr2, ..., $kwN := exprN) { (: Some expression :)}
replaces what would otherwise be a set of N! + 1
separate inline function definitions, each of which must be assigned to a separate variable.
Similarly to the static function calls, with this new feature a call to such an inline function must provide values for all positional arguments, followed by an optional set (meaning in any order) of assignments of values to specific keyword-valued (optional) arguments. The rules for an inline function call are similar to those for a call to a static function - the provided values for the positional arguments must precede all other provided values and the values for the optional arguments may be provided in any order.
Here is a short example of an inline function definition and calling it:
let $incr := fn($arg1, $increment := 1) {$arg1 + $increment }
return
(
$incr(5),
$incr(5, increment := 2),
$incr(5, increment := 3)
)
produces:
6, 7, 8
Pull request #1174 created #created-1174
1173 array:build, map:build: Positional access
Issue: #1173
Issue #1110 closed #closed-1110
New error codes
Issue #1173 created #created-1173
array:build, map:build: Positional access
array:build
seems to be the only function (for iterating over ordered input) for which the HOF parameters lacks a positional parameter:
array:build(
$input as item()*,
$action as function(item()) as item()* := fn:identity#1
) as array(*)
(: should be :)
array:build(
$input as item()*,
$action as function(item(), xs:integer) as item()* := fn:identity#1
) as array(*)
Issue #1172 created #created-1172
Iterating maps: Positional access
With the new filter expression for maps, the context position is available. To be consistent, if we want to provide positional access for unordered data, we should…
- either provide this for map functions as well (
map:filter
,map:for-each
), or - don’t provide it for map predicates at all.
Issue #1171 created #created-1171
Predicates returning xs:boolean vs. xs:boolean?
In 4.13.4 Filter Expressions for Maps and Arrays, the type of the predicate expression FILTER
is xs:boolean?
.
To be consistent, we should either relax the types of the predicate/filter functions (for fn:filter
, etc.) or stick with xs:boolean
.
Issue #1170 created #created-1170
Editorial: fn:index-where; parentheses; …
1. fn:index-where, array:index-where
Returns the position in an input ...
→ positions
2. Redundant parens in function signatures
as (function(xs:anyAtomicType, item()*) as xs:boolean)
...
Issue #1169 created #created-1169
Maps & Arrays: Consistency & Terminology
After the introduction of #1094 and #1159, and before adding more map/array operations, I think it’s time to get more serious about consistency and terminology. The current drafts employ a variety of terms that are not clearly defined, or separated from each other. We now have at least…
items, members, pairs, keys, values, entries
…which are sometimes used for maps, for arrays, or for both data structures. A first attempt to clean up, with reducing the overall effort:
-
A minor one: The modifier for lookups should be in singular form, analagous to node axes:
item
,key
,value
,pair
. -
While I first advocated the orthogonality principle for axes in lookup expressions, I now think we should stick to the existing terminology. Otherwise, we would need to revise many other existing parts of the spec. My suggestion would be to:
- introduce
member
for arrays - only allow
key
,value
andpair
for maps - allow
items
for both maps and arrays
This would make it symmetric with a) the current terminology for maps and arrays, and b) enhanced for
clauses, i.e. for member $m
and for key $k value $v
.
The reverse approach would be to drop for member $m
and to also allow for key $k value $v
for arrays (with for value
replacing for member
). In addition, we could have for pair
.
-
With the introduction of the
item
axis,map:values
andarrays:values
should be renamed tomap:items
andarray:items
. →map:contents
andarray:contents
, see #1179 -
I would suggest dropping
array:members
andarray:of-members
. The names don’t imply we’ll deal with records, and it’s not in line withfor member $m
either. If we want to keep these functions, we could rename them toarray:pairs
andarray:of-pairs
and add the integer positions as keys, and we should introduce and consistently use the termpair
for maps and arrays.
Closely related: #826
Issue #1125 closed #closed-1125
1094 Enhanced lookup expressions
Issue #1094 closed #closed-1094
Axis steps in lookup expressions
QT4 CG meeting 074 draft minutes #minutes-04-23
Draft minutes published.
Issue #1135 closed #closed-1135
Definition of focus functions
Issue #1157 closed #closed-1157
1135 Correction to definition of focus functions
Issue #1163 closed #closed-1163
1159 Add filter expressions for maps and arrays
Issue #235 closed #closed-235
Add multiple=true() option to fn:parse-json and fn:json-doc
Issue #1155 closed #closed-1155
Glossary formatting
Issue #1164 closed #closed-1164
1155 Consistency of glossaries
Pull request #1168 created #created-1168
1166 Clarify rule on invalid option keys
An error is raised for an option key unless (a) it is listed in the specification, or (b) it is recognized by the implementation, or (c) it is a QName with a non-absent namespace.
Also clarified the rule about accepting an array in place of a sequence (I'm not sure whether this is something that we actually do, and certainly it doesn't happen unless the parameter explicitly allows an array.)
Fix #1166
Issue #1167 created #created-1167
Merge $collation into $options parameter of fn:deep-equal()
To avoid the ugly third parameter to deep-equal which will almost always be set to (), merge $collation into the $options parameter, whose type becomes (map(*) | xs:string)?
for backwards compatibility.
The same idea is being applied to unparsed-text and can probably be done elsewhere.
Issue #1166 created #created-1166
Invalid option keys: the rule is unclear
PR #1059 introduced the rule:
If an option is not described in the specification, if it is not supported by the implementation and if its name is in no namespace, a type error [err:FORG0013] must be raised.
and this has proved its worth in finding quite a few errors in the test suite!
However, it's not entirely clear what it means.
Entries in the options map can have keys of any type. I suspect this rule is intended to apply to (a) keys of type xs:string, and (b) strings of type xs:QName where the namespace URI is absent.
Alternatively, perhaps it should apply to ALL option keys other than a QName in a non-null namespace?
QT4 CG meeting 074 draft agenda #agenda-04-23
Draft agenda published.
Issue #1165 created #created-1165
[Editorial] References to numeric codepoints in prose: consistency
A quick glance in F+O finds:
\n (newline, x0A)
π (x3C0)
the glyph ≂̸ which is expressed using the two codepoints #x2242 #x0338
A format token of ١ (Arabic-Indic digit one)
the format token ① (circled digit one, ①)
the actual Unicode character COMBINING DIARESIS (Unicode codepoint U+0308) or ̈
The Latin small letter dotless i (ı, U+0131, used in Turkish)
the Unicode replacement character (U+FFFD)
CRLF (U+000D, U+000A), LF (U+000A), or CR (U+000D)
comma "," (U+002C)
the Unicode quotation mark " (U+0022)
a single newline (U+000A) character
I feel we could do better...
Issue #700 closed #closed-700
Operators for array mapping and filtering
Pull request #1164 created #created-1164
1155 Consistency of glossaries
Use a common style for all glossaries.
Add a glossary to F+O.
Fix #1155
Pull request #1163 created #created-1163
1159 Add filter expressions for maps and arrays
Issue #1162 created #created-1162
Revert strict type for positional variables (xs:integer → xs:positiveInteger)
I feel that the decision to change xs:integer
to xs:positiveInteger
was a bit hasty (https://github.com/qt4cg/qtspecs/pull/1131#issuecomment-2051379262):
To be consistent, numerous other expressions and functions would need to be rewritten as well to use stricter types (arbitrary examples: the count
clause; $err:line-number
in the catch clause; the result type of fn:string-to-codepoints
; positions in array:get
, fn:parse-integer
, etc.; the position parameter in HOF functions, and so on and on). We haven’t done so yet, and I seriously wonder what exactly we would win from the stricter types. In many cases, it would be reasonable to also define a strict upper limit, which is not possible with our types anyway.
Implementations may build heavily on the fact that xs:integer
has been the default type for integer values in previous versions of the languages. For example, we use cached instances for the most small integer values, or we rewrite constructs with xs:integer
to other constructs accepting the same type.
For all these challenges, as always, technical solutions exist, but the question is if there aren’t more interesting things to focus on than on such corner cases. Queries like count(1 to 1000000000000)
are not supported by all implementations either although they may appear trivial to the ordinary user (by coincidence, it’s supported by BaseX, but I doubt it has been used a lot).
In short, I would like us to revert the change in https://github.com/qt4cg/qtspecs/pull/1131/commits/bba6e4f1067e0ef0779688622a58320a5298d440 and stick with xs:integer
. If people feel bad about it, I would suggest discussing strict types in a much broader and general way.
Issue #1161 created #created-1161
More changes to drop the requirement for document-uri() uniqueness
Issue #898 was about dropping this constraint that document-uri()s had to be unique and PR #905 was adopted to resolve it. However, I see that the the XPath specification still contains the following note:
Note:
This means that given a document node $N, the result of fn:doc(fn:document-uri($N)) is $N will always be true, unless fn:document-uri($N) is an empty sequence.
I don't believe that applies any longer, so it should be removed.
It's possible that we need to finesse the description of available documents as well. The current description was clearly written from the perspective that document URIs would be unique and there'd be a 1:1 mapping from URIs to documents.
QT4 CG meeting 073 draft minutes #minutes-04-16
Draft minutes published.
Issue #1160 created #created-1160
fn:is-collation-available
The new function fn:collation raises an error [err:FOCH0002] in the case when the requested collation is not supported. Or, if the fallback
key's value is true()
, then the implementation chooses "the most similar supported collation" - which could be perceived as arbitrary and unexpected by the code developer.
This might be OK if the language has try/catch capabilities and fallback="no"
is specified, but may not be the best outcome in a pure XPath evaluation.
A solution to this problem is to provide a function fn:is-collation-available that accepts the same argument ($options
map) as fn:collation
, and also could accept a string argument whose value is the URI of the collation. This function produces a boolean, true()
meaning that the collation is available and can be constructed and used, false()
- otherwise.
Signature
fn:is-collation-available( $descriptor as xs:string | map(*) ) as xs:boolean
Issue #1140 closed #closed-1140
Use $target instead of $search for indexing functions
Issue #1141 closed #closed-1141
1140 Replace 'search' with 'target' for indexing functions
Issue #1147 closed #closed-1147
QT4CG-072-01 Clarify schema type terminology
Issue #1142 closed #closed-1142
fn:deep-equal: items-equal
Issue #1150 closed #closed-1150
1142 Drop restriction disallowing items-equal with unordered
Issue #1138 closed #closed-1138
format-number arguments
Issue #1151 closed #closed-1151
1138 Merge format and format-name params of format-number
Issue #1152 closed #closed-1152
1146 Inline change log
Issue #115 closed #closed-115
Lookup operator on arrays of maps
Issue #298 closed #closed-298
Abstract supertype for map and array
Issue #397 closed #closed-397
Type names
Issue #836 closed #closed-836
Add support for CSV 'dialect' features covered by the OKFN's Frictionless Data CSV spec in `fn:parse-csv` and related functions
Issue #1115 closed #closed-1115
XSLT - ability to call a function from xslt (not just xpath)
Issue #1154 closed #closed-1154
[xsl:item-type] error in sample
Issue #1156 closed #closed-1156
Fix error in XSLT example
Issue #1084 closed #closed-1084
Incorrect rendition of option defaults
Issue #1149 closed #closed-1149
1084 Add fos:default-description to support prose descriptions of defaults
Issue #1159 created #created-1159
Filter operator for arrays
I propose to provide ?[...]
as a filter operator for arrays.
For example,
let $array := [(1,2,3), (4,5,6,7)]
return $array?[count(.) = 4]
returns
[(4,5,6,7)]
I propose that the operator should work exactly like the familiar []
for sequences in its handling of numeric and boolean predicate values. So for example $array?[2,1] in the above example returns [(4,5,6,7), (1,2,3)]
. The result is always an array (which may be a little surprising). This means that $array?[3]
has the same effect as [$array?3]
or [$array(3)]
.
Issue #1158 created #created-1158
Simple mapping operator for arrays
I propose to provide !!
as a simple mapping operator for arrays.
For example [(1,2,3), (4.5.6)]!!count(.)
returns [3, 3]
.
The expression on the LHS must be an array.
The expression on the RHS is evaluated once for every member of the array, with that member as the context value, with the context position set to the position of that member in the array, and with the context size set to the array size.
The result is returned as an array which will always be the same size as the input array.
Note in passing that this provides a solution (though perhaps a clumsy solution) to issue #755, in that the example expression
(0 to 4) ~ count(.)
can now be written as [(0 to 4)]!!count(.)?*
Pull request #1157 created #created-1157
1135 Correction to definition of focus functions
Fix #1135
Pull request #1156 created #created-1156
Fix error in XSLT example
Fix #1154
QT4 CG meeting 073 draft agenda #agenda-04-16
Draft agenda published.
Issue #1155 created #created-1155
Glossary formatting
The format of the glossary for the data model spec differs needlessly from the other specifications. (Note, linking from the glossary entry to the place where the term is defined seems useful.)
Issue #1154 created #created-1154
[xsl:item-type] error in sample
Here : https://qt4cg.org/specifications/xslt-40/Overview-diff.html#named-item-types
First sample is
<xsl:item-type name="cx:complex" as="record(r as xs:double, i as xs:double)"/>
<xsl:variable name="i" as="cx:complex" select="cx:number(0, 1)"/>
xsl:variable/@select
should probably be cx:complex(0, 1)
instead of cx:number
Issue #1153 created #created-1153
XSLT: debugging template rule selection
The biggest headache when debugging XSLT stylesheets is working out which template rules have been invoked in response to an xsl:apply-templates
instruction (I'm hitting this frustration right now with the qtspecs build stylesheets...). The xsl:message
instruction is unhelpful here, because if the "wrong" template rule is firing, you don't know where to add the message. And the only other standardised debugging aids are fn:trace()
and xsl:assert
, which don't help either.
I propose an attribute on xsl:apply-templates
, xsl:apply-imports
, and xsl:next-match
: trace=yes|no. If enabled, execution of the instruction causes a message to be output (as if by xsl:message) identifying the rule that is invoked, in an implementation-defined way. In the case that a built-in template rule is invoked, the message should indicate this, and any implicit apply-templates performed by the built-in rule should be evaluated as if it specified trace="yes". It is "recommended" that the message output should identify the stylesheet module, line number, match pattern, and mode, if the information is available, and should also include a representation of the item that is being processed by the instruction, for example the node kind and name.
Pull request #1152 created #created-1152
1146 Inline change log
This is a first cut at changes to introduce an inline change log - changes shown at the start of each affected section, with a flag in the TOC to indicate which sections have changed.
It is currently applied, for demonstration purposes, to changes made in the serialization spec.
More specifically:
- Added the
changes
andchange
elements to the DTD;changes
is an optional element that followshead
within any section - Changed the XSLT stylesheets and CSS to render the
changes
element, and to add a flag to the TOC entry if achanges
element is present - Added specimen
changes
elements to the seriallization spec
There's a lot more to be done:
- Generate an aggregated list of changes in an appendix
- Improve the CSS rendition
- Toggle change markings on and off; browse forward and backward through changed sections
- Add the
changes
data to the other specs
Pull request #1151 created #created-1151
1138 Merge format and format-name params of format-number
close #1138
Note, the proposal could do with further editorial work to use standard options
markup to define the options available.
Pull request #1150 created #created-1150
1142 Drop restriction disallowing items-equal with unordered
Allows the use of an items-equal callback even when comparisons are unordered, despite the fact that this may have atrocious performance.
close #1142
Pull request #1149 created #created-1149
1084 Add fos:default-description to support prose descriptions of defaults
Close #1084
This won't render correctly in the PR, but hopefully the diff is clear enough to decide if this is the approach we want to take.
Pull request #1148 created #created-1148
1143 Coercion rules: handle choice types before atomization
Fix #1143
Pull request #1147 created #created-1147
QT4CG-072-01 Clarify schema type terminology
Responding to an action from the review of PR #1132, this editorial PR attempts to improve the definitions and usage of terms such as "schema type", "atomic type", "pure union type", "generalized atomic type".
Issue #796 closed #closed-796
allow explicit type expressions in XPath variable bindings
Issue #1131 closed #closed-1131
796,231 - Extend XPath for and let expressions
Issue #1146 created #created-1146
Identifying 4.0 Changes
The list of changes in an appendix is (a) difficult to maintain (with a tendency to cause Git conflicts) and (b) remote from the places in the spec where the changes actually arise. At the same time, automated diff markup tends to give a lot of unwanted detail, highlighting changes that are purely editorial.
I propose that we try out an alternative approach. Each section/subsection with significant changes should start with an info box listing the changes, headed "Changes in 4.0". This should be rendered with a distinct colour or border to make it recognisable, and it should be possible to toggle whether the changes are shown or hidden. Changes that represent an incompatibility should be specially marked, perhaps with a device such as a warning triangle. A Δ marker (or colour highlighting) could appear in the table of contents against any section that has a changes
entry.
Internally, the changes should be identified with custom markup: I suggest an optional <changes>
element immediately after <head>
, with a sequence of <change>
children, each of which should contain administrative metadata (such as a link to the issue and/or PR) as well as user-readable text.
For changes to F+O functions, corresponding elements should be added to the FOS catalog schema; this should replace (or generate) the current "History" section.
Issue #1145 closed #closed-1145
Array Decomposition
Issue #1144 closed #closed-1144
Sequence Decomposition
Issue #1145 created #created-1145
Array Decomposition
This proposal allows arrays to be decomposed and assigned to separate variables in a single declaration within a for or let expression binding.
Given an array such as [1, 2, 3]
, the values within that array cannot easily be extracted. With the current version of XPath and XQuery, they need to be assigned to a temporary variable first. For example:
let $result := get-camera-point()
let $x := $result?(1)
let $y := $result?(2)
let $z := $result?(3)
return "(" || $x || "," || $y || "," || $z || ")"
This proposal would allow this to be written more concisely as:
let [$x, $y, $z] := get-camera-point()
return "(" || $x || "," || $y || "," || $z || ")"
These are equivalent in this proposal, except that $result
is not a statically known variable binding in the array decomposition let clause.
Note: The older syntax in XPath-NG was:
let $[x, y, z] := get-camera-point() return "(" || $x || "," || $y || "," || $z || ")"
For each variable declaration in the array decomposition at index N
, and $expr
being the result of the for/let expression, then $expr?(N)
is the value bound to the variable declaration as a new variable binding. If the value does not exist, an err:FOAY0001
(array index out of bounds) error will be raised.
An array decomposition can be used in any for or let clause binding to decompose the items in an array. If the type of the for or let clause binding expression is not a sequence, an err:XPTY0004
error is raised.
Assigning the rest of an array
It can be useful to only extract part of an array (e.g. the heading of a table), and store the rest of the items in another variable. For example:
let $(heading as array(xs:string), rows as array(xs:string)...) :=
load-csv("test.csv")
If there are no items remaining in the array the result is an empty array.
Influences
Tuple decomposition is found in various languages such as Python, Scala, and C#. These languages also have support for tuple types.
Python has support for specifying that a variable is assigned the remaining values in the tuple.
Use Cases
There are many cases where fixed size sequences may be used such as points, complex and rational numbers, sin/cos, and mul/div. This makes extracting data from these simpler, and may also be used to aid readability by assigning descriptive names to each of the items in the sequence.
Examples
Extracting values from an array:
declare function sincos($angle as xs:double?) {
[ math:sin($angle), math:cos($angle) ]
};
let $angle := math:pi()
let [$sin, $cos] := sincos($angle)
return $sin || "," || $cos
Issue #1144 created #created-1144
Sequence Decomposition
This proposal allows sequences to be decomposed and assigned to separate variables in a single declaration within a for or let expression binding.
Given a sequence such as (1, 2, 3)
, the values within that sequence cannot easily be extracted. With the current version of XPath and XQuery, they need to be assigned to a temporary variable first. For example:
let $result := get-camera-point()
let $x := $result[1]
let $y := $result[2]
let $z := $result[3]
return "(" || $x || "," || $y || "," || $z || ")"
This proposal would allow this to be written more concisely as:
let ($x, $y, $z) := get-camera-point()
return "(" || $x || "," || $y || "," || $z || ")"
These are equivalent in this proposal, except that $result
is not a statically known variable binding in the sequence decomposition let clause.
Note: The older syntax in XPath-NG was:
let $(x, y, z) := get-camera-point() return "(" || $x || "," || $y || "," || $z || ")"
For each variable declaration in the sequence decomposition at index N
, and $expr
being the result of the for/let expression, then $expr[N]
is the value bound to the variable declaration as a new variable binding. If the value does not exist, an empty sequence is bound to the variable.
A sequence decomposition can be used in any for or let clause binding to decompose the items in a sequence. If the type of the for or let clause binding expression is not a sequence, an err:XPTY0004
error is raised.
Assigning the rest of a sequence
It can be useful to only extract part of a sequence or array (e.g. the heading of a table), and store the rest of the items in another variable. For example:
let $(heading, rows ...) := fn:parse-csv("test.csv")
If there are no items remaining in the sequence the result is an empty sequence.
Influences
Tuple decomposition is found in various languages such as Python, Scala, and C#. These languages also have support for tuple types.
Python has support for specifying that a variable is assigned the remaining values in the tuple.
Use Cases
There are many cases where fixed size sequences may be used such as points, complex and rational numbers, sin/cos, and mul/div. This makes extracting data from these simpler, and may also be used to aid readability by assigning descriptive names to each of the items in the sequence.
Examples
Extracting values from a sequence:
declare function sincos($angle as xs:double?) {
math:sin($angle), math:cos($angle)
};
let $angle := math:pi()
let ($sin, $cos) := sincos($angle)
return $sin || "," || $cos
Issue #983 closed #closed-983
fn:reduce (or fn:fold without initial value)
Issue #1143 created #created-1143
Coercion Rules for Choice Item Types
The proposal that we accepted for choice item types (PR #1132) invokes atomization only if the choice type is a generalised atomic type, that is, if all alternatives in the choice are atomic.
This makes it tricky to take advantage of choice types for extending existing functions in a backwards-compatible way. For example, we might want to change the second argument of fn:unparsed-text from $encoding as xs:string
to $options as (xs:string | map(*))
. But under the current rules, this means the supplied value of the $encoding argument will no longer be atomized.
I propose to change this by effectively promoting rule 3 to appear before rule 2. Rule 2 is the atomization rule, and rule 3 is the new rule:
If R is a [choice item type] that is not a [generalized atomic type], then the following rules are applied with R set to each of the alternatives in the choice item type, in order, until an alternative is found that does not result in a type error; a type error is raised only if all alternatives fail.
The phrase in italics is deleted.
The effect is that if the required type is (xs:string | map(*))
then we first try converting the supplied argument as if the required type were xs:string
(including atomization), and if that fails we try converting it as if the required type were map(*).
Issue #231 closed #closed-231
for expression: "at" keyword
Issue #1139 closed #closed-1139
let clause: function coercion
Issue #788 closed #closed-788
New function fn:annotate()
Issue #1105 closed #closed-1105
Casting to numerical type from strings with underscores
Issue #67 closed #closed-67
Allow optional parameters and keyword arguments on map and sequence variadic functions.
Issue #132 closed #closed-132
Clarify if redirects should be followed
Issue #613 closed #closed-613
Allow "union" as synonym for "|" everywhere
Issue #666 closed #closed-666
Polyfill function implementations
Issue #713 closed #closed-713
Annotations: Editorial notes
Issue #834 closed #closed-834
Add creation function for `csv-row-record` type
Issue #1142 created #created-1142
fn:deep-equal: items-equal
The current spec says about items-equal
that…
If this option is present then the
ordered
option MUST betrue
and theunordered-elements
option MUST be an empty sequence.
I doesn’t say what is going to happen if ordered
is false
or if unordered-elements
is non-empty.
My preference would be to allow all combinations; we could then do things like:
deep-equal(
(1, 2, 3),
(3.1, 2.1, 1.1),
{ 'ordered': false(), 'items-equal': fn($a, $b) { xs:integer($a) = xs:integer($b) } }
)
It may imply O(n²), but it’s very simple to formulate other XPath expressions with the same complexity, such as $huge1[. = $huge2]
.
Pull request #1141 created #created-1141
1140 Replace 'search' with 'target' for indexing functions
Close #1140
Having created the issue, per my outstanding action, I thought I'd take a quick look to see how extensive the change would be. AFAICT (though I confess to not looking exceedingly carefully), only two functions are effected. Here, for your consideration, is a PR that resolves the issue.
Issue #1140 created #created-1140
Use $target instead of $search for indexing functions
Back in February, when we discussed array:index-of
, DN observed that the argument name $search
could be interpreted as performing some sort of action. The alternative $target
was proposed instead as being more "noun like".
QT4 CG meeting 072 draft minutes #minutes-04-09
Draft minutes published.
Issue #1093 closed #closed-1093
1091 Add fn:collation function
Issue #1091 closed #closed-1091
Convenience function to construct a collation URI
Issue #99 closed #closed-99
Functions that determine equality of two sequences or equality of two arrays
Issue #1063 closed #closed-1063
deep-equal() - option to compare functions liberally
Issue #1120 closed #closed-1120
99v2 deep equal with callback
Issue #122 closed #closed-122
Support general union sequence types
Issue #1132 closed #closed-1132
122 Choice item types (generalizing local union types)
Issue #1112 closed #closed-1112
1110-partial New error codes
Issue #1118 closed #closed-1118
Use new map{} syntax in adaptive output method
Issue #1123 closed #closed-1123
1118 Drop the "map" keyword in adaptive serialization output
Issue #1128 closed #closed-1128
1020 Further notes on the consequences of function coercion
Issue #1133 closed #closed-1133
fn:filter why predicate as map(*)
Issue #1134 closed #closed-1134
1133 Correct map:filter callback signature
Issue #1139 created #created-1139
let clause: function coercion
@michaelhkay I should be careful (as I regularly miss changes and additions in the 4.0 drafts), but it seems that the application of the function coercion rules for typed let
clauses (https://qt4cg.org/specifications/xquery-40/xquery-40.html#id-binding-rules) is not mentioned in the list of substantive changes at the end of the document. Would it be useful to add it, or does this change fall into a different category?
QT4 CG meeting 072 draft agenda #agenda-04-09
Draft agenda published.
Issue #1138 created #created-1138
format-number arguments
Given the availability of choice item types, I propose that we merge the format
and format-name
parameters of fn:format-number
into a single parameter of type (xs:string | xs:QName | map(*))
. This seems a better design when parameters are mutually exclusive and perform a related role.
Pull request #1137 created #created-1137
161 Variadic functions
Fix #161
This proposal attempts to do the minimum necessary to allow the variable-arity nature of the fn:concat function to be reproduced for other functions including user-defined functions. The idea is that fn:concat should no longer be treated as a special case.
The proposal is deliberately less ambitious than some of the ideas discussed in the referenced issue. It's generally easier to get something into the language if we take smaller steps.
For an overview see section 4.5.3 of the XQuery spec.
Issue #1136 created #created-1136
Defining names for parameters on typed function tests
When defining the type of a higher-order function parameter, you cannot currently specify the names of the parameters of that higher-order function.
Allowing this can be useful for variou reasons:
- documenting the parameter names in the function signature -- this makes it clear looking at the function in an IDE, etc. what the parameters are;
- making the specs clearer by referring to the parameters by name;
- allowing a processor to provide better error messages by referring to the parameters names, e.g. when there is a type conversion error;
- allowing a user to reference the parameter by name if we enable this to resolve named keyword argments (which is currently being discussed in #1114).
Thus, you could declare e.g. index-where like this:
declare function fn:index-where(
$input as item()*,
$predicate as function(
$item as item(),
$position as xs:integer
) as xs:boolean
) as xs:integer* {
(: ... :)
};
Issue #1135 created #created-1135
Definition of focus functions
§5.4.2.6 states
The expression function { EXPR } (or fn { EXPR }) is a syntactic shorthand for the expression function($Z as item()) as item() { $Z ! (EXPR) }, where $Z is a variable name that is otherwise unused. Note that the function body (EXPR) is evaluated with a [fixed focus]: the context position and context size will always be 1 (one).
This is no longer true now since generalization of the context item to context value. EXPR is evaluated once with the entire sequence $Z as the context value, it is not evaluated once for each item in $Z.
We have no direct way of expressing this in the absence of a resolution to issue #755.
Pull request #1134 created #created-1134
1133 Correct map:filter callback signature
Fix #1133
Issue #1133 created #created-1133
fn:filter why predicate as map(*)
fn:filter
is defined like this :
map:filter(
$map as map(*),
$predicate as function(xs:anyAtomicType, item()*) as map(*)
) as map(*)
Why the predicate function is not returning a xs:boolean
?
Pull request #1132 created #created-1132
122 Choice item types (generalizing local union types)
Fix #122.
Allows the new item type syntax (A | B)
, replacing union(A, B)
; the alternatives are no longer restricted to be atomic types. The choice item type is a generalized atomic type if and only if all the alternatives are generalized atomic types.
Note that #122 also proposed unions of sequence types. While that is also viable, I found that unions of item types handled pretty well all practical use cases, and it seems excessive to offer both. Unions of item types proved (a) more useful (b) easier to combine with the existing feature of local union types, and (c) easier to handle in the coercion rules. Providing both is also tricky to handle in the grammar.
Issue #1044 closed #closed-1044
CSV row delimiter - allowed values
Issue #1104 closed #closed-1104
TypeTest expressions
Pull request #1131 created #created-1131
796,231 - Extend XPath for and let expressions
Fix #796 Fix #231
Extends "for" and "let" expressions in XPath to allow a larger subset of XQuery FLWOR expression syntax. Specifically:
- Allow positional variables (
at $pos
) - Allow type declarations (
as type
) - Allow let/for clauses to be mixed without an intervening
return
.
Issue #1122 closed #closed-1122
Rendering xspecref
Issue #1130 closed #closed-1130
Fix xspecref to production
Pull request #1130 created #created-1130
Fix xspecref to production
Fix #1122
This is (apparently) the first use of an xspecref
to a production.
Issue #1129 closed #closed-1129
Fix Norm's affiliation
Pull request #1129 created #created-1129
Fix Norm's affiliation
Someone preparing a QT4 status talk for a conference observed that my affiliation on the data model spec was out-of-date.
I'm just going to merge this one without any fanfare.
Pull request #1128 created #created-1128
1020 Further notes on the consequences of function coercion
Adds further notes an examples explaining the consequences of function coercion, especially when applied to maps and arrays. The new notes make it clear that test case MapTest-058 is incorrect; a map, once coerced to a function, cannot be used as a map.
Issue #1127 created #created-1127
Binary resources
We have some functions that accept binary input (parse-html, parse-csv) and others that don't (parse-xml, parse-json). There seems to be no obvious justification for the inconsistency.
Related to this:
(a) we have no functions to convert (encode/decode) between binary and string given an encoding
(b) we have no function to read a binary resource from a URI
Both of these are available in the EXPath bin library but should perhaps be promoted to the main spec.
Issue #1039 closed #closed-1039
Allow dynamic collations in XQuery "order by" and "group by"
Issue #1092 closed #closed-1092
1039 Add notes referring to fn:collation-key
Issue #1100 closed #closed-1100
99 fn:equal() function to compare sequences and arrays
Issue #1113 closed #closed-1113
Misleading rendering BiDi text in parse-integer example
Issue #1126 closed #closed-1126
1060 Minor fixes
Pull request #1126 created #created-1126
1060 Minor fixes
Whitespace, variable names, tests
Issue #1121 closed #closed-1121
1060 Formatting
Pull request #1125 created #created-1125
1094 Enhanced lookup expressions
Fix #1094
Lookup expressions (both deep and shallow) are enhanced in two ways:
(a) the syntax is extended to provide options that avoid flattening the result. For example $V?pairs::K
delivers the result as a sequence of key-value pairs.
(b) a new KeySpecifier format is provided to filter the results by type. For example $V??type(record(first, last))
selects all items in the recursive content that are of type record(first, last)
. This replaces the previous syntax $V??*::record(first, last)
which caused ambiguities with occurrence indicators.
Issue #859 closed #closed-859
Syntax problem with type-qualified wildcards in lookup expressions
Issue #1106 closed #closed-1106
859 lookup syntax problems
Issue #1124 created #created-1124
Formatting XPath/XQuery: Preferences, Conventions
In #1060, the formatting of code examples in the spec was unified. This issue is about discussing the formatting rules and (ideally) to define conventions for newly added code. If we don’t manage to define rules, the existing specs should provide enough examples for all syntactical constructs to be inspired by.
To start with, one suggestion in yesterday’s meeting was to choose a more compact presentation. Empty maps, empty arrays, and functions with an empty body are currently formatted as follows:
map { }, { },
array { }, [ ],
function { }, fn { }, fn($x) { }
We could remove the inner whitespace:
map {}, {},
array {}, [],
function {}, fn {}, fn($x) {}
Pull request #1123 created #created-1123
1118 Drop the "map" keyword in adaptive serialization output
Close #1118
Issue #1122 created #created-1122
Rendering xspecref
xspecref markup in the serialization spec is being rendered incorrectly. See for example
<xspecref spec="XP40" ref="doc-xpath40-MapConstructor"/>
in section 10, which renders as
A [map item] is serialized using the syntax of a [Section ] ^XP40 without ...
where the link to the referenced section works correctly, but the section title is not displayed.
Pull request #1121 created #created-1121
1060 Formatting
Minor editorial fixes (examples, typos). I’ll merge this after a while if no one objects.
Pull request #1120 created #created-1120
99v2 deep equal with callback
A second attempt to address issue #99
Replaces PR #1100
Fix #99 Fix #1063
In response to comments during the review of #1100, this PR abandons the proposed fn:equal() function and instead adds a callback option to fn:deep-equal. This can potentially be used to compare any pair of items (including maps and arrays) if desired.
Issue #1119 created #created-1119
Declare namespace bindings in XPath
We have dropped the proposed "with" expression, which was in the spec but never reviewed by the WG.
We need to reconsider the requirement: do we need some kind of construct to declare namespace prefixes, and perhaps other parts of the static context, in XPath?
One thought here is that when XPath expressions are issued from a host language such as Javascript or Python, the typical pattern is to have lots of small independent XPath expressions within a program. It doesn't make sense for each such expression to have its own boilerplate to establish the context, which will usually be the same for each expression. Rather it makes sense for the XPath invocation API to supply a reference to a context object which is set up once and reused. However, this doesn't mean that there's no room for XPath syntax to establish the context. For example, one might envisage a program doing
XPath engine = new XPath();
engine.setStaticContext("declare namespace abc='http://abc.uri'; pqr = 'http://pqr.uri");
engine.evaluate("//x/@y");
so the syntax for creating the static context would be decoupled from the expression syntax.
Issue #711 closed #closed-711
Using annotations for navigation of JSON trees
Issue #1070 closed #closed-1070
Concise syntax for map construction
Issue #1071 closed #closed-1071
1070 Bare Brace map constructor syntax
Issue #1118 created #created-1118
Use new map{} syntax in adaptive output method
Should the adaptive output method be changed to use the new bare-brace syntax when serializing maps, dropping the map
keyword?
Issue #1019 closed #closed-1019
XQFO: Unknown option parameters
Issue #1059 closed #closed-1059
1019 XQFO: Unknown option parameters
Issue #1077 closed #closed-1077
Correct the status of new language features
Issue #1074 closed #closed-1074
Confirm status of provisional functions
Issue #1097 closed #closed-1097
566-partial Fix colon issue in URI parsing
Issue #1107 closed #closed-1107
Grammar discrepancy on fn:pin() examples
Issue #1060 closed #closed-1060
Formatting XPath/XQuery
Issue #1078 closed #closed-1078
1060 Formatting XPath/XQuery
Issue #1076 closed #closed-1076
1075 Drop 'with' expressions
Issue #1075 closed #closed-1075
Drop "with" expressions
Issue #1109 closed #closed-1109
Discrepancies in fn:hash() published examples
Pull request #1117 created #created-1117
1116 Add options param to unparsed-text
Reverts the change to unparsed-text and unparsed-text-lines so they no longer normalise line endings by default.
Instead an options parameter is added to select this as a non-default behaviour.
At the same time, we add an option to control whether the function is deterministic (that is, returns the same content if called repeatedly with the same URI). In 3.1 the spec stated that implementations might provide an option to do this, but did not provide an interoperable way of setting this option. For compatibility, the default is still to be deterministic.
Fix #1116
Issue #1116 created #created-1116
unparsed-text() end-of-line normalization
I'm uncomfortable with the backwards-incompatible change we have made to unparsed-text()
which now normalizes line endings.
I think it's unlikely that there are many users who care about the difference between LF and CRLF line endings and want to preserve that difference; but I think it's very likely that there are users who have written application code that expects the line ending to be CRLF, where the application will break if the line ending changes.
I would be more comfortable with the change if there were an option setting to revert to the 3.1 behaviour. But my preference would be to add the option and keep the default compatible with 3.1.
Also, for users who want to treat the file as a sequence of lines, we already introduced unparsed-text-lines()
so they don't have to worry about different representations of line endings.
QT4 CG meeting 071 draft agenda #agenda-03-26
Draft agenda published.
Issue #1115 created #created-1115
XSLT - ability to call a function from xslt (not just xpath)
I don't think this is possible in 3.0 and I don't think its yet suggested (whats the easiest way to find out)....if it has then close.
I wonder in passing many times why I ever use
<xsl:call-template.../>
when functions exist? I never use the data context inside a named template...feels dangerous.
So the only reason I do it, is because I can embed literal XML elements e.g.
<xsl:call-template name='foo'>
<xsl:with-param name='barElement' as='element(barElement)'>
<barElement/>
<xsl:with-param/>
but if I could do
<xsl:call-function name='foo'>
<xsl:with-param name='barElement' as='element(barElement)'>
<barElement/>
<xsl:with-param/>
then i would use functions always in preference to named templates.
motivation
- language simplification (though you'd have to keep named templates for legacy)
- you explicitly remove the data context (which I think is error prone in practice).
- if people used functions in preference to named templates then you would be able to use more of your code directly from x-path expressions.
Issue #1114 created #created-1114
Partial function application: Keywords and placeholders
The test suite contains test cases – FunctionCall-414 … FunctionCall-417 – for partially applied functions with keywords and placeholders:
<test-case name="FunctionCall-414" covers-40="keywords">
<description>Use of keyword arguments with placeholders on user-defined function</description>
<created by="Michael Kay" on="2023-03-13"/>
<modified by="Michael Kay" on="2023-12-13" change="do what the description says"/>
<dependency type="spec" value="XQ40+"/>
<test><![CDATA[
declare function local:diff ($s as xs:integer, $t as xs:integer) as xs:integer {
$s - $t
};
local:diff(s := 12, t := ?)(8)
]]></test>
<result>
<assert-eq>4</assert-eq>
</result>
</test-case>
...
I didn’t find information on this feature combination in the spec; is it already covered? If yes, is it also possible to partially apply function items with keywords?…
declare function local:f($s, $t) { $s - $t };
local:f#2(s := 12, t := ?)(8),
local:f(?, ?)(s := 12, t := ?)(8)
...
If 2x yes, I can try to add some more test cases (for example, I assume that $f(t := 12, ?)
is illegal, as arguments without keywords probably need to be placed first).
Issue #1113 created #created-1113
Misleading rendering BiDi text in parse-integer example
In fuction index one example of fn:parse-integer() is using parameters containing arabic letters. This leads to wrong display of parameters as 1st and 2nd parameter look like switched because browser renders both of them from left to right. It is this example:
<fos:test>
<fos:expression><eg>translate('٢٠٢٣', '٠١٢٣٤٥٦٧٨٩', '0123456789')
=> parse-integer()</eg></fos:expression>
<fos:result>2023</fos:result>
</fos:test>
This looks confusing. I don't know what will be the best fix. Maybe storing '٠١٢٣٤٥٦٧٨٩' into variable would help and prevent the issue.
Pull request #1112 created #created-1112
1110-partial New error codes
Issue: #1110.
- fn:hash: I added
FOHA0001
as error code. - fn:op: I used
XPTY0004
as error code, as the allowed operators could also be defined as string enumeration. - XQuery, Map Test: Not included in this PR.
Issue #1111 created #created-1111
xsl:pipeline
In XSLT 3.0 it is not possible to write a multi-phase streaming transformation, where two are more phases each operate in streaming mode and the result of one phase is piped into the next. Such transformations can only be written as multiple stylesheets, coordinated by some calling application.
A non-streamed multiphase transformation typically uses variables for the intermediate results:
<xsl:variable name="temp1">
<xsl:apply-templates mode="phase1"/>
</xsl:variable>
<xsl:variable name="temp2">
<xsl:apply-templates select="$temp1" mode="phase2"/>
</xsl:variable>
<xsl:apply-templates select="$temp2"/>
This cannot be streamed because variables cannot hold streamed nodes.
The idea is to allow this to be written:
<xsl:pipeline streamable="yes">
<xsl:apply-templates mode="phase1"/>
<xsl:apply-templates select="." mode="phase2"/>
<xsl:apply-templates select="." mode="phase3"/>
</xsl:pipeline>
where each instruction in the pipeline takes as its context value the result of the previous instruction.
Even when no streaming is involved, the xsl:pipeline instruction brings usability benefits: it's much clearer to the reader what is going on.
(Triggered by a support request from a user wanting to make an existing pipelined transformation streamable; but the idea was considered and "postponed to v.next" during XSLT 3.0 development. The replacement of "context item" by "context value" removes one of the obstacles.)
Issue #1110 created #created-1110
New error codes
The XQFO spec includes various “[TODO: error code]” comments. Should we add error codes when finalizing PRs, or does a master plan exist to add them at the very end?
Issue #1109 created #created-1109
Discrepancies in fn:hash() published examples
In the third example:
hash("")
the expected result has a spurious trailing letter "o". That's trivial and I will fix it.
In the seventh example:
hash(serialize($doc), map{"algorithm": "sha-1"})
I am getting a completely different result, which I suspect is because I am getting a different result from serialize()
. Perhaps the difference is something like a trailing newline, I don't know.
In my case the result of serialize($doc)
is the 14-character string "<doc>abc</doc>"
, which I believe is correct, but I suspect there might be other results of serialize($doc)
that would also be conformant with the spec.
Pull request #1108 created #created-1108
566-partial Describe a less aggressive %-encoding for fn:build-uri
My proposal seemed to meet with general approval, so here is my attempt to implement it in the spec.
Issue #1107 created #created-1107
Grammar discrepancy on fn:pin() examples
I'm probably getting the wrong end of the stick, but I can't see how the example for fn:pin():
pin(["a","b","c"])?1 => label()?parent => array:foot()
meets the current EBNF. (I know bits of this area may be in flux, so this may be just for the record .)
pin(["a","b","c"])?1 => label()
meets the production for ArrowExpr
, and the RHS of an ArrowExpr
is, in this case, an ArrowStaticFunction
ArgumentList
pair, which doesn't encompass the subsequent lookup.
But for LookupExpr
to include the ?parent
requires PostfixExpr
as its first term, and PostfixExpr
can only include ArrowExpr
via PrimaryExpr/ParenthesizedExpr
, i.e. with brackets.
Issue #1052 closed #closed-1052
parse-csv() - simplify output
Pull request #1106 created #created-1106
859 lookup syntax problems
Fix the syntax ambiguity identified in issue #859 by dropping the troublesome construct.
It is hoped something else will be introduced in its place.
Fix #859
Issue #1105 created #created-1105
Casting to numerical type from strings with underscores
The Digits production now permits underscores as separators in long numerical character sequences. However in casting to numerical types, either by operator or by function:
'12_345_678' as xs:integer
number('12.345_678')
am I correct that this should fail according to Casting from xs:string and xs:untypedAtomic, even though a static resolution/rewrite would be possible?
Issue #1104 created #created-1104
TypeTest expressions
Our current status quo text allows the result of a lookup expression to be filtered by type:
[[1,2], [3,4], 5, 6]?*::array(*)?1
and issue #859 points out that this doesn't work because of a syntax ambiguity involving occurrence indicators ('?' is both an occurrence indicator and a lookup operator).
This issue addresses that problem by re-examining the requirements, and pulling in a number of other issues at the same time.
In path expressions we have a shorthand syntax for selecting nodes, called the node test, and the proposed syntax ::array(*)
was modelled on this. Node tests have a considerable overlap with types, but there are limitations. For example the self
axis is often used to turn a node test into a general predicate, but [self::XX]
can only be used to test elements, not attributes. However, the popularity of node tests and the self axis illustrates the need for a concise filtering operation.
Of course it's always possible to write [. instance of array(*)]
but this gets extremely verbose.
In XSLT 3.0, template rules matching maps and arrays could only be written as match=".[. instance of array(*)]"
, which gets really ugly, so we have proposed an alternative in 4.0. Specifically, you can match any type using match="type(ItemType)"
, and for many types such as arrays and maps you can abbreviate this to, for example match="array(*)"
. But this feels clumsy because the type() wrapper is sometimes needed and sometimes not.
I would like to propose an expression that has concise syntax, whose effect is equivalent to . instance of T
. I propose to use the ~
symbol. This is available as both a binary and unary operator, so we can define a binary form $z ~ T
which is syntactic shorthand for $z instance of T
, and a unary form ~T
which is shorthand for . ~ T
.
First, in the case of lookup expressions, we can now write:
[[1,2], [3,4], 5, 6]?*[~array(*)]?1
TypeTests will often be used within predicates in this way, and of course the usage is completely general.
Here's an example used for array:filter: array:filter($array, fn{~xs:integer+})
which selects all members of the array comprising one or more integers.
In XSLT 4.0 the syntax ~T
replaces the current TypePattern, giving a much more uniform way of matching items by type.
In XPath and XSLT conditionals the construct can be used as an equivalent to XQuery's TypeswitchExpr:
<xsl:choose>
<xsl:when test="~xs:integer">...</xsl:when>
<xsl:when test="~xs:string">...</xsl:when>
...
</xsl:choose>
The choice of tilde for this operator is motivated by:
- There are not many symbols available
- Tilde has many different uses in mathematics and computing, some of which represent a boolean test applied to a value (for example testing whether it is similar to another value or whether it matches some pattern), which is not dissimilar to this proposed usage
- The alliteration between "tilde" and "type" has some mnemonic value (cf. the use of
@
for the attribute axis).
Issue #1103 created #created-1103
CSV Parsing - handling line ending normalization
During discussion of PR #1066 there was much debate about how best to handle normalization of (typically CRLF) line endings.
Perhaps it's very unlikely that CRLF line endings will make it as far as the parse-csv() function, because they will already have been normalized for example by unparsed-text(). But data can also be read in other ways, for example bin:read-binary() or sql:query() extension functions, or passed in as a string-valued parameter to a transformation.
Perhaps we should have a separate mechanism for normalizing line endings in any data, independent of CSV parsing? (But perhaps it's important to retain CRLF in quoted strings?)
Perhaps CSV parsing should normalise CRLF unconditionally, without needing to set a special option for it?
Issue #1101 closed #closed-1101
XQuery: Normalize line endings
Issue #1089 closed #closed-1089
Rounding when casting string to date/time or duration types
Issue #1090 closed #closed-1090
1089 Add rounding rules for casting string to duration etc
Issue #1079 closed #closed-1079
Editorial: XSLT, Applying Template Rules, Examples
Issue #1083 closed #closed-1083
1079 Change book used in example
Issue #1050 closed #closed-1050
Potential (low-risk) Ambiguities in XPath EBNF
Issue #1081 closed #closed-1081
1050 Fix ItemType grammar ambiguity
Issue #1080 closed #closed-1080
1036 Rephrase the rules for number-parser with liberal JSON
Issue #1036 closed #closed-1036
parse-json: liberal parsing
Issue #1102 closed #closed-1102
Fix broken idref to escaped-crlf in test generation
Pull request #1102 created #created-1102
Fix broken idref to escaped-crlf in test generation
It appears that escaped-crlf-3
might have been intended. @michaelhkay ?
Issue #1073 closed #closed-1073
XQFO (editorial)
Issue #757 closed #closed-757
Function families
Issue #463 closed #closed-463
fn:parts() - extract the parts of a (not-really) atomic value
Issue #448 closed #closed-448
Support extended dateTime formats of ISO-8601:2019?
Issue #283 closed #closed-283
Enumeration types
Issue #218 closed #closed-218
Function library for maps with composite keys: and thoughts on encapsulation
Issue #33 closed #closed-33
JSON Parsing & Serialization: Numbers
Issue #883 closed #closed-883
Improve return type for fn:load-xquery-module()
QT4 CG meeting 070 draft minutes #minutes-03-19
Draft minutes published.
Issue #1072 closed #closed-1072
883 Return type of load-xquery-module
Issue #1066 closed #closed-1066
1052 Simplify the results of parse-csv
Issue #1101 created #created-1101
XQuery: Normalize line endings
Various tests, such as line-ending-Q002
, validate if line ending are normalized when parsing the input:
<test-case name="line-ending-Q002">
<description>Normalization of line endings in XQuery</description>
<created by="Michael Kay" on="2011-11-24"/>
<dependency type="spec" value="XQ10+"/>
<test>deep-equal(string-to-codepoints('
'), (10))</test>
<result>
<assert-true/>
</result>
</test-case>
I cannot find a corresponding note in the current XQuery 4 draft. Should we add it?
I would welcome this normalization. I assume that no one over the last decades has missed carriage return in XML?
Issue #1099 closed #closed-1099
Build fixes
Pull request #1100 created #created-1100
99 fn:equal() function to compare sequences and arrays
Fix issue #99
Introduces a function fn:equal() that compares two arbitrary values (sequences, maps, arrays, etc), with a callback for comparing "leaf" items in the structure.
Pull request #1099 created #created-1099
Build fixes
I had a brain cramp when I wrote the build.gradle
file for this repository. This PR fixes that.
It also adds a nobreak
attribute to the code
element. The intent, not yet implemented, is that you can say
<code nobreak="true">some long, but not unreasonably long expression</code>
and the stylesheet will prevent a line break in the middle of the expression.
Pull request #1098 created #created-1098
566-partial Editorial improvements for parse-uri
- Add a note clarifying that the fragment identifier should be (1) URI decoded and (2) ignored if it's the empty string.
- Reworked a bit of the description in order to avoid an ambiguity in how
///abc
should be parsed. (The current spec can be satisfied either by parsing it as//(/abc)()
or//()/abc
and only the former is intended.)
Pull request #1097 created #created-1097
566-partial Fix colon issue in URI parsing
In the course of reviewing the tests for fn:parse-uri
, I discovered (or perhaps more correctly, @ChristianGruen discovered) that the rules for matching Windows drive letters are inconsistent. This PR fixes that inconsistency.
Issue #1096 created #created-1096
Effect of atomization on array:index-of()
What is the expected result of the expression:
array:index-of( [[1,2], [3,4]], [3,4] )
It seems that the second argument is atomised (because its declared type is atomic), but the first argument is not.
So both members of the array have count=1, whereas $search has count=2, so nothing matches, so the result is ().
Now, what if we write:
array:index-of( [[1,2], (3,4)], [3,4] )
This time it seems that the second member of the array matches, so the result is 2.
This doesn't feel right. One solution would be to say that each member of the array is itself atomised. But that seems to lead to other surprises with other examples of nested arrays.
An alternative would be to atomize neither argument (which would mean changing the function signature). But then we would need to use a different comparison operation.
We seem to be back where we started -- I was unhappy about introducing this function because of the difficulty of defining a good comparison operation for it to use.
QT4 CG meeting 070 draft agenda #agenda-03-19
Draft agenda published.
Issue #1095 closed #closed-1095
Collation: caseblind → Standardize or replace with `html-ascii-case-insensitive`?
Issue #1095 created #created-1095
Collation: caseblind → Standardize or replace with `html-ascii-case-insensitive`?
Various test cases use the artificial http://www.w3.org/2010/09/qt-fots-catalog/collation/caseblind
collation. It seems that most (all) of them could also be written with the http://www.w3.org/2005/xpath-functions/collation/html-ascii-case-insensitive
.
Could we replace the tests with the standardized collation, or should we rather try to standardize the caseblind variant?
Issue #1094 created #created-1094
Axis steps in lookup expressions
This issue picks up where issue #341, issue #350, issue #596, issue #960 etc left off - an attempt to find better syntax and semantics for navigation within JTrees (by which I mean trees of maps and arrays). The problems we are addressing are well aired in those previous issues. There are new opportunities for improving navigation within pinned trees, where upwards navigation becomes possible.
Firstly I propose that the existing constructs ?*
, ?key
, and ?1
be treated as abbreviations for ?content::*
, ?content::key
, and ?content::1
respectively. The content axis delivers a flattened sequence of items.
Then I propose we introduce an entry
axis. ?entry::*
, ?entry::key
, and ?entry::1
deliver their results as a sequence of key value pairs, in the style of map:pairs()
. Arrays for this purpose are treated as maps with integer keys. For example if $A
is [(1,2), (3,4)]
then $A?entry::*
delivers (map{'key':1, 'value':(1,2)}, map{'key':2 'value':(3,4)}
.
This applies equally to the deep lookup operator. $A??entry::*
returns all the key-value pairs within the JTree rooted at $A, recursively.
We could also consider a value
axis which delivers a sequence of arrays containing the values, losing the associated keys.
If values are labelled, as a result of being found by navigating a pinned JTree. then upwards navigation is also possible. For an item in a pinned tree,
-
containing-entry::*
delivers the containing entry as a key-value pair. Duplicates are eliminated. -
owner::*
delivers the immediately containing map or array as identified by the label -
ownership::*
delivers the transitive closure of theowner::*
axis. -
peer::*
deliversowner::*/entry::*
-
following-member::*
delivers the subarray of the containing array that follows the current entry -
preceding-member::*
delivers the subarray of the containing array thay precedes the current entry
Of course, improved names for these concepts are welcomed!
In these examples I have used *
to select everything on the relevant axis. This can always be replaced by a key specifier K that selects the item only if it is labelled with a key K. So for example ownership::address selects the containing maps and arrays that are themselves in a map entry with key "address".
I think we also need a convenient way to filter the selection by type (see issue #859 for a problem with the current syntax). I propose
??content::[record(longitude, latitude)]
to select all items in the recursive content that match type record(longitude, latitude)
Similarly
??entry::[array(xs:integer)+]
to select all entries where the value is an array of integers.
Finally, responding to issue #341, I propose that lookup operators should be error free: rather than reporting errors, they should return nothing.
Pull request #1093 created #created-1093
1091 Add fn:collation function
Fix #1091
Pull request #1092 created #created-1092
1039 Add notes referring to fn:collation-key
Fix #1039
Rather than adding a new feature to the language, we add notes to "order by" and "group by" explaining how the requirement can be met using the fn:collation-key() function.
Issue #334 closed #closed-334
Transient properties: a new approach to deep selection and update in maps and arrays
Issue #86 closed #closed-86
Fallback for named timezones
Issue #64 closed #closed-64
Specify optional parameters to create bounded variadic functions
Issue #56 closed #closed-56
Allow item-type to be matched within its definition scope
Issue #1091 created #created-1091
Convenience function to construct a collation URI
I propose a convenience function to construct a collation URI: for example
collation({'lang':'fr'})
returns a collation URI suitable for French.
If the property names supplied are those that are defined for UCA collation names, the result will be the corresponding UCA collation URI; alternatively, implementation-defined property names can be included.
Pull request #1090 created #created-1090
1089 Add rounding rules for casting string to duration etc
Fix #1089
Issue #1089 created #created-1089
Rounding when casting string to date/time or duration types
F+O section 21.2 describes rules for rounding when strings are cast to xs:decimal. The same rules should apply when casting to a dateTime, time, or duration, in the case where the number of digits in the fractional seconds part exceeds the precision supported by the implementation.
Issue #1082 closed #closed-1082
Inconsistency in underscore in numeric literal grammar
Issue #1088 closed #closed-1088
1082 Fix numeric literal grammar
Pull request #1088 created #created-1088
1082 Fix numeric literal grammar
Fix #1082
Also adds some notes and examples
Pull request #1087 created #created-1087
1086 Editorial changes to array:values
Fix #1086
Issue #1086 created #created-1086
array:values spec cleanup
The rules for array:values say:
The function concatenates the members of $array and returns them as a sequence. The values are returned in their original order. Arrays contained within members are returned unchanged.
The effect of the function is equivalent to $array?*.
This is all a bit too vague.
- the values are not concatenated, at least not in the sense of concat()
- it doesn't return the members, it returns their sequence concatenation
- the phrase "arrays contained within members" is unclear. The examples reveal that this rule is intended to include arrays that ARE members.
- the equivalent expression
$array?*
allows $array to be things thatarray:values()
doesn't allow (like an empty sequence).
More subtly, the introduction to section 19.1 says:
All functionality on arrays is defined in terms of two primitives:
The function [array:members] decomposes an array to a sequence of value records. The function [array:of-members] composes an array from a sequence of value records.
and the spec for array:values doesn't conform to this guideline.
Try:
The function returns the sequence-concatenation of the members of $array, retaining order. More formally, the effect of the function is equivalent to the expression array:members($array)?value
.
and add to the notes:
Unlike array:flatten, the function does not apply recursively to nested arrays.
Issue #1085 created #created-1085
Parameters to fn:sort
An interesting suggestion made in passing in the thread discussing fn:ranks(). It would be possible to combine the collation
argument and the ascending/descending
argument of fn:sort
into a single argument, whose value is an optional "ascending|descending" keyword followed by an optional collation URI (whitespace-separated, presumably).
This might seem a little bizarre at first sight, but having a list of collation URIs followed by a list of sort key functions followed by a list of ascending/descending keywords is also a little bizarre, and it would have two advantages - it would make better use of the second argument which is currently nearly always set to ()
, and it would put the two parts of the order specification (the collation and its direction) in closer proximity. After all, they are used in combination to decide whether one value precedes or follows another.
I'm not 100% convinced by the idea, but it seems worth considering. What do people think?
Issue #1084 created #created-1084
Incorrect rendition of option defaults
In the F&O spec, when rendering the default value of an option, code font is being used for narrative prose: see for example defaults for the delivery-format
and base-output-uri
options of fn:transform
Pull request #1083 created #created-1083
1079 Change book used in example
Changed the example to a book by a reputable author.
Fix #1079
Issue #1082 created #created-1082
Inconsistency in underscore in numeric literal grammar
Numeric Literals describes permitting underscores to be used as separators in sequences of digits within long numbers. The first interpretation rule says underscores are first stripped out.
But the grammar provided appears to me to be inconsistent.
IntegerLiteral ::= Digits
DecimalLiteral ::= ("." Digits) | (Digits "." [0-9]*)
DoubleLiteral ::= (("." Digits) | (Digits ("." [0-9]*)?)) [eE] [+-]? Digits
Digits ::= DecDigit ((DecDigit | "_")* DecDigit)?
DecDigit ::= [0-9]
Digits
permits underscores in the grammar, which works in the integer portion of the numeric literal, but when the value exceeds 1 the fractional part is described as [0-9]*
. If underscore stripping is a 'pre-parsing' step, then Digits
need not mention it at all.
On the other hand if the grammar is defining the sequence of characters that are permitted, then the fractional section in the grammar should also permit underscores, which it plainly does not in the presence of an integer part. (The test seconds-010
uses such in an expansion of π.)
An alternative formulation that I think does describe underscores in fractional parts might be:
DecimalLiteral ::= ("." Digits) | (Digits "." Digits?)
and similarly for DoubleLiteral
. I know this isn't a game-changer, but for those generating grammars, consistency certainly helps.
Pull request #1081 created #created-1081
1050 Fix ItemType grammar ambiguity
Two alternatives in the grammar were both EQNames, distinguished semantically. This ambiguity in the grammar is now fixed (no living XPath expressions are harmed by this change).
Fix #1050
Pull request #1080 created #created-1080
1036 Rephrase the rules for number-parser with liberal JSON
Rephrasing as suggested in the issue.
Fix #1036
Issue #1079 created #created-1079
Editorial: XSLT, Applying Template Rules, Examples
This guy is all around. Do we really need him in our specs as well? 😏
https://qt4cg.org/pr/1078/xslt-40/Overview.html#applying-templates
"Title": "How to Win Elections",
"Authors": [ "...
Pull request #1078 created #created-1078
1060 Formatting XPath/XQuery
Editorial; closes #1060.
This PR attempts to unify the presentation of XPath and XQuery code. It’s not complete, but it should definitely improve the status quo.
The chosen formatting and indentation rules can certainly be discussed. My major objective was consistency: I selected rules that were used frequently enough in the given documents.
Apart from the presentation stuff, this PR fixes various minor bugs in the rules and examples.
Pull request #1077 created #created-1077
Correct the status of new language features
This PR corrects the status of certain language features that the change appendix in the spec incorrectly describes as having not been accepted by the WG., The features in question are:
The rules for reporting type errors during static analysis have been changed so that a processor has more freedom to report errors in respect of constructs that are evidently wrong, such as
@price/@value
, even though dynamic evaluation is defined to return an empty sequence rather than an error.
This change has in fact been discussed and accepted by the group. See PRs #603 and #884.
Record types are added as a new kind of ItemType, constraining the value space of maps.
Record types have become a fundamental feature of much of our work, with many additional capabilities relying on them. They became an official part of the spec with the closure of issue #172.
Local union types are added as a new kind of ItemType, constraining the value space of atomic values.
Enumeration types are added as a new kind of
ItemType
, constraining the value space of strings.
Local union types and enumeration types became an official part of the spec with the acceptance of PR #691
The lookup operator
?
can now be followed by a string literal, for cases where map keys are strings other than NCNames.
These changes were endorsed by acceptance of PR #926.
The rules for value comparisons when comparing values of different types (for example, decimal and double) have changed to be transitive. A decimal value is no longer converted to double, instead the double is converted to a decimal without loss of precision. This may affect compatibility in edge cases involving comparison of values that are numerically very close.
We still have open issues regarding comparison, conversion, and promotion of numeric values. See for example issue #986. So we may yet decide to roll back these changes. For practical purposes it's sensible to treat the current text as status quo, since so many individual changes have been made that unwinding can only be treated as a new issue.
A
for member
clause is added to FLWOR expressions to allow iteration over an array.
The current specification of for member
results from the acceptance of PR #752.
Pull request #1076 created #created-1076
1075 Drop 'with' expressions
Proposes dropping "with" expressions from the spec.
Fix #1075
Issue #1075 created #created-1075
Drop "with" expressions
The current draft includes a proposal for a "with" expression to establish the namespace context for an XPath expression. This has never been reviewed or accepted by the group. See §4.1 of the language specifications.
I propose to raise a PR that drops this feature, in order to force discussion as to whether we want it in its current form, or to replace it with something better, or to drop it entirely.
Pull request #1074 created #created-1074
Confirm status of provisional functions
The purpose of this PR is to bring the current F&O draft specification into a state where it is confirmed as the current status quo accepted by the CG.
The following functions (mainly new, some amended) that have been present in the draft for some while, but with caveats about their status, are confirmed as part of the status quo:
- fn:slice
- fn:format-number
- fn:stack-trace
- map:filter
- map:replace
- array:replace
The following functions are dropped (for the time being):
- fn:json
- map:substitute
The actual PR essentially changes text that alludes to the status of these functions, it does not change the actual specifications.
The current state of qt4tests in relation to these functions is:
fn:slice - OK fn:format-number - missing tests for recent changes fn:stack-trace - no tests map:filter - OK map:replace - no tests array:replace - no tests fn:json - no tests map:substitute - no tests
Pull request #1073 created #created-1073
XQFO (editorial)
Examples fixed.
Pull request #1072 created #created-1072
883 Return type of load-xquery-module
Use a record type for the return type of the function.
Fix #883
Pull request #1071 created #created-1071
1070 Bare Brace map constructor syntax
Makes the keyword "map" in map constructors optional.
Fix #1070
QT4 CG meeting 069 draft minutes #minutes-03-12
Draft minutes published.
Issue #220 closed #closed-220
Encapsulation
Issue #262 closed #closed-262
Navigation in deep-structured arrays
Issue #274 closed #closed-274
What would it take/would it be possible to build a module repository for QT?
Issue #295 closed #closed-295
Extend support for self-reference in record types
Issue #314 closed #closed-314
Basic Operations on Maps and Arrays
Issue #825 closed #closed-825
array:members-at
Issue #829 closed #closed-829
fn:boolean: EBV support for more item types
Issue #960 closed #closed-960
Should ??KS flatten the results
Issue #961 closed #closed-961
Simulating Objects: Performance
Issue #1037 closed #closed-1037
fn:json-to-xml: 'number-parser' option
Issue #1058 closed #closed-1058
1037 fn:json-to-xml: 'number-parser' option
QT4 CG meeting 069 draft agenda #agenda-03-12
Draft agenda published.
Issue #1070 created #created-1070
Concise syntax for map construction
It has been suggested that we should allow a "bare braces" syntax for map construction. This would reduce visual clutter especially when defining options arguments, as in
serialize($result, map{"method": "adaptive", "indent": true()})
I believe there are no syntactic obstacles to dropping the "map" keyword. The main reason it is there was because there was competition for the construct with people doing so-called scripting extensions who wanted "bare braces" to represent blocks of statements.
Allowing {"method": "adaptive"}
would align with JSON.
But I think we should go a step further and drop the quotes:
{method: "adaptive"}
except that we could allow a string literal if the key isn't an NCName, as with record type syntax.
Could we do this and still allow computed or non-string keys? I don't think we need to, the existing syntax remains available.
So I propose we allow:
serialize($result, {method: "adaptive", indent: true()})
While we're about it, is there any enthusiasm for allowing
serialize($result, {method: "adaptive", indent: ✅})
(: U+2705 :)
Issue #596 closed #closed-596
Pinned values: Transforming Trees
Issue #1069 created #created-1069
fn:ucd
This issue floats the idea of a new function, fn:ucd
(for Unicode character database).
The working signature would be fn:ucd($codepoint as xs:positiveInteger) as map(*)?
. In the returned map, each entry would have a key that is a Unicode property name (full or abbreviated) and a value that reflects the property of $character
in the Unicode character database.
What would users get? Access to a deep store of character data not otherwise (easily) available, such as name, name alias, bidirectional properties, age (when it entered Unicode), breaks (word, sentence, grapheme), scripts, and dozens of other properties. See Unicode TR 44. Although many properties are of specialized interest, I think most people would find at least a few of these properties of significance.
Can't we already do this with regular expressions? Well, no. Category escapes in XPath regular expressions, e.g., \p{Lm}
, are based upon general categories, but the properties mentioned above cut across these general categories. For example, general category Pd
dash is not coterminous with the property Dash
. Property Quotation_Mark
crosses many subcategories of P
Punctuation.
A few use cases:
Extrapolation
string-to-codepoints('ɑϞ') ! ('U+' || dec-to-hex(.) || ': ' || ucd(.)('Name'))
would return ('U+251: LATIN SMALL LETTER ALPHA', 'U+3DE: GREEK LETTER KOPPA')
Filtering
if (some $i in string-to-codepoints($text) satisfies ucd($i)('Soft_Dotted')) then...
Annotating
<div class="{myfunc:most-frequent((string-to-codepoints($text) ! ucd(.)('Script')))}">
<xsl:value-of select="$text"/>
</div>
might produce
<div class="Syriac">ܡܠܟܘܬܐ ܕܫܡܝܐ ܐܝܬܝܗ̇. ܠܐ ܚܫܘܫܘܬܐ ܕܢܦܫܐ܆ ܥܡ ܝܕܥܬܐ ܕܫܪܪܐ ܕܗܠܝܢ ܕܐܝܬܝܗܝܢ</div>
And so forth. I can think of dozens of different types of operations where fn:ucd
might be significant.
Thoughts?
Pull request #1068 created #created-1068
73 fn:graphemes
First draft of fn:graphemes
, in response to discussion at #73 .
A battery of tests will be submitted as a PR to the qt4tests repository.
Issue #1067 closed #closed-1067
fn:deep-equal: significant children
Issue #1067 created #created-1067
fn:deep-equal: significant children
The current rules of fn:deep-equal are:
e. Let
significant-children($parent)
be the sequence of nodes obtained by applying the following steps to the children of$parent
, in turn: i. Comment nodes are discarded if the optioncomments
is false. ii. Processing instruction nodes are discarded if the optionprocessing-instructions
is false. iii. Adjacent text nodes are merged. … …the sequencesignificant-children($i1)
is deep-equal to the sequencesignificant-children($i2)
.
If my interpretation is correct, the following expression is now expected to return true
…
deep-equal(
<e>A<!---->B</e>,
<e>AB</e>
)
…and we need to update various test cases (e.g. K2-SeqDeepEqualFunc-22).
Pull request #1066 created #created-1066
1052 Simplify the results of parse-csv
Changes parse-csv to deliver the results in a simpler format:
(a) the result structure is less deeply nested: one record with four entries
(b) the actual data is delivered as a sequence of arrays of strings, closely aligned with the result of csv-to-arrays
The rules in the spec have also been rearranged to reflect this, so the rules are now organised according to the values delivered for each of these four fields.
The examples in the spec are changed to reflect the new output format; in addition they have been editorially reorganized so each example is more self-contained, avoiding the need for extensive scrolling to find the values of variables referenced in each example.
Fix issue #1052
Issue #1065 created #created-1065
fn:format-number: further notes
This issue summarizes suggestions for fn:format-number
from the QT4 Meeting 068 that have not yet been incorporated into the current draft:
- The Unicode Common Locale Data Repository (CLDR) should be referenced; it has recommendations for all of the languages in Unicode and some variants.
- We could consider introducing an options map so that we can just add more things later (such as e.g. an option for using the default decimal format for parsing the picture string, see https://github.com/qt4cg/qtspecs/issues/1048#issuecomment-1978869499).
Issue #919 closed #closed-919
Should predicate callbacks use EBV?
Issue #944 closed #closed-944
Coercion rules: implicit types
Issue #1047 closed #closed-1047
Incorrect note for `fn:some` and `fn:every`
Issue #1064 closed #closed-1064
340-editorial fn:format-number
Pull request #1064 created #created-1064
340-editorial fn:format-number
…an addendum to the editorial change I made yesterday; will be merged in a minute.
Issue #1054 closed #closed-1054
Spec fn:message #id using old name fn:log
Issue #1057 closed #closed-1057
1054 Spec fn:message #id using old name fn:log
Issue #1063 created #created-1063
deep-equal() - option to compare functions liberally
We have changed deep-equal() so it no longer automatically treats functions as not equal.
However, it is still in practice infeasible to use deep-equal() for comparison of test results that include function items because it is not in general possible to supply an expected result that compares true to the function item actually returned. Since comparison of test results is an important use case for deep-equal, this is a serious limitation. It affects our own process that checks that output from examples in the spec is correct: the examples for parse-csv, for example, are artificially adjusted to make test comparison feasible by eliminating the function items in the result, which reduces the pedagogical value of the examples.
There should be an option such as strict-function-comparison=true|false
. If set to false, then the function properties such as name, arity, and signature are compared, but the function body is ignored and assumed equal.
QT4 CG meeting 068 draft minutes #minutes-03-05
Draft minutes published.
Issue #1053 closed #closed-1053
1047 Default predicate for some#1 and every#1
Issue #1046 closed #closed-1046
1038 take-while predicate no longer uses EBV
Issue #413 closed #closed-413
New function: parse-csv()
Issue #1017 closed #closed-1017
Change csv-to-xml() to produce an XHTML table
Issue #1043 closed #closed-1043
CSV parsing - "blank" rows
Issue #1051 closed #closed-1051
1043 Clarification of CSV edge cases
Issue #340 closed #closed-340
fn:format-number: Specifying decimal format
Issue #1049 closed #closed-1049
340-partial fn:format-number: Specifying decimal format
Issue #1061 closed #closed-1061
discussion - language pragmas
Pull request #1062 created #created-1062
150bis revised proposal for fn:ranks
This proposal is an amended/alternative proposal for the fn:ranks function, taking into account the work done on the original issue #150 and the PR #1027 and the comments raised. Acknowledgements to the original author for the idea and for a lot of good work on examples etc.
It amends the previous proposal as follows:
(a) the signature and the semantics are aligned with fn:sort. This adds some functionality (multiple sort keys, ascending/descending) and also removes some complexity (two different collations for comparing input items and result items)
(b) the style of exposition is changed editorially for consistency with other functions
Issue #1061 created #created-1061
discussion - language pragmas
The motivation is introducing breaking changes into the language that may have value, but not enough value to justify a breaking change.
Haskell uses language pragmas for this, and actually most (well, a lot) Haskell code does not use the base specification, and quite common constructs (GADTs, multi param type classes) require extensions.
Haskell devs are used to this, it may require some referring to the the top of the file to change the pragmas but its a working solution to introducing optional things that may be breaking changes.
Thoughts?
Benefits
- allows breaking changes to be introduced
Costs
- developers may have to refer to the language pragma to correctly understand the code
- implementation explosion, extensions may not be independent and interact causing an explosion of combinations of extensions (though i think its reasonable for an implementation to just implement combinations that are practical).
I'm biased, I want to introduce breaking changes, but am thwarted by the versioning strategy.
Issue #1060 created #created-1060
Formatting XPath/XQuery
I got reminded today that the specification documents are kind of “wild”, because all code snippets use a different formatting:
- The indentation is inconsistent (the tendency seems to be 2 spaces, in accordance with the function signatures). Repeatedly, indentations are used that don’t seem to follow any conventions at all.
- Sometimes,
function
,map
andarray
keywords are followed by a space, sometimes not. My preference would be not be too stingy; we have enough space. - Sometimes, the
return
keyword starts in a new line, sometimes it’s attached to the previous line.
This is certainly something we cannot finalize too early, but I think we shouldn’t be too erratic in an official document, even though it’s “just code”.
Related: #1000.
Pull request #1059 created #created-1059
1019 XQFO: Unknown option parameters
Issue: #1019
Pull request #1058 created #created-1058
1037 fn:json-to-xml: 'number-parser' option
Issue: #1037
Pull request #1057 created #created-1057
1054 Spec fn:message #id using old name fn:log
Issue: #1054
Issue #1056 closed #closed-1056
Simplifying match templates
Issue #1056 created #created-1056
Simplifying match templates
I like match templates a lot, I think they are a USP for XSLT, but I find using them quite clumsy e.g.
- priority rules are quite subtle (I couldn't tell you what they are not, and I tend to make them explicit)
- because each match sits in a different template they tend to sort of drift around in the spaghetti of the code
- they don't naturally extend to nested local matches....everything exists at the top level.
If you compare this with main stream functional match expressions then they are quite syntactically different, and I think the mainstream syntax is probably a bit simpler (and much more familiar) (I can see this potentially extending to lots of subsequent things but I'll keep it to the headline)
I think something like
<xsl:template mode="foo" as="xs:string">
<xsl:match select="Foo">
<xsl:sequence select="'this is a foo'"/>
</xsl:match>
<xsl:match select="Bar">
<xsl:sequence select="'this is a bar'"/>
</xsl:match>
<xsl:match>
<xsl:sequence select="'this is something else'"/>
</xsl:match>
</xsl:template>
- templates are matched in sequence (as is the norm), no opaque priority rules
- if nothing is matched then nothing is returned...I have effectively a catchall match above.
- everything is cohesive, the template contains all matches....no secret ones hidden at the bottom of the file.
there's lots of holes here,
- how does this interact with existing match templates?
- are the a different syntax for the same thing?
- how do they work with includes and imports?
my guesses are...they ARE just different syntax for the existing infrastructure...because thats the smallest change. and then the other questions are answered by how the above syntax maps into "priority" but tbh, as I barely know how the current priority rules work, I can't really give a sensible guess.
tbh, if this is just different syntax then secret matches CAN exist elsewhere in the spaghetti, but at least the programmer does have a construct to not do that, rather than the default contract to lack cohesion from the outset.
Issue #1055 created #created-1055
xsl:variable/@as - simplifying the language - attempt 2
I've thought about it.
The key issue I had which genuinely caused me years of confusion (I didnt understand it so I ignored it, and dealt with it by typeing random xslt code)....this....
<xsl:variable name="presentationMediaElement" as="element(urn:presentationMedia)">
<presentationMedia/>
</xsl:variable>
if I don't declare the "as" then it does something different and confusing (it assumes its a document element I think, though I NEVER want it to do this).
so for stylesheets declared as version "4.0"+, can we make the default interpretation of that its an element?
Does this breaks backwards compatability with v1? tbh, the code is already incompatible because the equivalent 1.0 code requires node-set
, its already broken, so I suggest making the fix simple to understand.
why is this so irksome to me? because for me its incredibly confusing
its confusing because (and i didnt express this well the last time), it makes a type declaration have inconsistent behaviours.
In languages with OO (is it reynolds?) type systems this also happens BUT in an OO type system an expression has a type than can be cast to a subtype and a subtype is very special because everything that is true of the supertype (in the constained type logic) is true of the subtype (you can express this in terms of set/class membership in a universe if thats how you think about these things).
but in this case, this isnt the case....the two interpretations are disjoint, this isnt a cast.
So the concrete proposal is uniquely define the semantics of.
<xsl:variable name="presentationMediaElement">
<presentationMedia/>
</xsl:variable>
to be
<xsl:variable name="presentationMediaElement" as="element(urn:presentationMedia)">
<presentationMedia/>
</xsl:variable>
from 4.0 onwards.
(ironically, personally i will probably still put the "as" clause in, but if i were trying to learn the language today I'd understand this on day 1, not day 1000).
P.S.
I have a suspicion I still dont fully understand it, but i'm sure someone will point that out in due course.
Issue #1054 created #created-1054
Spec fn:message #id using old name fn:log
- https://qt4cg.org/specifications/xpath-functions-40/Overview.html#func-message
- https://qt4cg.org/specifications/xpath-functions-40/Overview.html#func-log
Issue #1018 closed #closed-1018
Output of parse-csv()
Pull request #1053 created #created-1053
1047 Default predicate for some#1 and every#1
Changes the default predicate for fn:some#1
and fn:every#1
to be fn:boolean#1
, which takes the EBV of the items in the input sequence. The previous use of fn:identity#1
caused some unexpected behaviour.
Issue #1052 created #created-1052
parse-csv() - simplify output
Currently parse-csv produces a structure like this:
map {
"columns": map {
"names": map{"one":1, "two":2},
"fields": ("one", "two")
}
"rows": (
map{
"fields": ("aaa", "bbb"),
"field" fn($col){$this?fields[$col]}
},
map{
"fields": ("ccc", "ddd"),
"field" fn($col){$this?fields[$col]}
}
)
}
There are a number of ways this could be improved.
- The structure is needlessly different from the return value of maps-to-arrays(). Users will get confused between the two representations, and will find it hard to switch from one to the other. For example, one delivers rows as sequences, the other delivers rows as arrays.
- There are too many levels in the structure; the expressions to select within it are unnecessarily complicated, and users will get poor diagnostics when they get it wrong.
- In this example (with two columns) for each row there is one map, one sequence, one function, and two strings - five values in all. The output of
csv-to-arrays
has only three objects (one array and two strings). However hard an optimized implementation tries to reduce the overhead, the space occupied by a million-row parsed CSV is likely to be larger than needed. - The use of sequences rather than arrays means that no JSON-serialization of the structure is possible
I propose using a flatter structure, like this pseudo-code sketch:
map {
"column-index": map{"one":1, "two":2},
"columns": ["one", "two"]
"rows": (
["aaa", "bbb"],
["ccc", "ddd"]
)
"get": fn($row, $col){$this?rows[$row]($col)}
"size": fn(){count($this?rows)}
}
This doesn't meet all the objections outlined above; for example it represents rows as a sequence of arrays, which is consistent with csv-to-arrays
, but not JSON-serializable. But I think it's a considerable improvement.
Pull request #1051 created #created-1051
1043 Clarification of CSV edge cases
Gives a more precise definition of blank rows and empty fields, and generally adds detail on how edge cases should be handled.
Fix #1043.
Issue #1038 closed #closed-1038
Backwards incompatibility caused by use of EBV in callback functions
Issue #1050 created #created-1050
Potential (low-risk) Ambiguities in XPath EBNF
After demonstrating iXML
XPath grammar production at the meeting of 27th Feburary, it seemed worth recording some of the ambiguity issues encountered, if only so others might be aware of possible pitfalls.
Please note that the Lexical Structure notes in the spec do resolve these ambiguities, by extra-grammatical interpretations, most notably the choice of longest conforming match, but for grammar/parsers which don't specify or support this, such as InvisibleXML, ambiguities might arise, though there may be ameliorating changes to the resulting grammar that will resolve them. I am not advocating changes to the specification EBNF but merely noting where such problems might occur from my implementation experience, and potentially suggesting some workarounds.
Here are a couple of cases:
TypeName / AtomicOrUnionType
The rule for ItemType
is ~
ItemType ::= ... TypeName| .... | AtomicOrUnionType | ...
where both TypeName
and AtomicOrUnionType
resolve solely to the EQName
production. The grammar interpretation notes suggests (I think) that it binds to TypeName
if such exists in the current static context, which is an extra-grammatical concept, but I may be mistaken.
StringTemplate
The productions for StringTemplate are:
[106] StringTemplate ::= "`" (StringTemplateFixedPart | StringTemplateVariablePart)* "`"
[107] StringTemplateFixedPart ::= ((Char - ('{' | '}' | '`')) | "{{" | "}}" | "``")*
[108] StringTemplateVariablePart ::= EnclosedExpr
where it relies on longest match semantics to avoid ambiguity. (If this was not the case a potential infinity of empty StringTemplateFixedPart
productions could be satisfied, or any sequential partitions of a sequence of characters.)
An alternative (recursive and more cumbersome) formulation, which avoids the ambiguity is (in an iXML grammar for compactness):
StringTemplate: -"`", StringTemplateContent?, -"`".
-StringTemplateContent: StringTemplateFixedPart |
StringTemplateVariablePart |
StringTemplateVariablePart, StringTemplateContent |
StringTemplateFixedPart, StringTemplateVariablePart, StringTemplateContent?.
StringTemplateFixedPart: ("{{"; "}}"; "``"; ~["`{}"])+.
StringTemplateVariablePart
remains unchanged. (iXML doesn't support character set subtraction, so ~["``{}"]
(any character except...) is used for the Char - ....
term.) By allowing a fixed part only to be followed by a variable part, this effectively permits the content either to be empty, or a sequence of parts such that StringTemplateVariablePart
terms can be consecutive, but not StringTemplateFixedPart
and it seems to work effectively, at least in my iXML parser.
Reactions, corrections, remarks, praise and brickbats welcome. I'll document any more as I find them. John
Pull request #1049 created #created-1049
340-partial fn:format-number: Specifying decimal format
The PR introduces an additional $format
argument to fn:format-number
, which allows you to override decimal formats with custom properties.
Next, we may need to clarify if the current specification already allows processors to provide custom decimal formats (https://github.com/qt4cg/qtspecs/issues/340#issuecomment-1968856655). It’s not part of this PR.
Issue #1048 created #created-1048
fn:format-number: relax restrictions on exponent-separator (possibly minus-sign, percent, per-mille)
The current rules for decimal formats are too restrictive (i.e., too much focused on Anglo-Saxon formatting rules). The most prominent case is the Arabic exponent-separator „character“, which consists of two characters: عر
(https://www.localeplanet.com/icu/ar/). The exponent separator of other locales is not restricted to a single character either. For example, se-NO
uses ·10^
.
When we include the ICU library in the analysis, we also find minus-sign
, percent
and per-mille
properties that are longer than 1 character. Examples:
- The
minus-sign
character forhe
consists of200e
and002d
(200e
is the Left-to-Right Mark). - The Arabic
percent
character consists of066a
and061c
(061c
is the “Arabic Letter Mark”). - The
per-mille
property ofen-US-posix
is0/00
.
Issue #1047 created #created-1047
Incorrect note for `fn:some` and `fn:every`
fn:some
and fn:every
state (non-normatively):
"If the second argument is omitted or an empty sequence, the first argument must be a sequence of xs:boolean values.".
I don't think this note is correct. If the default predicate identity#1
is used, it is coerced to the required type function(item()) as xs:boolean
, so the effective predicate is fn($x as item()) as xs:boolean {identity($x)}
. This atomises the result of calling identity($x)
and casts the result to xs:boolean
. Therefore expressions such as some([true()])
and some(<a>true</a>)
return true, not an error.
Pull request #1046 created #created-1046
1038 take-while predicate no longer uses EBV
See issue #1038, which pointed out compatibility problems with using EBV for callback predicates, as proposed in issue #919.
In specifying fn:take-while
we anticipated acceptance of the proposal to use EBV for predicate callbacks; now that we have decided not to make that change, this PR brings take-while
into alignment with other functions using a predicate callback.
Issue #1016 closed #closed-1016
Editorial comments on fn:parse-csv()
Issue #1042 closed #closed-1042
1016 Editorial cleanup - csv-to-arrays
Issue #236 closed #closed-236
map:build: sequence of keys
Issue #1041 closed #closed-1041
236 map:build: sequence of keys
Issue #988 closed #closed-988
960 Pinned and labeled values
QT4 CG meeting 067 draft minutes #minutes-02-27
Draft minutes published.
Issue #485 closed #closed-485
Predeclared namespaces in XQuery
Issue #1040 closed #closed-1040
485 Predeclared namespaces in XQuery: output
Issue #1029 closed #closed-1029
Make argument of fn:void optional
Issue #1032 closed #closed-1032
1029 Make argument of fn:void optional
Issue #1033 closed #closed-1033
QT4CG-066-01 Add note that whitespace and comments in regexen are lexical constructs
Issue #356 closed #closed-356
array:leaves
Issue #843 closed #closed-843
Standard, array & map functions: Equivalencies
Issue #872 closed #closed-872
Symmetry: fn:items-at → fn:get
Issue #990 closed #closed-990
Transitive closure on non-nodes
Issue #1007 closed #closed-1007
How to invert a predicate function
Issue #1030 closed #closed-1030
allow pattern matches in axis expression
Issue #1034 closed #closed-1034
QT4CG-066-xx Add note regarding absence of drop-while / skip-while
Issue #1024 closed #closed-1024
Precedence of `otherwise` operator
Issue #1031 closed #closed-1031
1024 Change precedence of 'otherwise' operator
Issue #1003 closed #closed-1003
919 Use EBV in boolean callbacks
Issue #1045 created #created-1045
Functions to manage namespace usage
Prior to saving XML generated in XQuery I often tweak the namespace usage. This makes the XML lighter and clearer for the casual reader and is sometimes mandated by users and systems. I think providing builtin solutions for these cases would ease these tasks.
Common cases are:
-
Remove unused prefixes Example: the function presented at https://stackoverflow.com/questions/23002655/xquery-how-to-remove-unused-namespace-in-xml-node
-
Make a namespace the default wherever it is used. Example:
functx:change-element-ns-deep($nodes,$targetns,"")
See http://www.xqueryfunctions.com/xq/functx_change-element-ns-deep.html -
Remove the use of all/some namespaces Example: BaseX https://docs.basex.org/wiki/Utility_Module#util:strip-namespaces
A somewhat related issue https://github.com/qt4cg/qtspecs/issues/266
QT4 CG meeting 067 draft agenda #agenda-02-27
Draft agenda published.
Issue #1044 created #created-1044
CSV row delimiter - allowed values
Section 15.4.2.1 says:
The row delimiter defaults to matching any of CRLF ( ), LF ( ), or CR ( ). Valid values for the row delimiter are a single Unicode character, or one of CRLF, LF, or CR, that has not been marked for use as the column delimiter. Implementations must raise [[err:FOCV0002](] if the row-delimiter option is set to a multi-character string other than CRLF ( )
- It's not entirely clear to me what this is saying. Are alternative row delimiters other than newline delimiters allowed (
row-delimiter:('|','/')
? - The statement in this section doesn't align with the error conditions appearing in the actual function specs, which says: "A dynamic error [[err:FOCV0002] occurs if one or more of the values for field-delimiter or quote-character are specified and are not a single character." - no mention here of the row-delimiter.
Issue #1043 created #created-1043
CSV parsing - "blank" rows
The CSV parsing specification states "A blank row is represented as an empty array.".
-
It's not clear what "blank" means here. Does it depend on the whitespace-trimming option?
-
It would be more logical to return an array containing a single zero-length string, since any other line containing no field delimiter is considered to contain one field.
-
Alternatively, it might make sense to ignore the row entirely.
Pull request #1042 created #created-1042
1016 Editorial cleanup - csv-to-arrays
The changes here are almost entirely editorial, reordering material, removing duplication and changing some of the language for consistency with the rest of the spec. There is one substantive change - the function csv-to-simple-rows
is renamed csv-to-arrays
.
Fix #1016
Pull request #1041 created #created-1041
236 map:build: sequence of keys
Issue: #236
Pull request #1040 created #created-1040
485 Predeclared namespaces in XQuery: output
Issue: #485
Issue #1039 created #created-1039
Allow dynamic collations in XQuery "order by" and "group by"
I propose that in "order by" and "group by" clauses, the keyword "collation" should be followed by an expression rather than by a URILiteral.
The only problem this causes is if the expression depends on variables in the tuple stream, because obviously the collation must be selected for the tuple stream as a whole, not for each individual tuple.
We can solve this problem by amending the rules for the scope of variables bound in FLWOR expressions (§4.15.1 rule 1) so that collation expressions are excluded from the scope; or perhaps (it might be simpler) to make it a static error if the collation expression refers to a variable bound in the containing FLWOR expression.
If the syntax allows a general expression then a simple "quoted-string"
will be interpreted as a StringLiteral
rather than a URILiteral
. As far as I'm aware the two things are syntactically and semantically identical so this isn't a problem.
Issue #1038 created #created-1038
Backwards incompatibility caused by use of EBV in callback functions
Changing fn:filter and similar functions that existed in 3.1 to use the EBV of the callback function's result has introduced a backwards incompatibility. In 3.1, the function conversion rules were used to convert the callback function's result to xs:boolean
. This involves atomization. If the callback returned the untyped node <a>false</a>
, this is atomised as false()
, but its EBV is true()
.
Revealed by test case fn:filter-006.
I think it's unlikely to happen much in practice, but it's a bit nasty. Perhaps we shouldn't make the change to use EBV for functions that existed in 3.1?
Perhaps we should even consider reverting the change entirely. It's not exactly essential.
Issue #1037 created #created-1037
fn:json-to-xml: 'number-parser' option
A function supplied via the number-parser
option of fn:json-to-xml
is now allowed to return zero or more one items (see #973). Analogous to the action
argument of fn:replace
, the result should be converted to a string by invoking fn:string
on the result. An example:
json-to-xml('-1', map { 'number-parser': abs#1 })
→ <fn:number>1</fn:number>
json-to-xml('1', map { 'number-parser': fn { true#0 } })
→ err:FOTY0013
No change is required for fn:parse-json
.
Issue #1036 created #created-1036
parse-json: liberal parsing
I noticed that some new test cases (fn-parse-json-712
, fn-parse-json-716
, possibly others) rely on specific liberal parsing rules.
@michaelhkay Do you think that it could make sense to formalize some of those rules, or should we rather fix the test cases?
Issue #1005 closed #closed-1005
regular expressions - whitespace
Issue #709 closed #closed-709
(Un)Checked Evaluation
Issue #459 closed #closed-459
Eager and lazy evaluation
Issue #135 closed #closed-135
Arrays' counterparts for functions on sequences, and vice versa
Issue #94 closed #closed-94
Functions that determine if a given sequence is a subsequence of another sequence
Issue #43 closed #closed-43
Support standard and user-defined composite values using item type definitions
Issue #1001 closed #closed-1001
fn:subsequence-where: equivalent `fn:slice` expression
Issue #1020 closed #closed-1020
When to apply the coercion rules
Issue #1035 created #created-1035
Add default values for parameters in constructor functions for records
We have added implicit constructor functions for named record types; we should allow the parameters in these functions to take explicit default values.
For example
declare item type my:complex as record(r as xs:double, i as xs:double := 0)
At the same time we might consider introducing fixed values, for example
declare item type my:rectangle as record(height, width, area ::= function($rect){$rect?height * $rect?width))
in which (a) the area function must NOT be supplied as an argument to the constructor function call, and (b) a map in which the area field is different from this fixed value is not a valid instance of the my:rectangle
record type.
Pull request #1034 created #created-1034
QT4CG-066-xx Add note regarding absence of drop-while / skip-while
In response to comments noted in the minutes of meeting 066, and made in writing against PR #1008, add a note justifying the absence of drop-while or skip-while functions.
Pull request #1033 created #created-1033
QT4CG-066-01 Add note that whitespace and comments in regexen are lexical constructs
Adds a note explaining why whitespace and comments are not explicit in the regex grammar; see action QT4CG-066-01
Pull request #1032 created #created-1032
1029 Make argument of fn:void optional
Allows the use of fn:void#0
when required.
Fix #1029
Pull request #1031 created #created-1031
1024 Change precedence of 'otherwise' operator
Changes the precedence of the otherwise
operator so that @price otherwise @cost * 2
now means @price otherwise (@cost * 2)
rather than (@price otherwise @cost) * 2
.
Fix #1024
QT4 CG meeting 066 draft minutes #minutes-02-20
Draft minutes published.
Issue #999 closed #closed-999
regular expression addition - comments
Issue #1022 closed #closed-1022
999 Allow comments in regular expressions
Issue #1028 closed #closed-1028
960(partial) Recognize alternative representation of JSON null
Issue #617 closed #closed-617
Implicit constructor functions for record types and union types
Issue #953 closed #closed-953
617 Define record constructors
Issue #1002 closed #closed-1002
Reinstate subsequence-before
Issue #1008 closed #closed-1008
1002 Add fn:take-while function (replacing subsequence-before)
Issue #655 closed #closed-655
fn:sort-with: Comparators
Issue #795 closed #closed-795
655 fn:sort-with
Issue #1023 closed #closed-1023
1020 explain consequences of function coercion
Issue #1025 closed #closed-1025
1001 Fix incorrect operator precedence in subsequence-where
Issue #1030 created #created-1030
allow pattern matches in axis expression
There's a danger that this already exists, and that i dont know about it, but i dont think it does.
Consider this SO question.
https://stackoverflow.com/questions/78027093/selecting-preceding-cousins-inclusing-siblings
the questioner is writing this
/root/level1/level2[@id='6']/preceding::level2[parent::level1[parent::root]][1]
eeek...look at the nasty nested predicates
when he/she wants to write this
/root/level1/level2[@id='6']/preceding::(root/level/level2)[1]
there is an answer on the question which sort of shows how horrific the problem is in general.
(its a problem that crops up quite a lot for me)
Issue #1029 created #created-1029
Make argument of fn:void optional
If you want to supply a function that always returns an empty sequence, fn:void#0 would be useful; but currently there is no arity-zero variant.
Example: map:build(...., combine:=fn:void#0)
returns a map in which any key that occurs more than once in the input is mapped to an empty sequence.
The first argument of fn:void
should default to an empty sequence.
QT4 CG meeting 066 draft agenda #agenda-02-20
Draft agenda published.
Pull request #1028 created #created-1028
960(partial) Recognize alternative representation of JSON null
Defines an option in parse-json and json-doc to define a representation for JSON null, defaulting to ()
as currently used. Selecting a different value may be useful because it bypasses the problem that the ?
and ??
operators flatten the results, causing ()
to be elided.
Suggests use of the QName fn:null
as an alternative representation; and changes the JSON serialization method to recognize this QName as a representation of null.
Pull request #1027 created #created-1027
150 fn:ranks
As proposed and discussed here: https://github.com/qt4cg/qtspecs/issues/150
Issue #1026 created #created-1026
XSLT match patterns on pinned maps and arrays
Given that <xsl:apply-templates select="pin(.)??course?code"/>
will select items that are labeled with their position in the containing tree of maps and arrays, it should be possible to match the selected items with a match pattern of the form
match="?course?code"
that operates in a similar way to patterns such as course/code
in XML.
Perhaps the pinning of the map should be done automatically by the xsl:apply-templates
instruction.
Pull request #1025 created #created-1025
1001 Fix incorrect operator precedence in subsequence-where
Fixes the "equivalent expression" to subsequence-where.
Fix issue #1001
Issue #1024 created #created-1024
Precedence of `otherwise` operator
I made a mistake when specifying subsequence-where, caused by misunderstanding the precedence of the otherwise
operator: see issue #1001.
In the expression
let $start := index-where($input, $from)[1]
otherwise count($input) + 1
I failed to realise that otherwise
binds more tightly than +
.
I'm opening the issue to solicit views as to whether we have got this right.
One might take the view that the closest thing to otherwise
in other familiar language is the ternary conditional operator, which has lower precedence than anything else including and
and or
; but then, its first operand is a boolean expression while it's relatively unlikely that the operands of otherwise
will be boolean. I'm therefore thinking that it might be best to put it between 'eq' and '||`, so
$a eq $b otherwise $c || $d
parses as
$a eq ($b otherwise ($c || $d))
Issue #827 closed #closed-827
map:empty, map:exists ← array:empty, array:exists
Issue #779 closed #closed-779
Hash/checksum function
Issue #978 closed #closed-978
948 Reflected the comments of the CG on the specification of scan-left and scan-right
QT4 CG meeting 065 draft minutes #minutes-02-13
Draft minutes published.
Issue #720 closed #closed-720
From Records to Objects
Issue #985 closed #closed-985
720 Add lookup arrow expressions (method invocations)
Issue #949 closed #closed-949
Partial Function Applications: Allow return of function name
Issue #972 closed #closed-972
949 Partial Function Applications: Allow return of function name
Issue #42 closed #closed-42
Relax type incompatibility in order by clause (impl. dep. instead of XPST0004)
Issue #55 closed #closed-55
Provide an XML version of the stack trace
Issue #79 closed #closed-79
fn:deep-normalize-space($e as node())
Issue #989 closed #closed-989
character sequence constructor 'a' to 'z'
Issue #994 closed #closed-994
Invoking maps & arrays: allow sequences?
Issue #1009 closed #closed-1009
QT4CG-064-03, QT4CG-064-04: Examples, Return type of `fallback`
Issue #1010 closed #closed-1010
1009 Examples, Return type of parse-json:fallback
Issue #916 closed #closed-916
720 Allow methods in maps with access to $this
Pull request #1023 created #created-1023
1020 explain consequences of function coercion
Adds explanatory material to explain my interpretation of the spec and the consequences on backwards compatibility. No change to the spec is proposed. (To review the PR, I suggest reading the change markings in the XQuery spec.)
Fix issue #1020
Pull request #1022 created #created-1022
999 Allow comments in regular expressions
Fix #999
Issue #1021 created #created-1021
Extend `fn:doc`, `fn:collection` and `fn:uri-collection` with options maps
fn:doc
, fn:collection
and fn:uri-collection
currently expect only a single argument, a URI.
There is no way of adding additional parameters to those functions.
Several implementations of XPath have worked around that limitation by
- passing of parameters via query string as part of the URI:
- see https://www.saxonica.com/documentation10/index.html#!sourcedocs/collections
- exist-db's implementation of
uri-collection
works similarly
- create custom functions in other namespaces to add an options map as a second parameter
saxon:doc
in Saxon https://www.saxonica.com/documentation10/index.html#!changes/extensions/9.7-9.8fetch:doc
in baseX https://docs.basex.org/wiki/Fetch_Module#fetch:doc
While both approaches do work well, they do fall flat in terms of interoperability and discoverability.
A script written for Saxon leveraging saxon:doc
will not work on baseX in vice versa even though they offer options with some overlap.
And a developer looking at the language specification will not discover that these options even exist.
I would like to add a second signature to the above functions with an options map as a second argument.
fn:doc($href as xs:string?) as document-node()?
fn:doc($href as xs:string?, $options as map(xs:string, *)? := ()) as document-node()?
NOTE: Looking at the other two functions below I believe the first parameter should be defined as $href as xs:string? := ()
fn:collection( $uri as xs:string? := ()) as item()*
fn:collection( $uri as xs:string? := (), $options as map(xs:string, *)? := ()) as item()*
fn:uri-collection( $uri as xs:string? := ()) as xs:anyURI*
fn:uri-collection( $uri as xs:string? := (), $options as map(xs:string, *)? := ()) as xs:anyURI*
Since a lot of those options depend on the current runtime most of them will be "free" options. This will also help us get to a specification quickly and circumvent long infighting about some very specific details.
I do see, however, a good chance of specifying a small set of options that would work across implementations.
Possible standard options
For fn:doc
validation
: wether and how to validate the input files against a schemawhitespace
: (strip-space
,stripws
) what to do with whitespace in the input documentparser
: could be used to define a different parser (for html documents)
For collection
and uri-collection
I see the following:
recurse
: traverse collection trees down into its subcollectionsstable
: this is already vaguely mentioned in the spec and would benefit from a clearer specificationtype
: (akamedia-type
orcontent-type
) while the allowed values will be implementation defined the key should be standardised
This would bring the above functions to follow a pattern developers are already familiar with (see fn:serialize
and others)
Thanks for initial input by @ChristianGruen, Liam Quin and @michaelhkay
QT4 CG meeting 065 draft agenda #agenda-02-13
Draft agenda published.
Issue #1020 created #created-1020
When to apply the coercion rules
The rules for function calling say that the coercion rules are applied to the values supplied as function arguments; they are also applied in other circumstances such as when binding values to variables. The coercion rules are applied (as far as the spec is concerned) whether or not the supplied value already matches the required type.
Saxon has always attempted to optimise this process: if the supplied value is already an instance of the required type, no coercion takes place.
I have discovered at least one case where this assumption is incorrect: the coercion rules are not idempotent in the case where the supplied value matches the required type. This case concerns function coercion, exemplified by the new test case FunctionCall-058: if the expected type of a callback parameter is function(xs:integer) as xs:boolean
, and the supplied value for the callback is a function that accepts xs:decimal
, then the coercion rules say that a call to the supplied function that supplies an xs:decimal
must be rejected as a type error even though the supplied function accepts it.
Note that this means we have introduced a rather subtle backwards incompatibility. In XQuery 3.1, coercion was not applied to variable bindings, so the following would work (the supplied function matches the declared type of the variable):
declare variable $f as function(xs:integer) as xs:boolean
:= function($x as item()) as xs:boolean {string($x)};
return $f("banana");
(see new tests VarDecl065/066)
In 4.0 I believe this is supposed to throw a type error, because the supplied function is wrapped in a wrapper function that checks that the supplied argument is an integer.
We have extended the coercion rules considerably in 4.0, and we need to be confident that there are no other similar cases where the coercion rules are no longer idempotent.
Issue #1019 created #created-1019
XQFO: Unknown option parameters
The current option parameter conventions are:
- It is not an error if the options map contains options with names other than those described in this specification. Implementations MAY attach an ·implementation-defined· meaning to such entries, and MAY define errors that arise if such entries are present with invalid values. Implementations MUST ignore such entries unless they have a specific ·implementation-defined· meaning. Implementations that define additional options in this way SHOULD use values of type
xs:QName
as the option names, using an appropriate namespace.
The obvious consequence is that wrongly typed or unsupported options are not reported as such:
serialize($node, map { 'format': 'html' })
I think we should still allow proprietary options, but raise errors when an option is neither defined in the specification nor supported by the given implementation. On the one hand, this will help users to spot typos (e.g., byte-order-mask
or instead of byte-order-mark
). On the other hand, options that are supported by one implementation will be rejected, which feels reasonable to me, as options usually change either the result, or the way how the input is treated.
If we believe that this change is too disruptive, we could tolerate entries with xs:QName
keys.
Issue #1018 created #created-1018
Output of parse-csv()
I propose making some simplifications to the output of parse-csv() to make it more amenable to processing.
- Represent each row as a map, rather than as a structure with a data field and an accessor function. Note that implementations worried about memory usage can devise a custom map implementation optimised for the case where many maps have the same regular structure. (cf recent thread about Javascript "shapes")
- The key for a field in this map should be an integer if (i) column-names is set to false, or (ii) the column in question does not have a unique header name; in other cases it should be the name from the header.
- Replace the top-level
columns
record with a simple array of field names. It's easy enough to map names to positions using index-of.
I also propose changing the name to csv-to-maps
for consistency with csv-to-table
and csv-to-arrays
.
We should advocate use of csv-to-arrays where data is to be accessed positionally, and csv-to-maps where it is to be accessed by column names, and optimise the design accordingly.
Looking at a use case, the first example (§15.4.7.1) would be unnecessary if as proposed we change csv-to-xml to generate XHTML directly, But if it were needed, it would change from
let $csv := fn:parse-csv(`name,city{$crlf}Bob,Berlin`)
return <table>
<thead>{
for $column in $csv?columns?fields
return <th>{ $column }</th>
}</thead>
<tbody>{
for $row in $csv?rows return <tr>
{ for $field in $row?fields return <td>{ $field }</td> }
</tr>
}</tbody>
</table>
to
let $csv := fn:parse-csv(`name,city{$crlf}Bob,Berlin`)
return <table>
<thead>{
for $column in $csv?columns
return <th>{ $column }</th>
}</thead>
<tbody>{
for $row in $csv?rows return <tr>
{ for $column in $csv?columns return <td>{ $row?$column }</td> }
</tr>
}</tbody>
</table>
Issue #1017 created #created-1017
Change csv-to-xml() to produce an XHTML table
I propose (a) renaming csv-to-xml
as csv-to-table
, and (b) changing the output to be an XHTML table. Specifically, instead of outputting
<csv xmlns="http://www.w3.org/2005/xpath-functions">
<columns>
<column>name</column>
<column>city</column>
</columns>
<rows>
<row>
<field column="name">Bob</field>
<field column="city">Berlin</field>
</row>
<row>
<field column="name">Alice</field>
<field column="city">Aachen</field>
</row>
</rows>
</csv>
it should output:
<table xmlns="http://www.w3.org/1999/xhtml">
<thead>
<tr>
<th>name</th>
<th>city</th>
</tr>
</thead>
<tbody>
<tr>
<td title="name">Bob</td>
<td title="city">Berlin</td>
</tr>
<tr>
<td title="name">Alice</td>
<td title="city">Aachen</td>
</tr>
</thead>
</table>
Benefits:
(a) the data is just as easy to manipulate or transform as the current output (b) it can be copied directly into HTML transformation output if required (c) it is familiar to users (d) we don't have to write, test, and document a schema (e) there may well be libraries that can perform further transformations on the structure, for example conversion to other table representations, extraction to spreadsheet formats, etc.
Issue #1016 created #created-1016
Editorial comments on fn:parse-csv()
(a) The spec says:
The first argument is CSV data, as defined in ..., in the form of a sequence of xs:string values.
But in fact, the argument is a single (optional) xs:string value, not a sequence.
(b) The spec says:
If $csv is the empty sequence, implementations must return a parsed-csv-structure-record whose rows entry is the empty sequence.
If $csv is the empty sequence, but column name extraction has been requested, or explicit column names have been supplied, then the parsed-csv-structure-record returned by implementations must have a rows entry whose value is the empty sequence.
The second paragraph seems to add nothing to the first.
(c) And the phrase "implementations must return XXX" is unidiomatic. The normal form of words is "the function returns XXX".
(d) The grammar of the sentence "Handling of delimiters, and whitespace trimming, are handled using..." is inelegant.
(e) References to the record type names (such as parsed-csv-structure-record) should be hyperlinked.
Pull request #1015 created #created-1015
1013 [XSLT] Clarify effect of accumulator capture on non-element nodes
Adds a sentence saying that when an accumulator rule with capture="yes" matches a non-element node, the capture attribute has no effect.
Fix #1013
Issue #1014 created #created-1014
Predicates, sequences of numbers: Feedback
Feedback on #996:
Successful result: early exit
If the EBV is computed, and if the first item is a node, the remaining items are ignored:
'OK'[ <a/>, 1, 'x' ] → 'OK'
I would suggest doing the same for predicates that start with a number:
- If a comparison is successful, an implementation should be allowed to skip the remaining comparisons.
- If
$seq
starts with a number,E[$seq]
andE[position() = $seq]
will become equivalent:
$seq[1 to 100, 'x']
$seq[position() = (1 to 100, 'x')]
Obviously, an error still needs to be raised if a comparisons leads to a type error.
Error Codes
If we don’t equate E[$seq]
and E[position() = $seq]
, it would be useful to stick with FORG0006
(instead of XPTY0004
) [1]:
- It would be confusing to get
FORG0006
forE['x', 1]
andXPTY0004
forE[1, 'x']
. - If a processor uses unified implementations for EBV and predicate checks, it leads to additional effort just because the error code differs.
If we use different error codes, existing tests need to be revised ([2], maybe others).
[1] https://github.com/qt4cg/qtspecs/pull/996/files#diff-b37a92a9eb3ab9ba48a00de9627a1124466b9c86ecb2b4989d04be3942c597a6R8240 [2] https://github.com/qt4cg/qt4tests/blob/70e52c690a26bbeee0641af14ccb319a2cc98081/prod/Predicate.xml#L1158-L1165
Issue #1012 closed #closed-1012
Fix some incorrect examples in the F&O spec
Issue #1013 created #created-1013
[XSLT] Need to say what happens when a capturing accumulator rule matches a non-element node
The option capture="yes" has been added to xsl:accumulator-rule
; its purpose is to indicate that the entire subtree under an element is to be captured during streamed processing of the document, and is made available as an in-memory tree once the element end tag has been processed.
We need to say what happens when such a rule matches a node other than an element. I think it makes sense for the capture="yes" to be ignored, optionally with a warning.
Pull request #1012 created #created-1012
Fix some incorrect examples in the F&O spec
No issue raised; the errors are revealed by the generated QT4 tests.
Issue #1011 created #created-1011
fn:transform() improvements
- The spec talks about how to invoke an XSLT 1.0, 2.0, or 3.0 processor, but not a 4.0 processor.
- There is no way of supplying a source document in a way that allows streaming. Saxon has added a
source-location
parameter for this purpose; this should be in the standard. - If the stylesheet is to read streamed input, then there also needs to be control over whether and how it does schema validation.
- When calling from XSLT, the best default for
base-output-uri
is probably the value ofcurrent-output-uri()
. The default is currently implementation-defined, but we should recommend this possibility. - The post-process option was added with the aspiration that it would enable secondary result documents (xsl:result-document output) to be written directly (e.g. to filestore) as a side-effect. However, it fails to achieve this. There should probably be an option to request this even though we cannot define its semantics precisely.
Pull request #1010 created #created-1010
1009 Examples, Return type of parse-json:fallback
Issue: #1009
I’ve used this PR to fix some other buggy examples in the XQFO spec.
Issue #1009 created #created-1009
QT4CG-064-03, QT4CG-064-04: Examples, Return type of `fallback`
Thanks for the attentive inspection of #975.
The type xs:untypedAtomic
for the fallback
function of fn:parse-json
made no sense indeed: JSON escape sequence can never be converted to numbers. The return type will be xs:anyAtomicType
instead of item()
(the result will be converted to a string).
I’ll revise the rules and add some examples.
Issue #977 closed #closed-977
Ignore this, it's just a test
Pull request #1008 created #created-1008
1002 Add fn:take-while function (replacing subsequence-before)
Adds function fn:take-while
, replacing/reinstating previously proposed items-before()
and subsequence-before()
.
Fix #1002
Issue #1007 created #created-1007
How to invert a predicate function
It's nice to be able to write
index-where($in, contains(?, 'e'))
to select all the items that contain an 'e'.
What should we write in order to select all the items that do not contain an 'e'? All formulations seem a bit clumsy in comparison:
index-where($in, fn{not(contains(., 'e'))})
index-where($in, chain((contains(?, 'e'), not#1))
Perhaps this is a sufficiently common requirement that it would be helpful to allow
index-where($in, inverse(contains(?, 'e')))
where inverse($predicate)
is defined as fn($it, $pos){not($predicate($it, $pos))}
Issue #1006 created #created-1006
regular expression addition - word boundaries
Could we provide support for the regex character sequence \b
for matching word boundaries?
It’s already supported by some processors via vendor-specific flags, and would be very helpful even if didn’t over the full Unicode range.
Issue #1005 created #created-1005
regular expressions - whitespace
There is some confusion about the rationale for defining the multi-character escape for whitespaces in a recent discussion on Slack:
\s
is limited to[#x20\t\n\r]
- In contrast,
\w\
covers[#x0000-#x10FFFF]-[\p{P}\p{Z}\p{C}]
, i.e., considers the full Unicode range
Do we know the reason?
I assume it’s both too late and out of scope to change that in our specs, but maybe we can improve the XQFO spec and…
- mention why
\s
does not include\p{Zs}
or\p{Z}
- add an example for looking up non-breaking spaces… for example:
matches(
string-join(('my', 'pleasure'), char(0xA0)),
'\p{Z}'
)
Issue #1004 closed #closed-1004
fn:char updated as agreed 2024-02-06: drop the form char('#x20')
Pull request #1004 created #created-1004
fn:char updated as agreed 2024-02-06: drop the form char('#x20')
The forms char('#32`) and char('#x20') are dropped.
Issue #963 closed #closed-963
Errors in forming function items (continued)
Issue #888 closed #closed-888
Reclassify XPDY0002 as a type error
Issue #992 closed #closed-992
888, 963: Error handling for unsatisfied context dependencies
Pull request #1003 created #created-1003
919 Use EBV in boolean callbacks
Changes functions with a $predicate callback function to use the effective boolean value of the result, mainly to allow things like index-where(*, fn{self::x})
.
Fix #919
QT4 CG meeting 064 draft minutes #minutes-02-06
Draft minutes published.
Issue #187 closed #closed-187
Add a 'while' clause to FLWOR expressions
Issue #943 closed #closed-943
187 Add FLWOR expression while clause
Issue #260 closed #closed-260
array:index-of
Issue #968 closed #closed-968
260 array:index-of
Issue #969 closed #closed-969
843-partial Standard, array & map functions: Equivalencies
Issue #973 closed #closed-973
fn:parse-json, fn:json-to-xml: `number-parser`, `fallback`
Issue #975 closed #closed-975
973 fn:parse-json, fn:json-to-xml: number-parser, fallback
Issue #984 closed #closed-984
959-partial Add fn:seconds function
Issue #993 closed #closed-993
989 (partial) Allow char() to take integer argument
Issue #830 closed #closed-830
Revise appendix D.4 of F+O: Illustrative user-written functions
Issue #997 closed #closed-997
830 Drop F+O appendix D.4
Issue #816 closed #closed-816
Predicates: Support for numeric sequences
Issue #996 closed #closed-996
816 Allow a predicate in a filter expression to be a sequence of numbers
Issue #995 closed #closed-995
937 (fn:hash) revised in light of CG feedback
Issue #628 closed #closed-628
distinct-values and duplicate-values: order of results
Issue #987 closed #closed-987
628 Define result order for distinct-values and duplicate-values
Issue #911 closed #closed-911
Type "Promotion" in the coercion rules
Issue #980 closed #closed-980
911 Coercion to allow double to decimal etc
Issue #966 closed #closed-966
Rewrite spec of deep lookup operator: edits
Issue #979 closed #closed-979
966 Minor fixes to deep lookup
Issue #964 closed #closed-964
fn:has-attributes
Issue #970 closed #closed-970
XQFO: Context item → value
Issue #971 closed #closed-971
970 XQFO: Context item → value
Issue #1002 created #created-1002
Reinstate subsequence-before
There's a question on StackOverflow today:
https://stackoverflow.com/questions/77944304/
that makes me think dropping subsequence-before
might have been a mistake (the replacement, subsequence-where
, doesn't allow the end condition to be exclusive).
The question is how to find all the consecutive list
elements that follow a given para
element. That would be solved with subsequence-before(following-sibling::*, fn{not(self::list)})
. Doing it with subsequence-where
is much harder - you need to drop the final element in the result if it is not a list
element, while also taking into account that the result might be empty.
I would like to propose reinstating subsequence-before; or perhaps inverting the predicate and naming it subsequence-while()
, so it becomes subsequence-while(following-sibling::*, fn{self::list})
assuming we accept the proposal in issue #919 to allow a callback predicate to use EBV.
Issue #1001 created #created-1001
fn:subsequence-where: equivalent `fn:slice` expression
I probably should have tagged #940 with »Request Changes«, as I believe the equivalent expression with fn:slice
needs to be fixed (or removed if it turns out to be too quirky): https://github.com/qt4cg/qtspecs/pull/940#issuecomment-1919399348.
QT4 CG meeting 064 draft agenda #agenda-02-06
Draft agenda published.
Issue #940 closed #closed-940
878 Add subsequence-where function
Issue #1000 created #created-1000
XQFO Code in the Rules sections
In #978, it’s being discussed what is the best language for presenting code in the Rules sections of the XQFO specification. Currently, XPath is used for compact equivalencies, for example…
(: array:size :)
count(array:members($array))
(: fn:remove :)
$input[not(position() = $positions)].
...while XQuery is used for more complex expressions, including function declarations, or when the XPath representation would be syntactically more complex. Examples:
(: fn:deep-equal :)
declare function equal-strings(
$string1 as xs:string,
$string2 as xs:string,
$collation as xs:string,
$options as map(*)
) as xs:boolean {
let $n1 := if ($options?whitespace = "normalize"))
then normalize-unicode(?, $options?normalization-form)
else identity#1
let $n2 := if ($options?normalize-space)
then normalize-space#1
else identity#1
return compare($n1($n2($string1)), $n1($n2($string2)), $collation) eq 0
}
(: fn:index-where :)
for $item at $pos in $input
where $predicate($item, $pos)
return $pos
(: …flatten, fold-left, while-do, others :)
Finally, we have many cases in which XPath/XQuery code is omitted, either because the presented feature is basic enough, because the equivalent code would get too complicated, or (e.g., for fn:doc
) because it does not provide means to express the feature.
We should strive for consistency and decide which language(s) the majority of us believes is the best choice…
- XPath & XQuery (what we currently have)
- XPath only
- XPath, XQuery and XSLT (whatever seems most appropriate)
- Other pseudocode
- Don’t use pseudocode at all if it is too complex to be represented with moderately simple XPath code
Issue #999 created #created-999
regular expression addition - comments
The original Perl regular expression syntax allows comments with the x flag. They use # to introduce comments up to a newline.
Maybe we could support XPath-style comments in regular expressions, such as (:#.......#:)
when the x flag is present?
Today i use (?:comment: stuff here )?
but this requires that "stuff here" can be compiled into a regular expression!
Issue #998 created #created-998
regular expression addition - lookbehind assertions and lookahead assertions
look-ahead assertions are i think the most useful things not found in qt regular expressions, and also look-behind.
This lets you do things like
replace( ., '
/ ( [^/]+ ) (*positive_lookahead: /)
', '...', 'x')
replacing components between /..../ but not consuming the trailing /, so that /a/b/c/d/ comes out as /../../../../
Perl uses (?=pattern), (*pla:pattern), (*positive_lookahead:pattern) (?!pattern), (*nla:pattern), (*negative_lookahead:pattern) to match only if the pattern is (or is not) followed by a match to pattern,
and (?<=pattern), \K, (*plb:pattern), (*popsitive_lookbehind:pattern) (?<!pattern), (*nlb:pattern), (*negative_lookbehind:pattern) for zero-width look-behind assertions.
Note, libpcre (and older Perl version) restrict lookbehind assertions to fixed length. You can write (?<=dog|cat) food to match " food" preceded by "dog" or "cat", but you cannot write (?<=dogs?|cats?) barking
\C is also forbidden, as are capturing subgroups. But the facility is still very useful, and reduces the need for repeated substitutions.
I propose adding only the first form in each case, not the newer "*" forms, which are less widely supported.
Pull request #997 created #created-997
830 Drop F+O appendix D.4
This PR drops the non-normative appendix D.4, which contained illustrative user-written functions. It was a very patchy and disorganised collection of functions which were primarily there because someone had proposed adding a function to the standard library and the WG had turned down the suggestion on the grounds that users could easily write the function themselves. It's not worth the effort of rewriting the appendix to take 4.0 enhancements into account.
Fix #830
Pull request #996 created #created-996
816 Allow a predicate in a filter expression to be a sequence of numbers
Fix #816
Pull request #995 created #created-995
937 (fn:hash) revised in light of CG feedback
This revises #937 (catalyzed by #779) in light CG discussion that approved the PR.
- Output is now
xs:hexBinary?
- Second parameter
$algorithm
replaced with an option map. In the specs I avoidedfos:values/fos:value
, because this would disallow for case/space normalization, and it would effectively disallow any implement-defined algorithms not on the list of three algorithms, and trigger the dynamic error described in rule 6 in the option parameter conventions. - Options map has only one option. It doesn't make sense to provide an option changing the kind of output.
- I think this is the first example of an options map where the
fos:meaning
has rich text (paragraphs, unordered lists). It builds and renders fine locally. - Extra note on the output format.
- Examples are expressed as chained functions, to illustrate how to get the customary string values.
Issue #994 created #created-994
Invoking maps & arrays: allow sequences?
Can’t we support integer sequences as arguments in dynamic function calls on maps and arrays?
The following query is already valid…
[ 3, 4, 5 ] ? (2, 1)
…but the number of users who are able to decode this syntax is very limited. It would be easier to allow:
[ 3, 4, 5 ](2, 1)
I would expect the results to be returned in the supplied order (4
and 3
).
Pull request #993 created #created-993
989 (partial) Allow char() to take integer argument
Addresses the use case in issue #989. (But leave the issue open for now).
Discussion point: should we drop the options char("#32")
and char("#x20")
as they now seem redundant?
Pull request #992 created #created-992
888, 963: Error handling for unsatisfied context dependencies
Fix #963 by providing more detail on the expected error handling for partial function application.
Fix #888 by making XPDY0002 a type error rather than a dynamic error.
Issue #991 created #created-991
Invisible-xml - missing details
The spec for invisible-xml doesn't say whether the parsing function returns an element node or a document node.
It should also say, for completeness, that the parsing function is "nondeterministic with respect to node identity" - that is, if you parse the same input twice, its undefined whether you get the same node twice, or different nodes.
Issue #990 created #created-990
Transitive closure on non-nodes
In PR #988 I inadvertently used the transitive-closure function to process non-nodes; that's not supported by the current specification of the function.
The only difficulty in extending it is how to define a suitable identity comparator so we know when to terminate. Probably this should be done using a callback, defaulting to op('is')
. In the use case of PR #988, the comparator could be supplied as false#0
- the step function is acyclic, so we can treat all items reached as distinct.
Issue #989 created #created-989
character sequence constructor 'a' to 'z'
Although you can write 'a' to 'z' as
(string-to-codepoints('a') to string-to-codepoints('z') ! codepoints-to-string()
i’m not sure this is easily discoverable.
Note, 'a' to 'z' is obviously dependent on the current collation - if you're using EBCDIC may the dogs help you. I’m assuming, however, any two Unicode characters could appear in the string literals, and it’d be an error to have more than one character in either string.
This means 'ċ' to 'ŗ’ would be an error, not equivalent to 'c' to 'r' (taking the first character of each string), since those are not precomposed forms.
Mostly, i tend to write this because of other languages - e.g. Perl has 'a' .. 'ÿ' or whatever, as does Ruby.
ICU in Python has UnicodeSet('[[:Ll:]&[:Latin:]]') which is powerful but grokhard (& here is intersect i think).
Although 'a' to 'z' is probably what i've seen & used most often, 'a' to 'f' and '0' to '9' are also obvious candidates.
Pull request #988 created #created-988
960 Pinned and labeled values
This PR introduces the concepts of pinned and labelled values and the way in which they can be used to obtain additional information about the results of a deep lookup or map/array navigation operation. The changes at this stage are confined to a new section in the data model spec introducing the concept of labeled items, and a new section in the XPath language spec showing how these are used when navigating maps and arrays. This is a first step; if the WG approves of the general approach, there will be a lot more detail to add in due course.
The PR addresses a number of open issues:
Issue #960 - flattening of results from $map??KS Issue #711 - using annotations for navigation of JSON trees Issue #596 - pinned values: transforming trees Issue #350 - CompPath (composite objects path) expressions Issue #334 - Transient properties: selection and update in maps and arrays Issue #262 - navigation in deep-structured arrays Issue #108 - template match using values of tunnel parameters
It does not claim to resolve them all, but I believe it provides the groundwork for doing so.
Pull request #987 created #created-987
628 Define result order for distinct-values and duplicate-values
Fix #628
Issue #986 created #created-986
Numeric Comparisons
We've been trying to change the semantics of numeric comparison without breaking existing applications. As a result, the current status quo is very messy. Let's review where we are.
The eq/lt operators, given mixed operand types, convert decimal operands to double and compare as double. No change from 3.1. This comparison is not transitive in edge cases. The =
and <
operators are defined in terms of eq
and lt
.
Map key comparisons compare as "infinite precision decimal". No change from 3.1. This comparison is now exposed as fn:atomic-equal().
deep-equal() refers to atomic-equal(), which is a change in behaviour from 3.1.
distinct-values() refers to deep-equal(), which is a change in behaviour - deliberate, because it needs to be transitive.
index-of() refers to eq. No change from 3.1.
compare() has been newly introduced; like atomic-equal() it uses infinite precision decimal for comparison.
sort() uses compare(). This is a change from 3.1; again needed because transitivity is important.
min() and max() use compare(). This is a change from 3.1.
The new highest() and lowest() functions use sort().
XSLT for-each-group refers to distinct-values().
XSLT xsl:sort currently refers to numeric-compare() and will presumably change to use compare().
XSLT xsl:merge refers to xsl:sort
XQuery "group by" refers to deep-equal()
XQuery "order by" refers to compare()
So:
- Nearly everything now uses decimal comparison where the operands are of mixed type
- There are many different ways that we say this - it's often indirect. There are only two comparison methods, but you have to follow a chain of references to work out which one applies.
- The two exceptions that still do comparison the 3.1 way (converting both operands to xs:double) are (a) the eq/lt/=/< operators, and (b) the index-of and array:index-of functions.
There are definitely things that now break. For example I was working on tests yesterday with assertions in the form deep-equal(nodes/number(.), (8.2, 5.4, 6.5)) - that is, comparing doubles to decimals. The nodes actually contain the strings "8.2", "5.4", "6.5". The test was failing because converting the string "8.2" to a double and then converting the double to a decimal does not produce the decimal value 8.2.
This mixed bag really doesn't seem acceptable. What options do we have?
- Be bold: make everything uniformly use transitive comparisons, and accept that some user code will break.
- Be timid: use transitive comparisons only where it really matters (distinct-values, grouping, sorting) and use promotion to double everywhere else.
- Compromise: introduce a compatibility mode, or a context option that allows users to control the behaviour, or another set of comparison operators.
Any other ideas?
Pull request #985 created #created-985
720 Add lookup arrow expressions (method invocations)
Fix #720.
Replaces #916.
Issue #948 closed #closed-948
fn:scan-left and fn:scan-right - produce accumulation of results
Pull request #984 created #created-984
959-partial Add fn:seconds function
Issue: #959
Issue #983 created #created-983
fn:reduce (or fn:fold without initial value)
Various languages (Kotlin, F#, Haskell, Rust, Scala, others) offer two functions for what we summarize as folds: one that accepts an initial value and another one that consumes the first item of the input as initial value. The first function is usually called fold
, the latter is called reduce
, but some languages (like JavaScript) pack the functionality into a single function.
We have the same options:
- We could tweak
fn:fold-left
andfn:fold-right
in a way that thezero
argument is ignored if it’s not explicitly supplied:
fold-left(1 to 5, action := op('*'))
However, the behavior would then differ from fold-left(1 to 5, (), action := op('*'))
, which is something we tried to avoid in more recent functions (there are functions like fn:name
, though, that behave similarly: fn:name(())
and fn:name()
does something different).
- We could also introduce a separate function
fn:reduce
(or 2 variants, resp. 4 if we include arrays):
fn:reduce(
$input as item()*,
$action as function(item()*, item()) as item()*
) as item()*
This would allow us to do reduce(1 to 5, op('*'))
, and for some people, a reduce function will be more familiar than a fold.
On Wikipedia, there’s a good summary on folds in various languages.
Issue #982 created #created-982
scan-left, scan-right: position argument, array functions
We have added an optional position argument to nearly all callback functions that are invoked once for each item in a sequence. This argument is omitted from the new scan-left and scan-right functions. It should be added for consistency.
One of the proposed use cases for scan-left and scan-right is for debugging calls on fold-left and fold-right. This use case requires that the callback functions in the two cases are compatible.
Background note: the optional position argument is modelled on Javascript, where it is permitted in all the common higher-order functions such as filter
, forEach
, reduce
and reduceRight
(which are the Javascript equivalent of fold-left and fold-right). Javascript doesn't appear to offer an equivalent of scan-left
and scan-right
.
Issue #981 created #created-981
Identify optional arguments in callback functions
It was pointed out today that is not obvious, looking at a function signature like
fn:filter(
$input as item()*, |
$predicate as function(item(), xs:integer) as xs:boolean |
) as item()*
that the second argument of the $predicate
function is optional.
At least in the documentation, it would be useful to capture this in some way. Being "optional" here means that it makes sense, semantically, to supply an arity-1 function, in which case the caller will not supply the second argument.
Perhaps it would also be useful to go beyond documentation, and attach some syntax and semantics to it. Specifically, if the signature of the callback function indicates that the first N arguments are required, then supplying a function item of arity less than N will result in a type error.
Pull request #980 created #created-980
911 Coercion to allow double to decimal etc
Fix #911 The coercion rules are changed to allow implicit casts among numeric types, for example a double can be supplied where the required type is decimal. (The term "promotion" is now used only for operators, where the two operands must be converted to a common type.)
Pull request #979 created #created-979
966 Minor fixes to deep lookup
Fix #966
Apply suggestions in Christian Grün's comments on PR #927.
Pull request #978 created #created-978
948 Reflected the comments of the CG on the specification of scan-left and scan-right
Reflected the comments of the CG on the specification of scan-left and scan-right
Pull request #977 created #created-977
Ignore this, it's just a test
I'm trying to work out why sometimes PR succeeds when it contains markup errors...
Issue #976 closed #closed-976
Fix markup errors with fos:notes
Pull request #976 created #created-976
Fix markup errors with fos:notes
QT4 CG meeting 063 draft minutes #minutes-01-30
Draft minutes published.
Issue #957 closed #closed-957
948 Added fn:scan-left and fn:scan-right
Issue #965 closed #closed-965
XQFO: minor edits and bug fixes
Pull request #975 created #created-975
973 fn:parse-json, fn:json-to-xml: number-parser, fallback
Issue: #973.
In addition, I fixed the description for the fallback
option fn:parse-json
, as it seemed incomplete to me:
The function is called when the JSON input contains a special character (as defined under the escape option) that is valid according to the JSON grammar, whether the special character is represented in the input directly or as an escape sequence.
Issue #974 closed #closed-974
Rules for context-dependent function references in XSLT (e.g. regex-group#1)
QT4 CG meeting 063 draft agenda #agenda-01-30
Draft agenda published.
Issue #974 created #created-974
Rules for context-dependent function references in XSLT (e.g. regex-group#1)
I'm not sure where we are on this one.
Does regex-group#1 capture the "current matching substrings" component of the dynamic context?
XSLT 4.0 test case analyze-string-101 suggests that it does, and that this represents a change from 3.0 -- there are separate versions of the test with different expected results for the two cases.
But §5.4 of the XSLT 4.0 spec still ends with the sentence:
This rule does not extend to the XSLT extensions to the dynamic context defined in this section. If a dynamic function call is made that depends on the XSLT part of the dynamic context (for example, regex-group#1(2)), then the relevant components of the context are cleared as described in the table above.
I suspect this sentence should have been deleted, but need to track down the history.
Issue #973 created #created-973
fn:parse-json, fn:json-to-xml: `number-parser`, `fallback`
- See #33:
number-parser
option needs to be added tofn:json-to-xml
. - Similar to
fn:replace
, I would suggest usingfunction(xs:untypedAtomic) as item()?)
as signature for both thefallback
and thenumber-parser
option, and (forfallback
) to invokefn:string#1
on the result. This way, explicit casts in the code get obsolete. Queries like the following one…
parse-json(
'-123',
map { 'number-parser': fn($n) { $n => number() => abs() }
)
…can then be simplified to parse-json('-123', abs#1)
.
Pull request #972 created #created-972
949 Partial Function Applications: Allow return of function name
Issue: #949. The major change is the rule for the name
property of partial function applications for static function calls.
In addition, I have unified the presentation of the different function item expressions.
Affects test cases like xqhof40
(which need to be fixed with or without this PR).
Pull request #971 created #created-971
970 XQFO: Context item → value
Issue: #970
Issue #970 created #created-970
XQFO: Context item → value
Resulting from #129 (related: #755): Many rules in the XQFO spec still refer to the context item, which is currently defined as follows:
When the context value is a single item, it can also be referred to as the context item; when it is a single node, it can also be referred to as the context node.
We need to make clear what’s going to happen if the context value is not a single item:
- In many cases, this can simply be done by replacing “context item” with “context value”.
- In some cases (e.g. for
fn:string#0
), we should specify that an error is raised (usuallyXPTY0004
) if the input is not a single item.
Issue #878 closed #closed-878
Proposed extension to subsequence
Pull request #969 created #created-969
843-partial Standard, array & map functions: Equivalencies
Issue: #843
Maybe we should keep the issue open after merging this PR.
Pull request #968 created #created-968
260 array:index-of
Issue: #260
Issue #967 created #created-967
XPath Appendix I: Comparisons
Adopted from https://github.com/qt4cg/qtspecs/issues/260#issuecomment-1908033129
It would be useful if all functions that perform comparison included a cross-reference to the new XPath Appendix I; and I note that appendix doesn't seem to mention index-of.
@michaelhkay I’ve taken the liberty of assigning this to you, as I wasn’t sure what this is about.
Issue #874 closed #closed-874
878 Proposed extension to subsequence
Issue #966 created #created-966
Rewrite spec of deep lookup operator: edits
@michaelhkay I've created a little new issue to keep track of the 3 minor suggestions that I made in the PR that we merged today: https://github.com/qt4cg/qtspecs/pull/927
Pull request #965 created #created-965
XQFO: minor edits and bug fixes
Issue #818 closed #closed-818
Foxpath integration
Issue #693 closed #closed-693
QT4 Tests without counterpart in the specs
Issue #639 closed #closed-639
fn:void: Naming, Arguments
Issue #937 closed #closed-937
779 hash function
Issue #946 closed #closed-946
fn:iterate-while → fn:while-do, fn:do-until
Issue #962 closed #closed-962
946 fn:iterate-while → fn:while-do, fn:do-until
Issue #951 closed #closed-951
Parameters with default values: fn:lang, fn:id, fn:idref, fn:element-with-id
Issue #958 closed #closed-958
951 Parameters with default values: fn:lang, fn:id, fn:idref, fn:element-id
Issue #945 closed #closed-945
Module import: apparent contradiction
Issue #952 closed #closed-952
945 module import contradiction
Issue #950 closed #closed-950
Minor edits (examples, rules)
Issue #939 closed #closed-939
Remove fn:numeric-compare
Issue #941 closed #closed-941
939 Remove fn:numeric-compare
Issue #936 closed #closed-936
877 revised rules for op:binary-less-than
Issue #861 closed #closed-861
Precise meaning of $E??KS
Issue #927 closed #closed-927
861 Rewrite spec of deep lookup operator
QT4 CG meeting 062 draft minutes #minutes-01-23
Draft minutes published.
Issue #964 created #created-964
fn:has-attributes
Trivial (motivated by a user request):
As there is an fn:has-children
function, it seems surprising that there is no fn:has-attributes
function.
I would suggest…
- adding this function to the spec, or
- indicating in a note (for
fn:has-children
) why this function is missing.
Issue #963 created #created-963
Errors in forming function items (continued)
In #894, the following rule was added to the definition of Named Function References:
An error is raised if the identified function depends on components of the static or dynamic context that are not present, or that have unsuitable values. […]
DC0001
is raised for the callfn:id#1
if the context item is not a node in a tree that is rooted at a document node.
We should be consistent and add this rule to Partial Function Applications and Inline Function Expressions as well. Perhaps such rules could be defined just once for all affected function item constructors in the parent section?
Also, the error code doesn’t seem to be properly defined in the spec, it shows [ERROR errorref DC0001 NOT FOUND]
(maybe that’s intentional at this editing stage.)
QT4 CG meeting 062 draft agenda #agenda-01-23
Draft agenda published.
Pull request #962 created #created-962
946 fn:iterate-while → fn:while-do, fn:do-until
My first thought was to name the second function fn:do-while
, but fn:do-until
with an inversed predicate seemed more appropriate to me.
Issue: #946
Issue #961 created #created-961
Simulating Objects: Performance
Related to #953, #917 and #916, I wonder whether we are aware enough of the essential differences when we think of objects in a functional language:
- Mutable objects are extremely efficient, as an update is a simple main-memory value change.
- Immutable data structures need to be fully copied if a single value changes. As a result, the update of a map with, let’s say, 1 string and 50 functions would be a new map with 1 string and 50 functions. Even with efficient immutable map implementations that we have, I doubt that it makes sense to create full copies with 1+50 entries, of which only 1 string will be different.
- Imagine a FLWOR expression that creates 1000 of such maps, with possibly 1 value that’s different in each instance. We don’t need 1000 copies of 50 functions; the memory consumption would be much smaller if we only stored relevant values.
This thread is not about premature optimization; I just want to be sure we think about the obstacles when using maps for objects. Maybe the solutions are already on the horizon; maybe we could tackle some of the concerns with the definition of default values…
declare record person(
name as xs:string,
title := (),
full := fn { string-join((?title, ?name), ' ') }
);
…and maps with type annotations. If we don’t materialize defaults, the embedded annotation would indeed need to effect functions like map:get
, as questioned by Michael in https://github.com/qt4cg/qtspecs/pull/953#issuecomment-1896078605.
Issue #960 created #created-960
Should ??KS flatten the results
Currently the result of ??KS
(like ?KS
) is flattened. So if you do ??dimensions
and the value of each dimensions
entry is a sequence of zero or more numbers, the result munges them all together into a single sequence (dropping any empty values in the process).
Should we change this, for example to return a sequence of arrays, or an array of sequences?
This makes life a bit more difficult in the simple case where all the values are singletons -- and notably, when constructing a path such as ??A??B??C
-- but it makes it possible to handle the more general case where they aren't all singletons.
Issue #959 created #created-959
Milliseconds ↔ xs:dayTimeDuration, Unix time ↔ xs:dateTime
We should extend the constructor functions to convert integers (millisecond, and the Unix time, starting from 1970-01-01T00:00:00Z
) to xs:dayTimeDuration
and xs:dateTime
instances…
xs:dateTime(12345),
xs:dayTimeDuration(12345)
…and it should be possible to convert the values back to integers.
Related: https://docs.basex.org/wiki/Conversion_Module#Dates_and_Durations
Pull request #958 created #created-958
951 Parameters with default values: fn:lang, fn:id, fn:idref, fn:element-id
Editorial (#951): Reverts the changes made in #901 for 4 context-dependent functions.
Pull request #957 created #created-957
948 Added fn:scan-left and fn:scan-right
As discussed at https://github.com/qt4cg/qtspecs/issues/948
Pull request #956 created #created-956
850-partial Editorial improvements to parse-html()
Related to issue #850, but doesn't close it entirely.
Issue #955 created #created-955
Options parameters as record types
In the new parse-html() function, the content of the options parameter is described using a record type. This differs from other functions, that describe the type as map(*), and have a statement that "the option parameter conventions apply".
Ideally we should use record types for all options parameters. However we need to check carefully that this does not affect edge-case compatibility, for example implicit conversions or the acceptability of extensions. If we can't do that then we should bring parse-html() into line with other functions such as parse-json().
Issue #954 created #created-954
Establish a default value for the XSLT fixed-namespaces attribute
The newly-defined fixed-namespaces attribute on the xsl:stylesheet
element is a huge positive step towards improving programmer's productivity by removing the need to provide up to 9 namespace declarations in every stylesheet module, thus reducing unnecessary cluttering, simplifying and slimming the code and increasing its readability.
It seems like an accidental omission that the current text doesn't specify a default value for this attribute. If there is a well-chosen default value, this would even further decrease the requirements for the programmer to engage in such a non-problem-solving activity as entering memorized strings, and would prevent errors such as either not providing the correct values for the namespace-uris or forgetting to specify this new attribute.
One obvious candidate for a default value of the fixed-namespaces attribute is #standard
, which means that without having to press even a single additional key, the XSLT programmer gets all standard namespaces automatically bound to the well-known prefixes:
xsl
xml
xs
xsi
fn
math
map
array
err
Proposal:
Please, augment the current text by specifying that the default value for the fixed-namespaces attribute is #standard
Pull request #953 created #created-953
617 Define record constructors
Fix #617
Note that this is a first step. Noticeably we can't yet use these constructor functions to create records that have methods. It's nevertheless a big step forward.
Pull request #952 created #created-952
945 module import contradiction
Soem editorial clarifications regarding XQuery module import and schema import.
Fix #945
Issue #928 closed #closed-928
Minor edits through ch. 15
Issue #947 closed #closed-947
Reorganise F+O chapter 15 [editorial]
Issue #530 closed #closed-530
Escaping of forward slash in JSON output method
Issue #942 closed #closed-942
530 Fix typo, escape-solidus not escape-uri-attributes
Issue #880 closed #closed-880
872 Symmetry: fn:items-at → fn:get
QT4 CG meeting 061 draft minutes #minutes-01-16
Draft minutes published.
Issue #930 closed #closed-930
Obsolete comment under fn:deep-equal()
Issue #933 closed #closed-933
930 drop obsolete note about comments and PIs
Issue #932 closed #closed-932
931 Add rules for duration precision
Issue #931 closed #closed-931
Precision of duration arithmetic
Issue #737 closed #closed-737
295: Boost the capability of recursive record types
Issue #951 created #created-951
Parameters with default values: fn:lang, fn:id, fn:idref, fn:element-with-id
Since #895, absent optional arguments and empty sequences in built-in functions are treated identically. Exceptions are functions that already had different rules for such cases (e.g., fn:node(())
always returns an empty sequence, no matter what the context is).
I noticed we should also exclude fn:lang
, fn:id
, fn:idref
, and fn:element-with-id
: Otherwise, a compiler won’t be able to statically assess if a function call is dependent on the context.
Pull request #950 created #created-950
Minor edits (examples, rules)
- Examples were fixed.
- The equivalent expression for
map:values
was changed to$map?*
.
Issue #949 created #created-949
Partial Function Applications: Allow return of function name
Without wanting to revive #889, an important observation I picked up is that named function references and “partially applied functions without applications” can be considered identical. That is, there should be no reason to distinguish between count#1
and count(?)
.
Currently, however, partially applied functions are currently defined to lose the reference to the original function and its arity (and some Qt4 tests ensure that this is the case: https://github.com/qt4cg/qt4tests/blob/8649941e0e695ff8fb4cb27c52e99590cc88126f/misc/HigherOrderFunctions.xml#L1933).
From a user perspective, I see no reason why the two cases should be treated differently, and I would argue that we should either treat them identically or (at least) allow implementations to treat them identically, i.e., allowing an implementation to return count
for function-name(count(?))
.
Issue #948 created #created-948
fn:scan-left and fn:scan-right - produce accumulation of results
fn:scan-left and fn:scan-right - produce accumulation of results
In XPath 4.0 so far we still don't have a convenient way to express the functionality of producing a series of accumulated (accrued) results when applying a folding function over a collection (sequence, array, ...) of items. The general use-case for this is the task to produce a sequence of running totals when applying an operation over a sequence of data points: produce the partial sums of loan payments over fixed periods, produce the compounded amounts of a deposit with fixed interest rate over years, ..., etc.
Two functions (shamelessly borrowed from Haskell):
- fn:scan-left
- fn:scan-right
fn:scan-left
This function has a similar signature to that of fn:fold-left
and produces the same final result, however it produces the complete (ordered) sequence of all partial results from every new value the accumulator gets during the evaluation of fn:fold-left
.
Signature
fn:scan-left($input as item()*,
$zero as item()*,
$action as function(item()*, item()) as item()*
) as array(*)*
Properties
This function is ·deterministic·, ·context-independent·, and ·focus-independent·.
Rules
The function is equivalent to the following implementation in XPath(return clause added for completeness):
let $scan-left-inner := function($seq as item()*,
$zero as item(),
$fun as function(item()*, item()) as item()*,
$self as function(*)
) as array(*)*
{
let $result := [$zero]
return
if(empty($seq)) then $result
else
(
$result, $self(tail($seq), $fun($zero, head($seq)), $fun, $self)
)
},
$scan-left := function($seq as item()*,
$zero as item(),
$fun as function(item()*, item()) as item()*
) as array(*)*
{
$scan-left-inner($seq, $zero, $fun, $scan-left-inner)
}
return
$scan-left(1 to 10, 0, op('+'))
Examples:
$scan-left(1 to 10, 0, op('+'))
produces:
[0]
[1]
[3]
[6]
[10]
[15]
[21]
[28]
[36]
[45]
[55]
fn:scan-right
This function has a similar signature to that of fn:fold-right
and produces the same final result, however it produces the complete (ordered) sequence of all partial results from every new value the accumulator gets during the evaluation of fn:fold-right
.
Signature
fn:scan-right($input as item()*,
$zero as item()*,
$action as function(item()*, item()) as item()*
) as array(*)*
Properties
This function is ·deterministic·, ·context-independent·, and ·focus-independent·.
Rules
The function is equivalent to the following implementation in XPath(return clause added for completeness):
let $scan-right-inner := function($seq as item()*,
$zero as item()*,
$f as function(item(), item()*) as item()*,
$self as function(*)
) as array(*)*
{
if(empty($seq)) then [$zero]
else
let $rightResult := $self(tail($seq), $zero, $f, $self)
return
([$f(head($seq), head($rightResult))], $rightResult)
},
$scan-right := function($seq as item()*,
$zero as item()*,
$f as function(item(), item()*) as item()*
) as array(*)*
{
$scan-right-inner($seq, $zero, $f, $scan-right-inner)
}
return
$scan-right(1 to 10, 0, op('+'))
Examples:
$scan-right(1 to 10, 0, op('+'))
produces:
[55]
[54]
[52]
[49]
[45]
[40]
[34]
[27]
[19]
[10]
[0]
QT4 CG meeting 061 draft agenda #agenda-01-16
Draft agenda published.
Issue #899 closed #closed-899
Simplifying the language - types have behaviour.
Pull request #947 created #created-947
Reorganise F+O chapter 15 [editorial]
This PR reorganizes the subsections of F+O chapter 15 (the XML/HTML/JSON/CSV/IXML chapter). There is no change to content apart from a couple of introductory sentences. I'm planning to do some fine-grained work on the content in due course, but to make that easier to review it seems best to do the top-level reorganisation first. I'm therefore hoping that this PR will go through quickly "on the nod" so I can use it as a baseline for the detail changes.
Issue #946 created #created-946
fn:iterate-while → fn:while-do, fn:do-until
First feedback shows that fn:iterate-while
is helpful, but the name needs to be improved: It implies that the first iteration occurs before the invocation of the test… Which could sometimes be helpful, too.
I suggest renaming the function to fn:while-do
(“do” is commonly used when while loops are specified), and adding fn:do-until
.
Issue #945 created #created-945
Module import: apparent contradiction
XQuery 5.12 paragraph 2 says:
If a module A imports module B, the static context of module A will contain the [in-scope schema definitions]... of module B.
Paragraph 10 says:
A [module import] imports only functions, variable declarations, and item type declaratons; it does not import other objects from the imported modules, such as [in-scope schema definitions] or [statically known namespaces].
They can't both be right, surely?
Issue #944 created #created-944
Coercion rules: implicit types
Since 2.0, the coercion rules (formerly function conversion rules) have allowed implicit conversion from decimal to double, decimal to float, and float to double on function calls; other conversions such as double to decimal or float to decimal are not allowed. This has never made very much sense because in some implementations, decimal to float is a lossy conversion whereas float to decimal is not.
One option would be to allow conversion from any numeric type to any other.
The main caveat here is that I don't think it makes sense to allow a double such as 1.5e0 to be supplied where the required type is xs:integer. We have introduced new conversions that make it possible to supply a decimal where an integer is expected, but only if the decimal is in the value space of integer.
A possible formulation would be:
If the required type is a numeric type (that is, xs:decimal, xs:double, xs:float, or any type derived from these), and if the supplied value is a numeric value, then the supplied value is cast to the required primitive type, and if the result is in the value space of the actual required type it is then relabelled as an instance of the actual required type (if not, the conversion fails).
This means that supplying 1.0e0 for an argument expecting xs:integer (or xs:positiveInteger, etc) would work (it would cast to xs:decimal and then relabel as xs:integer), but supplying 1.1e0 would fail.
Pull request #943 created #created-943
187 Add FLWOR expression while clause
Fix #187
Pull request #942 created #created-942
530 Fix typo, escape-solidus not escape-uri-attributes
Fix #530.
On half a dozen occasions, escape-uri-attributes
is used where escape-solidus
is clearly intended.
Issue #886 closed #closed-886
Binary map keys
Pull request #941 created #created-941
939 Remove fn:numeric-compare
Issue: #939
Pull request #940 created #created-940
878 Add subsequence-where function
Supersedes PR #874
Following discussion of PR #874 which proposed an extended subsequence() function with options to define the start and end position by predicates, this new PR proposes instead a subsequence-where() function that allows the start and end position to be defined by predicates, leaving the existing subsequence() function unchanged.
The items-before/after/starting-where/ending-where quartet are dropped.
The new function is inclusive at both ends. To start at the item after the one that matches the start condition, apply tail() to the result. To finish before the item that matches the end condition, apply trunk() to the result.
Issue #893 closed #closed-893
fn:compare: Support for arbitrary atomic types
Issue #918 closed #closed-918
Minor cx through chap. 14
Issue #939 created #created-939
Remove fn:numeric-compare
Related action: QT4CG-060-04
@michaelhkay I proposed to merge fn:numeric-compare
into fn:compare
in #866; your response was:
Folding fn:numeric-compare into fn:compare is more feasible, but you've then got one function that does two different jobs; there's no type safety to ensure that the arguments have compatible types, and you need ad-hoc rules to say which combinations of arguments are valid and which aren't. The merit of two separate functions is that each is a total function over the domain implied by its signature.
Do you think the concerns are still relevant, or should we tackle this?
Issue #938 created #created-938
Canonical serialization
This issue picks up suggestions from #779 regarding canonical serialization, and solicits from the community group input on if such a function is desirable, and what such a function might look like.
In the context of #779, the idea was that two XML documents with different physical representations, but semantically equivalent, could be serialized to a canonical form, with a hash value applied to each confirming identity. Of course, with canonical operation, a simple string comparison would be sufficient, absent any hashing.
XML Signature was suggested as one approach, with some hesitation. I would like to suggest, instead, that we look to implement Canonical XML Version 1.1 (herein CX1.1), perhaps with map options that calibrate how CX1.1 is implemented. I have no experience using CX1.1, so user input is welcome.
Another point of discussion is whether this merits a new function, e.g., fn:canonical-serialize
, or should be built upon fn:serialize
. A problem with the latter option, is that such an approach makes no sense without the method
option specified as xml
. Another approach would be to go deeper, into the serialization spec, and expand the xml
method to ensure a canonical option.
I believe that this function would be extremely useful. When preparing test suites, output could be saved as secondary documents as canonical XML, and any subsequent regression tests could adjust comparanda to canonical XML, and very precise node-wise comparisons could be made.
I look forward to everyone's input.
Pull request #937 created #created-937
779 hash function
First draft of hash function, proposed in #779.
Error message left as to-do item; guidance from editors appreciated.
I opted to leave out wrapper/cryptographic functionality, such as salting, and to demonstrate via example how it could be done by a developer on their own. In my opinion what we need here is a simple atomic function that can be incorporated into other molecular functions.
I may tinker with the prose description up to CG discussion, so comments are welcome.
Pull request #936 created #created-936
877 revised rules for op:binary-less-than
Rule 3 for op:binary-less-than
was a bit of a mess (see #877), and needed to be expressed as a recursive operation.
My proposed revision depends of phraseology drawn from fn:decode-from-uri
, fn:deep-equal
, and 5.3.2 Unicode Codepoint Collation (here slightly adjusted from unordered to ordered list).
Issue #876 closed #closed-876
Placement of fn:in-scope-namespaces(), fn:in-scope-prefixes(), fn:namespace-uri-for-prefix()
Issue #909 closed #closed-909
893 fn:compare: Support for arbitrary atomic types
Issue #860 closed #closed-860
Unary Lookup when the context value is a sequence
Issue #926 closed #closed-926
860 Editorial rearrangement of spec for shallow lookup
Issue #780 closed #closed-780
format-number() etc incompatibility
Issue #925 closed #closed-925
780 Document incompatibility in format-number etc
Issue #935 closed #closed-935
Fix the fo test catalog
Pull request #935 created #created-935
Fix the fo test catalog
A duplicate name was introduced.
Issue #648 closed #closed-648
Schema for FN namespace should block extension and substitution
Issue #924 closed #closed-924
648 Disallow user modifications to schema for FN namespace
Issue #913 closed #closed-913
XQFO: under/unused variable apparatus
Issue #923 closed #closed-923
913-new-examples-for-local-name-etc
Issue #915 closed #closed-915
[Editorial] Incorrect terminology: function implementation is now function body
Issue #922 closed #closed-922
915 function body terminology
Issue #914 closed #closed-914
XQFO minor edits
Issue #912 closed #closed-912
XQFO: Minor edits
Issue #906 closed #closed-906
fn:deep-equal: unordered → ordered
Issue #907 closed #closed-907
906 fn:deep-equal: unordered → ordered
Issue #898 closed #closed-898
Drop the requirement for document-uri() uniqueness
Issue #905 closed #closed-905
898 - relax the constraints on document-uri
Issue #821 closed #closed-821
Annotations: Make default namespace explicit
Issue #904 closed #closed-904
821 Annotations: Make default namespace explicit
Issue #895 closed #closed-895
Parameters with default values: allow empty sequences
Issue #901 closed #closed-901
895 Parameters with default values: allow empty sequences
Issue #934 created #created-934
String comparison in deep-equal
The code showing how strings should be compared in deep-equal has gone awry, it doesn't match the prose. In equal-strings(), the lines
let $n1 := if ($options?whitespace = "normalize"))
then normalize-unicode(?, $options?normalization-form)
else identity#1
let $n2 := if ($options?normalize-space)
then normalize-space#1
else identity#1
should read:
let $n1 := if ($options?whitespace = 'normalize')
then normalize-space#1
else identity#1
let $n2 := if ($options?normalization-form))
then normalize-unicode(?, $options?normalization-form)
else identity#1
Actually, the whole thing can now be expressed more concisely using fn:chain:
declare function equal-strings(
$string1 as xs:string,
$string2 as xs:string,
$collation as xs:string,
$options as map(*)
) as xs:boolean {
let $norm := fn:chain(?,
(normalize-unicode(?, $options?normalization-form)[$options?whitespace = "normalize"],
normalize-space#1[$options?normalize-space]))
return compare($norm($string1), $norm($string2), $collation) eq 0
}
Pull request #933 created #created-933
930 drop obsolete note about comments and PIs
Fix #930
The note is obsolete because adjacent text nodes are now combined after stripping comments and PIs.
Pull request #932 created #created-932
931 Add rules for duration precision
Fix #931
Adds rules for the precision of durations and operations on durations, analogous to the existing rules for dates/times.
Issue #931 created #created-931
Precision of duration arithmetic
We specify that dates/times are manipulated at least to millisecond precision, but we have no similar statement for durations.
See https://stackoverflow.com/questions/77752844
michael.hor257k points out:
The 2.0 specification states: "The result is obtained by casting $arg to an xs:dayTimeDuration ... and then computing the seconds component as described in 10.3.2.3 Canonical representation." And then: "The canonical representation of xs:dayTimeDuration restricts ... the value of the seconds component to xs:decimal valued from 0.0 to 59.999... ", with reference to XML Schema Part 2: Datatypes which mandates "a minimum fractional second precision of milliseconds or three decimal digits". None of this appears in the 3.0 spec, though the examples still show a decimal digit being extracted. –
Issue #929 closed #closed-929
map:values() - Would it be better to return an array?
Issue #930 created #created-930
Obsolete comment under fn:deep-equal()
The notes for fn:deep-equal() include the paragraph:
By default, the contents of comments and processing instructions are significant only if these nodes appear directly as items in the two sequences being compared. The content of a comment or processing instruction that appears as a descendant of an item in one of the sequences being compared does not affect the result. However, the presence of a comment or processing instruction, if it causes a text node to be split into two text nodes, may affect the result.
This is no longer true: we fixed it so that adjacent text nodes are merged after stripping comments and PIs.
Issue #929 created #created-929
map:values() - Would it be better to return an array?
The new function map:values()
returns the values present in a map, flattened into a sequence.
This loses information if the values are not all singletons.
Would it be better to return an array?
That is, to return array:build(map:pairs($map), fn{?value}))
Pull request #928 created #created-928
Minor edits through ch. 15
Light edits here for consistency, clarity. I didn't touch the CSV prose much, knowing it is subject to major revisions.
Pull request #927 created #created-927
861 Rewrite spec of deep lookup operator
Fix #861
This is a complete rewrite of the spec for deep-lookup, hopefully clarifying some edge cases and fixing bugs, but not intended to introduce any major changes.
Pull request #926 created #created-926
860 Editorial rearrangement of spec for shallow lookup
Rearranges the spec for lookup expressions so that unary lookup is now defined in terms of postfix lookup, not the other way around; this simplifies the rules when the context value is not a singleton, or when the key specifier expression is context-dependent.
Fix #860
Pull request #925 created #created-925
780 Document incompatibility in format-number etc
Fix #780
Changes the XSLT and F+O specs to document a minor incompatibility arising from the change to functions such as format-number()
to accept an argument of type union(xs:string, xs:QName)
rather than xs:string
.
In addition, in XSLT, all such functions now accept union(xs:string, xs:QName)
rather than union(xs:QName, xs:string)
. This is primarily to make them all consistent.
Pull request #924 created #created-924
648 Disallow user modifications to schema for FN namespace
Fix #648
Issue #889 closed #closed-889
Rename "Named Function Reference"
Pull request #923 created #created-923
913-new-examples-for-local-name-etc
I have created new (executable) examples for functions name, local-name, namespace-uri, node-name, count, number.
There are of course many other functions that would benefit from the same treatment.
Fix #913
Pull request #922 created #created-922
915 function body terminology
Fix #915
Pull request #921 created #created-921
920 Allow xsl:break and xsl:next-iteration within branch of xsl:switch
Allow xsl:break and xsl:next-iteration within branch of xsl:switch
Fix #920