QT4 CG Meeting 167 Minutes 2026-06-02
Meeting index / QT4CG.org / Dashboard / GH Issues / GH Pull Requests
Table of Contents
- Summary of new and continuing actions
[1/11] - Draft Minutes
- 1. Administrivia
- 2. Technical agenda
- 2.1. Review of issues
- 2.1.1. Issue #2660: `fn:matching-segments`: named capture groups
- 2.1.2. Issue #2653: FLWOR, member/key/value clauses: allow sequences
- 2.1.3. Issue #2644: Coercing a map to a record type
- 2.1.4. Issue #2567: Allow one named record type to extend another
- 2.1.5. Issue #2641: [Feature] Comments and parse-csv()
- 2.1.6. Issue #2615: XSLT: should xsl:map-entry return a JNode?
- 2.1.7. Issue #2600: Add options record as second parameter to `fn:collection` and `fn:uri-collection`
- 2.1.8. Issue #2591: Grammar: step?lookup is currently invalid
- 2.1.9. Issue #2588: Reintroduce xsl:record
- 2.1.10. Issue #2581: XSLT Match patterns for GNodes
- 2.1.11. Issue #2576: parse-json() option to match record types
- 2.1.12. Issue #2521: File retrieval, URIs: fragment identifiers
- 2.1.13. Issue #2518: Type Safety
- 2.1.14. Issue #2482: Add a fallback function to the `bin:decode-string` function
- 2.1.15. Issue #2464: add method to use path()
- 2.1.16. Issue #2393: Keep or drop array:members and array:of-members
- 2.1.17. Issue #2390: methods and inheritance
- 2.1.18. Issue #2257: Record declarations without namespace
- 2.1.19. Issue #2219: Generalize method calls to sequences
- 2.1.20. Issue #2169: Longest-token rule incorrectly produces
StringInterpolationdelimiter - 2.1.21. Issue #2073: JNodes and sequences
- 2.1.22. Issue #2039: Generalize context item to context value in XSLT
- 2.1.23. Issue #1949: fn:element-to-map: Updated Feedback
- 2.1.24. Issue #1777: Shallow copy in XSLT with maps and arrays
- 2.1.25. Issue #1234: Serialization Parameters: Indentation, Whitespace, Newlines
- 2.1.26. Issue #675: XSLT streaming rules for new constructs
- 2.1.27. Issue #285: Stability of collections
- 2.2. JSON transformations
- 2.3. Trusted execution
- 2.4. Publication scheduling
- 2.1. Review of issues
- 3. Any other business
Summary of new and continuing actions [1/11]
[ ]QT4CG-143-02: MK to try to recover the ability to extract formal equivalences into tests[X]QT4CG-165-02: NW to draft an agenda for the face-to-face meeting[ ]QT4CG-167-01: DB to write a PR for #2641, comments in CSV[ ]QT4CG-167-02: MK to make a PR for #2591, grammar for step?lookup is invalid[ ]QT4CG-167-03: NW to make a PR for #2482, fallback on bin:decode-string[ ]QT4CG-167-04: NW to make a PR explaining load-xquery-module for PR #2464[ ]QT4CG-167-05: MK to write a proposal to change #2393 so the functions return JNodes[ ]QT4CG-167-06: NW to write a PR to resolve #2169 per GuntherRademacher[ ]QT4CG-167-07: NW to review tests for interpolated strings with edge cases in mind[ ]QT4CG-167-08: MK to review the state of #1949 to see which items are still outstanding.[ ]QT4CG-167-09: NW to close all “nice to have” issues at the end of October if they haven’t progressed
Draft Minutes
1. Administrivia
1.1. Roll call [5/10]
Regrets: RD, CG, JK, JWL, WP
[X]David J Birnbaum (DB)[ ]Reece Dunn (RD)[ ]Christian Grün (CG)[ ]Joel Kalvesmaki (JK)[X]Michael Kay (MK)[X]Juri Leino (JLO)[ ]John Lumley (JWL)[X]Alan Painter (AP)[ ]Wendell Piez (WP)[X]Bethan Tovey-Walsh (BTW)[X]Norm Tovey-Walsh (NW) Scribe. Chair.
1.2. Accept the agenda
Proposal: Accept the agenda.
Accepted.
1.3. Next meeting
Perhaps tomorrow, and then 16 June. Meeting of 9 June is canceled.
CG gives regrets for 2, 9, and 16 June.
2. Technical agenda
2.1. Review of issues
2.1.1. Issue #2660: `fn:matching-segments`: named capture groups
2.1.2. Issue #2653: FLWOR, member/key/value clauses: allow sequences
See issue 2653
- MK: I’m uneasy. This feels like something that should be done explicitly.
- JLO: This is what I would expect.
- MK: Allowing empty sequence might be reasonable.
- DB: There are other places where we might have sequences of maps and arrays? Why is this only in FLWOR expressions?
- JLO: There’s an implicit merging that I don’t like. I was thinking more of the lookup operator.
There’s reluctance here, but we’ll leave it open and wait to see if a PR is forthcoming.
2.1.3. Issue #2644: Coercing a map to a record type
See issue 2644
- MK: I can see the argument for wanting to detect errors. But there’s no easy way to drop the unwanted fields.
- NW: Can we have a function for that?
- MK: Types aren’t values, so no. We could have a “castable something” kind of expression,.
- … If we think the feature is important enough to justify new syntax.
Consensus in the comments seems to settle on extending cast as expressions.
- BTW: Would it be possible to make it such that if the supplied value is a superset of the field names from the record type.
- MK: One of the problem is that this interacts with other things; for example the issue about subtyping record types.
- … If a subtype was defined, then it would be safe to ignore the extra fields because you’ve declared it explicitly.
Let’s consider that issue for a moment…
Defer discussion of #2644 until after we’ve resolved #2567, below.
2.1.4. Issue #2567: Allow one named record type to extend another
See issue 2567
- MK: This is all related to changing to using the names of record types.
- MK: I have a PR in progress for this, but I’m not happy with it.
- NW: Are you confident we’ll get there?
- MK: I think so. The main issue is how much of the current functionality you drop.
- … There’s a lot of capability for loose typing with maps; do you move to something much more strict for records, or do you try to preserve some of the looseness.
- JLO: Don’t the type annotations mean we have to be strict?
- MK: One of the issues is anonymous record types; can you annotate an anonymous record with a type?
- … What then does it mean for them to be the same?
- … There are also complexities related to recursiveness. If you say a named record type is just an alias, then recursive types become infinite.
- … Some of the academic literature on typing says you conceptually expand it and compare two infinite definitions.
- … Let’s avoid that!
- … The other way is that they’re only equivalent if they use the same type name.
- JLO: Then two anonymous records can never equal?
- MK: Right. There are a lot of questions about anonymous record types.
- JLO: What’s the use case for anonymous record types?
- MK: We could get rid of them. It’s sometimes inconvenient to have to name a record type.
Some discussion of the use of an anonymous record type as a function argument.
- MK: We could say that anonymous record types have an unknowable name.
- MK: Parsing a JSON tree gives you maps at the moment; if you want to select maps based on their content, like the pattern syntax in XSLT, when we had extensible records you could do that. But we’ve lost that now and we don’t really have a substitute.
- JLO: So there’s no way?
- MK: You can do it with a complex predicate, but there’s no convenient way.
2.1.5. Issue #2641: [Feature] Comments and parse-csv()
See issue 2641
- DB: This seems low cost and useful.
- MK: How does this relate to blank lines?
- NW: Blank lines are skipped?
- MK: No, there’s an arbitrary rule about what it gives you; I think it gives you a record containing a single empty field.
ACTION QT4CG-167-01: DB to write a PR for #2641, comments in CSV
2.1.6. Issue #2615: XSLT: should xsl:map-entry return a JNode?
See issue 2615
- MK: Can we make
xsl:map-entrymore consistent withxsl:array?- “Maybe.”
No progress; but we need to do the work.
2.1.7. Issue #2600: Add options record as second parameter to `fn:collection` and `fn:uri-collection`
See issue 2600
- JLO: I’m working on a PR, but it isn’t ready yet.
- MK: Do we want to make collections a bit less abstract? Do we want to define mappings from abstract collections to directories or zip archives?
- JLO: I would like to keep it abstract; in the database context “collections” are something else.
- … My main concern is do we have to have a mechanism to define those mappings in the options map, or can we leave it implementation defined.
- MK: Do we introduce a concept of media type?
- JLO: Not yet. The original spec is just undefined, it says use a mapping if it exists.
- MK: So you don’t want the map to say?
- NW: Letting users setup the mapping themselves would have some value.
Some discussions of various aspects of media types and filesystem metadata.
- JLO: This is a complex problem; some type of media type mapping would be okay.
- MK: I’d be inclined to allow the options to define a mapping table and it’s implementation defined if you don’t.
- JLO: Then there’s the parser question. Do we define a mapping from media type to parser?
- … This is where I’m struggling at the moment!
- MK: You could make it a mapping from file extension to media type and then from media type to parsing function.
- MK: Mapping to a parsing function would allow you to move the complexities of
options to the parser out of the collection options.
- … But what is the signature of that function? Pass a binary?
JLO will continue to develop the PR.
2.1.8. Issue #2591: Grammar: step?lookup is currently invalid
See issue 2591
Some discussion about the fact that the grammar has changed and some of the nonterminals described in the issue are now gone.
- NW: Are we inclined to fix this?
- MK: I think so.
- JLO: Would
(a/b)?xwork? If it would, maybe we don’t want to fix this? - MK:
a/b/xwill return a JNode rather than the value from the map - JLO: So what’s desired here is an implicit call to
fn:jvalue. - MK: Yep.
ACTION QT4CG-167-02: MK to make a PR for #2591, grammar for step?lookup is invalid
- JLO: The parenthesized version works, and I still prefer that.
2.1.9. Issue #2588: Reintroduce xsl:record
See issue 2588
- MK: Let’s leave this until we’ve sorted out other things; it’s only syntactic sugar.
2.1.10. Issue #2581: XSLT Match patterns for GNodes
See issue 2581
- MK: The debate is what we want the endpoint to be. In particular, do we want
traditional style match patterns of the form
a/b/cto match JNodes in JTrees?- … I think I’ve been swayed a little bit by implementation convenience.
- … It’s useful to know what kind of thing is going to be matched and we’d lose that.
- … So I’m slightly torn. Also, from a user point of view, it’s useful to be able to look at the code and know what it’s going to match.
- … So there’s both an implementation and usability argument for making the patterns distinct.
- DB: It seems to be a consequence of making them nodes.
- JLO: Not doing this requires a different form of pattern match.
- MK: Yes, and we’ve introduced the new syntax for matching maps.
- NW: I’m persuaded by the usability issue.
- MK: There’s a differents in XPath and XQuery. In XQuery,
a/b/coften appears in a context where you can see what it’s doing. But a match pattern in a template rule is much more context free.- … In XQuery,
$map/a/b/c, you can tell that$mapis a map. - … But template rules are much more stand alone.
- … In XQuery,
Some discussion of the current problem, including enclosed modes.
- JLO: In XQuery there will be places where it will happen.
- MK: It will happen, not everyone declares the type of their parameters.
MK observes that you have to use predicates to go upwards if you separate them.
MK to continue working on it.
2.1.11. Issue #2576: parse-json() option to match record types
See issue 2576
- MK: Maps often come from parsing JSON, an opportunity to turn them into records would be useful.
Some discussion of possible consequences.
- JLO: Doesn’t this make the parsing non-streamable?
- MK: Yes, but it’s already not fully streamable because of duplicate key checking.
Consensus: this is a good idea.
2.1.12. Issue #2521: File retrieval, URIs: fragment identifiers
See issue 2521
- NW: Simplifying the error codes seems reasonable.
- MK: I agree. They’re a mess.
- NW: What do folks think of allowing fragment identifiers more generally?
- JLO: I like them. But I think they should also be allowed as an option.
- … I say no to fragment identifers in the URI, but yes to allowing them in the options.
- NW: “Ugh.”
Presumably this is for all the parsing functions that take URIs.
- MK: The
docfunction already has a lot of trouble because most parsers don’t report ID attributes.- … And if the input comes from a DOM or something, it tends to have been lost.
- NW: We need to explain why it’s only on the
fn:docfunction or allow it everywhere.
2.1.13. Issue #2518: Type Safety
See issue 2518
- MK: If you use the lookup operator on a record, you get an error if the record type doesn’t define what you’re looking for.
- … I think that’s probably the best we can do.
- NW: Propose close with no further action?
Done.
2.1.14. Issue #2482: Add a fallback function to the `bin:decode-string` function
See issue 2482
- NW: Do folks agree we should add a fallback function?
Some discussion of what you could pass to the fallback function. If you can’t decode the input, it isn’t properly a code point, but it also isn’t necessarily a single octet.
- MK: Related: what’s actually feasible if you’re not hand decoding it.
- NW: I propose we mark this “nice to have” and see if NW produces a PR.
ACTION QT4CG-167-03: NW to make a PR for #2482, fallback on bin:decode-string
2.1.15. Issue #2464: add method to use path()
See issue 2464
- MK: Defining a function that accepts only the subset of XPath returned by
fn:pathis hard work and of little value. - MK: There’s a generic solution using load-xqery-module…
Proposal: give an example of how to use load-xquery-module to achieve the effect.
ACTION QT4CG-167-04: NW to make a PR explaining load-xquery-module for PR #2464
2.1.16. Issue #2393: Keep or drop array:members and array:of-members
See issue 2393
- MK: The introduction of JNodes have changed the equation here a bit.
- … You could define array members to evaluate the child axis to return JNodes
- … We could have a
childrenfunction and it would be similar to that. (We do havesiblings)
Proposal: recast the functions as returning JNodes.
ACTION QT4CG-167-05: MK to write a proposal to change #2393 so the functions return JNodes
2.1.17. Issue #2390: methods and inheritance
See issue 2390
- MK: The comment I made 30 May is related; we could allow maps to be
%frozen.- … Doing a map:put or map:get, you either get an error or it creates a new map, it doesn’t give you another instance of the same type.
Some discussion of “frozen” vs “immutable” vs “persistent”…
Leave until/add to the question of extensible record types.
2.1.18. Issue #2257: Record declarations without namespace
See issue 2257
- MK: The big issue here is different default namespaces for functions and types.
- … A record type acts as both function name and a type name; if the defaults are different, you get into an awful mess.
Leave open, CG has indicated he wants to try a PR.
2.1.19. Issue #2219: Generalize method calls to sequences
See issue 2219
- MK: There are two parts and we did one part. If the left hand side selects
multiple maps, we apply the method to each one.
- … What we haven’t done is allow the right hand side to select multiple methods.
History is complicatd on this. There’s a valid consistency argument. We need a PR.
2.1.20. Issue #2169: Longest-token rule incorrectly produces StringInterpolation delimiter
See issue 2169
Consensus: that probably works.
- JLO: It looks the same to me.
- MK: It’s just a consequence of the tokenization rules.
ACTION QT4CG-167-06: NW to write a PR to resolve #2169 per GuntherRademacher
- MK: I hesitated because I wondered about edge cases.
ACTION QT4CG-167-07: NW to review tests for interpolated strings with edge cases in mind
2.1.21. Issue #2073: JNodes and sequences
See issue 2073
- MK: The classic scenario is that you have a map and one of the entries in a
map has a value that is a sequence that contains two maps.
- … How many JNodes do you get?
- … Do you get a single child JNode containing a sequence of two maps, or two JNodes each of which is a single map.
- NW: What does the spec say today?
- MK: What it says isn’t very approachable.
Looking at the expression in Data Model 8.5 JNodes
$PARENT = JNode( [ ([1], [2], "X"), ([3], [4]) ] )
dm:j-value = [ ([1], [2], "X"), ([3], [4]) ]
(: That item is an array :)
for each member:
([1], [2], "X")
return dm:JNode with that sequence as its jvalue
[1]
return dm:JNode with the 1 as its jvalue
[2]
return dm:JNode with the 2 as its jvalue
"X"
return ()
- NW: That’s certainly not useful…
- JLO: Wouldn’t it be better to break it out of a JTree at this point?
- MK: We could make it an error at this case
Is there an analogy with text nodes? It’s sort of mixe content.
- MK: Maybe JNodes can have leaf nodes that are treated differently.
Will continue to think about it.
2.1.22. Issue #2039: Generalize context item to context value in XSLT
See issue 2039
- MK: I think we can live with the status quo; we have a context value in the XPath context but in XSLT, we still have a context item.
There’s a lot of work without a lot of value in making the initial context thing a value.
Proposal: close with no acton.
2.1.23. Issue #1949: fn:element-to-map: Updated Feedback
See issue 1949
ACTION QT4CG-167-08: MK to review the state of #1949 to see which items are still outstanding.
2.1.24. Issue #1777: Shallow copy in XSLT with maps and arrays
See issue 1777
- MK: In shallow-copy-mode, if you don’t write any template rules, the input is unchanged.
- … You should be able to write templates that match individual items in the tree.
- … It should behave as much like the same model with XNodes as possible.
Next on the agenda, see below.
2.1.25. Issue #1234: Serialization Parameters: Indentation, Whitespace, Newlines
See issue 1234
- JLO: I like it.
- MK: We have extension attributes in Saxon to do this.
- … We do indent attributes by default, but you can switch it off (by setting the line length)
- DB: I’d like
line-lengthas well. - JLO: I’ve had several users requesting this.
- MK: An option to reformat paragraphs would also be nice.
- NW: Let’s leave aside the question “what is a paragraph!”
- MK: Well, it’s clear in HTML.
We’ll wait to see if CG writes a PR.
2.1.26. Issue #675: XSLT streaming rules for new constructs
See issue 675:
“Yes, we need them.”
- MK: I’ve done some work on them, but they need reviewing.
- … And there’s an enormous gap in testing.
Some discussion of the philosophy of streaming.
- MK: The demand for streaming is down (because hardware is bigger), but the 2^31 limit on Java and C# strings is becoming problematic. But that’s an implementation problem.
2.1.27. Issue #285: Stability of collections
See issue 285:
- MK: There’s another significant problem with stability: all these functions now have options and if you read the same URI with different options, you sometimes need to get back a different document.
- NW: Is it time to consider abandoning the stability requirement altogether?
- MK: Well, that’s a radical choice. Lots of users call “doc” inside a match
pattern, so that would be catastrophic.
- … Saxon
fn:collectionhas never been stable and no one has ever complained.
- … Saxon
- JLO: Writing into a collection a database would make more documents appear.
Some discussion of hashing the options and doing stability with the URI + the options.
- MK: That’s what we’re doing with the
fn:docfunction.- … Then there’s consistency: if you do
fn:docandfn:unparsed-texton the same URI, do they have to be “consistent”?
- … Then there’s consistency: if you do
- NW: It boils down to keeping the bag of bits you got from the first access.
- MK: Stability for the
fn:docfunction has two aspects: has the underlying resource changed, but also, you are guaranteeing node identity of the document. That doesn’t apply to parsing JSON, or other sources.
We have the option on fn:doc to say whether or not you want guaranteed stability.
Leave open, still unresolved.
2.2. JSON transformations
- MK: I did a case study on this. I wrote it up in an issue as it progressed, #1786.
- … What I needed to test as a feasibility study was to take an application that users recursive JSON
- … The example I came up with is the Java to C# transpiler; we use the Java parser to get a syntax tree, we do a transformation, and serialize the result as C#.
- … The open source parser we use gives us XML. But what if it gave us JSON?
- … I was testing how we’d have to change the transpiler if the syntax tree was JSON.
- … That turned out to be a fairly useful exercise, but in some ways we were lucky because the XML it generates is slightly peculiar.
- … The XML uses element names to identify the kind of objects in the AST.
- … Consequently, the transformation in that case is totally based on the role attribute, not the element names.
- … In JSON, you don’t have element names and the chances are the JSON you would get would represent the kind of construct and the role as JSON properties. That effects the feasibility of matching them.
- MK: The other thing that was slightly unusual was that it didn’t need the ancestor access. Usually, there’s at least one place where you have to look at the context. But Java doesn’t have context dependencies like that.
- MK: The lack of element names is the most signifcant thing that emerges; everything has to be driven by property values.
- MK: What I’m struggling with now is, how do you put maps back together when you’re going up the tree after the transformation.
- MK: I thought arrays would be difficult, but I think in practice users will do
for-each constructs over arrays.
- … You aren’t likely to have a template that matches an array-of-authors and another that matches an array-of-dimensions. You’re going to loop over the arrays where you know what they are.
- MK: A recursive structure is usually done with maps; I imagine that a recursive array structure is very unusual.
Some discussion of the nature of JSON; we’re dealing with a hierarchal data structure. It’s recursive in that sense.
- MK: Recursive data structures are common in documents, but less common perhaps in data.
Some discussion of the dimensions of nested arrays. Usually a fixed dimension in strongly typed languages.
Consider this input:
{ "name": "europe", "type": "territory", "part": [ { "name": "germany", "type": "country", "part": [ { "name": "berlin", "type": "city", "population": 3800000 }, { "name": "bonn", "type": "city", "population": 340000 } ] } ] }
The CG did some whiteboarding exercises exploring various possible approaches. Some of the syntax may be speculative.
Updating the population of each territory:
<xsl:template match="{'type': enum('territory')}"> <xsl:variable name="result" as="map(*)"> <xsl:apply-templates/> </xsl:variable> <xsl:sequence select="map:put($result, 'population', sum($result//population)"/> </xsl:template>
Attempting to fix the capitalization of city names:
<xsl:template match="{'name'}[map:size(.) = 1]"> <xsl:sequence select="map:put(., 'name', eg:fix-name-case(.?name))"/> </xsl:template>
Increasing the population of each city by 10%:
<xsl:template match="{'population'}[map:size() = 1 and ../type = 'city']"> <xsl:sequence select="map{'population', ../population*1.1}"/> </xsl:template>
Sorting the cities by population:
<xsl:template match="array-of({'type':enum('city')})"> <xsl:apply-templates> <xsl:sort select="?population"/> </xsl:apply-templates/> </xsl:template>
Attempting to add a new property to each city:
<xsl:template match="j{'type': enum('city')}"> <!-- add: area --> <xsl:map> <xsl:apply-templates/> <xsl:map-entry key="'area'" select="eg:area-of(?name)"/> </xsl:map> </xsl:template>
Notes:
- In shallow-copy mode, if apply-templates operates in the context of a map, it reconstructs a map from the result of applying templates
- In shallow-copy mode, if apply-templates operates in the context of an array, it reconstructs an array from the result of applying templates
- If apply-templates is operating on an array, sorting operates on the array elements
2.3. Trusted execution
Is the trusted execution model complete and sufficient?
- MK: The real issue is transitive reference. You have to be able to ask for
books.xml, but what’s scary is whenbooks.xmlcontains a reference to/etc/passwd. - JLO: But I do want to be able to limit external access in general.
- MK: When you do an
fn:transformcall, the question is, “do I trust the person who wrote the stylesheet I’m invoking”. - JLO: The spec should at least say what functions have potential external access.
- MK: The classic billion laughs kind of vulnerability raises the question of whether you trust the document itself.
- NW: The “available documents” question is what’s really important.
- MK: For xsl:include and xsl:import, you’re in the domain of the compiler. Is there a security mode in the C-compiler that prevents you from accessing resources.
- NW: For
fn:transform, I might want to have different documents for different transforms. - MK: That would require the trust level being not a boolean but something like a domain name.
Some discussion of the context of xsl:use-package; it’s static.
- MK: There’s still a bit of a disconnect between the static concept notion of available documents and the security model.
2.3.1. Where is “trusted” needed?
- Any function that accesses to an external resource
- All of the file: functions, the fn:doc function and friends
2.4. Publication scheduling
What are we going to publish, and when? How do we inform the broader community when we think we’re ready for “last call” review? How do we manage the details of final publication?
- NW: We’re not a rec-track document, so there’s no official sequence.
- … We can do “dated/frozen” specifications as effective “last call” drafts.
- … Give users 60 days and iterate until done.
- MK: What about logos and patent policy?
- NW: That’s covered by the fact that we’re a W3C community group.
- NW: Last call in 2026 seems…unlikely. We really must try to get to a last call status in 2027.
- Proposal: get to last call by 1 March 2027?
ACTION QT4CG-167-09: NW to close all “nice to have” issues at the end of October if they haven’t progressed
3. Any other business
None heard. It looks like we don’t need to meet on Wednesday. Next meeting: 16 June.