QT4 CG Meeting 169 Minutes 2026-06-23
Meeting index / QT4CG.org / Dashboard / GH Issues / GH Pull Requests
Table of Contents
- Summary of new and continuing actions
[0/10] - Draft Minutes
- 1. Administrivia
- 2. Technical agenda
- 2.1. Issue 2169: Longest-token rule incorrectly produces StringInterpolation delimiter
- 2.2. PR #2696: 2695 Apply templates to maps, arrays, and JNodes
- 2.3. PR #2691: 2683 Casting cannot return subtype
- 2.4. PR #2688: 1949 Refine rules for element-to-map() handling of atomic types
- 2.5. PR #2686: 2684 strip-space - align fn:doc and fn:document
- 2.6. PR #2649: 2647 file:descendants: recursion, filtering
- 3. Any other business
Summary of new and continuing actions [0/10]
[ ]QT4CG-143-02: MK to try to recover the ability to extract formal equivalences into tests[ ]QT4CG-167-01: DB to write a PR for #2641, comments in CSV[ ]QT4CG-167-02: MK to make a PR for #2591, grammar for step?lookup is invalid[ ]QT4CG-167-03: NW to make a PR for #2482, fallback on bin:decode-string[ ]QT4CG-167-04: NW to make a PR explaining load-xquery-module for PR #2464[ ]QT4CG-167-05: MK to write a proposal to change #2393 so the functions return JNodes[ ]QT4CG-167-07: NW to review tests for interpolated strings with edge cases in mind[ ]QT4CG-167-08: MK to review the state of #1949 to see which items are still outstanding.[ ]QT4CG-167-09: NW to close all “nice to have” issues at the end of October if they haven’t progressed
Draft Minutes
1. Administrivia
1.1. Roll call [10/11]
Regrets: AP.
[X]David J Birnbaum (DB)[X]Reece Dunn (RD)[X]Christian Grün (CG)[X]Joel Kalvesmaki (JK)[X]Michael Kay (MK)[X]Juri Leino (JLO)[X]John Lumley (JWL)[ ]Alan Painter (AP)[X]Wendell Piez (WP)[X]Bethan Tovey-Walsh (BTW)[X]Norm Tovey-Walsh (NW) Scribe. Chair.
1.2. Accept the agenda
Proposal: Accept the agenda.
Accepted.
1.3. Approve minutes of the previous meeting
Proposal: Accept the minutes of the previous meeting.
Accepted.
1.4. Next meeting
The next meeting is planned for 30 June.
No regrets heard.
1.5. Review of open action items [0/10]
[ ]QT4CG-143-02: MK to try to recover the ability to extract formal equivalences into tests[ ]QT4CG-167-01: DB to write a PR for #2641, comments in CSV[ ]QT4CG-167-02: MK to make a PR for #2591, grammar for step?lookup is invalid[ ]QT4CG-167-03: NW to make a PR for #2482, fallback on bin:decode-string[ ]QT4CG-167-04: NW to make a PR explaining load-xquery-module for PR #2464[ ]QT4CG-167-05: MK to write a proposal to change #2393 so the functions return JNodes[ ]QT4CG-167-06: NW to write a PR to resolve #2169 per GuntherRademacher[ ]QT4CG-167-07: NW to review tests for interpolated strings with edge cases in mind[ ]QT4CG-167-08: MK to review the state of #1949 to see which items are still outstanding.[ ]QT4CG-167-09: NW to close all “nice to have” issues at the end of October if they haven’t progressed
1.6. Review of open pull requests and issues
This section summarizes all of the issues and pull requests that need to be resolved before we can finish. See Technical Agenda below for the focus of this meeting.
1.6.1. Blocked
The following PRs are open but have merge conflicts or comments which suggest they aren’t ready for action.
- PR #2698: 2641 Support comments in csv
- PR #2638: 2632-6: cross-cutting consistency
- PR #2637: 2632-5: refresh stale 4.0 content
- PR #2636: 2632-4: logic and semantics
- PR #2635: 2632-3: typos and grammar
- PR #2634: 2632-2: fix broken examples in expressions.xml
- PR #2633: 2632-1: fix critical bugs and DTD-validity issues
- PR #2594: 2389 Adaptive Serialization: more freedom
- PR #2350: 708 An alternative proposal for generators
- PR #2247: 716 Deferred Evaluation in XPath - the f:generator record
- PR #2160: 2073 data model changes for JNodes and Sequences
- PR #2071: 77c deep update
1.6.2. Merge without discussion
The following PRs are editorial, small, or otherwise appeared to be uncontroversial when the agenda was prepared. The chairs propose that these can be merged without discussion. If you think discussion is necessary, please say so.
- PR #2697: 2689 Updates/corrections to XSLT examples
- PR #2694: Reserved constructor names should include `trace`
- PR #2690: Corrections to PR2581 noted during review
Proposal: Merge without further discussion
Accepted.
2. Technical agenda
2.1. Issue 2169: Longest-token rule incorrectly produces StringInterpolation delimiter
See #2169
- NW attempts to review the issue.
- MK: I think Gunther wants to re-engineer the rules for complex terminals and tokenization.
- … I’d be full of admiration if he succeeds. I’ve done something ad hoc.
- … I think he’s looking for a single-level grammar without a concept of
- … That might be possible, but strikes me as disruptive.
- RD: We already have subtokenization, when handling “<“, you need to do a lookahead to determine if it’s a markup start character.
- MK: That’s one of the complex terminals. The idea is that you can tell what comes next by looking ahead a bit.
- … Tokenization involves parsing of sub expressions.
- RD: I think there are also issues around pragma parsing.
- MK: Those can be resolved by looking a character at a time, you don’t have to be recursive.
Some discussion of statefulness and regex.
- JLO: This is only for the string constructor, is that right?
- MK: Yes, I think this particular problem is with interpolation and not with templates.
- … But Gunther has raised it as a sort of general problem.
- RD: Is this issue also present in 3.1?
- JLO: It could (maybe) have been introduced, but …
- MK: What we’ve changed is the longest token rule.
- … It used to say “take the longest token consistent with the EBNF” and no one really knew what that meant.
- … It sounds like you’re supposed to backtrack, but no one ever convinced me.
ACTION: NW to attempt to get more detail from Gunther about what the proposal entails.
(Overtaken by events; see RD in Any other business.)
- CG: Would it be possible to use Gunther’s suggestion of using two characters?
- MK: That is worth exploring.
- JLO: I think it would also be helpful to give it a little more context about where it can occur.
- … From the input it looks like it could be anywhere, but that’s not the case.
2.2. PR #2696: 2695 Apply templates to maps, arrays, and JNodes
See PR #2696
- MK: I’m not 100% confident in this one. It was useful to write, but I’m not sure it’s the final answer.
- … We’re in XSLT. This is another stage on the journey of how we transform JSON trees.
- … I wanted to explore how apply-templates should work when applied to maps, arrays, or JNodes.
- … If there’s a
selectattribute, we should make no changes. I tried a few things, but they didn’t work. - … If there’s no
selectattribute, I proposed some changes but I’m not 100% confident.
- MK: The change is that the default value of the
selectdepends on the type of the item.child::node()for XNodeschild::*for JNodesjtree(.)/child::*if the context item is a map or an array (but that’s the part I’m not sure of).for any other item.
- MK: That gets interesting when you look at the examples.
MK reviews examples in the PR.
- MK: There seem to be two distinct ways to process the tree, turn them into JNodes or process them explicitly.
- … They don’t mix very well; you sort of have to do it one way or the other.
- … Once you go to maps and arrays, you lose the ability to go back to JNodes.
- NW: I thought the former example with JNodes was clear and nice.
- JWL: If you took the analagous XNode tree, either of those methods would be equally useful.
- … What you’re trying to do is get as much harmonization as you can between the two models.
- MK: Yep.
- JWL: If you take the case of the category as the acid test; in the XNode
operation, you can get the category because you can always go back up.
- … If you look at the second example and the XNode equivalent, you can do the tunnel but you don’t have to because you can always go back up.
- JWL: One effectively produces a JNode tree and the other one doesn’t. And you can’t detect whether or not you’d want to do that in a match system.
- MK: One issue is the discontinuity, if you start doing it the second way and decide you want to go back up, you don’t want to have to rewrite it all.
- MK: What I’d like to do next is see how this works with built-in template rules and shallow-copy. Do some more examples and use cases.
- NW: That sounds good to me.
- WP: I like this so far, I’m not sure it’s necessarily a problem so much as a cost of the new capability.
- … Checking out how it works in more scenarios and I’m interested in how it works to match strings and atomics.
- … There seems to be an either-or. Maybe we want to make sure the new thing doesn’t break the old thing.
2.3. PR #2691: 2683 Casting cannot return subtype
See PR #2691
- MK: I think this is reasonably straightforward in comparison.
- MK: We had a contradiction; this PR resolves it. It says
cast asand a constructor function can’t return a subtype but everything else can. - MK: In the processing model, we make the exceptions clear there as well.
- MK: What brought this on was Saxon doing an optimization where casting a decimal to an integer didn’t do anything.
- NW: I think that will be a lot less surprising.
Proposal: accept this PR.
Accepted.
2.4. PR #2688: 1949 Refine rules for element-to-map() handling of atomic types
See PR #2688
- MK: This partially addresses some of the comments that CG made. Not all of them because I didn’t agree with all of them.
- MK: I’ve factored out the determination of a property type so that the rule is the same for elements and attributes.
- … The new section, Inferring a Datatype, describes how to infer the type.
- … This can now handle doubles, decimals, integers, booleans, and strings.
- … This is what you get in an automatically generated plan. You can override it yourself.
- JWL: How would you get 0 or 1 as boolean in case 2?
- MK: If it’s all 0 or 1, you’re going to get integer. You’d only get it if there was a mixture of “false” and “true” as well.
- CG: Maybe we should also consider leading
+symbols as well. And maybe a leading-as well. - JLO: I wanted to understand better what the goal is. Is this so you have nice fallbacks or to “do the right thing”?
- … Whenever that’s not good enough, I can adjust the plan to get what I want.
- MK: It’s to handle the common cases, like when all the prices are decimals you want to make them numbers not strings.
- JLO: I think leading
+is really important in that case. - CG: One suggestion was to stick with the explicit types, if a user provides a
plan that says something that is an integer, don’t switch to integer.
- … Whenever you design a plan deliberately, you want the processor to follow it.
- MK: I think that’s best handled with an option. I’m concerned about the case where you’re processing a lot of invoices and after five years, someone passes in one with a “$”. Do you reject that, making someone work out what the heck is going on. Or do you just convert it to a string, making it a bit more obvious what went wrong.
- CG: Isn’t that like schema validation?
- MK: Yes, if you validate, you’ll see the invalidity.
- JLO: If I had a schema aware processor and I thought something that’s not valid, what happens?
- MK: Well, the ideal is that you validate first. And the transformation uses the schema information.
- JLO: I’d like an option here as well.
- CG: We have many cases where the data might change cause errors, so I think it would be more consistent to always have an error.
- WP: In the real world, this is really ugly. I think the bar is low. We’re already ahead of the game. Options are good and errors are good. Don’t let it get too fancy.
Proposal: accept this PR.
Accepted.
2.5. PR #2686: 2684 strip-space - align fn:doc and fn:document
See PR #2686
- MK: The issue here is that
fn:docandfn:documentdidn’t align with how they handled strip space.- That’s partly because
fn:documentis only in XSLT; but also it said that the semantics offn:docwas different in an XSLT context.
- That’s partly because
- MK: The challenge is to try to come to some acceptable compromise.
- MK: This proposal changes
strip-spaceonfn:docto being three valued.- (MK describes the three choices:
all,none,conditional.)
- (MK describes the three choices:
- MK: The proposed default is to follow strip-space and preserve-space, which is
equivalent to
nonein XPath contexts. - MK: In XSLT, the type on
fn:documentis the same thing. The definition ofallandnoneare the same andconditionalis expressed in XSLT terms. - CG: I think I still don’t fully understand why an optional boolean wouldn’t be a better choice.
- … XQuery users wouldn’t understand what
conditionalmeans, and the serialization specifications all useyesandno.
- … XQuery users wouldn’t understand what
- MK: I just don’t like having a three-valued item without any way to make all the values explicit.
Some discussion of where you might use this. In a conditional checking for the host language, for example.
- RD: I was looking at the XQuery boundary space and wondering why you didn’t follow the preserve and strip; but then looking at the XSLT, those are separate instructions.
Some discussion of a parameter for XQuery to control this.
- WP: Is it more or less confusing to add “yes” and “no” as aliases for “all” and “none”?
- JLO: I expect that I’ll never use this, but I like “inherit” more than “conditional”. The former immediately has a meaning to me.
Proposal: accept this PR.
Accepted.
2.6. PR #2649: 2647 descendants: recursion, filtering
See PR #2649
- CG: This PR changes some options in the file functions.
- … You can say things like you want to have all subdirectories up to a depth.
- MK: Is it clear what the predicate is being applied to?
- … I’m just wondering if it’s a full path or a local path or…
- CG: That’s the path that will be returned in the end; it’s unchanged but not explicitly stated.
- MK It might be helpful to be a little bit more explicit.
- JLO: Whose the first to write a globbing function for this?
- … This isn’t the easiest way to do this.
- … What is
$paththen?
- CG: It’s the path that will be returned in the end.
CG to add a little clarity about that path.
- CG: You’d like an extra function for the glob pattern?
- JLO: I’d like one; but I will add one with recurse and filter if I have to.
- NW: There are lots of flavors of globbing.
- WP: Are recurse and depth both syntactic sugar?
3. Any other business
- RD: I’ve had a look at Gunther’s issue and I don’t think it’s an issue. The
way lexing works, the
`{is only valid within the string constructor content. So it should only be tokenized in that context.- … The element content char state that’s present when doing direct element
content doesn’t have that
`so it gets lexed and tokenized correctly.
- … The element content char state that’s present when doing direct element
content doesn’t have that
- RD: I’ve left a comment.
- NW: Thank you.