QT4 CG Meeting 169 Minutes 2026-06-23

Meeting index / QT4CG.org / Dashboard / GH Issues / GH Pull Requests

Table of Contents

Summary of new and continuing actions [0/10]

  • [ ] QT4CG-143-02: MK to try to recover the ability to extract formal equivalences into tests
  • [ ] QT4CG-167-01: DB to write a PR for #2641, comments in CSV
  • [ ] QT4CG-167-02: MK to make a PR for #2591, grammar for step?lookup is invalid
  • [ ] QT4CG-167-03: NW to make a PR for #2482, fallback on bin:decode-string
  • [ ] QT4CG-167-04: NW to make a PR explaining load-xquery-module for PR #2464
  • [ ] QT4CG-167-05: MK to write a proposal to change #2393 so the functions return JNodes
  • [ ] QT4CG-167-07: NW to review tests for interpolated strings with edge cases in mind
  • [ ] QT4CG-167-08: MK to review the state of #1949 to see which items are still outstanding.
  • [ ] QT4CG-167-09: NW to close all “nice to have” issues at the end of October if they haven’t progressed

Draft Minutes

1. Administrivia

1.1. Roll call [10/11]

Regrets: AP.

  • [X] David J Birnbaum (DB)
  • [X] Reece Dunn (RD)
  • [X] Christian Grün (CG)
  • [X] Joel Kalvesmaki (JK)
  • [X] Michael Kay (MK)
  • [X] Juri Leino (JLO)
  • [X] John Lumley (JWL)
  • [ ] Alan Painter (AP)
  • [X] Wendell Piez (WP)
  • [X] Bethan Tovey-Walsh (BTW)
  • [X] Norm Tovey-Walsh (NW) Scribe. Chair.

1.2. Accept the agenda

Proposal: Accept the agenda.

Accepted.

1.3. Approve minutes of the previous meeting

Proposal: Accept the minutes of the previous meeting.

Accepted.

1.4. Next meeting

The next meeting is planned for 30 June.

No regrets heard.

1.5. Review of open action items [0/10]

  • [ ] QT4CG-143-02: MK to try to recover the ability to extract formal equivalences into tests
  • [ ] QT4CG-167-01: DB to write a PR for #2641, comments in CSV
  • [ ] QT4CG-167-02: MK to make a PR for #2591, grammar for step?lookup is invalid
  • [ ] QT4CG-167-03: NW to make a PR for #2482, fallback on bin:decode-string
  • [ ] QT4CG-167-04: NW to make a PR explaining load-xquery-module for PR #2464
  • [ ] QT4CG-167-05: MK to write a proposal to change #2393 so the functions return JNodes
  • [ ] QT4CG-167-06: NW to write a PR to resolve #2169 per GuntherRademacher
  • [ ] QT4CG-167-07: NW to review tests for interpolated strings with edge cases in mind
  • [ ] QT4CG-167-08: MK to review the state of #1949 to see which items are still outstanding.
  • [ ] QT4CG-167-09: NW to close all “nice to have” issues at the end of October if they haven’t progressed

1.6. Review of open pull requests and issues

This section summarizes all of the issues and pull requests that need to be resolved before we can finish. See Technical Agenda below for the focus of this meeting.

1.6.1. Blocked

The following PRs are open but have merge conflicts or comments which suggest they aren’t ready for action.

  • PR #2698: 2641 Support comments in csv
  • PR #2638: 2632-6: cross-cutting consistency
  • PR #2637: 2632-5: refresh stale 4.0 content
  • PR #2636: 2632-4: logic and semantics
  • PR #2635: 2632-3: typos and grammar
  • PR #2634: 2632-2: fix broken examples in expressions.xml
  • PR #2633: 2632-1: fix critical bugs and DTD-validity issues
  • PR #2594: 2389 Adaptive Serialization: more freedom
  • PR #2350: 708 An alternative proposal for generators
  • PR #2247: 716 Deferred Evaluation in XPath - the f:generator record
  • PR #2160: 2073 data model changes for JNodes and Sequences
  • PR #2071: 77c deep update

1.6.2. Merge without discussion

The following PRs are editorial, small, or otherwise appeared to be uncontroversial when the agenda was prepared. The chairs propose that these can be merged without discussion. If you think discussion is necessary, please say so.

  • PR #2697: 2689 Updates/corrections to XSLT examples
  • PR #2694: Reserved constructor names should include `trace`
  • PR #2690: Corrections to PR2581 noted during review

Proposal: Merge without further discussion

Accepted.

2. Technical agenda

2.1. Issue 2169: Longest-token rule incorrectly produces StringInterpolation delimiter

See #2169

  • NW attempts to review the issue.
  • MK: I think Gunther wants to re-engineer the rules for complex terminals and tokenization.
    • … I’d be full of admiration if he succeeds. I’ve done something ad hoc.
    • … I think he’s looking for a single-level grammar without a concept of
    • … That might be possible, but strikes me as disruptive.
  • RD: We already have subtokenization, when handling “<“, you need to do a lookahead to determine if it’s a markup start character.
  • MK: That’s one of the complex terminals. The idea is that you can tell what comes next by looking ahead a bit.
    • … Tokenization involves parsing of sub expressions.
  • RD: I think there are also issues around pragma parsing.
  • MK: Those can be resolved by looking a character at a time, you don’t have to be recursive.

Some discussion of statefulness and regex.

  • JLO: This is only for the string constructor, is that right?
  • MK: Yes, I think this particular problem is with interpolation and not with templates.
    • … But Gunther has raised it as a sort of general problem.
  • RD: Is this issue also present in 3.1?
  • JLO: It could (maybe) have been introduced, but …
  • MK: What we’ve changed is the longest token rule.
    • … It used to say “take the longest token consistent with the EBNF” and no one really knew what that meant.
    • … It sounds like you’re supposed to backtrack, but no one ever convinced me.

ACTION: NW to attempt to get more detail from Gunther about what the proposal entails.

(Overtaken by events; see RD in Any other business.)

  • CG: Would it be possible to use Gunther’s suggestion of using two characters?
  • MK: That is worth exploring.
  • JLO: I think it would also be helpful to give it a little more context about where it can occur.
    • … From the input it looks like it could be anywhere, but that’s not the case.

2.2. PR #2696: 2695 Apply templates to maps, arrays, and JNodes

See PR #2696

  • MK: I’m not 100% confident in this one. It was useful to write, but I’m not sure it’s the final answer.
    • … We’re in XSLT. This is another stage on the journey of how we transform JSON trees.
    • … I wanted to explore how apply-templates should work when applied to maps, arrays, or JNodes.
    • … If there’s a select attribute, we should make no changes. I tried a few things, but they didn’t work.
    • … If there’s no select attribute, I proposed some changes but I’m not 100% confident.
  • MK: The change is that the default value of the select depends on the type of the item.
    • child::node() for XNodes
    • child::* for JNodes
    • jtree(.)/child::* if the context item is a map or an array (but that’s the part I’m not sure of)
    • . for any other item.
  • MK: That gets interesting when you look at the examples.

MK reviews examples in the PR.

  • MK: There seem to be two distinct ways to process the tree, turn them into JNodes or process them explicitly.
    • … They don’t mix very well; you sort of have to do it one way or the other.
    • … Once you go to maps and arrays, you lose the ability to go back to JNodes.
  • NW: I thought the former example with JNodes was clear and nice.
  • JWL: If you took the analagous XNode tree, either of those methods would be equally useful.
    • … What you’re trying to do is get as much harmonization as you can between the two models.
  • MK: Yep.
  • JWL: If you take the case of the category as the acid test; in the XNode operation, you can get the category because you can always go back up.
    • … If you look at the second example and the XNode equivalent, you can do the tunnel but you don’t have to because you can always go back up.
  • JWL: One effectively produces a JNode tree and the other one doesn’t. And you can’t detect whether or not you’d want to do that in a match system.
  • MK: One issue is the discontinuity, if you start doing it the second way and decide you want to go back up, you don’t want to have to rewrite it all.
  • MK: What I’d like to do next is see how this works with built-in template rules and shallow-copy. Do some more examples and use cases.
  • NW: That sounds good to me.
  • WP: I like this so far, I’m not sure it’s necessarily a problem so much as a cost of the new capability.
    • … Checking out how it works in more scenarios and I’m interested in how it works to match strings and atomics.
    • … There seems to be an either-or. Maybe we want to make sure the new thing doesn’t break the old thing.

2.3. PR #2691: 2683 Casting cannot return subtype

See PR #2691

  • MK: I think this is reasonably straightforward in comparison.
  • MK: We had a contradiction; this PR resolves it. It says cast as and a constructor function can’t return a subtype but everything else can.
  • MK: In the processing model, we make the exceptions clear there as well.
  • MK: What brought this on was Saxon doing an optimization where casting a decimal to an integer didn’t do anything.
  • NW: I think that will be a lot less surprising.

Proposal: accept this PR.

Accepted.

2.4. PR #2688: 1949 Refine rules for element-to-map() handling of atomic types

See PR #2688

  • MK: This partially addresses some of the comments that CG made. Not all of them because I didn’t agree with all of them.
  • MK: I’ve factored out the determination of a property type so that the rule is the same for elements and attributes.
    • … The new section, Inferring a Datatype, describes how to infer the type.
    • … This can now handle doubles, decimals, integers, booleans, and strings.
    • … This is what you get in an automatically generated plan. You can override it yourself.
  • JWL: How would you get 0 or 1 as boolean in case 2?
  • MK: If it’s all 0 or 1, you’re going to get integer. You’d only get it if there was a mixture of “false” and “true” as well.
  • CG: Maybe we should also consider leading + symbols as well. And maybe a leading - as well.
  • JLO: I wanted to understand better what the goal is. Is this so you have nice fallbacks or to “do the right thing”?
    • … Whenever that’s not good enough, I can adjust the plan to get what I want.
  • MK: It’s to handle the common cases, like when all the prices are decimals you want to make them numbers not strings.
  • JLO: I think leading + is really important in that case.
  • CG: One suggestion was to stick with the explicit types, if a user provides a plan that says something that is an integer, don’t switch to integer.
    • … Whenever you design a plan deliberately, you want the processor to follow it.
  • MK: I think that’s best handled with an option. I’m concerned about the case where you’re processing a lot of invoices and after five years, someone passes in one with a “$”. Do you reject that, making someone work out what the heck is going on. Or do you just convert it to a string, making it a bit more obvious what went wrong.
  • CG: Isn’t that like schema validation?
  • MK: Yes, if you validate, you’ll see the invalidity.
  • JLO: If I had a schema aware processor and I thought something that’s not valid, what happens?
  • MK: Well, the ideal is that you validate first. And the transformation uses the schema information.
  • JLO: I’d like an option here as well.
  • CG: We have many cases where the data might change cause errors, so I think it would be more consistent to always have an error.
  • WP: In the real world, this is really ugly. I think the bar is low. We’re already ahead of the game. Options are good and errors are good. Don’t let it get too fancy.

Proposal: accept this PR.

Accepted.

2.5. PR #2686: 2684 strip-space - align fn:doc and fn:document

See PR #2686

  • MK: The issue here is that fn:doc and fn:document didn’t align with how they handled strip space.
    • That’s partly because fn:document is only in XSLT; but also it said that the semantics of fn:doc was different in an XSLT context.
  • MK: The challenge is to try to come to some acceptable compromise.
  • MK: This proposal changes strip-space on fn:doc to being three valued.
    • (MK describes the three choices: all, none, conditional.)
  • MK: The proposed default is to follow strip-space and preserve-space, which is equivalent to none in XPath contexts.
  • MK: In XSLT, the type on fn:document is the same thing. The definition of all and none are the same and conditional is expressed in XSLT terms.
  • CG: I think I still don’t fully understand why an optional boolean wouldn’t be a better choice.
    • … XQuery users wouldn’t understand what conditional means, and the serialization specifications all use yes and no.
  • MK: I just don’t like having a three-valued item without any way to make all the values explicit.

Some discussion of where you might use this. In a conditional checking for the host language, for example.

  • RD: I was looking at the XQuery boundary space and wondering why you didn’t follow the preserve and strip; but then looking at the XSLT, those are separate instructions.

Some discussion of a parameter for XQuery to control this.

  • WP: Is it more or less confusing to add “yes” and “no” as aliases for “all” and “none”?
  • JLO: I expect that I’ll never use this, but I like “inherit” more than “conditional”. The former immediately has a meaning to me.

Proposal: accept this PR.

Accepted.

2.6. PR #2649: 2647 descendants: recursion, filtering

See PR #2649

  • CG: This PR changes some options in the file functions.
    • … You can say things like you want to have all subdirectories up to a depth.
  • MK: Is it clear what the predicate is being applied to?
    • … I’m just wondering if it’s a full path or a local path or…
  • CG: That’s the path that will be returned in the end; it’s unchanged but not explicitly stated.
  • MK It might be helpful to be a little bit more explicit.
  • JLO: Whose the first to write a globbing function for this?
    • … This isn’t the easiest way to do this.
    • … What is $path then?
  • CG: It’s the path that will be returned in the end.

CG to add a little clarity about that path.

  • CG: You’d like an extra function for the glob pattern?
  • JLO: I’d like one; but I will add one with recurse and filter if I have to.
  • NW: There are lots of flavors of globbing.
  • WP: Are recurse and depth both syntactic sugar?

3. Any other business

  • RD: I’ve had a look at Gunther’s issue and I don’t think it’s an issue. The way lexing works, the `{ is only valid within the string constructor content. So it should only be tokenized in that context.
    • … The element content char state that’s present when doing direct element content doesn’t have that ` so it gets lexed and tokenized correctly.
  • RD: I’ve left a comment.
  • NW: Thank you.