QT4 CG Meeting 143 Minutes 2025-11-25
Meeting index / QT4CG.org / Dashboard / GH Issues / GH Pull Requests
Table of Contents
Draft Minutes
Summary of new and continuing actions [0/3]
[ ]QT4CG-143-01: CG to make another attempt at binary functions.[ ]QT4CG-143-02: MK to try to recover the ability to extract formal equivalences into tests[ ]QT4CG-143-03: JK to look for C14N test suites.
1. Administrivia
1.1. Roll call [8/10]
Regrets: EP.
[X]David J Birnbaum (DB)[ ]Reece Dunn (RD)[X]Christian Grün (CG)[X]Joel Kalvesmaki (JK)[X]Michael Kay (MK)[X]Juri Leino (JLO)[X]John Lumley (JWL)[X]Wendell Piez (WP)[ ]Ed Porter (EP)[X]Norm Tovey-Walsh (NW) Scribe. Chair.
1.2. Accept the agenda
Proposal: Accept the agenda.
Accepted.
1.3. Approve minutes of the previous meeting
Proposal: Accept the minutes of the previous meeting.
Accepted.
1.4. Next meeting
The next meeting is planned for 2 December 2025.
Regrets: DB.
Looking forward, the chair proposes that we meet on 2, 9, and 16 December, then recess for the end-of-year holidays, with the following meeting on 6 January 2026.
Agreed.
1.5. Review of open action items [4/4]
[X]QT4CG-140-02: MK to add a note about dealing with binary in parse-csv and parse-json[X]QT4CG-141-01: MK to follow up on a comment by JWL on #2269[X]QT4CG-142-01: MK to review the “Captured Groups within Lookahead” example.[X]QT4CG-142-02: MK to add explanatory note about the difference between typed an untyped values in string-length
1.6. Review of open pull requests and issues
This section summarizes all of the issues and pull requests that need to be resolved before we can finish. See Technical Agenda below for the focus of this meeting.
1.6.1. Blocked
The following PRs are open but have merge conflicts or comments which suggest they aren’t ready for action.
- PR #2309: Allow SimpleMapExpr after ArrowExpr
- PR #2266: 540 system-property equivalent for XQuery
- PR #2256: 2216 All atomic types become ordered
- PR #2247: Deferred Evaluation in XPath - the f:generator record
- PR #2160: 2073 data model changes for JNodes and Sequences
- PR #2124: 573 Functions to Construct Trees
- PR #2071: 77c deep update
- PR #2019: 1776: XSLT template rules for maps and array
1.6.2. Merge without discussion
The following PRs are editorial, small, or otherwise appeared to be uncontroversial when the agenda was prepared. The chairs propose that these can be merged without discussion. If you think discussion is necessary, please say so.
- PR #2308: QT4CG-140-02 Add note about binary input to parse-csv and parse-json
- PR #2306: QT4CG-141-01 Fix formal equivalent of array:for-each
- PR #2304: QT4CG-142-02: Add notes on gotchas for string-length and normalize-space
- PR #2303: 2195 Fix some more simple editorial errors
- PR #2296: 2288 XSLT implicit document nodes
Proposal: merge without discussion.
Accepted.
1.6.3. Close without action
It has been proposed that the following issues be closed without action. If you think discussion is necessary, please say so.
- Issue #2291: load-xquery-module: formalizing (loading-)parameters
- Issue #2287: Ordered maps break JSON interoperability
Proposal: merge without discussion.
- MK: These are external comments, we should try to give a little bit of rationale.
- WP: I think #2287 has been adequately answered.
- MK: I think we can also observe that the CG spent a lot of time talking about the consequences of the change.
Accepted.
1.6.4. Substantive PRs
The following substantive PRs were open when this agenda was prepared.
- PR #2301: 2198 Add cdata attribute to xsl:text and xsl:value-of
- PR #2296: 2288 XSLT implicit document nodes
- PR #2289: 2195 (partial) Editorial notes (incremental)
- PR #2283: 2276 Relax XSLT rules on Extension Attributes
- PR #2282: 2278 Add function bin:infer-encoding; simplify bin:decode-string
- PR #2281: 2280 Usability of xsl:array-member
- PR #2274: 407 Function items capturing XSLT context components
- PR #2259: 938 Canonical serialization
- PR #2213: 2047 External resources and security
- PR #2019: 1776: XSLT template rules for maps and array
2. Technical agenda
2.1. PR #2282: 2278 Add function bin:infer-encoding; simplify bin:decode-string
See PR #2282
Discussion continues from last week, please see also that discussion.
- MK opens again with a quick review of the PR.
- CG: I like the idea of making the offset completely optional, but I just
wonder what would happen if we did it with the current solution.
- … We’re introducing new logic here and I think it would be nice if the solutions worked the same way.
- MK: The main difference with unparsed text is that it can take information from http headers.
- CG: But it also has semantic details that aren’t considered in decode-string here.
Some discussion of how to simulate the old behavior.
- CG: We spent time defining the current rules, it would make sense to preserve them and only change the offset idea.
- MK: I think it’s useful to have the additional function to infer the encoding.
- … But I guess you can make it have the same semantic effect.
- … Do you want to make another attempt at this?
- CG: I think it would make sense to use the infer-encoding feature only if the offset is zero.
- JLO: I like the changes; I think I would also agree with CG that it would be
nice to have a little extra magic if UTF-16 is provided, it tries to figure
out if it’s BE or LE if it can.
- … I also think it would be nice if the logic could be the same in different parts of the spec.
ACTION: QT4CG-143-01: CG to make another attempt at binary functions.
2.2. PR #2289: 2195 (partial) Editorial notes (incremental)
See PR #2289
CG displays the PR.
- CG: I looked at the various arrows and decided that there was too much variation.
- … I switched to text.
- CG: I also updated the pseudocode for
bin:shift - MK: I had a stylesheet that extracted the formal equivalence into a test set.
ACTION: QT4CG-143-02: MK to try to recover the ability to extract formal equivalences into tests
- JLO: I’m all for text.
- … We could think of having arrows next to it.
- NW: It’s also more accessible.
Proposal: accept this PR
Accepted.
2.3. PR #2259: 938 Canonical serialization
See PR #2259
- JK: Canonical serialization makes it possible to do things like take hashes of serialization to assure that they’re the same.
- JK: And canonical serialization is also possible for JSON.
JK reviews the PR.
- JK: For XML, I’m assuming that dropping comments during canonicalization is always false.
- JK: In JSON, there are slight changes to the output method accounting for canonicalization.
- … There’s a carve out for dealing with XNodes in JNodes.
- There are changes in some of the other specs to expand the serialization parameters.
- MK: Do we know if there are test suites for C14N?
- JK: I don’t.
ACTION: QT4CG-143-03: JK to look for C14N test suites.
- NW: Should there be errors in XSLT and XQuery for attempts to set canonicalization and output properties that are ignored when C14N is enabled.
- JK: I considered it, but I opted to keep it silent.
- MK: Generally, if you ask for URI escaping with the JSON output method, for example, it’s going to be ignored. That’s the precedent.
- JWL: Why do we choose to work on the octet stream instead of the XDM model.
- JK: It’s a node set that’s required, but all we can require is a sequence. I tried to do that, but MK pointed out that there’s an incompatibility there.
- MK: The C14N spec has enormous problems supporting an arbitrary node set. We can avoid all those problems by notionally serializing to an octet stream and the canonicalizing.
Proposal: accept this PR.
Accepted.
2.4. PR #2213: 2047 External resources and security
See PR #2213
- MK: I looked at it again and don’t have any direction to change it. I think it’s pretty good.
MK reviews the new “external resources and security” section.
- JK: This looks good to me, my question is, do we have good feedback from the folks to whom this might matter?
- WP: I think it’s an awsome question. Who are those people?
- MK: We get push back from customers on security, varying for people who know
what they’re talking about to folks who just look at the output from scanners.
- … It’s a wide range of input.
- … This seems like enough to address security concerns from the outside.
- CG: Security is definitely a big issue for users. We get a lot of feedback.
We’ve incorporated that into our products. I like the ‘trust’ value, but maybe
it could be a string so that it’s easier to extend with vendor-specific
features. So everyone has to support ‘true’ and ‘false’, but it could be
extended to things like databases.
- … But if we have different features, standard and non-standard, then things can get even more complicated.
- MK: These are going into options parameters that are already extensible, so I think it mixes quite well. But it’s obviously up to vendor to say what that interaction is.
- WP: From my perspective, the security experts are often unsure, but customers are a great place to start. In view of that, I like this very much. I don’t think perfect is possible. Just having a stake in the ground is important.
- JLO: I still think a boolean is good. We could make them strings; but it’s an
enum and if you want it to change it would be a different enum. I think it’s
better to have two sets: trusted and untrusted, and then vendor extensions.
- … I like the very simple boolean first.
- … If there are specific examples where it’s very complicated, I’d like to see them.
- CG: One thing I have in mind is the general challenge of the size of
resources. Some users can only open small files, etc. But we don’t have a
concept for that. We mostly work on database concepts: read/write access to
specific databases, or groups of databses, admin features, etc.
- … It’s most interesting for extension modules, like the http or file module.
- … The easiest solution would be only to allow it for trusted users.
- … I have many ideas in mind but I haven’t reflected on how it might work.
- … But I agree that having a boolean is simple.
- JLO: If we do this now, and we know that someone has to build on this in the future, could this backfire in some way?
- MK: I think the fact that it’s going in options parameters makes it pretty extensible.
MK reviews the detail on one of the functions.
- MK: In F&O, we get detail on the functions.
- … There’s a mention in collations because they can be external resources.
- …
fn:docgets atrustedparameter. - …
fn:collectiongets a similar option
- CG: With regards to external resources, does that also apply to database resources?
- MK: Well, we have this let out that says trusted=no says you can only get access to things that are explicitly made available. In a database context, that would include the parts of the database that you have explicit permission to access.
- CG: That makes sense.
- MK: You don’t need an extra parameter to control external entities for
fn:unparsed-text.
MK continues the review.
- MK:
load-xquery-modulehas atrustedoption.- … That turns out to be a little tricky to implement. But I think it’s the right answer.
- JWL: In terms of things like
fn:doc, the default would be that if I’m trusted, the doc is also trusted.- … We aren’t going to get backwards compatiblity problems.
- MK: I think we could have backwards compatibility problems; the advice is to make things secure by default.
Proposal: merge this PR?
Accepted.
3. Any other business
None heard.