QT4 CG Meeting 021 Minutes 2023-02-07

Table of Contents


Approved at meeting 022 on 14 February 2023.

Summary of new and continuing actions [0/6]

  • [ ] QT4CG-002-10: BTW to coordinate some ideas about improving diversity in the group
  • [ ] QT4CG-016-08: RD to clarify how namespace comparisons are performed.
  • [ ] QT4CG-021-01: NW to raise a PR addressing the points in issue #307
  • [ ] QT4CG-021-02: NW to check for deprecated URI features and add options for them in fn:build-uri and fn:parse-uri
  • [ ] QT4CG-021-03: RD to change must to will in DOM notes about lowercase
  • [ ] QT4CG-021-04: RD to revise and move the note about unrecognized entities

1. Administrivia

1.1. Roll call [10/14]

Regrets: BTW

  • [ ] Anthony (Tony) Bufort (AB)
  • [X] Reece Dunn (RD)
  • [X] Sasha Firsov (SF)
  • [X] Christian Grün (CG)
  • [X] Joel Kalvesmaki (JK) [:08-]
  • [X] Michael Kay (MK)
  • [X] John Lumley (JL)
  • [X] Dimitre Novatchev (DN)
  • [X] Ed Porter (EP)
  • [ ] Liam Quin (LQ)
  • [ ] Adam Retter
  • [X] C. M. Sperberg-McQueen (MSM)
  • [ ] Bethan Tovey-Walsh (BTW)
  • [X] Norm Tovey-Walsh (NW). Scribe. Chair.

1.2. Accept the agenda

Proposal: Accept the agenda.


1.3. Approve minutes of the previous meeting

Proposal: Accept the minutes of the previous meeting.


1.4. Next meeting

The next meeting is scheduled for Tuesday, 14 February 2023.

No regrets heard.

1.5. Review of open action items [10/12]

  • [ ] QT4CG-002-10: BTW to coordinate some ideas about improving diversity in the group
  • [X] QT4CG-016-02: NW to add an ed-note indicating when it was approved.
  • [X] QT4CG-016-06: RD to reword the introduction to mapping to clarify who’s doing the mapping
  • [X] QT4CG-016-07: NW to make an issue about the problems of document-uri uniqueness
  • [ ] QT4CG-016-08: RD to clarify how namespace comparisons are performed.
  • [X] QT4CG-016-09: RD to add a note stating that the local name should always be lowercase
  • [X] QT4CG-016-10: RD to consider how to clarify parsed entity parsing.
  • [X] QT4CG-019-01: MK to fix the type of $pattern in fn:tokenize()
  • [X] QT4CG-020-01: MK to check if the default for type-variety is correct.
  • [X] QT4CG-020-02: MK to review the description of the equal-strings function
  • [X] QT4CG-020-03: MK to add examples using options.
  • [X] QT4CG-020-03: MK to add an option to return false instead of raising an error

2. Technical Agenda

2.1. Issue #307, parsing and building URIs comments and queries

See issue #307.

  • RD: I have some questions…
  • NW: I think the issues you raise in 1a/1b are places where I have to correct the text.

RD shares his screen to review the issue.

  • RD: Issue 2 is I don’t think the fn:build-uri will work correctly for IPv6 addresses where there are colons in the host part.
  • RD: Issue 3 is that user:password are now rejected by browsers. Perhaps it shouldn’t render the password?
  • NW: What do you mean by render?

Some discussion. NW remains confused about what render means. The string has to contain the password or the resulting string isn’t useful.

  • RD: Maybe we should at least add a note to say that having a password in the string is a risk.
  • RD: The last point is that the spec suggest known port numbers be omitted.
  • NW: I wonder about the implementations knowing the port numbers
  • SF: In modern browsers, http: isn’t a valid protocol anymore. The same as gopher:.
  • MK: I think these functions are about generic URI syntax rather than specific schemes and dereferencing. You might want to construct a namespace URI, for example, and you could use http: for that or xyz:.
  • NW: I think that’s right.
  • DN: I think what MK says is right, but on the other side, one of our goals is to define the function to be convenient to the user. I think we should have options. For example, we should have an option that forbids username:password by default. And also we could forbid http: by default.
  • NW: I have reservations about making a user specify an option to do the thing that the request asks for.
  • RD: I don’t think we should be adding any additional logic. The http: to https: mapping is dependent on when you make a request to the server.
    • … What’s convenient for one application is inconvenient for another. What works for targeting web browsers might not work when targeting a different platform.
    • … I’m in favor of keeping these two functions purely about the RFC.
  • DN: I think there is a principle in design that it’s good to make dangerous things difficult. For the user:password, I think we need to apply this principle.
  • NW: Point taken, but it’s not the sort of thing that feels like a user is going to do it accidentally. And if they need a username:password, making them specify that and set an option that says they meant to specify that seems odd.
  • RD: The RFC specification says that the username/password is deprecated. Applications may choose to ignore or reject such data.
    • … I think supporting those different use cases for the user:password bit makes sense.
  • NW: Okay.
  • MSM: Just for consistency, I think you should generalize that action to check for anything that’s deprecated.

ACTION QT4CG-021-01: NW to raise a PR addressing the points in issue #307

ACTION QT4CG-021-02: NW to check for deprecated URI features and add options for them in fn:build-uri and fn:parse-uri

  • JK: Perhaps a way to do this is to have another arity that says “safe mode on” or some such.
  • MK: Are we just talking about building URIs, or also parsing?
  • RD: Both, I think.
  • CG: I think it’s somewhat confusing if passwords are rejected when using parse-uri.

2.2. PR #320: Issue 98 - add options parameter to fn:deep-equal

See pull request #320. We reviewed this last week with an eye towards getting approval this week.

  • MK: I’ve done the actions in response to the review, but there’s also work in progress. I’ve got more than 50 tests and I’m working on an implementation. That’s revealed places where the spec needs more detail. I think we should refrain from further review until I’ve done that.

2.3. PR #289: Proposal to add fallback behaviour to map:get and array:get

See pull request #289.

MK reviews the PR.

  • MK: In its current state, it basically effects map:get and array:get.
    • … There’s now a fallback function on map:get that says what the function should do if the key isn’t present. Default is to return ().
    • … And similarly on array:get, but the default is to throw an error.
  • RD: Shouldn’t the namespace of the error be the err: namespace?
  • MK: Yes, well spotted.

This brings us to the context of defaults, but that’s a different issue.

  • DN: I think this seems reasonable. I’m only worried about the default for maps. It’s perfectly possible that for a given actual key, the map could actually contain an empty sequence. That would make it difficult to distinguish these two cases.
  • MK: But we’ve got to be compatible with the existing capability which returns an empty sequence.
  • DN: I thought someone raised an issue that errors were raised and they wanted forgiving behavior. But maybe that’s something different.

(Before the end of the call, DN finds the reference and sends email about it.)

  • MK: In 3.1, it returns an empty sequence and there are notes that say if that’s a problem check first with map:contains.

Some discussion of having an empty map returned so that you can dereference through the sequence. DN will try to find the use case.

Proposal: Accept this PR


2.4. PR #330: Update fn:parse-html to apply review feedback.

See pull request #330.

RD reviews changes in PR 330.

Some discussion of current build failures. NW thinks rebasing off the current master will fix it.

  • RD: I’ve applied all the feedback except for namespaces which I’m not sure about.
  • RD: There’s now a note about character encodings.
    • … it attempts to avoid a decode/recode cycle

Some … discussion … about the fact that WHAT WG defines ISO 8859-1 and ASCII encodings are aliases for Windows-1252.

  • MK: I found it simpler when describing something analagous for serialization to treat it as if you were encoding it into UTF-8 and then back out again. As a simpler way of specifying it.

Some further discussion about the fact that one of the goals here is to make it clear that you aren’t doing a decoding/recoding step.

  • RD: I’ve clarified what * meant in the table.
    • … Updated the section about mapping to XDM from HTML DOM Nodes.
    • … Reworded the introduction to make that clearer.
    • … Where that section uses attribute and element local names, I’ve added a note about the names being lowercase.
  • MK: Will the names already be lowercase in the DOM?
  • RD: If you parse them with the HTML5 parsing algorithm, they will be lowercase. It’s not clear what the HTML3 or HTML4 specifications say about them.
  • MSM: I was thinking that the implication of MK’s observation is that you could change that must to a will. I thought in the introduction we said it was HTML5 DOM nodes.
  • RD: I’m happy to change must to will.

ACTION QT4CG-021-03: RD to change must to will in DOM notes about lowercase

  • RD: I’ve replicated the note wherever localname is referenced.
    • … The note for element localname is slightly different because of how HTML5 describes the process.
    • … Added a note saying that the HTML specification does not include unparsed entity references. The way that the parsing algorithm works, if the entity is unknown then the entity character data gets put into the text node as is.
  • NW: “Unparsed” has a particular meaning, it might be clearer to say “unrecognized” or “unknown”
  • MSM: But in fact, such an entity is expanded, it’s just expanded to the text of it’s representation!
  • MK: I think my point is that the headings are wrong: unparsed entities are something completely different. These notes are about parsed entities.
  • RD: So maybe this should be in the text section?
  • NW: That makes sense.
  • MSM: And the note should also be revised to avoid the term “unparsed entity”.

ACTION QT4CG-021-04: RD to revise and move the note about unrecognized entities

  • RD: The other bit was in terms of namespaces.
  • MK: The HTML WG has invented a distinction between HTML nodes or XML nodes and they have different semantics. That roughly worked in XPath 1.0, where usually all your nodes came from one document.
    • … We have to find a different way to try to achieve those usability benefits.
    • … It’s sort of out-of-scope for this PR except to add a note that says that the data model for HTML and XML nodes are indistinguishuable and the note about namespace handling in XPath 1.0 can’t apply.

[Scribe fails to capture some nuance in the discussion of namespace handling.] Something about special handling for specific namespace prefixes.

  • RD: There’s also other handling for putting HTML, MathML, and SVG elements in their respective namespaces. It’s effectively treating specific QNames as special indicators and ignoring everything else. Do we want to follow suit in the construction of the XDM, or do we want to willfully violate it back into the way XML works.

Proposal: Accept this PR.


3. Any other business

  • MK: I’d like to make some progress on XSLT things.

Proposal: Hold a meeting explicitly about XSLT features in two weeks.