QT4 CG Meeting 016 Minutes 2022-12-20

Table of Contents

Minutes

Approved at meeting 017 on 10 January 2023.

Summary of new and continuing actions [0/14]

  • [ ] QT4CG-002-10: BTW to coordinate some ideas about improving diversity in the group
  • [ ] QT4CG-015-02: NW to improve the width of the diagrams, perhaps multiple views
  • [ ] QT4CG-015-03: NW to make sure the direction of the arrow is in the legend
  • [ ] QT4CG-015-04: NW to investigate of a dynamic presentation is practical
  • [ ] QT4CG-016-01: DN to provide prose with more of the details for #281
  • [ ] QT4CG-016-02: NW to add an ed-note indicating when it was approved.
  • [ ] QT4CG-016-03: RD to add a note clarifying “known character encoding”
  • [ ] QT4CG-016-04: RD to add a note clarifying the “*”/”*” html/version combination
  • [ ] QT4CG-016-05: RD to add a “todo” noting the dependency on keyword arguments
  • [ ] QT4CG-016-06: RD to reword the introduction to mapping to clarify who’s doing the mapping
  • [ ] QT4CG-016-07: NW to make an issue about the problems of document-uri uniqueness
  • [ ] QT4CG-016-08: RD to clarify how namespace comparisons are performed.
  • [ ] QT4CG-016-09: RD to add a note stating that the local name should always be lowercase
  • [ ] QT4CG-016-10: RD to consider how to clarify parsed entity parsing.

1. Administrivia

1.1. Roll call [10/14]

  • [ ] Anthony (Tony) Bufort (AB)
  • [X] Reece Dunn (RD)
  • [X] Sasha Firsov (SF) [0:20-]
  • [X] Christian Grün (CG)
  • [X] Joel Kalvesmaki (JK) [0:15-]
  • [X] Michael Kay (MK)
  • [X] John Lumley (JL)
  • [X] Dimitre Novatchev (DN)
  • [X] Ed Porter (EP)
  • [ ] Liam Quin (LQ)
  • [ ] Adam Retter
  • [ ] C. M. Sperberg-McQueen (MSM)
  • [X] Bethan Tovey-Walsh (BTW)
  • [X] Norm Tovey-Walsh (NW). Scribe. Chair.

1.2. Accept the agenda

Proposal: Accept the agenda.

Accepted.

1.3. Approve minutes of the previous meeting

Proposal: Accept the minutes of the previous meeting.

  • JL: The previous minutes imply that I wrote all of the XSLT compiler; I only wrote the XSLT ones.

Accepted with that amendment.

1.4. Next meeting

The next meeting is scheduled for Tuesday, 10 January 2023.

No regrets heard.

1.5. Review of open action items [3/7]

  • [ ] QT4CG-002-10: BTW to coordinate some ideas about improving diversity in the group
  • [X] QT4CG-014-01: MK to propose new try-get functions for arrays, maps, and (maybe) sequences
    • PR #289
  • [X] QT4CG-015-01: MK to change the name of the argument to array:index-where from $input to $array
    • Slipped into one of the PRs
  • [ ] QT4CG-015-02: NW to improve the width of the diagrams, perhaps multiple views
  • [ ] QT4CG-015-03: NW to make sure the direction of the arrow is in the legend
  • [ ] QT4CG-015-04: NW to investigate of a dynamic presentation is practical
  • [X] QT4CG-015-05: MK to raise a PR to resolve issue #107, self::(a|b|c)
    • PR #286

2. Technical Agenda

2.1. Issue #281

We had some discussion of #281 last week, but no resolution.

  • DN: I think there was some progress in the comments. MK suggested I write some prose.
  • MK: I need to see the detail.

ACTION QT4CG-016-01: DN to provide prose with more of the details for #281

  • DN: I also reopened an issue about guards on function evaluation.

2.2. Issue #170, XPath “otherwise” operator

MK proposes that this issue may be ready to be decided.

At the meeting last week, no one present could recall why this issue had been postponed.

  • NW: Does anyone know why we postponed it?
  • MK: I think there was debate about how it should be spelled.
  • DN: I agree, it was the name.

A first straw poll. From the candidates, vote for any and all that you’d consider:

Candidates:

  • [ ] otherwise
  • [X] fallback
  • [ ] if-empty
  • [ ] if-null
  • [X] when-empty
  • [ ] when-null
  • [X] on-empty
  • [ ] on-null
  • [X] default

Okay, that eliminates a few options. Next straw poll: vote for exactly one option, your preference.

New candidates:

  • otherwise: 7 votes
  • fallback: 0 votes
  • when-empty: 0 votes
  • on-empty: 1 vote
  • default: 0 votes

Looks like “otherwise” is the favorite.

  • NW: Is there anyone who can’t live with “otherwise”?

No one says they can’t live with it.

Proposal: use “otherwise”

Accepted.

ACTION QT4CG-016-02: NW to add an ed-note indicating when it was approved.

2.3. Review pull request #259: parse-html (issue #74)

See pull request #259

RD gives us a brief tour.

  • RD: Moved “parsing and serializing” up to a main section
    • … Moved the parse-xml, parse-json, and some related functions into that section
    • … All of the object parsing and serialization functions are all in the same place
    • … 15.3.1, there’s an options method for the parsing options
  • MK: Records take string keys where traditionally our options have taken QName keys, especially extensible ones.

Some discussion of whether or not QNames are allowed, they are because * occurs in the record.

  • RD: There’s an include-template-content parameter that defines the behavior for handling the template element in HTML5.
    • … HTML specification says that the content in the template doesn’t appear in the tree, but specifies different behavior. So it’s a bit tricky; this option lets you decide if you want them or not.
    • … Review of the signature

Some discussion of what “parsing with a known character encoding” means when the input is a string. Could add a note to clarify.

ACTION QT4CG-016-03: RD to add a note clarifying “known character encoding”

  • RD: Outlines how the encoding is determined in the case of binary.
    • … You can override the encoding.
    • … Outlines what the various html/version combinations mean.

Some discussion of what the html/version combination “*”/”*” means. It’s a wildcard, not a literal value.

ACTION QT4CG-016-04: RD to add a note clarifying the “*”/”*” html/version combination

  • RD: Trying to be quite loose with specifying how these are handled.
    • … An implementation could, for example, use HTML: The Living Standard for all of the versions.
    • … A few error conditions are defined.

Some discussion of the use of keyword arguments in the examples. RD is assuming that the proposal to build a map from any additional arguments has been accepted.

ACTION QT4CG-016-05: RD to add a “todo” noting the dependency on keyword arguments

  • RD: The “Conversion from HTML” section is quite complicated
  • MK: Does the DOM: Living Standard define the mapping?
  • RD: No, they define the node types, we’re defining the mapping.

ACTION QT4CG-016-06: RD to reword the introduction to mapping to clarify who’s doing the mapping

  • RD: There are some notes about how processing instructions are handled
    • … Similarly, CDATA sections become text nodes
    • … The DocumentFragment node doesn’t have a corresponding XDM node.
    • … It’s used in the ShadowRoot which is limited to JavaScript so we don’t care
    • … And in template content which we have include-template-content to support.
    • … The following sections go into more detail about how each of the node conversions are supported.
  • MK: Am I right that DOM attribute nodes don’t have a parent?
  • RD: The DOM Attr interface does have a parent.
  • MK: Okay. I thought they’d changed that, maybe I’m thinking of something else.
  • RD: Continues…
    • … The child nodes have to handle the HTMLTemplateElement
    • … The DocumentType is removed

Some discussion of how adjacent text nodes are handled.

Some discussion of the document-uri accessor. The fact that document URIs are required to be unique is a problem.

ACTION QT4CG-016-07: NW to make an issue about the problems of document-uri uniqueness

  • RD: Continues…
    • … Observes that the is-id accessor has to be sensitive to the fact that HTML ID values don’t have to be NCNames.
    • … Handling namespace nodes requires a special accessor that deals with “namespace attributes”.
    • … The tree is traversed to collect the relevant namespaces.
  • MK: In point 2a, the comparison has to be defined, it can’t be node identity.

ACTION QT4CG-016-08: RD to clarify how namespace comparisons are performed.

  • RD: Continues…
    • … The node kind accessor has special handling for namespaces
    • … The node name accessor is quite tricky because in the HTML parser the local name can contain the “:”. That needs to be split and parsed.
  • MK: Are there any differences in the value spaces for HTML and XML names other than the “:”?
  • RD: No, I don’t think so. And the local-name is always lowercase with the HTML: The Living Standard parser.

ACTION QT4CG-016-09: RD to add a note stating that the local name should always be lowercase

  • RD: Continues…
    • … The parent accessor also has to check for the HTMLTemplateElement case.
    • … There’s a note that describes conequences of the fact that templates don’t have a link back into the tree. An implementation might be able to do better if it has access to the host property.
    • … The string-value accessor is described. It handles combining adjacent text and CDATA sections into a single value.
    • … The type-name accessor is described.
    • … The typed-value accessor is described.
    • … The unparsed-entity-public-id and unparsed-entity-system-id are empty.

ACTION QT4CG-016-10: RD to consider how to clarify parsed entity parsing.

  • NW: I’d have been happy to leave out all the HTML options and just doing HTML5
  • RD: I think they’re necessary because of MK’s comments about how the conversion would actually work.
  • MK: I’m worried about whether we’re capable of writing test cases about this many versions and variants.
  • RD: Liam Quin mentioned support for XHTML which also introduced versions.
  • NW: Okay. I wasn’t trying to derail things.
  • NW: I think the extra options should be in their own option, not tacked on the end. I created a comment with an example and rational.
  • SF: Do we need to specify our own algorithm in addition to what HTML defines?
  • RD: There are complexities that have to be addressed that depend on the specification, for example the template countent support. Likewise, namespaces and attributes have to be specified.
  • SF: I’m trying to make a bridge between the HTML and XML.
    • … You’re doing it by converting into the XML standard.
    • … It could also be done by changing HTML to accept XML.
  • RD: There are a number of places where XML and HTML disagree, for example, the value space of IDs. And HTML says it willfully ignores HTML rules for things like the template element.
    • … And because HTML isn’t namespace aware, it doesn’t know how to handle namespace nodes.
    • … We’re not bringing in the living standard directly because we need a mapping.
  • SF: The mapping is understandable but it implies that you’re not going to work on HTML itself. You’re going to be importing the DOM, not operating on the browser DOM. That’s a very significant break for adoption of this proposal into the HTML standard.
  • RD: I’ve noted that there are several ways to construct the tree; it could take the HTML DOM and implement the XDM accessors in terms of those.
  • SF: It can but it will be not just a bottleneck but a blocker to getting this into the browser. We could delegate the creation of the nodes to the underlying engine. If we give the original document power to create the nodes, they’ll be able to adopt our XML tree as their HTML tree.
  • RD: A browser will parse XML to the HTML DOM and will parse HTML to the HTML DOM. The issue is if you do things like element.localName, that can return things like e:bug

Some discussion of the various ways that you can do this “conversion”. It’s a question of how to present the view.

  • NW: I hate to interrupt, but we’ve run out of time today.
  • DN: Can we continue with this discussion on the 10th?
  • NW: Yes, we’ll continue on the 10th.

3. Any other business

Happy holidays, everyone!