QT4 CG Meeting 019 Minutes 2023-01-24

Table of Contents

Minutes

Approved at meeting 020 on 31 January 2023.

Summary of new and continuing actions [0/17]

  • [ ] QT4CG-002-10: BTW to coordinate some ideas about improving diversity in the group
  • [ ] QT4CG-015-02: NW to improve the width of the diagrams, perhaps multiple views
  • [ ] QT4CG-015-04: NW to investigate of a dynamic presentation is practical
  • [ ] QT4CG-016-02: NW to add an ed-note indicating when it was approved.
  • [ ] QT4CG-016-03: RD to add a note clarifying “known character encoding”
  • [ ] QT4CG-016-04: RD to add a note clarifying the “*”/”*” html/version combination
  • [ ] QT4CG-016-05: RD to add a “todo” noting the dependency on keyword arguments
  • [ ] QT4CG-016-06: RD to reword the introduction to mapping to clarify who’s doing the mapping
  • [ ] QT4CG-016-07: NW to make an issue about the problems of document-uri uniqueness
  • [ ] QT4CG-016-08: RD to clarify how namespace comparisons are performed.
  • [ ] QT4CG-016-09: RD to add a note stating that the local name should always be lowercase
  • [ ] QT4CG-016-10: RD to consider how to clarify parsed entity parsing.
  • [ ] QT4CG-020-01: MK to fix the type of $pattern in fn:tokenize()
  • [ ] QT4CG-020-02: NW to revisit the width issue in the type diagrams

1. Administrivia

1.1. Roll call [8/14]

Regrets: BTW, EP, MK

  • [ ] Anthony (Tony) Bufort (AB)
  • [X] Reece Dunn (RD)
  • [X] Sasha Firsov (SF)
  • [X] Christian Grün (CG)
  • [X] Joel Kalvesmaki (JK) [:12-]
  • [ ] Michael Kay (MK)
  • [X] John Lumley (JL)
  • [X] Dimitre Novatchev (DN)
  • [ ] Ed Porter (EP)
  • [ ] Liam Quin (LQ)
  • [ ] Adam Retter
  • [X] C. M. Sperberg-McQueen (MSM)
  • [ ] Bethan Tovey-Walsh (BTW)
  • [X] Norm Tovey-Walsh (NW). Scribe. Chair.

1.2. Accept the agenda

Proposal: Accept the agenda.

Accepted.

1.3. Approve minutes of the previous meeting

Proposal: Accept the minutes of the previous meeting.

Accepted.

1.4. Next meeting

The next meeting is scheduled for Tuesday, 31 January 2023.

No regrets heard.

1.5. Review of open action items [5/17]

(Items marked [X] are believed to have been closed via email before this agenda was posted.)

  • [ ] QT4CG-002-10: BTW to coordinate some ideas about improving diversity in the group
  • [ ] QT4CG-015-02: NW to improve the width of the diagrams, perhaps multiple views
  • [ ] QT4CG-015-04: NW to investigate of a dynamic presentation is practical
  • [ ] QT4CG-016-02: NW to add an ed-note indicating when it was approved.
  • [ ] QT4CG-016-03: RD to add a note clarifying “known character encoding”
  • [ ] QT4CG-016-04: RD to add a note clarifying the “*”/”*” html/version combination
  • [ ] QT4CG-016-05: RD to add a “todo” noting the dependency on keyword arguments
  • [ ] QT4CG-016-06: RD to reword the introduction to mapping to clarify who’s doing the mapping
  • [ ] QT4CG-016-07: NW to make an issue about the problems of document-uri uniqueness
  • [ ] QT4CG-016-08: RD to clarify how namespace comparisons are performed.
  • [ ] QT4CG-016-09: RD to add a note stating that the local name should always be lowercase
  • [ ] QT4CG-016-10: RD to consider how to clarify parsed entity parsing.
  • [X] QT4CG-018-01: MK to make a PR that removes ternary expressions.
  • [X] QT4CG-018-02: MK to review “In this notation, function-name, in bold face…” in 1.5.
  • [X] QT4CG-018-03: MK to address the issues raised in the commeont on PR 304
  • [X] QT4CG-018-04: MK to consider the editorial suggestion on “a predicate is not part of a step”

2. Technical Agenda

We have regrets from MK for this week, but some of these PRs seem straight-forward enough to resolve in his absence.

2.1. Review pull request #313 fn:remove()

See pull request #313.

Proposal: Accept this PR

Accepted.

2.2. Review pull request #312 minor editorial improvements

See pull request #312.

  • NW: This was your change MSM. Are you satisified?
  • MSM: Okay.

Some discussion of the “fn” prefix.

  • DN: It’s good that in XPath there are implicit bindings for prefixes. We don’t have this in XSLT; I’ve proposed it.

Proposal: Accept this PR

Accepted.

2.3. Review pull request #310 outstanding issues from PR #304

See pull request #310.

  • JL: The type of $pattern in fn:tokenize() has to be xs:string?

ACTION QT4CG-020-01: MK to fix the type of $pattern in fn:tokenize()

Some discussion of parameter passing…

Proposal: Accept this PR

Accepted.

2.4. Review pull request #309 drop ternery conditionals

See pull request #309.

Proposal: Accept this PR

Accepted.

2.5. Review pull request #308 improvements to type diagrams

See pull request #308.

Changes not successfully presented.

  • RD: If we moved xs:integer and xs:string out, then maybe it would be narrower.
  • DN: I would like to see the nodes on the right levels.

Some further discussion of the layout.

  • JL: If you start to pull the diagrams apart, if you give labels to the groups will begin to make people think they have those names.
  • MSM: Just keep xs:anyAtomicType on the root level, don’t add intermediate labels.

Some discussion of a prose description.

  • MSM: I think this is as good a job narrowing it as can be done. I’d be happy to accept this. If we want to avoid side-to-side scrolling, the tree notation that’s most successfull at dealing with narrow columns is the indentation form that you see in file explorers. Splitting out subtrees for width reasons makes the shape of the entire tree harder to visualize, so I’m concerned about that.
  • NW: I’ll explore that. And I’ll fix the horizontal scrolling.
  • JK: One I look at the legend, what is the meaning of a box with two items in it?
  • RD: We could try to make a graph something like this, from the XML Schema spec: https://www.w3.org/TR/xmlschema-2/#built-in-datatypes
  • DN: It’s also narrower because it leaves off the xs: prefixes.
  • MSM: I prefer the 1.1 diagram: https://www.w3.org/TR/xmlschema11-2/#built-in-datatypes. And I see that it’s vertical.
  • RD: I think it makes sense.

RD offers to try experimenting with a few layouts.

  • MSM: Apropos of trying to make this something that could be generated, the vertical diagram is SVG and it’s unlikely that I didn’t generate it!

ACTION QT4CG-020-02: NW to revisit the width issue in the type diagrams

2.6. Issue #299 (formerly #281)

We had some discussion of #281 previously, but no resolution. Awaiting more feedback from the CG in the issue.

  • NW: It seems like most of the feedback is that this should be an implementation detail.
  • DN: In MK’s absence, I’m not sure we can discuss this, but I would like to give an update. I’ve made the specification much more concise; wildcards in a destructuring expression should be lazy by default. That was the construtive part. There was a lot of discussion about what an “average” user is. I think this also an important topic.
    • … RD provided information that such functions are in MarkLogic.
    • … While MarkLogic’s functions are not exactly as what is proposed here, it is notable that an implementor thought about the laziness/eagerness issue and provided facilities for the users to indicate this. Probably they had significant reasons to do this.
    • … MK is saying that if we give this option to the user, the user will not be able to take advantage of it.

DN shares his screen, showing xsl:evaluate from XSLT

  • DN: There are lots of things in xsl:evaluate that are much more dangerous than my proposal for a lazy keyword. We’re all users, I don’t know what the average user is.
  • RD: It would be interesting to see if there are any other programming languages that have this kind of functionality. As I said in the thread, optimization is hard. A lot of processors invest a lot of time and resources optimizing the queries. In general writing an optimal query can in some cases be counter-intuitive and heavily dependent on the particular processor and processor version. The thing I’m not sure on is how a thing like a “lazy” keyword would benefit the processor. And how will the user know where to put it without causing performance issues. It can also be highly dependent on the actual data that you’re using.
    • … There’s an example where it’s more optimal to use bubble sort than any other kind of sort if you have a list that’s already mostly sorted.
    • … Those kinds of decisions are generally better made by the processor.
  • DN: I totally agree. I’m afraid most people misuderstand the proposal. It’s not supposed to replace the optimizer. I would only use it when I was desperate. It should only be done as a last resort.
  • NW: I’m not sure that the danger of xsl:evaluate is exactly relevant.
  • DN: I think it was good that xsl:evaluate was added and I think that it would be good to add lazy.
  • DN: In Haskel everything is lazy by default. You have to request eager evaluation.
  • MSM: I’m greatful to RD and DN working together to allow me to see this in a slighly different light. It’s one particular way that one could give a hint to an optimizer. The bubble sort example is a good one, under particular conditions this algorithm is better than that one. In general, type systems let you know those things, but it’s always possible that the optimizer could use information that it doesn’t have, if it could have it. If I think of it as a hint I could give the processor, then it feels like a processing instruction. It shows the implementors know what it’s like to have a four o’clock deadline!
  • CG: My personal impression is that we need at least one implementor that’s convinced about how to implement such a keyword. There are so many ways that users could use it. Would it have an impact on error reporting? What happens if the query is reordered? If it’s not 100% specified.
  • DN: I think the lazy keywords is defined strictly. I think what you’re asking about is the lazy function, and it’s semantics are defined strictly.
  • RD: To the point of why xsl:evaluate is dangerous, it’s because if you’re referencing user generated data, you could then open yourself up to running arbitrary code. To the point of the lazy keyword, the some and every quantifier expressions already specify the behavior of lazy evaluation if a processor wants to short-circuit when they encounter a value that’s either true or false (depending). I’m not sure that the specification says that a let expression has to be evaluated then-and-there. There are consistency constraints, but apart from that as long as it evaluates the expression that it needs to in ways that are functionally equivalent, it can defer it.
    • … There’s no gaurantee that a processor will know what to do with a lazy keyword.

Some discussion of whether this should be a hint or something manditory.

  • RD: I think it might be more useful to try to highlight areas where lazy evaluation could be done by a processor.

3. Any other business