QT4 CG Meeting 026 Minutes 2023-03-14

Table of Contents

Minutes

Approved at meeting 027 on 21 March 2023.

Summary of new and continuing actions [0/9]

  • [ ] QT4CG-002-10: BTW to coordinate some ideas about improving diversity in the group
  • [ ] QT4CG-016-08: RD to clarify how namespace comparisons are performed.
  • [ ] QT4CG-023-01: NW to review the stylesheets for functions across XPath and XSLT
  • [ ] QT4CG-024-01: MK to add namespace-uri-for-prefix argument changes to the compatibility appendix
  • [ ] QT4CG-024-02: DN to develop an alternative proposal for deep-action.
  • [ ] QT4CG-025-02: MK to make the context function properties simple values instead of functions
  • [ ] QT4CG-025-03: MK to revise and expand technical detail in PR #375
  • [ ] QT4CG-025-04: RD to remove the note in 15.5.15 of functions and operators.
  • [ ] QT4CG-026-01: MK to write a summary paper that outlines the decisions we need to make on “value sequences”

1. Administrivia

1.1. Roll call [9/12]

Regrets: BTW, JK

  • [ ] Anthony (Tony) Bufort (AB)
  • [X] Reece Dunn (RD)
  • [X] Sasha Firsov (SF)
  • [X] Christian Grün (CG)
  • [ ] Joel Kalvesmaki (JK)
  • [X] Michael Kay (MK)
  • [X] John Lumley (JL)
  • [X] Dimitre Novatchev (DN)
  • [X] Ed Porter (EP)
  • [X] C. M. Sperberg-McQueen (MSM)
  • [ ] Bethan Tovey-Walsh (BTW)
  • [X] Norm Tovey-Walsh (NW). Scribe. Chair.

1.2. Accept the agenda

Proposal: Accept the agenda.

Accepted.

1.3. Approve minutes of the previous meeting

Proposal: Accept the minutes of the previous meeting.

Accepted.

1.4. Next meeting

The next meeting is scheduled for Tuesday, 21 March 2023.

ATTENTION: This meeting is scheduled at 16:00 local time in Europe and the UK. The United States switched to Daylight Saving Time starting on Sunday, 12 March 2023. Until Europe and the UK switch to summer time (on 26 March 2023), this meeting will be one hour later in the United States.

No regrets heard.

1.5. Review of open action items [7/15]

  • [ ] QT4CG-002-10: BTW to coordinate some ideas about improving diversity in the group
  • [ ] QT4CG-016-08: RD to clarify how namespace comparisons are performed.
  • [ ] QT4CG-023-01: NW to review the stylesheets for functions across XPath and XSLT
  • [X] QT4CG-023-05: NW to put record types on an agenda.
  • [ ] QT4CG-024-01: MK to add namespace-uri-for-prefix argument changes to the compatibility appendix
  • [ ] QT4CG-024-02: DN to develop an alternative proposal for deep-action.
  • [X] QT4CG-025-01: NW to close the issues we agreed to close in meeting 024
  • [ ] QT4CG-025-02: MK to make the context function properties simple values instead of functions
  • [ ] QT4CG-025-03: MK to revise and expand technical detail in PR #375
  • [ ] QT4CG-025-04: RD to remove the note in 15.5.15 of functions and operators.
  • [X] QT4CG-025-05: MK to correct the accidental deletion in 3.6.2 of XQuery
  • [X] QT4CG-025-06: NW to fix the “blue boxes” in the serialization specification
  • [X] QT4CG-025-07: MK to update the text about for/let changes to use use definitions and termrefs correctly.
  • [X] QT4CG-025-08: MK to add an error code for the case where member is used on something that isn’t an array
  • [X] QT4CG-025-09: NW to add the parse-uri and build-uri tests to the test suite

2. Technical Agenda

This agenda begins with record types, proposed for the agenda some weeks ago, followed by a few more substantial PRs then a few small PRs.

2.1. Record types

In meeting 023, we encountered the proposal for a new pattern syntax, (type(T), record(N, M, N)) that allows matching of items by item type. This needs CG discussion.

  • MK: We’re doing a lot of things that are building on top of that syntax but we’ve never agreed to it.
    • … There may be some loose ends that are worth discussing (esp. in XSLT with respect to patterns).

MK shares XQuery-40#id-record-test

  • MK: Record is used instead of tuple because tuples are often unnamed.
  • MK describes the design presented in the spec.
    • … Limited recursion is allowed; a record can refer to itself.
    • … Record types would have to be able to name each other if we wanted more comprehensive support for recursive types.
    • … The subtyping rules are a bit complicated; review encouraged!
    • … Considered doing the subtyping rule intentionally but tried to spell them out instead.
    • … Largely meets requirements except for the ability to name record types by reference.
  • MSM: I have to ask what might seem like a dumb question, as far as I can tell, everything you’ve described I can do with maps, but not quite vice-versa. Why do we need records in addition to maps?
  • MK: We’re not introducing new values, we’re introducing new types. It’s a way of constraining the values that can be in a map.
  • MSM: That’s an excellent answer! And in that case the subtyping rules are important.
  • RD: I raised an issue a while ago about allowing “*” in a record, like we can in maps. It would be a useful alias.
  • MK: I’m sort of neutral; I don’t think it’s necessary but I’m not strongly opposed.
  • JL: RD, are you implying that it would be an alias?
  • MK: Yes, that’s what it would be if we allowed it. We’d just be saying that we didn’t require at least one field.
  • DN: I want to add something to what RD is saying; I see the value of “*” but for me the big step from maps to records is that I regard them as typed maps. So adding the “*” makes them less strictly type. Could we have “strict” and “untyped” records?
  • RD: A record(*) would be an untyped record type. See #id-map-test. The same pattern would apply to records.
  • MSM: If I understood MK’s description at the outset correctly, saying “this can be any record” is not the same as saying “this can be any map”. So when we said “it would be just be a synonym for map(*)” did I misunderstand?
  • MK: No, I think that would be a synonym. The star allows you to have anything else and if the anything is empty, that’s just like a map.
  • MSM: So if I have a “*” I no longer have the constraints on recursion?
  • MK: The only constraint on recursion is that you can’t constrain it. You can always have a field that can contain any map. What’s difficult is to say that it must contain one of these.
  • MSM: So the checking is not quite as tight as I’d thought. And if I have an exensible record, is that different from map(*)? I guess if I declared some fields as required.
  • MK: Yes, you’re expressing constraints on keys.
  • RD: As an example, the serialize function takes an option map and we could define it as a record type that constrains the various fields like output and things like that. And then allow unconstrained extensions.
  • DN: This discussion only amplifies for me that the real value of record without a “*”. The most important achievement of records is it’s strict type.
  • MK: When you’re using record types in XSLT style processing with match patterns, it turns out you very often want to specify just enough of the map to make it match uniquely. For example, if you want to match the kinds of records you get in JSON to represent employees, you want to be able to say this has a key called “social-security-number” so I’m going to assume it’s an employee.
  • DN: Yes, but it makes me think about some kind of subtyping relationship between records.
  • DN: The fields are unordered, but we’re defining the record presenting the fields in some order. Maybe it would be good to think of some sort of normalized presentation…I don’t know.
  • MK: You’ve got to be careful about that because at the moment, instances of a record type don’t have any flag that says a map is of a particular record type. A map can conform to many different record types; there’s no intrinsic type.
  • DN: We should also think about the instance of operator when the right hand side is a record. And maybe we should think about deep-equal as well.
  • MK: You can’t say that an item is a kind of record; we only have tests.
  • RD: You can think about it like duck typing in Python or the way that JSON bindings are bindings are implemented in typescript where you’ve got a Java object that represents your JSON but you don’t know whether it conforms so you have to check if it is an instance of that, which you do by checking it’s properties. That’s similar to how records work here.
  • JL: MK, the self-referential “..” is strictly to the parent of the type. Is there any case where you might want to point back up to the grand-parent?
  • MK: That’s typically where you want mutual recursion. A use case for that, I tried to model the schema component model with maps. If you do that, then every component as a record type and you want a graph of them pointing to each other so you’d like mutual recursion to describe that structure.
    • … The fact that you can’t construct a graph of maps is a different problem!

Some discussion of how the self-reference only works one level deep.

  • RD: You could describe a binary tree.
  • MK: I think John Snelson wrote something about this a while back. He was a head of his time with respect to making tuples in to named types that could refer to each other.

Proposal: We accept this as a consensus position.

Accepted.

2.2. PR #368: Issue 129 - Context item generalized to context value

See PR #368.

  • CG walks us through the PR.
  • CG: Issue #129 proposes expanding context item to context value.
    • … The sequence of items or nodes could also be bound externally.
    • … We could have a fat arrow operator to bind them.
    • … We consider using “.” to represent the context item and “~” to refer to the context value.
    • … Thanks to MK for helping to make a full proposal.

(The diff is a bit confusing because the sections have been reorganized.)

  • MK: My first thought was that this was going to be very disruptive because the notion that “.” is a singleton is so embedded. So if we’re going to have one, let’s try to make it a slightly different thing and keep the context item with it’s current semantics. I explored how it works in XPath and XQuery and it works quite nicely.
    • … We define a fixed relation between them. If the context value is a singleton, then “.” and “~” are the same thing.
    • … Then I found some useful things you can do with “~” like defining things over arrays.
    • … But what do you do about path expressions and axis steps?
      • … Should absolute paths use the context value and allow multiple root nodes?
      • … And what about relative path expressions?
      • … What scares me from an implementation point of view is that we might not be able to eliminate a lot of sorting into document order at compile time.
      • … Lots of times we know that the result will be a singleton so that we don’t have to do the sort. Now we might not know that until runtime.
      • … The compromise in this proposal is that absolute path expressions allow multiple selection. But using “!” not “/”. This can lead to unnecessary sorting.
      • … It’s certainly nice to do array filtering.
  • CG: We started from the very beginning to allow sequences of items in the path expression because that’s common in databases.
    • … People have become used to getting ordered and duplicate free results from path expressions.
    • … We could also optionally allow multiple nodes as input to absolute paths, but allow processors to raise errors.
  • MK: What do you do about document order for different documents in a database?
  • CG: It’s like fragments; every fragment has an ID which can be used for comparison. So documents stored sooner are “before” documents stored later.
  • MK: I wonder if someone could relax the constraints on “/” sorting into document order to say that in the case where you’re sorting multiple documents, the results are in arbitrary order rather than in some consistent order.

Some discussion of the consequences. You’d allow documents in arbitrary order but document-order values for items from documents

  • RD: What’s the motivation for keeping the results in document order?
  • CG: You can have queries like document {<xml><a/><a/></xml>}//a => { /xml } which would return two copies of “xml”.

“Where are we,” asks the scribe?

  • MK: We’ve jumped over the easy bits and just talked about the hard bits. The key thing is tha that the idea of context values does allow us to have array predicates.

CG shows us an example from XQuery 4.0 section 4.13.3.2.

  • NW: Is the proposal to accept this PR?
  • MK: It definitely needs more work because we need to work through the issues for XSLT.
  • DN: For the predicates for arrays, I have difficulty understanding. I made a proposal for “composite path language” that I think would give a much better syntax. I think “/” is going to be confusing. I think that it leads a lot more attention and investigation. It seems a little too complicated. Introducing new symbols doesn’t seem natural.
  • NW: What can we do to facilitate getting more work done here?
  • SF: We need to gather some examples of some of the other options, for example what is DN’s syntax.
  • DN: I don’t remember having read this in depth.

ACTION QT4CG-026-01: MK to write a summary paper that outlines the decisions we need to make on “value sequences”

  • RD: It would also be useful to have some motivating examples.

3. Any other business

DN shares some ideas for deep-equal-sequence.

  • DN: The current proposal for deep-equal is very long and complicated.
    • … We could have deep-equal-sequence that expands to deep-equal-item.
    • deep-equal-item can expand to deep-equal-atomic, deep-equal-map, deep-equal-array, deep-equal-array, deep-equal-node.
    • deep-equal-map uses deep-equal-atomic and deep-equal-sequence
    • deep-equal-array uses deep-equal-sequence
    • deep-equal-node uses different functions for the different node features
    • … We would also have deep-equal-set for things like attributes.
  • DN: On every level we would have greater understandability; we can construct the whole from these parts.
    • … This technique could also be applied to other things like parsing HTML.
    • … I think this would be a better technique generally.
  • MSM: I like this idea a lot, I like the idea of having the structure of a function like deep-equal be as far as possible be visually and obviously similar to the recursive definition of the structure of our data model. This seems like it would be useful for us and for readers.
  • RD: I think this is interesting. My only concern is in specifying options that apply to nested calls.

Chair proposes that DN create a new issue to track this

4. Adjourned

We ran out of time.