QT4 CG Meeting 012 Minutes 2022-11-22

Table of Contents

Minutes

Approved at meeting 013 on 29 November 2022.

Summary of new and continuing actions [0/9]

  • [ ] QT4CG-002-10: BTW to coordinate some ideas about improving diversity in the group
  • [ ] QT4CG-012-01: MK to add a reference to the place in XSD where it says that value spaces are non-overlapping
  • [ ] QT4CG-012-02: MK to point out the rules for union types are actually a bit more complex than described in section 2.2.1.
  • [ ] QT4CG-012-03: MK to consider introducing schema element and schema element in 2.8
  • [ ] QT4CG-012-04: MK to add “and also of other specific function types” to 2.8.
  • [ ] QT4CG-012-05: NW to see if the presentation of the type hierarchy tables can be improved.
  • [ ] QT4CG-012-06: MK to remove xs:untyped from the paragraph at the top of 2.8.2
  • [ ] QT4CG-012-07: NW to work with MK to sort out the server build issues with PR #237
  • [ ] QT4CG-012-08: NW to put the coercion rule changes on the agenda next week.

1. Administrivia

1.1. Roll call [8/13]

Regrets BTW, JL

  • [ ] Anthony (Tony) Bufort (AB)
  • [X] Reece Dunn (RD)
  • [X] Christian Grün (CG)
  • [X] Joel Kalvesmaki (JK) [0:20-]
  • [X] Michael Kay (MK)
  • [ ] John Lumley (JL)
  • [X] Dimitre Novatchev (DN)
  • [X] Ed Porter (EP)
  • [ ] Liam Quin (LQ)
  • [ ] Adam Retter
  • [X] C. M. Sperberg-McQueen (MSM)
  • [ ] Bethan Tovey-Walsh (BTW)
  • [X] Norm Tovey-Walsh (NW). Scribe. Chair.

1.2. Accept the agenda

Proposal: Accept the agenda.

Accepted.

1.3. Next meeting

The next meeting is scheduled for Tuesday, 29 November.

Regrets: JL

1.4. Approve minutes of the previous meeting

Proposal: Accept the minutes of the previous meeting.

Accepted.

1.5. Review of open action items [7/8]

2. Technical Agenda

2.1. Review pull request #232: Data model clarifications, issue #225

See pull request #232

  • MK: Mostly in response to MSM’s comment about better description of data values and type annotations
    • … Cleaned up the Introduction; added missing definitions
    • … Added 2.2 Basic Concepts
    • … Tried to make minimal changes
    • … We do use the terms value and sequence synonymously, nothing we can do about that.
    • … “Instance of the data model” is a synonym for sequence
    • … Define “item type”
    • … Cleaned up the term “tree”
    • … Changed definition of atomic value to say it’s a pair of a type annotation and a datum.
    • … Clarified that datums cannot have overlapping value spaces

ACTION QT4CG-012-01: MK to add a reference to the place in XSD where it says that value spaces are non-overlapping

  • MSM: I thought value spaces did overlap
  • MK: If you use identity, the value spaces don’t overlap.
  • MSM: I would have thought it was simpler to say that a given datum may appear in more than one value space, but as an atomic value, they are different.
  • DN: Maybe it would be good to list explicitly all the primitive atomic types.

General agreement that the atomic types are listed in the spec later on.

  • MK continues
    • … Note that type annotation means slightly different things for nodes and atomic values.
    • … Defined “schema type”
    • … Attempt to clarify the relationship between schema types and item types
  • RD: Suggests changing the prose to make it clear that “pure union types” are only of atomic types.
  • MK: We need to come back to that, I’m avoiding it here for the moment.

ACTION QT4CG-012-02: MK to point out the rules for union types are actually a bit more complex than described in section 2.2.1.

  • MK continues
    • … Getting away from the idea that every item has a most specific item type.
    • … Every item is an instance of one or more item types
    • … Atomic values do have a most specific item type
  • RD: Because you’re using item type syntax, “document()” needs to be “document-node()” etc.
  • MK: Right.

Some discussion of wether or not all attributes are instances of attribute(*).

ACTION QT4CG-012-03: MK to consider introducing schema element and schema element in 2.8

  • MSM: Is there just one function type?
  • MK: No, it will also be an instance of a more general function type based on co-variance and contra-variance.
  • MSM: Then adding that there are also other types, parallel to the wording for maps.

ACTION QT4CG-012-04: MK to add “and also of other specific function types” to 2.8.

  • MK: The data model tended to assume that all types are named; I’ve added a few notes about anonymous types.
  • DN: Maybe say they don’t have “permanent names”
  • MK: If they do have names, the names aren’t exposed.
  • DN: We don’t have a “type object” in the data model; I think we should add one.
  • RD: In the formal semantics for XPath and XQuery 1.0, there was an XPath- and XQuery-like formulation of the XML schema data model.
  • MK: I haven’t looked at that spec for a very long time; I’ll take another look.
  • MSM: I have one concern which is that I seem to remember that in the early days of the QT work, the formalists said that it complicates things too much if we say that types sometimes have a name and sometimes don’t, so we’re just going to assume they always do. So my concern about this note is that we don’t want to introduce a pervasive problem.
  • MK: That’s scattered about a bit, but no where do we explicitly say that.
  • MSM: As we work through the specs, everyone please keep an eye open for ramifications.

ACTION QT4CG-012-05: NW to see if the presentation of the type hierarchy tables can be improved.

  • RD: Should we be specifying the value space of the untyped- and any-typed types?
  • MK: Yes, probably. I haven’t made any changes in that area.
    • … There’s an error here, xs:untyped is not an atomic type.

ACTION QT4CG-012-06: MK to remove xs:untyped from the paragraph at the top of 2.8.2

  • MSM: To address RD’s concern, I don’t know what the value space of untypedAtomic is. The value space that seems obvious to me is that untypedAtomic has the union of the value spaces of all the primitive types. That was explicitly suggested by the schema working group, but the QT group said “nah”.
  • RD: Wouldn’t untypedAtomic be better being the same value space as xs:string?
  • MSM: That’s ok for the lexical space, but for the value space, shouldn’t the assignment of a more specific type be a refinement? Typing an xs:string as an xs:integer is not a refinement.
  • RD: But the infoset-mapping section says that an attribute or text value has a “data” type that’s of xs:untypedAtomic. It doesn’t make sense there to say the union of all possible types.
  • MK: We always describe moving from an untypedAtomic to something else as “casting”.
  • RD: Also, the casting rules treat xs:string and xs:untypedAtomic identically.
  • MK: They’re identical for nearly all purposes except for the behavior of the coercion rules.

Proposal: accept this PR.

Accepted.

2.2. Review pull request #237: XSL conditionals

See pull request #237

  • MK: I did a revised PR that was responsive to the actions I was given, but also went a little further. It builds locally but not on the server.

ACTION QT4CG-012-07: NW to work with MK to sort out the server build issues with PR #237

2.3. Review pull request #247

See pull request #247 which resolves MK actions QT4CG-011-01 and QT4CG-011-03

  • MK: The first is completely trivial, just a one word change.

On review, these are just markup fixes.

Proposal: Accept this PR.

Accepted.

2.4. Review pull request #249: fn:items-at, issue #213

See pull request #249

  • MK reviews fn:items-at()
    • … One doubt I had is that the most common use is going to be a single number, so it’s not clear if item-at would be better. But I used the plural.
    • … The use cases for this are primarily where the square bracket predicate notation gives you problems with the context item.
    • … The use cases for returning multiple items are sometimes equivalent to subsequence
    • … One point of detail is should this be like substring and subsequence and take xs:double instead of integers.
    • … We had a big debate about this with respect to functions we added before. I think we got that right, the explicit cast is useful and the cases where it’s needed are quite rare.
    • … If you use xs:double you have to talk about what happens if it’s not an exact whole number.
  • CG: How about xs:nonNegativeInteger?
  • MK: Yes, once we get the coercion rules sorted out.
  • MSM: I can still cast to xs:integer, yes?
  • MK: Yes, if we accept the change to the coercion rules.
  • DN: I think it might be useful to allow negative integers, which take from the end, like Python
  • DN: We should add a corresponding array function, fn:members-at()?
  • MK: If we do add negative numbers, we should add it everywhere that it’s appropriate, for example fn:remove().

ACTION QT4CG-012-08: NW to put the coercion rule changes on the agenda next week.

NW asks if there’s any possibility of harmonizing arrays and sequences for functions?

  • MK: I’ve proposed a way to parcel and unparcel arrays and sequences so that it’s easier. But it’s hard to make that nice and usable without explicit parceling.
  • RD: If we had explicit union types, we could…
  • MK: No, we couldn’t, we have this wretched problem that an array is a sequence of length one.

Returning to items-at…

  • CD: I think doubles would be better than integers because that would make it easier to rewrite predicates to this function. But you could say that was an optimization issue not something a user has to be concerned about?

The CG is not moved to make that change.

Proposal: Accept this PR

Accepted.

2.5. Review pull request #250: fn:foot, etc.

See pull request #250

  • DN: I find the names not good, especially truncate. Where head / tail make a pair, foot / truncate don’t. I proposed heel. We could also consider root / stalk. The other thing that’s more serious is that the corresponding functions for arrays will raise exceptions when the first argument is the empty sequence, but here we’re returning the empty sequence. The semantics are not similar. I think an additional argument that has a default to raise an exception.
  • MK: The issue of array bounds checking is a very good issue and a very difficult one. I think it’s indefensible that we have array bounds checking on arrays and not sequences. It’s a classic working group kind of artifact that introduces an enormous non-orthogonality in the language. At the same time, that’s a problem that’s enormously difficult to fix. I tend to want to keep things consistent whenever possible.

Out of time.

3. Any other business

  • RD: Can folks look over my parse-html PR?
  • MK: That’s good work, and needs careful review.
  • CG: I proposed in email that we could reference issues from the QT specs with the hash symbol and a number. Then we’d get linking from PRs to issues and commits.
  • NW: Good idea.