QT4 CG Meeting 003 Minutes 2022-09-20

Table of Contents

Minutes

Approved on 27 September.

1. Administrivia

Summary of new and continuing actions [0/6]

  • [ ] QT4CG-002-01: NW to incorporate email feedback and produce new versions of the process documents.
  • [ ] QT4CG-003-01: MK to find a way to specify fn:characters() more formally
  • [ ] QT4CG-003-02: MK to propose a reformulation of fn:index-of() in terms of fn:index-where()
  • [ ] QT4CG-003-03: NW to tweak the CSS for function signatures to avoid line breaks on - characters.
  • [ ] QT4CG-003-04: MK to rename fn:uniform() and fn:unique() to fn:all-equal() and fn:all-different(), respectively
  • [ ] QT4CG-003-05: MK to consider how fn:array-filter() could be generalized to handle array predicates on the index as well as the value.

1.1. Roll call [9/12]

Regrets: Bethan Tovey-Walsh

  • [X] Anthony Bufort (AB)
  • [X] Reece Dunn (RD)
  • [X] Christian Grün (CG)
  • [ ] Joel Kalvesmaki (JK) [x:35-]
  • [X] Michael Kay (MK)
  • [X] John Lumley (JL)
  • [X] Dimitre Novatchev (DN)
  • [X] Ed Porter (EP)
  • [ ] Liam Quin (LQ)
  • [X] C. M. Sperberg-McQueen (CSM)
  • [ ] Bethan Tovey-Walsh (BTW)
  • [X] Norm Tovey-Walsh (NW). Chair. Scribe.

1.2. Agenda

Proposal: Accept the agenda without amendments.

No objections.

1.3. Next meeting

The next meeting is scheduled for Tuesday, 27 September. Any regrets?

No regrets heard.

1.4. Approve minutes of the previous meeting

Proposal: Accept the minutes of the previous meeting as a correct record.

No objections.

1.5. Review of option action items [8/9]

(Items marked [X] are believed to have been closed via email before this agenda was posted.)

  • [ ] QT4CG-002-01: NW to incorporate email feedback and produce new versions of the process documents. Continues
  • [X] QT4CG-002-02: RD to make a traige pass over the issues. Completed

NW: Thank you!

RD: I Haven’t checked if the editorial labels need to be clarified. I also didn’t add milestones or priorities as those should be decided by the communty.

Nods of agreement.

DN: I also added some comparisions of array vs sequence functions.

DN: Also, I think sometimes there are two many labels. An issue marked XPath doesn’t also need to be marked XQuery.

RD: I think that make sense, but also the converse doesn’t apply: if something is labeled XQuery, it’s not necessarily about XPath. I’m happy if XPath implies XQuery. I wonder if there are cases where things are tagged XPath that don’t fit that rule.

CMS: I can’t think of any examples either, but for the imaginable case where something might apply to only some XPath implementations, we could have both tags.

Consensus appears to be that XPath will imply XQuery and XQuery implementors will have to search the list for both tags.

RD: Should we remove the square brackets [XPath], [XQuery], etc. in the titles?

NW: I don’t think anyone needs to take an action to do it, but as we address issues, if it seems clearer we can. Conversely, if someone really wants to go through and do it, I don’t have any objections to that either.

2. Technical Agenda

2.1. fn:characters

See https://qt4cg.org/specifications/xpath-functions-40/Overview.html#func-characters

  • CSM: Is there a way to do this with tokenize?
  • NW: No, I don’t think so. You’d have to split on empty string and that’s an error.
  • CSM: Ah, right I use fn:string-to-codepoints() and then fn:codepoints-to-string() on the result.
  • RD: fn:characters() also groups the unicode combining characters, correct?
  • MK: Not when using normalization. It’s not very formally defined. We could do it by defining it in terms of string-to-codepoints and back.
  • RD: It might be worth having a note describing the behavior. I made a proposal for a string-to-graphemes function that would keep combining characters together.
  • MK: I remember the proposal, but isn’t that the same as normalizing and then splitting?
  • RD: Not for characters that don’t have a corresponding code point for the composed form. Consider, for example, an “e” with grave and ring accents: è̊
  • MK: Right.
  • DN: I want to make a general comment. I think it’s important to have functions for convenience, even if it’s possible to do it in some other way in the language. It makes programmers more productive and avoids errors.

ACTION QT4CG-003-01: MK to find a way to specify fn:characters() more formally

Proposal: Accept this function.

No objections.

2.2. fn:identity

See https://qt4cg.org/specifications/xpath-functions-40/Overview.html#func-identity

  • MK: This was originally propsed by CG.
  • MK: I have slight reservations about the name because “identity” is associated with node identity.
  • DN: And also with the fn:id() function.
  • CSM: I like calling the function fn:identity(). While I agree with Mike about the current use of the word “identity” in the current specifications, I think that’s sometimes mislead readers!

Proposal: Accept this function.

No objections.

2.3. fn:index-where

See https://qt4cg.org/specifications/xpath-functions-40/Overview.html#func-index-where

  • MK: This is a generalization of fn:index-of() with more complex predicates and it can process sequences of things other than atomic values
  • DN: this funciton is not very useful if it is applied to an array, especially if the array has members that are arrays themselves.

    … maybe we should note that there are no functions for searching in arrays and maps.

  • JL: DN are you suggesting that we have a set of searching higher order functions?
  • DN: I don’t understand.
  • JL: When you want to do something like fn:index-where() on a structured array, you’re invoking the possibility of recursive application in certain parts. You’d want to know not that it was just in elements 4, 9, and 12, but that it was item 7 in the 4th top-level member, etc.
  • DN: Yes. In the email thread for this function, I gave an example.
  • MK: We have a separate issue open on trying to define functions for deep search of a hierarchic structure. This function isn’t meeting that requirement, but a simpler one. You could have an exactly analogous function that does a shallow search of an array. The deep search is harder to specify. Out of scope for this function.
  • RD: I wonder if we should describe the definiton of fn:index-of() in terms of fn:index-where().
  • MK: We could do that. It needs a bit of examination because of exactly what the equality semantics of fn:index-of() are, given that it takes a collation as an argument.

ACTION QT4CG-003-02: MK to propose a reformulation of fn:index-of() in terms of fn:index-where()

  • JL: Sometimes you want both the item itself and the index.
  • MK: Sometimes you want all before, or all after, or grouping…
  • CSM: The first thing that occurs to me is to filter on it, but we already have filtering.

Proposal: Accept this function.

No objections.

2.4. fn:in-scope-namespaces

See https://qt4cg.org/specifications/xpath-functions-40/Overview.html#func-in-scope-namespaces

Some discussion of the poor formatting in the published specification.

ACTION QT4CG-003-03: NW to tweak the CSS for function signatures to avoid line breaks on - characters.

  • MK: This is the function we would have had in place of the existing namespace functions, if we’d had maps from the start. I’ve reformulated the existing functions (fn:in-scope-prefixes() and fn:namespace-uri-for-prefix()) in terms of these.
  • RD: The reformulation looks fine to me.
  • MK: To make everyone aware, in defining the signatures, I’ve made use of the proposed capability to define union and enum types locally.
  • CSM: Looking ahead, is that a change to the type system or just to the way we document things?
  • MK: It’s a change to the type system that doesn’t change the value space. It adds types that partition the value space in a different way.
  • CSM: Even if we didn’t adopt that change to the type system, you could say we’re using it in signatures anyway.
  • MK: Yes, as a documentation convention.

Proposal: Accept this function.

No objections.

2.5. fn:is-NaN

See https://qt4cg.org/specifications/xpath-functions-40/Overview.html#func-is-NaN

  • CG: We could use fn:not-a-number() instead, so we don’t have the abbreviation “NaN” in the name.

Some discussion. General agreement seems to be that “NaN” is sufficiently well known as a term of art.

  • MK: While we’re discussing this, should we also have fn:NaN() that returns NaN?
  • CSM: IEEE defines a zillion forms of NaN and I thought that XSD tried to preserve that by specifying that NaN != NaN.
  • MK: In our value space, in both XSD and XDM, the value space of xs:double includes only one NaN and xs:float has a different NaN. But we don’t have all the other forms of NaN.

MK, JL, CSM observe that the different forms do become apparent in the EXPath binary extensions.

  • JL: Is fn:is-NaN() equivalent to castable as to numeric?
  • MK: No, I think it’s equivalent to not($value = $value).
  • MK: What about fn:NaN()?
  • NW: I was in favor until you pointed out that there are two different NaNs, one for double and one for float!
  • RD: Casting a float or a double from the string "NaN" returns the corresponding NaN value, doesn’t it?
  • MK: No it returns a failure, but the fn:number() function does.

Correction: Casting "NaN" to xs:float or xs:decimal works as RD suggests. It is an error to attempt to cast something like "junk" to a float or decimal, but fn:number("junk") returns NaN.

  • JL: So if we have the fn:number() function and a way of creating NaN, do we need the function?
  • MK: I suppose not.

Proposal for fn:NaN() fades away.

Proposal: Accept the fn:is-NaN() function.

No objections.

2.6. fn:highest, fn:lowest

See https://qt4cg.org/specifications/xpath-functions-40/Overview.html#func-highest

See https://qt4cg.org/specifications/xpath-functions-40/Overview.html#func-lowest

  • MK: This is in the example of user written functions in the 3.1 spec, but I found it sufficiently useful that it’s worth including. It’s modeled on fn:sort in that it has three variants.
  • DN: This can be generalzied to a function that returns all the items at a given rank, where highest is rank 1.
  • RD: I think we’d need a separate proposal for that function.
  • MK: Note that I’ve made it consistent with fn:min() and fn:max rather than fn:sort(), so the details are important.
  • CG: We’ve implented it and it worked for us.

Proposal: Accept these functions.

No objections.

2.7. fn:uniform, fn:unique

See https://qt4cg.org/specifications/xpath-functions-40/Overview.html#func-uniform

See https://qt4cg.org/specifications/xpath-functions-40/Overview.html#func-unique

  • MK: I find these really useful in assertions. Not something you need every day but useful when they are needed and possibly faster than a user-defined version.
  • JL: What about finding duplicates?
  • MK: That was a follow-on proposal, inspired by these, but it hasn’t been proposed.
  • DN: The name “uniform” doesn’t tell me what the function returns as true or false; it would be more obvious if it was something like, “contains-single-value” or something.
  • MK: “all-equal”?
  • DN: Have we talked about a similar function for arrays?
  • MK: That’s equally applicable.
  • CSM: I like “all-equal” in part because it tells you up front that you’re testing for equality not identity. I have the opposite problem with “distinct” because I think of identity not equality!
  • MK: And “all-different”?
  • CSM: Regardless of the name, the summary included should call out “equality” explicitly not just “distinct”.

Proposal: rename the functions fn:all-equal() and fn:all-different()

No objections.

ACTION QT4CG-003-04: MK to rename fn:uniform() and fn:unique() to fn:all-equal() and fn:all-different(), respectively

Proposal: Accept these functions as renamed.

No objections.

2.8. map:filter

See https://qt4cg.org/specifications/xpath-functions-40/Overview.html#func-map-filter

  • MK: I’ve forgotten what the use case for map:filter() was, but I recall that it was convincing. It returns a sub-map and you can filter on the key or the value.
  • JL: My first reading of the examples is that they were the same. Maybe change it so you get different results?
  • CSM: Make the second one filter on six character names?
  • CSM: I can think of a use case. For some algorithms you have to build a big-big map and once you’re done you can throw away two-thirds of it. This would be a way to do that.
  • CG: We already had a BaseX implementation.

[Someone, CG?] observes that the return type is wrong in the function signature. It should be map()* not item()*.

  • DN: There is a function array:filter() that does something different. That might be confusing. It might be good to have for array what map:filter() does. The filtering for array only applies the prediate to the elements of the array, not to the indexes. Maybe think about reconciling them?
  • CSM: If I want to filter on the array index as well as the array value, presumably all I have to do is call map:filter() on the array because arrays are maps.
  • MK: Arrays aren’t maps. they’re both functions. The reason we did that was because we decided, for better or worse, that we didn’t want sparse arrays. If you remove the third item from an array, the fourth item becomes the third, it doesn’t remain “4”.
  • RD: Do we have an array function to get a map of the indexes and the corresponding values?
  • MK: Not as a single shot function.
  • DN: Even the array:for-each-pair() function could benefit from making the index avilable. We can enrich array functionality by giving access to the index.
  • RD: If we do that then currently as specified would that work? Or would we need the optional/default value in the supplied function?
  • MK: We’d require more flexibility with arities and what function coerction rules do, it could get complicated…it depends on enhancements we haven’t made yet.
  • RD: I think it’d be worth a proposal for that, to look at it.
  • CSM: The big problem I see there is, in every other case, we have the pattern that the single arity function takes an argument, the two argument function takes that and an additional one, etc. The argument at position 1 is constant. It would be backwards from the way map:filter() works and that looks like a real usability problem!
  • RD: The other possibility to keep the array functions as defined but have equivalent with-position variants.
  • MK: fn:array:filter2()

Proposal: Accept this function?

  • CSM: Because of the alignment issues with array:filter(), I’d kind of like to leave this one open. Or is that unhelpful.
  • NW: I’m ok with that.

ACTION QT4CG-003-05: MK to consider how fn:array-filter() could be generalized to handle array predicates on the index as well as the value.

3. Any other business

  • DN: We’ve had new functions proposed, this list should be updated.
  • RD: Once we’ve gone through the functions that have already been added, we should go back to the issues.

Some discussion; general agreement seemed to be that it was better to do the functions currently defined in the draft first, then come back and review functions proposed more recently.

  • DN: Okay, but we should be on the look out for dependencies, if currently drafte functions would be better specified in terms of newer proposals, for example.
  • JL: I’d like to congratulate NW on the quality of the minutes.

4. Adjourned