@qt4cg statuses in 2023

This page displays status updates about the QT4 CG project from 2023.

See also recent statuses.

Issue #920 created #created-920

30 Dec at 12:55:53 GMT
The rules for the "tail position" of a sequence constructor need to take account of xsl:switch

Under xsl:iterate, there are rules defining what it means for an instruction to be in a tail position in a sequence constructor. In these rules xsl:switch should be treated the same way as xsl:choose.

Issue #919 created #created-919

28 Dec at 16:14:49 GMT
Should predicate callbacks use EBV?

Currently predicate callback functions used by things like fn:filter have to return a boolean; they can't rely on EBV semantics. For example you have to write fn{boolean(self::p)} rather than fn{self::p}.

It's probably with self tests that this is most noticeable, because self is so often used in a boolean context.

Of course the current rule gives stricter type-checking which will presumably catch some user errors. But it seems an unnecessary inconsistency.

Pull request #918 created #created-918

27 Dec at 03:27:33 GMT
Minor cx through chap. 14
  • fn:splice examples expanded to illustrate integer steps other than 1.
  • fn:unparsed-text-lines updated to reflect recent decisions about line handling in fn:unparsed-text.
  • Other clarifications or corrections.

Let me know if any of these edits are misfires.

Issue #917 created #created-917

22 Dec at 19:41:46 GMT
Better support for typed maps

Edit (2023-01-04): See https://github.com/qt4cg/qtspecs/issues/917#issuecomment-1875712638 for the most promising suggestion resulting from the discussion in this thread.


Inspired by #720 and concerns regarding usability and performance, it may be a big step, but couldn’t we define records as subtypes of maps?

  • The main difference would be that updates on records are only allowed as long as the resulting map matches a record definition.
  • This would allow us to return much better error messages, and to prevent users from deconstructing their own data structures.
  • We could still benefit from the existing map functions… provided that we believe it's an advantage. A stricter solution would be to disallow optional map entries completely (and treating records as a separate type).
  • From a technical point of view, data with a fixed structure can be optimized much better than a structure that changes dynamically.

Pull request #916 created #created-916

21 Dec at 17:15:17 GMT
720 Allow methods in maps with access to $this

This proposal allows functions within maps to access the containing map using the variable $this.

The proposal needs editorial work to integrate it fully into the text, but it is intended to be sufficiently complete to enable a full technical review.

Fix #720

Issue #915 created #created-915

21 Dec at 11:45:54 GMT
[Editorial] Incorrect terminology: function implementation is now function body

In 4.6.2.5 Inline Function Expressions we refer to the "implementation" property of a function item; this property has been renamed as the "function body".

Pull request #914 created #created-914

21 Dec at 04:20:17 GMT
XQFO minor edits

From reading up through chap. 13. The change in the title of chap. 10.2 is more accurate, and avoids the repetition of the title of chap. 10 itself.

Issue #913 created #created-913

20 Dec at 23:00:56 GMT
XQFO: under/unused variable apparatus

In the preamble of XQFO chap. 13, several paragraphs are spent introducing a tree structure example, defining variables $po, $item1, $item2, and $item3. The prose leads the reader to expect frequent invocation of this tree example, but chap. 13 never uses it.

Chapter 13's tree example is referred to (sans link) and summarized in the preamble of chap. 14. In that chapter it is used only once, in a fn:count example that really doesn't rely upon anything special in the tree example.

Variables $item1 and $item2 are invoked only once more: in chap. 4, for fn:number. It's rather out of the blue, because neither the function definition nor chapter 4's preamble say anything about what the variables mean.

My recommendation is to drop this material all together, and for the functions fn:count and fn:number replace the examples with simpler examples.

OTOH, I might have come across an incomplete implementation, and the editors might prefer to make more thorough use of this tree example. I don't know.

Pull request #912 created #created-912

20 Dec at 12:13:38 GMT
XQFO: Minor edits

Editorial; examples fixed

Issue #297 closed #closed-297

19 Dec at 18:24:56 GMT

Lookup in deeply nested JSON, an abbreviated syntax for map:find

Issue #20 closed #closed-20

19 Dec at 16:37:57 GMT

Highlight EBNF grammar differences in the diff versions of the specs

Issue #51 closed #closed-51

19 Dec at 16:37:51 GMT

Generalize lookup operator for function items

Issue #705 closed #closed-705

19 Dec at 16:37:45 GMT

Function Coercion: Function Arities

Issue #707 closed #closed-707

19 Dec at 16:37:40 GMT

Dynamic Function Calls: Processing Empty Sequences

Issue #892 closed #closed-892

19 Dec at 16:36:36 GMT

XPDY0002: Misleading examples

Issue #903 closed #closed-903

19 Dec at 16:36:34 GMT

892 XPDY0002: Misleading examples

Issue #902 closed #closed-902

19 Dec at 16:33:41 GMT

900 fn:sort, array:sort: Parameter names

Issue #900 closed #closed-900

19 Dec at 16:33:41 GMT

fn:sort, array:sort: Parameter names

Issue #894 closed #closed-894

19 Dec at 16:30:43 GMT

Errors in forming function items

Issue #897 closed #closed-897

19 Dec at 16:30:42 GMT

894 - errors in forming function items

Issue #866 closed #closed-866

19 Dec at 16:27:14 GMT

fn:sort, and XSLT and XQuery sorting, should use transitive comparisons

Issue #881 closed #closed-881

19 Dec at 16:27:13 GMT

866 Introduce and exploit new numeric-compare() function

QT4 CG meeting 059 draft minutes #minutes-12-19

19 Dec at 16:15:00 GMT

Draft minutes published.

Issue #911 created #created-911

18 Dec at 12:44:28 GMT
Type "Promotion" in the coercion rules

I have an open action: QT4CG-052-06: MK to consider the editorial question of “promotion” for the symmetric relations.

I think the point that led to this was the fact that the word "promotion" seems inappropriate for cases like (string/uri) where the implicit conversion can take place in either direction.

I'd like to propose a fix to this that is not merely editorial. I propose that we allow any cast from one numeric type to another in the coercion rules. For example, if the required type is decimal, then a double or float can be supplied. Since, for many implementations of xs:decimal, this can be done losslessly, it makes at least as much sense to convert from double to decimal as from decimal to double.

The word "promotion" is in fact used (in relation to the coercion rules) only in a table heading, and we can change this heading to "implicit casting".

Appendix B.1 currently says:

B.1 Type Promotion

[Definition: Under certain circumstances, an atomic value can be promoted from one type to another.] Type promotion is used in a number of contexts:

It forms part of the process described by the [coercion rules], invoked for example when a value of one type is supplied as an argument of a function call where the required type of the corresponding function parameter is declared with a different type. It forms part of the process described in [B.2 Operator Mapping]), which selects the implementation of a binary operator based on the types of the supplied operands. It is invoked (by explicit reference) in a number of other situations, for example when computing an average of a sequence of numeric values (in the fn:avg function).

and I suggest we retain the term only for the second case, operator mapping. This differs from the coercion rules in that there are two operands and the effect is always to convert one to the type of the other. This affects numeric types only (not string/uri or binary), and it will continue to promote decimal to double, decimal to float, and float to double.

Where functions (fn:avg, fn:sum, math:pow) refer to the promotion rules, I suggest that we spell out the conversions that happen explictly, since it's not entirely obvious how the rules should be extrapolated. (For example, fn:avg doesn't make it entirely clear what should happen if the first item is a decimal, the second is a float, and the third is a double. Are you expected to "look ahead" to see what types are present, rather than evaluating the average incrementally?)

QT4 CG meeting 059 draft agenda #agenda-12-19

18 Dec at 11:00:00 GMT

Draft agenda published.

Issue #910 created #created-910

18 Dec at 02:45:17 GMT
Introduce a Kollection object with functions that operate on all types of items that can be containers of unlimited number of "members"

The base for this issue is the email sent by @ndw to public-xslt-40@w3.org on Dec. 13th 2023, fully quoted below:

Hello all,

After a couple of weeks of discussion[1][2] about naming things, there seem to be a some quite different perspectives on the problem.

As background, let’s remember that we have a language (or a set of languages) that evolved over time. We couldn’t anticipate in version 1.0 what we would have in 4.0. We added new features in 2.0 and 3.0 that weren’t anticipated in previous versions either.

We live with decisions (some the result of long and hard battles within the working group(s)) like the fact that sequences don’t nest so all individual items are also sequences of length one.

The context for each addition to the language has been roughly: how can we add new, useful features with a minimum of backwards incompatibility.

It’s a natural consequence of this sort of evolution that there are rough edges. Why does fn:count returns the number of items in a sequence but always returns 1 if the argument is an array? Because an array is an item and an item is a sequence of length one.

(It doesn’t help that the vision of what the X* languages should be has changed over time. What started out envisioned as a tool for transforming documents from one format to another for presentation on the web or in print has grown into something that at least some members of the group view as first class, functional programming languages. That’s not bad, but it puts entirely different stresses on the design, I think.)

As we add new functions (specifically, in the case of recent discussions, but I expect the same perspectives apply more generally), I think one perspective is roughly this:

How can we name and organize the functions so that users are least likely to be surprised and most likely to be able to figure out how to solve a particular problem?

Taken to an extreme, this perspective isn’t about changing the semantics of the functions at all, it’s “just” about naming them. Is fn:get() better (easier to understand, less confusing) than fn:items-at?

I think another perspective is roughly this:

We have a messy design. It would be better if we could refactor the design so that it was more harmonious and logical. We don’t need four different, closely related functions to get items out of different sorts of data structures, we need a set of abstractions that make it obvious that only one function is necessary.

Taken to an extreme, this perspective is about reshaping the whole language so that a single, obvious set of function names emerges naturally from the carefully constructed abstractions.

I don’t think anyone holds exactly one perspective (discussions about renaming often involve some level of discussion about semantics, for example) and I’m attempting to polarize the perspectives a little bit in an effort to shine light on a larger problem, not to be divisive.

With my chair’s hat on, the main problem I see with the first perspective is that naming is hard, often personal and emotional, and will never be wholly logical (so there will always be more to discuss, so the “problem” is never resolved). It’s not quite fair to say it’s a distraction from the “bigger” issues we need to resolve, but it does take a lot of time.

I see the appeal of the second perspective. If we had a green field, we’d do things differently. I think we might all agree that, ideally, fn:count should return the number of items in a sequence, the number of items in an array, and the number of key-value pairs in a map. But it doesn’t and it can’t without fundamentally breaking things. I don’t think we’d get agreement to break fn:count, so what can we do?

A proposal to fundamentally redesign the data model would be a tough sell, I think.

One thing we could do is define a new namespace “gn” with functions that work more logically, that treat sequences, arrays, and maps, as collections and operate on them uniformly.

I suppose we could reconstruct the whole set of functions in this new namespace and focus our efforts there, perhaps going so far as to deprecate the current fn: namespace in favor of this new one. But could we get consensus to do that? Would users thank us?

I dunno. Innovations welcome.

                                    Be seeing you,
                                      norm

This issue addresses the 2nd alternative formulated briefly by Norm as:

One thing we could do is define a new namespace “gn” with functions that work more logically, that treat sequences, arrays, and maps, as collections and operate on them uniformly.

I suppose we could reconstruct the whole set of functions in this new namespace and focus our efforts there, perhaps going so far as to deprecate the current fn: namespace in favor of this new one. But could we get consensus to do that? Would users thank us?

Here are some of the obvious advantages of having a uniform Kollection concept that covers: arrays, sequences, maps, ... and possibly future new, specific, collection-like datatypes as sets:

  1. Uniform definition and understanding of a single data type - the Kollection.

  2. O(N) functions only, compared to O(M * N) at present. Here N is the number of functions needed for each of the current collection-like data types (Arrays, Sequences and Maps) and M is the number of collection-like data types (currently 3).

  3. The users will need to know about and understand just the single Kollection data type and its functions, not 3 or more similar collection-like data types and 3 or more number of similar (but different) functions. Minimizing by a factor of 3 the amount of factual knowledge that a user needs is something HUGE and extremely positive.

  4. Allowing users to say "Good Bye" to the unclear and treacherous flat-sequence concepts we have as legacy from XPath 1.0.

  5. Staying aligned to the examples of other modern programming languages such as C# with its IEnumerable interface. It is good to know that this has already been done in other shining programming languages, thus a nay-sayer will not be able to argue that this is not doable or, if done, would be negative to the language and its users.

  6. Freeing enormous resources and time for the members of the Community Group so that they can spend this on more valuable avenues, than trying to find similar and best names to M similar functions each defined to one of the M current collection-like data types.

Now, to dispel some plausible myths before they start circulating here:

  1. Myth 1: This will break backwards-compatibility? No, as proposed by Norm, all the functions operating on the generalized collection data type can be in a separate, new namespace and thus no existing user-code is affected.

  2. Myth 2: If a sequence containing a single Kollection still has count() of 1, then what is the use of the Kollection data type? Actually, as proposed by Norm, the Kollection data type and its functions reside in their own namespace. Doing things using only functions from this new namespace eliminates the possibility of using fn:count as it resides in the different, currently existing standard function namespace.

  3. Myth 3: This will be too-complex for the users and the users will not embrace it, so let us not waste time designing it. Wow, there were such prophets saying exactly the same about LINQ in 2005. As it often happens, the future proved them wrong. Users clearly and overwhelmingly "voted with their code" incorporating LINQ in almost all everyday applications and code repositories.

  4. Myth 4: Banning the current functions operating on sequences, arrays and maps would be a huge burden to the users, and would intervene negatively with their programming. In fact, nobody would be banning any of the existing functions. Users can continue to use them forever. The acceptance of the uniform and generalized Kollection data - type can happen gradually with time, as was the case with the addition of LINQ to C#.

Pull request #909 created #created-909

17 Dec at 14:04:12 GMT

893 fn:compare: Support for arbitrary atomic types

Issue #908 created #created-908

17 Dec at 11:57:56 GMT
Function identity: documentation, nondeterminism

In #520, the concept of function identities was introduced. This is what the current draft says:

XDM, 2.9.4 Function Items

identity: an abstract property that can be used to test whether two variables refer to the same function or to different functions. This property is exposed only for this purpose.

Note: Currently, the concept of function identity is used for two purposes: firstly, when functions appear in the arguments supplied to the fn:deep-equal function; and secondly, in establishing whether the arguments and results of a function are "the same" when deciding whether the function is deterministic.

Note: Function identity is not currently defined for maps and arrays, because in the circumstances where function identity would otherwise be used, maps and arrays are compared by examining their content.

XQFO, 1.8.4 Properties of functions

  1. […] the two function items have the same function identity. The concept of function identity is explained in Section 2.9.4 Function Items.

XQFO, 14.2.8 fn:deep-equal

c. $i1 and $i2 have the same function identity. The concept of function identity is explained in Section 2.9.4 Function Items.

XQFO, 17.1.1 fn:function-lookup

The function identity is determined in the same way as for a named function reference. Specifically, if there is no context dependency, two calls on fn:function-lookup with the same name and arity must return the same function.

While I definitely believe in the concept, I believe the documentation is still cryptic, or even impossible, to understand, at least without reading #520 or consuming the existing QT4 test cases. Here are some questions that I’m trying to answer:

  • Does “abstract property” mean that the property will not be materialized in an implementation, or does it mean that the property too vague to be precisely defined?
  • We should try to specify what “refer to the same function” means. Are function properties that allow us to at least safely identify a subset of functions the same? For example, will true#0 and true#0 always be identical? The test cases imply this, whereas #520 doesn’t.
  • In XQFO 17.1.1, there’s a hint that context-dependency influences the decision if functions are identified as equal. Does this mean that name#0 and name#0 cannot be equal? Or can, or will, they be equal if the context is identical?
  • The term “deterministic” does not appear anywhere else in the XDM spec, so one is inclined to think of the XQFO nondeterminism. It is then unclear whether two instances of, for example, map:entries and fn:parse-xml can be “the same” if the parameters are the same.
  • In XQFO 17.1.1, “must return the same function.” is also unclear: What exactly is meant by “same function”? Is it a function that creates an identical result (thus, excluding nondeterministic functions like fn:parse-xml)?
  • We should explain better what was the motivation to include the context-dependency of functions in the definition. It would certainly be more intuitive if both deep-equal(name#0, name#0) and deep-equal(name#1, name#1) returned true.

I’m sorry for not offering good answers in return. I could try to describe what we’ve implemented so far – mostly inspired by the test cases – but I’m not sure if it meets the requirements.

Related: #333

Pull request #907 created #created-907

16 Dec at 09:33:47 GMT

906 fn:deep-equal: unordered → ordered

Issue #906 created #created-906

16 Dec at 09:32:20 GMT
fn:deep-equal: unordered → ordered

As already suggested in https://github.com/qt4cg/qtspecs/pull/798#pullrequestreview-1709271106 (and by first user feedback), the upcoming PR renames the option unordered to ordered, with true as default.

Issue #339 closed #closed-339

15 Dec at 16:24:41 GMT

The constraints on document-uri are too...constraining

Pull request #905 created #created-905

14 Dec at 13:49:22 GMT
898 - relax the constraints on document-uri

Fix #898

Changes non-normative text for the doc() and document-uri() functions to make it clearer that the consequences of the normative rules are not quite as previously stated.

Pull request #904 created #created-904

14 Dec at 11:42:11 GMT
821 Annotations: Make default namespace explicit

In addition: Lexicographic order; formatting.

Issue #129 closed #closed-129

14 Dec at 11:22:47 GMT

Context item → Context value?

Issue #608 closed #closed-608

14 Dec at 11:21:15 GMT

Formatting Monospace (II)

Pull request #903 created #created-903

14 Dec at 09:13:42 GMT

892 XPDY0002: Misleading examples

Pull request #902 created #created-902

14 Dec at 08:18:46 GMT
900 fn:sort, array:sort: Parameter names

Closes #900. In addition, the equivalent expression for array:sort was fixed.

Pull request #901 created #created-901

13 Dec at 18:05:34 GMT
895 Parameters with default values: allow empty sequences

Closes #895

Issue #900 created #created-900

13 Dec at 16:03:14 GMT
fn:sort, array:sort: Parameter names

The sort functions now accept multiple collations, keys, and orders, and this needs to be reflected in the parameter names (which are still singular).

Issue #899 created #created-899

13 Dec at 15:31:05 GMT
Simplifying the language - types have behaviour.

I may misunderstand something but I always find the use of types and "as" to be counter intuitive (I'd prefer to be able to run an xslt 3+ script in some sort of 'strict' mode that was a bit more rigid, but thus simpler) e.g.

consider

      <xsl:variable name="foo1">
         <foo/>
      </xsl:variable>

question - what is the type of foo1? answer - (according to my saxon/oxygen setup the answer is) "document-node"

consider

      <xsl:variable name="foo2" as="element(foo)">
         <foo/>
      </xsl:variable>

question - is this code valid then (I would as someone not used to xslt 2+ assume not)? answer - yes

but surely this code is identical to foo1, so the 'type' of variable is actually changing the interpretation of the expression.

For me that's quite confusing

It would appear that these 2 values are not two different views (interfaces) of the same underlying value, else this

<xsl:variable name="foo3" as="element(foo)" select="$foo1"/>

would be valid.

It isnt (i.e. this doesn't appear to be some subtle OO style scenario where an evaluation can have multiple interfaces, here 'document-node' and 'element' are presumably disjoint types).

For me conceptually types are descriptions of expressions, they have no behaviour, yet here they appear to (to me) effect the interpretation, not simply describe it.

For me, I'd prefer a 'strict' mode where either

      <xsl:variable name="foo2" as="element(foo)">
         <foo/>
      </xsl:variable>

is a type error, because the expression is clearly a document-node OR some other mechanism to clarify the ambiguity without this conceptual wrinkle.

An expression should either have 1 interpretation, or if its ambigious, that should be an error, I don't think the language should default to prefer one over another.

P.S. why doesnt this work? I genuinely don't know how to explicitly declare something as a document-node.

  <xsl:variable name="foo1" as=document-node()>
     <foo/>
  </xsl:variable>

Issue #898 created #created-898

13 Dec at 10:04:04 GMT
Drop the requirement for document-uri() uniqueness

The specification of document-uri() states that the returned URI must be useable as input to the doc() function and that it always round-trips, so doc(document-uri($X)) is $X is guaranteed true.

A consequence of this rule is that you can't have two documents in the same execution scope with the same document-uri() property.

Enforcing this rule causes a lot of trouble. For example at the API level, the user can set two stylesheet or query parameters to two different documents that are associated with the same URI. Another example, two XSLT packages in the same stylesheet can call doc() on the same URI and get different documents back because they have set different validation and whitespace-stripping options. Use of fn:transform() causes further complications when documents are passed across the boundary (in fact that's where we first encountered the problem). And collection() brings further complications.

In order to conform to the rule in the spec, we changed Saxon a while back so the only documents that are guaranteed to have a document-uri() property are those that were read using the doc() function - and even then, things like validation and whitespace variations are troublesome. This causes users much confusion, partly, because of the change from earlier Saxon releases, but more because there are many situations where they would expect document-uri() to give a useful result and it doesn't.

I think we can fix this simply by removing the guarantee. That will cause less inconvenience to users than the current rule. We could perhaps modify it to say that in the case of a document returned from the doc() function, its document-uri() property will be such that a call to doc() with that URI will return the same document, but in the case of documents derived from other sources (for example collection(), or a result from fn:transform) there is no such guarantee.

Issue #887 closed #closed-887

12 Dec at 17:57:06 GMT

Trivial syntax error under "named function references"

Issue #896 closed #closed-896

12 Dec at 17:57:05 GMT

887 - fix simple typo in example

Issue #862 closed #closed-862

12 Dec at 17:54:49 GMT

Examples needed for "Implausible Expressions"

Issue #884 closed #closed-884

12 Dec at 17:54:48 GMT

862 Add explanations and examples of implausible expressions

Issue #844 closed #closed-844

12 Dec at 17:53:03 GMT

New sequence functions: names

Issue #879 closed #closed-879

12 Dec at 17:53:02 GMT

844 New sequence functions: names

Issue #875 closed #closed-875

12 Dec at 17:49:04 GMT

XQFO, chap. 9 minor edits

Issue #865 closed #closed-865

12 Dec at 17:46:03 GMT

Need to explain change in numeric comparison semantics

Issue #873 closed #closed-873

12 Dec at 17:46:02 GMT

865 Improve explanation of equality comparisons

Issue #867 closed #closed-867

12 Dec at 17:42:59 GMT

Signature notation in F+O: default values

Issue #870 closed #closed-870

12 Dec at 17:42:58 GMT

867 Explain defaults in function signatures

Issue #864 closed #closed-864

12 Dec at 17:41:36 GMT

$position argument in fold-right

Issue #742 closed #closed-742

12 Dec at 17:39:58 GMT

xsl:function-library: keep, drop, or refine?

Issue #863 closed #closed-863

12 Dec at 17:39:57 GMT

742 Drop xsl:function-library declaration

Issue #847 closed #closed-847

12 Dec at 17:24:52 GMT

build-uri() - is {"port":()} legal?

Issue #849 closed #closed-849

12 Dec at 17:24:51 GMT

847 Allow uri-structure-record keys to have empty sequence values

Issue #479 closed #closed-479

12 Dec at 17:21:14 GMT

fn:deep-equal: Input order

Issue #798 closed #closed-798

12 Dec at 17:21:13 GMT

479: fn:deep-equal: Input order

QT4 CG meeting 058 draft minutes #minutes-12-12

12 Dec at 17:20:00 GMT

Draft minutes published.

Pull request #897 created #created-897

12 Dec at 15:59:49 GMT
894 - errors in forming function items

Fix #894

Pull request #896 created #created-896

12 Dec at 15:18:10 GMT
887 - fix simple typo in example

Fix #887

Issue #895 created #created-895

12 Dec at 11:47:58 GMT
Parameters with default values: allow empty sequences

We need a consistent approach for defining types of optional function arguments. In most current cases, if a function argument is supplied, it must be non-empty:

map:get(
  $map       as map(*),
  $key       as xs:anyAtomicType,
  $fallback  as function(xs:anyAtomicType) as item()*  := void#1
) as item()*

fn:starts-with-sequence(
  $input        as item()*,
  $subsequence	as item()*,
  $compare      as function(item(), item()) as xs:boolean  := fn:deep-equal#2
) as xs:boolean

In some cases, it’s optional:

fn:replace(
  $value        as xs:string?,
  $pattern      as xs:string,
  $replacement  as xs:string?                                                   := (),
  $flags        as xs:string?                                                   := '',
  $action       as (function(xs:untypedAtomic, xs:untypedAtomic*) as item()?)?  := ()
) as xs:string)

(: #874 :)
fn:subsequence(
  $input   as item()*,
  $start   as xs:double?                                     := (),
  $length  as xs:double?                                     := (),
  $from    as (function(item(), xs:integer) as xs:boolean)?  := (),
  $while   as (function(item(), xs:integer) as xs:boolean)?  := (),
  $until   as (function(item(), xs:integer) as xs:boolean)?  := ()
) as item()*

As a result, map:get($map, fallback := ()) is invalid, while replace($string, $pattern, action := ()) would be valid.

I think it’s better to enforce non-empty arguments (provided that a single item is expected).

Issue #894 created #created-894

12 Dec at 11:21:21 GMT
Errors in forming function items

Follow-up to issue #888

There are a number of situations in which error behaviour is insufficiently specified:

  • Consider the partial function application fn:contains(?, 23), where the second argument is an integer, not a string. Does the partial application fail, or does it result in a function item that fails when dynamically evaluated? The spec does not say.
  • Consider the expression function-lookup(fn:name, 0) evaluated when there is no context item in the dynamic context. Does the call on function-lookup fail, or does it return a function item that fails when dynamically evaluated? The spec does not say, though there are numerous test cases in the function-lookup test set that suggest the latter.
  • The same is true for the expression fn:name#0 evaluated when there is no context item in the dynamic context.
  • Consider a user-defined function my:f with a parameter that has a default value of "."; consider both a function reference my:f#0 and a partial application my:f(?) evaluated when there is no context item. Again, I think the spec is unclear on the error behaviour.

It feels to me that the right thing to do in all these cases is to raise the error early: that is, to fail at the point where a function item is being constructed, not at the point where the function item is subsequently evaluated. However, this disagrees with QT3 expected test results for tests such as fn-function-lookup-267 and function-literal-267.

Issue #893 created #created-893

11 Dec at 17:46:56 GMT
fn:compare: Support for arbitrary atomic types

Inspired by #866:

We should extend fn:compare to support arbitrary atomic types. The comparison rules…

  • would be unchanged for strings,
  • would rely on fn:numeric-compare for numbers, and
  • would rely on the existing op: functions for the remaining types.

For example, the rule for dates would be:

  • 0 is returned if op:date-equal(A, B) is true,
  • -1 is returned if op:date-less-than(A, B) is true,
  • 1 is returned otherwise.

Some types will be rejected (xs:duration, xs:QName, xs:NOTATION, Gregorian types).

In addition, I would vote for making fn:numeric-compare and fn:atomic-equal private. I don’t see a benefit to expose them; I rather expect people to be confused.

Issue #892 created #created-892

11 Dec at 16:38:21 GMT
XPDY0002: Misleading examples

Related to #888. The examples in 4.6.2.2 Evaluating Dynamic Function Calls

…are misleading; all of them raise XPDY0002 if no element is bound to the global context value. Maybe shop could be replaced with $shop or doc('wares.xml')/shop.

QT4 CG meeting 058 draft agenda #agenda-12-12

11 Dec at 13:15:00 GMT

Draft agenda published.

Issue #891 closed #closed-891

11 Dec at 10:01:54 GMT

Cleanup the post-diff-hacking hack

Pull request #891 created #created-891

11 Dec at 10:01:48 GMT
Cleanup the post-diff-hacking hack

Improved, I think.

Issue #890 closed #closed-890

11 Dec at 09:38:20 GMT

Stop fussing with merge base branch

Pull request #890 created #created-890

11 Dec at 09:38:15 GMT
Stop fussing with merge base branch

Trying to track down the spurious diffs that we see in PRs.

I'll have to merge this to test it, so there will be a few random merges here. Sorry.

Issue #889 created #created-889

11 Dec at 00:26:25 GMT
Rename "Named Function Reference"

The term "named function reference" is used for a construct like name#1.

Although we define it as "A named function reference is an expression (written name#arity) which evaluates to a [function item]", the term "named function reference" perpetuates the incorrect assumption that it is some kind of literal or constant denoting a function item.

Of course, in many cases it can be treated as just that. But not when the function is context-dependent, for example name#0 or lang#1.

The term is also questionable because one would assume that a "named function reference" is a reference to a "named function", but there is no such concept as a "named function".

So what might be a better name? What the expression actually does (when evaluated) is to search the static context for a function definition whose name and arity range correspond, and then construct a function item that captures the relevant part of the dynamic context in its closure. It's hard to encapsulate all of that in a simple name for the construct, but I would suggest named function generator. This is sufficiently close to the current term to be recognisable, but tries to capture the fact that it's not just a constant or literal, it's an expression that activately does something when evaluated; and it's reasonably accurate in that the result of the evaluation is a function item that has a non-absent name.

Issue #888 created #created-888

09 Dec at 23:31:39 GMT
Reclassify XPDY0002 as a type error

I propose to reclassify XPDY0002 (context item is absent) as a type error rather than a dynamic error.

I don't propose to change the error code.

The only practical distinction is that this will allow the error to be reported statically when it can be detected statically, for example if the user writes something like

function($x as node()) {
  starts-with(name(), 'x')
}

At present Saxon will give you a compile-time warning for this, followed by a run-time error if the code is actually executed; this is the required behaviour for dynamic errors.

The change does mean that in a case like this example, it will no longer be possible to catch the error using try/catch. However, type errors can only be reported statically if the code is bound to fail at run-time, and catching errors that occur every time is not especially useful.

Issue #887 created #created-887

08 Dec at 15:04:55 GMT
Trivial syntax error under "named function references"

In the XPath/XQuery book, §4.6.2.4,

let $f := <foo/>/fn:name#0 return <bar>/$f()

should be

let $f := <foo/>/fn:name#0 return <bar/>/$f()

Digging a bit deeper, this reveals that we are not properly tagging and syntax-checking code examples in the spec.

Issue #886 created #created-886

07 Dec at 23:56:54 GMT
Binary map keys

We have made xs:hexBinary and xs:base64Binary comparable and we now allow implicit coercion between the two types.

I've been assuming, though I'm not sure we ever discussed it, that this automatically means that the two types can be "atomic equal" from the point of view of entries in maps: that is, a hexBinary representation of a particular binary value can no longer coexist in a map with a base64Binary representation of the same binary value.

If we were starting from scratch this would clearly make sense, but it has some messy implications:

  • It's a backwards incompatibility; in 3.1 you could construct maps that you can no longer construct in 4.0
  • It potentially affects interoperability of 3.1 and 4.0 applications. For example, an XQuery 4.0 application invoking an XSLT 3.0 transformation via fn:transform might get back a map that's not a valid map in 4.0.

In effect, this is not just a change to the behaviour of one function/operator, it is a data model change, because it changes the value space of the map(*) data type.

And more parochially, I freely admit, there's a lot of internal complexity trying to maintain a code base that supports both the 3.1 and 4.0 rules simultaneously.

Is this a feature that benefits users sufficiently to justify the transition complexities? Note that we can still support "eq" between the two data types without supporting fn:atomic-equal.

Issue #885 created #created-885

07 Dec at 17:38:24 GMT
fn:uuid

…to create a random universally unique identifier (UUID), represented as 128-bit value.

Should ideally be nondeterministic, or we may need to do something that’s similar to fn:random-number-generator.

Pull request #884 created #created-884

06 Dec at 22:49:54 GMT
862 Add explanations and examples of implausible expressions

Fix #862

Issue #883 created #created-883

06 Dec at 21:57:23 GMT
Improve return type for fn:load-xquery-module()

The return type is given as map(*). We could make it more precise with a record type.

The same goes for a number of other function signatures that currently use map(*) as an argument or result type.

Perhaps we should also define a more precise type for options parameters.

Issue #882 created #created-882

06 Dec at 18:53:12 GMT
fn:chain or fn:compose

I thought I had a great opportunity to use fn:chain the other day, and then found it didn't do what I wanted.

I wanted to negate a predicate: in pseudo-code

items-where($seq, not contains(?, "e"))

and I thought I could do this by chaining contains and not. But it doesn't work that way: fn:chain applies a sequence of functions to an argument, it doesn't compose a sequence of functions to yield a new function.

I wonder if a function that composes functions would be more useful, so I could write

items-where($seq, compose((contains(?, "e"), not#1)))

Pull request #881 created #created-881

06 Dec at 11:59:38 GMT
866 Introduce and exploit new numeric-compare() function

Fix #866

The proposal introduces a new fn:numeric-compare function that differs from lt/eq primarily in that decimals are compared retaining their full precision, rather than converting them to doubles which may lose precision. This makes the comparison fully transitive which makes it safe to use in all sorting algorithms.

The new comparison semantics are exploited in max(), min(), and sort(), and indirectly in highest() and lowest(); they are also referenced for comparing numeric values in XSLT xsl:sort (and therefore indirectly in xsl:merge) and in XQuery order by.

An effect of the change is that max() and min() applied to a sequence of integers now return an integer, not a double.

Pull request #880 created #created-880

06 Dec at 11:13:12 GMT
872 Symmetry: fn:items-at → fn:get

Closes #872.

Pull request #879 created #created-879

06 Dec at 11:05:48 GMT
844 New sequence functions: names

Closes #844. The items keyword in the function names (excluding items-at) has been changed to subsequence.

See #878 for the controversial discussion on what to do with subsequence-(after|before|starting-where|ending-where).

Issue #855 closed #closed-855

06 Dec at 10:50:57 GMT

844 New sequence functions: names

Issue #869 closed #closed-869

06 Dec at 10:29:35 GMT

Incorrect example: for-each-pair

Issue #878 created #created-878

06 Dec at 08:59:17 GMT
Proposed extension to subsequence

Copied from https://github.com/qt4cg/qtspecs/issues/844#issuecomment-1841415417:


I'm thinking again about integrating the items-* quartet into a heavily overloaded subsequence function.

Must supply zero or one of:

  • $start - the start position
  • $from - a predicate, such that the start is the first item to match the predicate
    • defaulting to 1

And zero or one of:

  • $length - the number of items to include
  • $while - a predicate, the subsequence takes items so long as the predicate is true
  • $until - a predicate, the subsequence takes items up to and including the first for which the predicate is false
    • defaulting to the end of the sequence.

This omits the "items-after" combination, but that one is easily achieved using tail(subsequence(from:="xxx")).

Issue #877 created #created-877

06 Dec at 03:29:12 GMT
Inconsistency in XQFO comparator functions/operators with recursive rules

The rules for op:hexBinary-less-than() appear to define a recursive octet-by-octet operation, but I think it flounders in rule 3, where it does not ask for rule 2 to be applied seriatim to each octet pair, but asks for an en masse comparison of two octet sequences.

Compare to 5.3.2, Unicode Codepoint Collation, which describes a similar recursive item-for-item comparison. Interesting formal differences (e.g., unordered list versus ordered list).

fn:deep-equal() is similar, but it is also much more complex. Nevertheless, the way it breaks down the problem at the outset, to dispense immediately with the recursive factor, and deal simply with the rules for equality, is IMO admirable.

It would be nice if there were a bit more consistency in the prose and presentation of recursive rules. Do others agree, and are there other functions/operations that should be considered in this question? I'm thinking immediately only of comparator functions/operations, not functions that use recursion to filter or create. (There may be parallels, but let's start with those functions that are most similar.)

Issue #876 created #created-876

06 Dec at 02:50:51 GMT
Placement of fn:in-scope-namespaces(), fn:in-scope-prefixes(), fn:namespace-uri-for-prefix()

Currently fn:in-scope-namespaces(), fn:in-scope-prefixes(), and fn:namespace-uri-for-prefix are filed under XQFO chapter 10, which purports to deal exclusively with QNames. But these three functions have no direct bearing on QNames in either input or output.

Two options occur to me:

  1. Move sections 10.2.6-8 to fall after 13.3.
  2. Rename chapter 10 to "Functions related to QNames and namespaces". Create a 10.3 that pertains exclusively to namespaces. Move to this new subchapter 10.2.4, 10.2.6-8, as well as 13.3 fn:namespace-uri() .

Or some variant of the above.

At any rate, I think the current placement doesn't properly expose these functions to the browsing reader.

Pull request #875 created #created-875

06 Dec at 02:18:31 GMT
XQFO, chap. 9 minor edits

Hopefully nothing controversial here. Edits are motivated by consistency and clarity.

Issue #624 closed #closed-624

06 Dec at 01:14:03 GMT

XPath function definition clarification

Issue #616 closed #closed-616

06 Dec at 01:13:23 GMT

XDM: X Node vs. x node

Issue #464 closed #closed-464

06 Dec at 01:12:29 GMT

Serialization sequence normalization step 3 needs clarification

Pull request #874 created #created-874

06 Dec at 00:09:05 GMT
878 Proposed extension to subsequence

Following discussion under issue #844, I decided to explore the possibility of extending subsequence() with optional parameters, with the aim of making the quartet of items-before/after/starting-with/ending-with unnecessary.

This is the spec that results. I feel it's a good trade-off; by adding three optional parameters to fn:subsequence, we can eliminate 4 functions that we are having trouble finding names for. The examples feel to me to be intuitive and readable; and there is more capability in the new function than we had before, for example by combining a predicate for the start position with an integer for the length.

I haven't explored arity-2 callbacks - these certainly need some notes and examples.

Issue #822 closed #closed-822

05 Dec at 17:10:01 GMT

XQuery, XQFO: Edits (pool)

Issue #851 closed #closed-851

05 Dec at 17:10:00 GMT

822: XQuery, XQFO: Edits (pool)

QT4 CG meeting 057 draft minutes #minutes-12-05

05 Dec at 17:10:00 GMT

Draft minutes published.

Pull request #873 created #created-873

05 Dec at 14:56:21 GMT
865 Improve explanation of equality comparisons

Fix #865

This PR:

  • Adds a non-normative appendix to XPath and XQuery comparing and contrasting the different ways of doing equality comparisons
  • Changes fn:atomic-equal so it no longer refers to fn:deep-equal (the recursion terminated, but was confusing to follow)
  • Removes text in XQuery describing the non-transitivity of group by clauses, which is now a solved problem
  • Corrects the description of backwards incompatibilities relating to numeric comparisons in the F+O spec.

Issue #872 created #created-872

05 Dec at 11:58:50 GMT
Symmetry: fn:items-at → fn:get

I think that fn:items-at should be changed to fn:get:

  • In #843, we try to harmonize the function names across sequences, maps, and arrays. We have array:get and map:get to retrieve single entries of the input, but we have fn:items-at for sequences.
  • fn:items-at allows you to supply more than a single position, but items-at($seq, (1, 3, 2) can easily be rewritten to (1, 3, 2) ! get($seq, .) – similar to (1, 3, 2) ! array:get($array, .) and (1, 3, 2) ! map:get($map, .).
  • With #844, fn:items-at would be the only function left with items in its name.

The function signature would be as simple as:

fn:get(
  $input  as item()*,
  $at     as xs:integer	
) as item()

Obviously, most people will still use $input[$at] – but the same applies to arrays and maps (and other functions like fn:head). One of the advantages of fn:get is that you can pass on the context item as position argument.

Pull request #871 created #created-871

04 Dec at 15:25:44 GMT

Action qt4 cg 027 01 next match

Pull request #870 created #created-870

04 Dec at 15:24:35 GMT
867 Explain defaults in function signatures

Fix #867

Issue #869 created #created-869

04 Dec at 14:45:55 GMT
Incorrect example: for-each-pair

The fourth example of fn:for-each-pair is wrong:

for-each-pair(
  (1, 8, 2),
  (3, 4, 3),
  fn($item1, $item2, $pos) {
    $pos || ': ' || max(($item1, $item2))
  }
)

Result: ("1: 1", "2: 4", "3: 2")

The results as given return the min of the pair, not the max.

QT4 CG meeting 057 draft agenda #agenda-12-05

04 Dec at 12:15:00 GMT

Draft agenda published.

Issue #868 created #created-868

04 Dec at 11:54:03 GMT
fn:intersperse → fn:join, array:join($arrays, $separator)

With string-join, you can create a string for multiple strings, optionally interspersed with a separator. array:join can be used to create an array from multiple arrays.

fn:intersperse, which has been added to the XQuery 4 draft, does something very similar, and early feedback indicates that the function is useful, but easily to overlook due to its name.

I propose to unify the functions by…

  1. renaming fn:intersperse to fn:join;
  2. adding a parameter to array:join: $separator as array(*)* := (); and
  3. allowing a separator sequence for fn:string-join: $separator as xs:string* := ().

Examples

Query | Result | Info -- | -- | -- string-join(('1','2','3'), '-') | '1-2-3' | existing syntax array:join([[1],[2],[3]], ['-']) | [1,'-',2,'-',3] | new join((1,2,3), '-') | (1,'-',2,'-',3) | now: intersperse((1,2,3),'-') string-join(('1','2','3')) | '123' | existing syntax array:join([[1],[2],[3]]) | [1,2,3] | existing syntax join((1,2,3)) | (1,2,3) | now: intersperse((1,2,3))
(or just (1,2,3)) string-join(('1','2','3'), ('-','+') | '1-+2-+3' | new array:join([[1],[2],[3]], ['-','+']) | [1,'-','+',2,'-','+',3] | new join((1,2,3), ('-','+')) | (1,'-','+',2,'-','+',3) | now: intersperse((1,2,3),('-','+'))

Issue #867 created #created-867

03 Dec at 16:31:20 GMT
Signature notation in F+O: default values

Section 1.5 of F+O introduces the signature proforma notation, and indicates that default values may be included for parameters.

It does not however say how the default value is interpreted. For example, with the signature

fn:starts-with-sequence(
    $input as item()*,  
    $subsequence as item()*,   
    $compare as function(item(), item()) as xs:boolean := fn:deep-equal#2
) as xs:boolean

There is nothing to tell us that the expression fn:deep-equal#2 is evaluated with the static and dynamic context of the caller (or of the function reference).

Note that this is different from the similar notation used for function declarations in XQuery, where the static context for the default fn:deep-equal#2 would be taken from the function declaration in the Query prolog.

Issue #866 created #created-866

03 Dec at 13:19:13 GMT
fn:sort, and XSLT and XQuery sorting, should use transitive comparisons

We have addressed the question of non-transitivity of equality matching in distinct-values(), and in XSLT and XQuery grouping, but the same issue exists for sorting. Currently fn:sort, as well as XSLT and XQuery sorting, rely on the "lt" operator for comparing values including mixed numerics such as doubles and decimals. Because this promotes to double, it is capable of losing precision, and is therefore non-transitive. Most sort algorithms rely on the supplied comparison function being transitive, and if it isn't, then undefined failures may occur including non-termination.

One particular quirk (which led me here) is that fn:highest and fn:lowest start by using fn:sort semantics to put the values in order, and then rely on fn:deep-equal semantics to find the values that are "equal highest" or "equal lowest". But fn:sort and fn:deep-equal have different ways of deciding whether two values are equal: decimal 1.2 and double 1.2 are equal for fn:sort, but not for fn:deep-equal.

Issue #865 created #created-865

03 Dec at 00:49:44 GMT
Need to explain change in numeric comparison semantics

We need to explain more clearly that we now have different rules for comparing numeric values in different circumstances. My understanding of the situation is:

For eq, =, etc, we continue to use the XPath 2.0/3.0/3.1 rules for backwards compatibility reasons: for example comparison between decimal and double is done by converting the decimal to a double. This has known problems in terms of transitivity, but we have retained the rules because we identified that too many compatibility problems would be introduced by changing them.

For deep-equal, distinct-values, XSLT and XQuery grouping, etc, we have switched to the rules that were introduced for comparing map keys in 3.0, now available through the fn:atomic-equal function. Under these rules, doubles are promoted to decimals for comparison.

We should probably include a table showing which rules are used where.

A good example to use is (1e-3 = 0.001). This is true in both 3.1 and 4.0. But under the rules for maps in 3.1, and the new rules for distinct-values in 4.0, these two values are considered distinct.

Issue #864 created #created-864

01 Dec at 23:34:39 GMT
$position argument in fold-right

In fn:fold-right (and thus array:fold-right) it's not clear how the position parameter works.

It appears to start at 1, and then to be decremented, which seems a little weird.

Working out what happens seems to involve reverse engineering the code, which isn't ideal. It's useful to have a formal definition of the function using code, but it shouldn't be necessary to reverse engineer 20 lines of difficult recursive code in order to get a feel for what the function does.

The only example given doesn't add any clarification.

Issue #470 closed #closed-470

29 Nov at 13:33:50 GMT

369: add fixed-prefixes attribute in XSLT

Issue #412 closed #closed-412

29 Nov at 13:33:21 GMT

409, QT4CG-027-01: xsl:next-match

Issue #856 closed #closed-856

29 Nov at 11:58:23 GMT

Spec for deep-equal() still references FOTY0015

Issue #857 closed #closed-857

29 Nov at 11:58:22 GMT

856 Drop reference to obsolete error condition in deep-equal()

Pull request #863 created #created-863

29 Nov at 11:53:22 GMT
742 Drop xsl:function-library declaration

Fix #742

Issue #862 created #created-862

29 Nov at 04:08:54 GMT
Examples needed for "Implausible Expressions"

In XPath 4.0 there is a new concept of Implausible Expressions.

There are several different sections about different types of implausible expressions:

  • 2.4.6 Implausible Expressions - only a single example, and it is not in an Examples sub-section and is difficult to locate.
  • 3.8.1 Implausible Coercions -has 3 examples and seems OK.
  • 4.7.4.3 Implausible Axis Steps - has no visible examples.
  • 4.15.3.4 Implausible Lookup Expressions - has no examples.

Another problem is that the definition of "implausible" seems not precise and subjective (what is the meaning of "there is a high probability that they were written incorrectly"):

"implausible Certain expressions, while not erroneous, are classified as being implausible, because there is a high probability that they were written incorrectly."

Proposed fixing actions:

  1. Provide a more precise and non-subjective definition of the concept.

  2. Provide many examples of implausible expressions - both in the central section 2.4.6 and in all other sections dealing with more specific types of implausible expressions.

Issue #169 closed #closed-169

28 Nov at 17:24:29 GMT

Handling of duplicate keys in xsl:map

QT4 CG meeting 056 draft minutes #minutes-11-28

28 Nov at 17:18:00 GMT

Draft minutes published.

QT4 CG meeting 056 draft agenda #agenda-11-28

27 Nov at 12:30:00 GMT

Draft agenda published.

Issue #858 closed #closed-858

27 Nov at 08:38:26 GMT

fn:identity: accept 2 arguments, ignore second

Issue #861 created #created-861

26 Nov at 22:52:33 GMT
Precise meaning of $E??KS

I don't think that the semantics of the expression $E??KS are clearly enough defined.

The effect of the deep lookup expression E??KS is obtained by evaluating E, establishing its recursive content C, removing any item that is not a map or array to yield a sequence D, and then evaluating the shallow lookup expression D?KS, but with one exception: if evaluation of any shallow lookup fails, then the error is not propagated, but instead its result is taken to be an empty sequence.

  1. The definition of "recursive content" needs to be tightened up.
  2. It needs to be more clearly stated which errors we ignore, and which we don't. For example, what if the key specifier evaluates to a non-singleton sequence?
  3. In the case of $E??* in particular, I don't think it makes much sense to exclude items that are not maps or arrays. I think the expectation in this case is to return the full recursive content.
  4. As currently defined, if $M is a map, the the result of $M??* includes the map $M itself. I don't think this matches expectations. Certainly, with the parallel expression $node//*, the result does not include $node.

Issue #860 created #created-860

26 Nov at 19:46:04 GMT
Unary Lookup when the context value is a sequence

We have added the text

If the context value is anything other than a single item, the semantics of the expression ?KS are defined to be equivalent to the expression . ! ?KS. The remainder of this section therefore explains the semantics on the assumption that the [context value] is a single item, referred to as the context item.

Consider the case where the context value is a sequence of two maps (map{'x':1, 1:'p', 2:'q'}, map{'x':2, 1:'P', 2:'Q'}) and the expression is ?(?x). What is the context for evaluation of the key specifier (?x)? I would have expected that we evaluate the key specifier once, in the outer context, so the key specifier value is (1,2) and we therefore take the entries with keys 1 and 2 in both maps, giving a result of ('p', 'q', 'P', 'Q'). But the cited paragraph suggests we evaluate KS separately for each item in the context value, and this return entry 1 of map 1 and entry 2 of map 2, giving a result of ('p', 'Q').

I know it's an edge case and it's unlikely in practice that people will write context-dependent key specifiers, but the rules need to be clear. I thought we had previously decided that the key specifier expression should be evaluated in the outer context.

Issue #859 created #created-859

26 Nov at 16:32:13 GMT
Syntax problem with type-qualified wildcards in lookup expressions

The new syntax for type-qualified wildcards has problems when used in a chained lookup expression, for example

[[1,2], [3,4], 5, 6]?*::array(*)?1

because the "?" that follows array(*) is interpreted as an occurrence indicator. This can be avoided by using parentheses, but it's too much of an elephant trap - a better solution is needed.

Issue #858 created #created-858

25 Nov at 19:27:17 GMT
fn:identity: accept 2 arguments, ignore second

Apart from id, Haskell has const, which accepts 2 arguments, but only returns the first. Thanks to the introduction of default arguments, it’s straightforward to extend fn:identity to be able to accept 2 arguments:

fn:identity(
  $input    as item()*,
  $ignored  as item()*  := ()
) as item()*

Pull request #857 created #created-857

24 Nov at 16:00:09 GMT
856 Drop reference to obsolete error condition in deep-equal()

Fix #856

Issue #856 created #created-856

24 Nov at 13:05:14 GMT
Spec for deep-equal() still references FOTY0015

The errors section of fn:deep-equal still says

A type error is raised [[err:FOTY0015] if either input sequence contains a function item that is not a map or array.

This is no longer the case (and error FOTY0015 is now obsolete and should be removed from the appendix)

Pull request #855 created #created-855

24 Nov at 10:46:18 GMT
844 New sequence functions: names

Closes #844

Issue #852 closed #closed-852

23 Nov at 12:04:43 GMT

Typo in XQuery equivalent for fn:transitive-closure

Issue #853 closed #closed-853

23 Nov at 12:04:42 GMT

852 Fix typo in transitive-closure

Issue #854 created #created-854

22 Nov at 22:16:48 GMT
Need more discussion and explanation of deep-lookup operator

During discussion of the ?? operator it was pointed out that we need more examples and explanation, especially of how to handle cases where the "flattening" behaviour of the operator is inconvenient. This applies equally to paths using the existing ? operator - $x?y?z and $x??z both have this problem. For example, there is no way of filtering the result of$x?y?z or $x??z to select only members of size 3.

Issue #57 closed #closed-57

22 Nov at 20:33:58 GMT

The item-type(T) syntax is not defined

Issue #172 closed #closed-172

22 Nov at 19:55:31 GMT

Record Tests

Issue #233 closed #closed-233

22 Nov at 19:53:35 GMT

Declare the result type of a mode, via @as

Issue #698 closed #closed-698

22 Nov at 19:20:04 GMT

GitHub: Line Endings

Issue #730 closed #closed-730

22 Nov at 16:44:23 GMT

Equivalence of map and function types

Issue #840 closed #closed-840

22 Nov at 16:37:35 GMT

Wrong example in fn:seconds-from-duration

Pull request #853 created #created-853

22 Nov at 16:34:36 GMT
852 Fix typo in transitive-closure

Fix #852

Issue #852 created #created-852

22 Nov at 16:04:34 GMT
Typo in XQuery equivalent for fn:transitive-closure

tc-inclusive($node/$step(.)), $step)

should read

tc-inclusive($node/$step(.), $step)

Issue #837 closed #closed-837

21 Nov at 17:15:23 GMT

297 Deep Lookup Operator "??" and wildcard qualifier "::"

QT4 CG meeting 055 draft minutes #minutes-11-21

21 Nov at 17:15:00 GMT

Draft minutes published.

Issue #848 closed #closed-848

21 Nov at 17:14:29 GMT

More fo spec examples corrections

Issue #833 closed #closed-833

21 Nov at 17:13:54 GMT

Fix the line endings, force a single lf in text files

Issue #841 closed #closed-841

21 Nov at 17:10:27 GMT

840: Typo in fn:seconds-from-duration example

Issue #845 closed #closed-845

21 Nov at 17:09:19 GMT

Quantified expressions and "binding tuples" (Editorial)

Issue #846 closed #closed-846

21 Nov at 17:06:54 GMT

845 Drop mention of tuples

Issue #842 closed #closed-842

21 Nov at 17:03:29 GMT

Improve stylesheet for generating keyword tests

Pull request #851 created #created-851

21 Nov at 14:39:32 GMT
822: XQuery, XQFO: Edits (pool)

Editorial; closes #822 (visit this issue for a list of the changes)

Issue #850 created #created-850

21 Nov at 13:04:45 GMT
fn:parse-html: Finalization

Now that fn:parse-html has been added to the specification, we need test cases for all provided options and input types (including binary input).

Looking at the current set of test cases, it seems unrealistic to use older libraries such as TagSoup for this function. I wonder if we should support ·implementation-defined· parsing algorithms at all. What do others think?

Next, is there any implementation available that supports all given method/html-version variants?

Issue #783 closed #closed-783

21 Nov at 12:56:10 GMT

Editorial: errors are raised (not reported, signaled, generated, or thrown).

Pull request #849 created #created-849

21 Nov at 10:09:49 GMT
847 Allow uri-structure-record keys to have empty sequence values

Fix #847

The description of build-uri was written with the expectation that if a key was in the map, it's value should be used. I don't really want to replace every occurrence of

if `x` is present in the map

with

if `x` is present in the map and its value is not the empty sequence

So I've attempted to justify that globally with the following paragraph at the beginning of the description:

The components are derived from the contents of the $parts map. To simplify the description below, any key whose value is the empty sequence is ignored; this is equivalent to the key not being present in the map.

I'm not hugely proud of that bit of prose though. Suggestions for improvements most welcome.

Pull request #848 created #created-848

21 Nov at 08:07:15 GMT
More fo spec examples corrections
  • Adds tagging for fos:test elements so we know which tests depend on XQuery (rather than XPath).

  • Corrects expected results for some tests

Issue #847 created #created-847

20 Nov at 23:52:36 GMT
build-uri() - is {"port":()} legal?

The (only) example in the spec of a build-uri() call specifies {"port":()} . But uri-structure-record has

port? as xs:string,

which means that the empty sequence is not a valid value. Either the record structure should be changed to specify the type as xs:string?, or the example should be changed.

QT4 CG meeting 055 draft agenda #agenda-11-21

20 Nov at 12:30:00 GMT

Draft agenda published.

Pull request #846 created #created-846

20 Nov at 00:32:21 GMT
845 Drop mention of tuples

Fix #845

Issue #845 created #created-845

19 Nov at 23:46:11 GMT
Quantified expressions and "binding tuples" (Editorial)

The section of the XPath specification on Quantified Expressions contains paragraph starting

"The order in which test expressions are evaluated for the various binding tuples"

This is the only place in which "binding tuples" are mentioned in connection with quantified expressions, and in XPath (as distinct from XQuery) it is the only place where tuples are mentioned at all. The paragraph could easily be rewritten to avoid introducing a new concept.

Issue #844 created #created-844

18 Nov at 11:11:44 GMT
New sequence functions: names

Observations:

A. What about renaming fn:contains-sequence, fn:starts-with-sequence and fn:ends-with-sequence to fn:contains-items, fn:starts-with-items and fn:ends-with-items, in alignment with fn:items-at, fn:items-before, etc.? If we add equivalent functions for arrays, it could be array:contains-members, etc.

B. It seems confusing to have fn:items-starting-where and fn:items-after, instead of fn:items-starting-after. Maybe we can think of alternative (shorter) names for fn:items-starting-where and fn:items-ending-where? – I know we’ve discussed before; I raised it again due to user feedback.

Issue #843 created #created-843

18 Nov at 10:59:36 GMT
Standard, array & map functions: Equivalencies

In many threads (#135, others), we have discussed how to align the functions for sequences, arrays, and maps. This is an attempt to summarize the status quo, and I hope to keep it up-to-date in the coming weeks.

The 4.0 functions are the ones with the keyword new attached. If the function is followed by a question mark, there may be an existing issue for its addition, or it may be consistent to add it.

Please note that the data types have fundamental differences, so it’s not always possible to present or provide exact symmetries.

To be discussed

Functions | Array Functions | Map Functions --- | --- | --- fn:contains-subsequence
new: #94, #844 | array:contains-subarray ? | map:contains fn:ends-with-subsequence
new: #96, #844 | array:ends-with-subarray ? | – fn:starts-with-subsequence
new: #96, #844 | array:starts-with-subarray ? | – fn:distinct-values | array:distinct-members ? | – fn:duplicate-values
new: #123 | array:duplicate-members ? | – fn:empty | array:empty new: #229 | map:empty ? #827 fn:exists | array:exists new: #229 | map:exists ? #827 fn:every new | array:every ? | map:every ? fn:some new | array:some ? | map:some ? fn:highest new | array:highest ? | – fn:lowest new | array:lowest ? | – fn:index-of | array:index-of ? #260 | – fn:index-where new | array:index-where
new: #114 | map:keys($m, $pred)
new: #467 fn:items-at new: #213
→ fn:get ? #872 | array:members-at ? #825
array:get | map:get fn:intersperse new: #2
→ fn:join ? #868 | array:join | – fn:subsequence-where
new: #878 | array:subarray-where ? | – fn:substitute ? #553, #583 | array:replace new;
array:substitute ? #583 | map:replace new;
map:substitute ? #583 fn:slice new | array:slice new | – – | array:split new | map:entries new – | array:values new | map:keys; map:values new – | array:entries ? #826 | map:entries new – | array:merge ? #826 | map:merge – | – | map:entry – | array:members
new → keep? #826 | map:pairs
new → keep? #826 – | array:of-members
new → keep? #826 | map:of-pairs
new → keep? #826 – | – | map:pair
new: #508 → keep? #826

Settled

Functions | Array Functions | Map Functions --- | --- | --- fn:count | array:size | map:size fn:filter | array:filter | map:filter fn:fold-left | array:fold-left | – fn:fold-right | array:fold-right | – fn:for-each-pair | array:for-each-pair | – fn:for-each | array:for-each | map:for-each fn:head | array:head | – fn:insert-before | array:insert-before | – fn:remove | array:remove | map:remove fn:reverse | array:reverse | – fn:sort | array:sort | – fn:subsequence | array:subarray | – fn:tail | array:tail | – – | array:put; array:append | map:put fn:footnew: #250 | array:foot new: #250 | – fn:trunk new: #250 | array:trunk new: #250 | –

Issue #91 closed #closed-91

18 Nov at 09:58:22 GMT

name of map:substitute

Issue #104 closed #closed-104

18 Nov at 09:57:20 GMT

name of map:replace/array:replace

Issue #699 closed #closed-699

17 Nov at 16:34:57 GMT

GitHub: Signing

Pull request #842 created #created-842

17 Nov at 15:47:35 GMT
Improve stylesheet for generating keyword tests

Improves the stylesheet that generates the test set misc/BuiltInKeywords.xml; specifically, it's smarter about generating acceptable callback functions that won't trigger an unwanted error.

(The generated test calls each function twice, once with positional arguments and once with keyword arguments, and checks that the two results are deep-equal).

Issue #838 closed #closed-838

17 Nov at 10:35:27 GMT

Collations in F&O examples for functions such as fn:contains()

Issue #839 closed #closed-839

17 Nov at 10:35:26 GMT

838 Fix collation variable references

Pull request #841 created #created-841

16 Nov at 22:57:48 GMT
840: Typo in fn:seconds-from-duration example

Addresses #840

Issue #840 created #created-840

16 Nov at 22:43:58 GMT
Wrong example in fn:seconds-from-duration

The newly added example

seconds-from-duration(
   xs:duration("P1Y1D")
)

Result: 1

Looks clearly wrong.

Joel, what did you have in mind?

Pull request #839 created #created-839

16 Nov at 20:29:58 GMT
838 Fix collation variable references

Fix #838. Editorial.

Issue #838 created #created-838

16 Nov at 19:25:30 GMT
Collations in F&O examples for functions such as fn:contains()

The examples for a number of functions, such as fn:contains(), use a UCA collation URI bound to the variable $coll.

Two problems:

(a) These sections also contain prose saying "The collation used in these examples, http://example.com/CollationA is a collation in which both -and * are ignorable collation units." - but this is not the collation URI actually used. This error was already present in the published 3.1 Recommendation.

(b) The examples that use the variable $coll do not have a use attribute, which means that the test cases generated in the test suite do not declare the variable, which means that the tests fail.

Pull request #837 created #created-837

15 Nov at 16:57:44 GMT
297 Deep Lookup Operator "??" and wildcard qualifier "::"

Adds support for a deep lookup operator "??" as a transitive equivalent of "?", and allows a wildcard lookup X?* or X??* to be qualified with the required type of value, for example X??*::record(from, to).

Issue #836 created #created-836

15 Nov at 13:24:48 GMT
Add support for CSV 'dialect' features covered by the OKFN's Frictionless Data CSV spec in `fn:parse-csv` and related functions

The OKFN's Frictionless Data project's CSV Standard specifies some additional things we should take into account for fn:parse-csv and related functions.

Most important is the option to specify a comment line character, whose presence at the start of a line will cause it to be treated as a comment. Because of the way that rows can span lines, post-processing to extract comments might be impossible in some cases.

Issue #835 created #created-835

15 Nov at 13:14:59 GMT
Review names of record types

The use of record types as a thing that users will interact with seems to be increasing. There are a number of new types proposed and we should review their names to ensure that we are using a coherent naming scheme for all spec-defined record types, and that the names are good enough - that they explain what they are, and aren't too unwieldy.

Issue #834 created #created-834

15 Nov at 13:09:36 GMT
Add creation function for `csv-row-record` type

The csv-row-record type used by the CSV XDM mapping provides its fields and an accessor function that can perform field lookup by index or column name. The column names are set when the CSV is parsed. For users who want to make use of csv-row-record (when needing to parse CSVs that fn:parse-csv itself cannot handle out-of-the-box, perhaps) we should provide a creation function that accepts the name: index map and the fields for the row and creates a csv-row-record with a correctly functioning field function entry.

Pull request #833 created #created-833

15 Nov at 12:48:13 GMT
Fix the line endings, force a single lf in text files

Hello,

Per discussion on the list, this PR changes the way git handles line endings so that text files will always, exclusively have line endings delimited by a single lf. This should be a largely transparent change for Mac/Unix users. On Windows, it means that checked out files will have lf line termination. If your favorite editing tool can handle this, then you don't have to care. Even if your editor saves files with cr/lf line endings, git will turn them back into single lf endings when you commit your changes.

Hat tip to @ChristianGruen for keeping focus on this issue.

This PR has no technical changes, and effects a relatively small number of files. I'll leave it here for a day or two, then I'm going to be inclined to merge it. Object now, if you object :-)

Pull request #832 created #created-832

15 Nov at 10:29:38 GMT
77 Add map:deep-update and array:deep-update

Note that unlike many of the functions we have added, these are non-trivial: they cannot easily be implemented in XSLT or XQuery.

This is a first cut and I expect some refinement will be needed, but reviews are invited.

I might subsequently propose layering some XSLT syntax on top of this for convenience.

Issue #831 closed #closed-831

14 Nov at 18:16:46 GMT

Fixed a couple of markup errors

Pull request #831 created #created-831

14 Nov at 18:16:40 GMT
Fixed a couple of markup errors

I don't really know how these crept in. Perhaps I was negligent in confirming that #828 passed tests in PR? Too late to tell now, but I think I've fixed them.

Issue #516 closed #closed-516

14 Nov at 17:47:43 GMT

Add position argument to HOF callbacks

Issue #828 closed #closed-828

14 Nov at 17:47:42 GMT

516 Add position argument to HOF callbacks

Issue #736 closed #closed-736

14 Nov at 17:40:59 GMT

730: Clarify (and correct) rules for maps as instances of function types

Issue #719 closed #closed-719

14 Nov at 17:37:27 GMT

413: Spec for CSV-related functions

Issue #554 closed #closed-554

14 Nov at 17:33:37 GMT

The Transitive Closure function produces an incomplete result, completeness/success and number of actual iterations must also be returned

Issue #754 closed #closed-754

14 Nov at 17:33:36 GMT

fn:transitive-closure: signature; remarks; too specific?

Issue #761 closed #closed-761

14 Nov at 17:33:35 GMT

554/754 Simplify the new transitive-closure function

Issue #216 closed #closed-216

14 Nov at 17:21:42 GMT

fn:unparsed-text: End-of-line characters

Issue #794 closed #closed-794

14 Nov at 17:21:41 GMT

216: fn:unparsed-text: End-of-line characters

Issue #712 closed #closed-712

14 Nov at 17:18:00 GMT

array:sort: to be aligned with fn:sort

Issue #823 closed #closed-823

14 Nov at 17:17:59 GMT

712 Extend array:sort to align with fn:sort

QT4 CG meeting 054 draft minutes #minutes-11-14

14 Nov at 17:15:00 GMT

Draft minutes published.

Issue #738 closed #closed-738

14 Nov at 17:12:28 GMT

FO: Why is fn:op under section "17.3 Dynamic loading"

Issue #799 closed #closed-799

14 Nov at 17:12:27 GMT

Errors in F&O spec examples

Issue #824 closed #closed-824

14 Nov at 17:12:26 GMT

799 errors in examples; 738 section heading for fn:op

Issue #747 closed #closed-747

14 Nov at 17:11:15 GMT

QName literals

Issue #743 closed #closed-743

14 Nov at 17:11:05 GMT

Extend enumeration types to allow values other than strings

QT4 CG meeting 054 draft agenda #agenda-11-14

13 Nov at 11:55:00 GMT

Draft agenda published.

Issue #830 created #created-830

12 Nov at 22:52:58 GMT
Revise appendix D.4 of F+O: Illustrative user-written functions

Many of the functions in this non-normative appendix are no longer needed, or can be expressed more concisely using new 4.0 language features.

Issue #829 created #created-829

12 Nov at 09:42:50 GMT
fn:boolean: EBV support for more item types

In #817, it was discussed that the current EBV semantics have been inspired a lot by XPath 1.0. Today, we have numerous other data types apart from strings, doubles, booleans, and nodes, and I believe it’s time to do justice to this by getting rid of the error for unsupported data types for fn:boolean.

We currently have:

Type | Rule to compute boolean value --- | --- node() | true() xs:boolean | $item != 0 and not(is-NaN($item)) xs:untypedAtomic, xs:string, xs:anyURI | $item != ''

I have two options in mind:

  1. The easiest solution, which would come closest to JavaScript, would be to return true() for all other items. This would allow us to do simple checks like:
declare function local:byte-length($data as xs:basexBinary?) xs:integer {
  (: instead of exists($data); utilizes the EXPath Binary Module :)
  if($data) then bin:length($data) else 0
};
  1. If we want to be more fine granular, we could do justice to the specifics of 4 more types:

Type | Rule to compute boolean value --- | --- array(*) | array:size($item) != 0 map(*) | map:size($item) != 0 xs:base64Binary
xs:hex64Binary | bin:length($item) != 0 or
not($item = (xs:hexBinary(''), xs:base64Binary(''))

It would then be possible to write:

  • if($map) instead of map:size($map) != 0 or map:exists($map) (see #827 for the naming controversy).
    Note that if($map) will also return false() is $map is an empty sequence.
  • if($func) { $func(1, 2) } instead of exists($func)

In the last comments of #817, it was addressed that the behavior of existing code may change if errors are replaced by results. I hope we can live with that, as I cannot think of cases in which the EBV computation make sense for items that always raise an error.

Which option do some of you prefer?

Issue #817 closed #closed-817

12 Nov at 09:07:52 GMT

EBV 4.0

Pull request #828 created #created-828

11 Nov at 22:51:30 GMT
516 Add position argument to HOF callbacks

I have added positional parameters to the following functions:

array:filter
array:fold-left
array:fold-right
array:for-each
array:for-each-pair
array:index-where
fn:every
fn:filter
fn:fold-left
fn:fold-right
fn:for-each
fn:for-each-pair
fn:index-where
fn:items-after
fn:items-before
fn:items-ending-where
fn:items-starting-where
fn:iterate-while
fn:partition
fn:some

Comments:

  • For fn:every and fn:some, the additional parameter seemed useful to me, as a positional variable has been requested for quantifier expressions in the past.
  • I’ve also added positional variables to folds.
  • I’ve unified and simplified the formal XPath/XQuery equivalencies in the rule sets.
  • I’ve dropped some XSLT equivalencies because I felt that the XQuery representations are more concise (I certainly won't mind if they're added back).

Closes #516.

Issue #827 created #created-827

09 Nov at 14:36:45 GMT
map:empty, map:exists ← array:empty, array:exists

We have array:empty and array:exists, but no equivalent functions for maps.

I think we have decided to live with the ambiguity (discussed in #229) that map:exists(map {}) will return false although the “map exists”. Same for arrays.

Issue #826 created #created-826

09 Nov at 14:03:11 GMT
Arrays: Representation of single members of an array

When introducing the new array features to some users, the for member syntax was welcomed by everyone.

However, there was some confusion (again, see my past feedback to the mailing list) about what the QT4 group considers to be “members of an array”, and about value records.

In particular, the “value record” representation of arrays led to questions that I didn’t have a good answer for. In particular, people didn’t understand why an array member was returned as a map, and why that map is (again) called “array member” or “value record” – a term no one associated with arrays (at least for now… which somewhat is not surprising, as it has just been introduced).

Next, due to atomization (as mentioned before), array:split allows us to omit the explicit ?value lookups that are required for array:members:

sum(array:members($array)?value)
sum(array:split($array))

I suppose I have been biased in my presentation, but I’ve failed to give good arguments to justify the current solution in the spec. The questions that I think need to be answered are:

  • How will people benefit from the (usually intermediate) map representation for array members?
  • What exactly do we win with array:members and array:of-members instead of using the existing array:join function, combined with the new array:split function?

Out of interest, I have rewritten the formal equivalencies for the array functions with array:split/array:join:

  • array:append

array:of-members((array:members($array), map{'value':$member})) array:join((array:split($array), array { $member }))

  • array:build

array:of-members($input ! map { 'value': $action(.) }) array:join($input ! array { $action(.) })

  • array:filter

array:of-members(array:members($array) => filter(function($m) { $predicate($m?value) }) array:join(array:split($array) => filter(function($m) { $predicate($m?*) })

  • array:for-each

array:of-members(array:members($array) ! map { 'value': $action(?value) }) array:join(array:split($array) ! array { $action(?*) })

  • array:for-each-pair
array:of-members(
  for-each-pair(array:members($array1), 
    array:members($array2), 
    function($m, $n) {map{'value': $action($m?value, $n?value)}}))
array:join(
  for-each-pair(array:split($array1), array:split($array2),
    function($m, $n) { array { $action($m?*, $n?*) } }))
  • array:insert-before

array:of-members(array:members($array) => insert-before($position, map{'value':$member})) array:join(array:split($array) => insert-before($position, array { $member }))

  • array:remove

array:of-members(array:members($array) => remove($positions)) array:join(array:split($array) => remove($positions))

  • array:reverse

array:of-members(array:members($array) => reverse()) array:join(array:split($array) => reverse())

  • array:slice

array:of-members(array:members($array) => slice($start, $end, $step)) array:join(array:split($array) => slice($start, $end, $step))

  • array:split

array:of-members(array:members($array) => sort($collation, function($x) { $key($x?value) })) array:join(array:split($array) => sort($collation, function($x) { $key($x?*) }))

  • array:subarray

array:of-members(array:members($array) => subsequence($start, $length)) array:join(array:split($array) => subsequence($start, $length))

  • array { $sequence }

array:of-members($sequence ! map { 'value': . }) array:join($sequence ! array { . })

  • [E1, E2, E3, ..., En]

array:join((map { 'value': E1 }, map { 'value': E2 }, map { 'value': E3 }, ... map { 'value': En })) array:join((array { E1 }, array { E2 }, array { E3 }, ... array { En }))

  • $array?*

array:members($array) ! ?value array:split($array) ! ?*

  • $array?$N / $array($N)

array:members($array)[$N]?value array:split($array)[$N]?* (or array:get($array, $N))


As a side note, I noticed that the equivalence given for array:join must be buggy:

(: current equivalence presented in the spec :)
array:of-members($arrays ! array:members(.))

(: returns [ 1, 2, 3 ] :)
let $arrays := ([ 1 ], [ 2, 3 ])
return array:of-members($arrays ! array:members(.))

Concluding, If I could choose, I would tend to drop array:members and array:of-members and rename array:split to array:members.

Issue #825 created #created-825

09 Nov at 02:03:55 GMT
array:members-at

The title says it all.

We have fn:slice and array:slice. We also do have fn:items-at, but we have somehow missed to add the corresponding array:items-at array:members-at function.

We could even think of a function map:entries-at and map:values-for-keys. The first of these would return all map entries that have as keys one of the provided as argument set of keys. The 2nd function would return all values of the map entries that have as keys one of the provided as argument set of keys.

Here is a complete XPath 3.1 implementation:

let $members-at := function(
                 $input as array( *), 
                 $indexes as xs:integer*
                ) as array(*)*
 {
     for $ind in $indexes
       return [$input($ind)]
 }

Example:

Evaluating this expression:

let $members-at := function(
                 $input as array( *), 
                 $indexes as xs:integer*
                ) as array(*)*
 {
     for $ind in $indexes
       return [$input($ind)]
 }
  return
     $members-at([1, (2, 3), (4, 5, 6)], (1, 3) )

produces the wanted result:

[1], [(4,5,6)]

Issue #771 closed #closed-771

08 Nov at 20:59:30 GMT

British vs. American English

Pull request #824 created #created-824

08 Nov at 20:42:01 GMT
799 errors in examples; 738 section heading for fn:op

Fix #799 Fix #738

Pull request #823 created #created-823

08 Nov at 20:20:31 GMT
712 Extend array:sort to align with fn:sort

Fix #712

Issue #480 closed #closed-480

08 Nov at 14:17:43 GMT

Allow type promotion of xs:string to xs:anyURI

Issue #822 created #created-822

08 Nov at 12:41:17 GMT
XQuery, XQFO: Edits (pool)

XQuery spec:

  • [x] 4.3.4 Context Value Reference → 4.3.4 Context Value References (plural, in alignment with the other expressions)
  • [x] Move 4.3.4 before 4.3.3 Parenthesized Expressions (and after 4.3.2 Variable References)
  • [x] 4.19 Switch Expression → 4.19 Switch Expressions
  • [x] Try/Catch Expressions: There’s no CatchErrorList anymore

XQFO spec:

  • [x] Unify representation of equivalent examples, implementations, …see https://github.com/qt4cg/qtspecs/pull/828#issuecomment-1807222990
  • [x] Add History sections for new functions
  • [x] 9^XXX outputs "4451""3185"
  • [x] errors are signaled → raised (#783)

Issue #820 closed #closed-820

08 Nov at 11:40:39 GMT

FLWOR: Variable Bindings, coercion

Issue #821 created #created-821

08 Nov at 11:32:24 GMT
Annotations: Make default namespace explicit

In XQuery, the default namespace for annotations is http://www.w3.org/2012/. It’s the only namespace for which no prefix exists, and I think we should change that. ann feels like a reasonable choice to me as we tend to have short prefixes (such as err for errors).

Issue #820 created #created-820

08 Nov at 11:25:23 GMT
FLWOR: Variable Bindings, coercion

In the current XQuery 4, the coercion rules are applied to variable bindings of FLWOR expressions:

https://qt4cg.org/specifications/xquery-40/xquery-40-diff.html#id-binding-rules

I believe this is yet another feature that needs to be formally accepted. If it has already been done so, this issue can be closed again immediately (maybe with a reference to the related issue, or the associated QT4 meeting).

Issue #65 closed #closed-65

08 Nov at 10:00:59 GMT

Support using different input/output element namespaces

Issue #238 closed #closed-238

08 Nov at 09:59:27 GMT

Support Invisible XML

Issue #789 closed #closed-789

08 Nov at 09:55:12 GMT

Serialization spec: terminology

Issue #807 closed #closed-807

08 Nov at 09:55:11 GMT

789 Serialization terminology [editorial]

Issue #791 closed #closed-791

07 Nov at 18:14:57 GMT

238: First draft of an fn:invisible-xml function

Issue #130 closed #closed-130

07 Nov at 18:12:22 GMT

New super/union type xs:binary?

Issue #815 closed #closed-815

07 Nov at 18:12:21 GMT

130,480 Binary Promotion

Issue #772 closed #closed-772

07 Nov at 18:08:54 GMT

Revise the fn:parse-html rules to make them clearer to follow.

Issue #809 closed #closed-809

07 Nov at 18:06:08 GMT

Placement of fn:atomic-equal in the specification

Issue #813 closed #closed-813

07 Nov at 18:06:07 GMT

809 Move fn:atomic-equal to section 14.2

Issue #806 closed #closed-806

07 Nov at 18:02:43 GMT

566 A few minor fixes for parse-uri

Issue #804 closed #closed-804

07 Nov at 17:58:07 GMT

Minor edits, XQFO chh. 7, 8

Issue #651 closed #closed-651

07 Nov at 17:45:13 GMT

fn:log → fn:message

Issue #803 closed #closed-803

07 Nov at 17:45:12 GMT

651: fn:log → fn:message

Issue #801 closed #closed-801

07 Nov at 17:45:07 GMT

nondeterministic vs non-deterministic

Issue #802 closed #closed-802

07 Nov at 17:45:06 GMT

801: non-deterministic → nondeterministic

Issue #660 closed #closed-660

07 Nov at 17:44:59 GMT

Static functions, default parameters

Issue #800 closed #closed-800

07 Nov at 17:44:58 GMT

660: Static functions, default parameters, XPST0017

Issue #797 closed #closed-797

07 Nov at 17:34:13 GMT

Edits to parse-uri()

Issue #704 closed #closed-704

07 Nov at 17:34:04 GMT

Context Value Expression → Context Value Reference

Issue #793 closed #closed-793

07 Nov at 17:34:03 GMT

704: Context Value Expression → Context Value Reference

Issue #819 closed #closed-819

07 Nov at 17:33:16 GMT

Fix markup error in example

Pull request #819 created #created-819

07 Nov at 17:33:11 GMT

Fix markup error in example

Issue #792 closed #closed-792

07 Nov at 17:21:35 GMT

783 XSLT: errors are raised

Issue #790 closed #closed-790

07 Nov at 17:21:24 GMT

129 XSLT40 and SER40 changes for context item -> value

Issue #775 closed #closed-775

07 Nov at 17:21:13 GMT

517: Reflected Christian Gruen's remarks

QT4 CG meeting 053 draft minutes #minutes-11-07

07 Nov at 17:20:00 GMT

Draft minutes published.

Issue #756 closed #closed-756

07 Nov at 17:18:10 GMT

JSON serialization - number formatting

Issue #818 created #created-818

07 Nov at 14:46:35 GMT
Foxpath integration

This is a placeholder issue for Syd Bauman’s suggestion on Slack to integrate Foxpath, or parts of it, in the standard.

Issue #817 created #created-817

06 Nov at 20:31:24 GMT
EBV 4.0

Yes, I dare to question the semantics of effective boolean values. The reason is that I never learned to fully like them. It seems obvious where the rules come from, and why they have been reasonable in previous versions of the language. From today’s perspective, I think there’s really some need to simplify and unify the rules, and I believe it’s possible with little effort and without endangering backward compatibility (provided that we are willing to drop errors and return results).

Some examples for the somewhat strange nature of the current rules:

  • boolean((<_>x</_>, <_>y</_>)) returns true, whereas boolean(('x', 'y')) raises an error.
  • boolean(xs:NCName('x')) returns true, whereas boolean(xs:QName('x')) raises an error.
  • boolean((<a/>, 1)) and boolean((1, <a/>)) may either return true or raise an error, depending on the implementation.

I believe it will make much more sense to

  1. check all values of the input equally (in analogy to the existential semantics of general comparisons), and
  2. use existence checks for more types instead of raising a clueless error.

The semantics would be tidied up a lot, it could look like this…

declare function ebv($input as item()*) as xs:boolean {
  some $item in $input satisfies typeswitch($item) {
    case xs:untypedAtomic | xs:string | xs:anyURI  return $item != ''
    case xs:numeric                                return $item != 0
    case xs:boolean                                return $item
    default                                        return true()
  }
};

…or, if we include more types, like this:

declare function ebv($input as item()*) as xs:boolean {
  some $item in $input satisfies typeswitch($item) {
    case xs:untypedAtomic | xs:string | xs:anyURI  return $item != ''
    case xs:numeric                                return $item != 0
    case xs:boolean                                return $item
    case xs:base64Binary                           return $item != xs:base64Binary('')
    case xs:hexBinary                              return $item != xs:hexBinary('')
    case array(*)                                  return array:size($item) != 0
    case map(*)                                    return map:size($item) != 0
    default                                        return true()
  }
};

(If we believe that it’s too progressive to accept all types, we could still raise an error for some specific types… although I don’t think that anyone would benefit from this choice).

As a result, EBV checks could also be used to check more than one item:

(: true if at least one tokenized string is non-empty :)
if(tokenize('a/', '/')) then ...
(: true if at least one number is unequal to 0 :)
if($numbers) then ...
(: true if at least one Boolean is true :)
if(false(), true(), true()) then ...

Nothing would change for the classical EBV checks: if($node/*), if($x = $y), if($ok), …

Regarding “1. check all values of the input equally”, one could argue that this might affect performance. I don’t actually think so: For node sequences, it will still be sufficient to retrieve only the first item. For mixed-type sequences, errors were raised in the past.

The resulting EBV could be easily combined with revised predicate semantics (#816).

Issue #816 created #created-816

06 Nov at 15:43:36 GMT
Predicates: Support for numeric sequences

Predicates provide a compact syntax for positional access to sequences but only single numbers are supported.

It would be handy to allow E[1, 2, 3] (E[3, 2, 1], E[1 to 3], etc.) as a shortcut for E[position() = (1, 2, 3)] (MarkLogic offers this possibility, if I remember correctly). We shouldn’t change the EBV syntax, and we should continue raising an error if the predicate sequence contains items other than numbers. Examples:

Expression | Result --- | --- (1 to 5)[2, 3] | 2, 3 (1 to 5)[3, 2] | 2, 3 (1 to 5)[2 to 3] | 2, 3 (1 to 5)[6, 'x'] | error (1 to 5)[1 to 5, 'x'] | 1, 2, 3, 4, 5 or error (up to the implementation, similar to sequences that start with a node)

I bet this has already been discussed in the past…

QT4 CG meeting 053 draft agenda #agenda-11-07

06 Nov at 09:50:00 GMT

Draft agenda published.

Issue #538 closed #closed-538

06 Nov at 09:21:15 GMT

480: Attempt to allow xs:string to be 'promoted to' xs:anyURI

Pull request #815 created #created-815

03 Nov at 18:15:23 GMT
130,480 Binary Promotion

Introduces mutual promotion between xs:base64Binary and xs:hexBinary. Fix #130.

Reorganises the material on type promotion. Supersedes PR #538.

Issue #814 created #created-814

03 Nov at 08:53:14 GMT
XSLT: Rules for on-no-match="shallow-copy-all"

There are some details missing for the error handling of the built-in template processing for shallow-copy-all.

When the built-in template processes an array, the built-in rule constructs a value record for each member, and applies-templates to this value record, expecting the result to be a value record.

  • It's not stated whether the "value record" is extensible, that is, whether it can contain fields other than "value"
  • No error code is given, and it's not identified explicitly as a type error.

Similarly when a map is processed, the result is expected to be a key-value record, but the details of the error are not spelled out.

Pull request #813 created #created-813

02 Nov at 18:47:08 GMT
809 Move fn:atomic-equal to section 14.2

Fix #809

Issue #812 created #created-812

02 Nov at 10:59:58 GMT
Coercion Rules: Unifications

It has always been a challenge to teach the difference between a conversion, coercion, promotion, casts and treats. Of course, we cannot get rid of the complexity, but I think it’s a good step forward that the conversion rules have recently been unified in the specification.

Maybe we can push it even further. I would suggest…

  1. renaming “Coercion Rules” to “Type Coercion” (…most other sections contain rules as well),
  2. making “Function Coercion” a subsection of “Type Coercion”, and
  3. making “Type Promotion” another subsection of “Type Coercion”.

I’m the wrong person to decide this, but maybe the full section can be moved to the Appendix, as it’s referenced all around the documents.

Issue #811 closed #closed-811

01 Nov at 19:16:17 GMT

Highlight changed functions in the ToCs and headings

Pull request #811 created #created-811

01 Nov at 19:16:10 GMT
Highlight changed functions in the ToCs and headings

Obviously, I should have done icons for functions that have changed as well. So now I have. I'm not convinced that the 🆙 emoji is sufficiently different from the 🆕 emoji. Maybe I should try to make them different colors or something. But it's a start.

Issue #810 closed #closed-810

01 Nov at 16:23:38 GMT

Fix line endings

Pull request #810 created #created-810

01 Nov at 16:23:29 GMT
Fix line endings

This PR adds a .gitattributes file that identifies some files explicitly as text files (and others explicitly as binary files).

After this PR is merged, I believe it will be the case that end-of-line handling will be correct on a per-platform basis. That is, if a Windows user checks out the repository, all the files will have PC-style line endings (CR followed by LF). If a Mac or Unix user checks out the repository, all the files will have Unix-style line endings (LF).

Commits should "do the right thing" to preserve the line endings appropriately.

Fingers crossed!

Issue #809 created #created-809

01 Nov at 14:35:28 GMT
Placement of fn:atomic-equal in the specification

On the XML.com slack, Pieter Lamers observes that fn:atomic-equal looks a little out of place in chapter 18 with the other map functions.

Issue #808 closed #closed-808

01 Nov at 13:22:03 GMT

Tweaks to highlight new functions in F&O

Pull request #808 created #created-808

01 Nov at 13:21:57 GMT
Tweaks to highlight new functions in F&O

Following a discussion on the XML.com slack, this is a little lunchtime hackery...

Any function listed in the new-functions section of the changes appendix or identified with an ednote that contains the string New in 4.0 will be marked as 🆕 in the specification. The 🆕 occurs in the ToC, the drop-down function list, and to the left of the section title.

(I'm just going to merge this as it has no spec changes and the effect won't be visible in the PR anyway.)

Pull request #807 created #created-807

01 Nov at 10:40:14 GMT
789 Serialization terminology [editorial]

"Instance of the data model" generally becomes "input tree"

An instance of the data model used to hold serialization parameters is now referred to as a "parameter document".

Fix #789

Pull request #806 created #created-806

01 Nov at 09:43:35 GMT
566 A few minor fixes for parse-uri

As CG observes in issue 566, there are still a couple of small problems with fn:parse-uri().

  1. The regular expressions used to parse the fragment identifier and query are incorrect. The URI specification allows ? to appear in a query string and # to appear in a fragment identifier, so the expressions have been rewritten to match everything after the first ? and #, respectively, even if they contain additional ? or # characters.
  2. The description of how to interpret the regular expression for parsing a Windows file path preceded by slashes was incorrect. It caused the leading "/" to be lost. That's been corrected.

Issue #805 closed #closed-805

01 Nov at 08:42:37 GMT

Improve formatting of FO examples

Pull request #805 created #created-805

31 Oct at 22:02:26 GMT
Improve formatting of FO examples

Stylesheet changes to improve the formatting of F&O examples. (But they still aren't perfect...!)

  1. Variables used by the examples are pulled out under a separate heading.

  2. If any examples include long lines, then single-column ("wide") format is used automatically.

  3. In two-column layout, if the left-hand column uses "eg" format, then so does the right-hand column. (But it would be nice to get rid of the grey background for this case).

Pull request #804 created #created-804

31 Oct at 21:32:07 GMT
Minor edits, XQFO chh. 7, 8

In addition to a minor typo, note the following:

  • rearrangement of a rule for $input (the original syntax, mixing types and derived types, confused me)
  • examples in X-from-Y duration functions to help illustrate the problem when mixing the two components of a duration.

Issue #753 closed #closed-753

31 Oct at 19:22:57 GMT

65: Allow xmlns="xxx" to NOT change the default namespace for NameTests

QT4 CG meeting 052 draft minutes #minutes-10-31

31 Oct at 17:30:00 GMT

Draft minutes published.

Issue #770 closed #closed-770

31 Oct at 17:12:08 GMT

566: Use fn:decode-from-uri in fn:parse-uri

Issue #469 closed #closed-469

31 Oct at 17:11:24 GMT

array:of-members, map:of-pairs: Signatures, Examples

Issue #782 closed #closed-782

31 Oct at 17:11:23 GMT

469: array:of-members, map:of-pairs: Signatures, Examples

Issue #778 closed #closed-778

31 Oct at 17:11:13 GMT

XQFO edits 5.4-5.6

Issue #784 closed #closed-784

31 Oct at 17:11:10 GMT

fos xsd

Issue #785 closed #closed-785

31 Oct at 17:11:00 GMT

777: updated history

Issue #786 closed #closed-786

31 Oct at 17:10:51 GMT

695: Added xref to fn:slice()

Issue #787 closed #closed-787

31 Oct at 17:10:36 GMT

783(part) - Editorial changes to Serialization spec

Pull request #803 created #created-803

31 Oct at 17:01:07 GMT
651: fn:log → fn:message

Closes #651

Pull request #802 created #created-802

31 Oct at 12:47:59 GMT
801: non-deterministic → nondeterministic

Closes #801

Issue #801 created #created-801

31 Oct at 12:45:57 GMT
nondeterministic vs non-deterministic

In the specs, both “nondeterministic” and “non-deterministic” can be found. The first one is more frequent, so I guess it’s the one that’s preferred.

Issue #359 closed #closed-359

31 Oct at 10:57:15 GMT

fn:void: Absorb result of evaluated argument

Pull request #800 created #created-800

31 Oct at 10:51:03 GMT
660: Static functions, default parameters, XPST0017

I’ve chosen XPST0017 over XPST0003 (it felt more intuitive to me).

Closes #660.

Issue #799 created #created-799

31 Oct at 10:27:18 GMT
Errors in F&O spec examples

In fn:expanded-QName, http:/example.com should be http://example.com.

In map:merge, there's a stray "(" in

map:merge((
  ($week, map { 7: "Unbekannt" })
)

Pull request #798 created #created-798

31 Oct at 10:19:43 GMT
479: fn:deep-equal: Input order

I decided to follow my initial suggestion and add unordered to fn:deep-equal instead of adding a separate function for it. My motivation:

  • It’s convenient to be able to use the new option in combination with the other options.
  • It will be used more often than many other options we’ve added recently.
  • It seemed pretty straightforward to add, both in terms of documentation and implementation.

Closes #479

Pull request #797 created #created-797

31 Oct at 03:33:19 GMT
Edits to parse-uri()

Attn @ndw -- these edits pertain to 6.6.1 fn:parse-uri and require your careful review.

  • hierarchical option appears in the rules of the function but not its preamble
  • "This function is described..." moved to anticipate the actual narrative.
  • I was a bit thrown by the phrase "This approach...is not implementation advice" because "approach" is vague and the reader is left with the impression that a good deal of something that follows is non-normative. We offer non-normative prose in the rules, but that informal prose is normally followed by an equivalent description that is normative. I did not make any edits in this area, but I would recommend clarification as to exactly what parts of the narrative are normative and which are non-normative.
  • The period-to-colon change at 27573 is to make sure the reader remains aware of the governing "If..." clause.
  • I deleted a paragraph that attempted to distill the rather complicated description of fn:decode-from-uri. A mere xref provides a cleaner narrative here and a more accurate description of uri decoding, and the xref is within the same document, so should not pose a burden on the reader.
  • For the query-parameters there was a discrepancy in the data model presented (in the preamble versus in the function rules), and I opted for the array-of-maps model. If it's supposed to be the other way (simple map), let me know and I'll revise.
  • query-segments versus query-parameters; I went with the latter.

Let me know where things ain't right.

Issue #796 created #created-796

30 Oct at 17:28:31 GMT
allow explicit type expressions in XPath variable bindings

It would be useful to be able to write

let $n as xs:integer := some_expr ....

and also maybe

for $s as xs:string, $p as my:name return....

This might occur e.g. in the body of a function, for example... and would help type safety and debugging.

Pull request #795 created #created-795

30 Oct at 15:02:13 GMT
655 fn:sort-with

Closes #655

Pull request #794 created #created-794

30 Oct at 14:05:07 GMT
216: fn:unparsed-text: End-of-line characters

Closes #216. See this issue for more details on the proposed change.

Pull request #793 created #created-793

30 Oct at 13:33:27 GMT
704: Context Value Expression → Context Value Reference

Closes #704

Pull request #792 created #created-792

30 Oct at 12:52:11 GMT
783 XSLT: errors are raised

See issue #783.

Errors are raised, not signalled or reported or generated or thrown.

Pull request #791 created #created-791

30 Oct at 12:37:31 GMT
238: First draft of an fn:invisible-xml function

Pursuant to action QT4CG-051-02, here is a proposal for fn:invisible-xml.

There are a number of different design choices that could be made. For example, one could argue that a function of the form fn:invisible-xml($grammar as xs:anyURI, $input as xs:anyURI) would be the easiest thing for users in many cases. But not in all cases. You might want versions where either the grammar or the input were strings instead of URIs.

I've attempted to craft the smallest proposal that could get the job done.

Pull request #790 created #created-790

30 Oct at 11:23:12 GMT
129 XSLT40 and SER40 changes for context item -> value

Proposes XSLT 4.0 and Serialization 4.0 changes resulting from the generalization of context item to context value in XPath 4.0.

The serialization changes are purely editorial.

In XSLT, we acknowledge the introduction of the context value in XPath but don't take advantage of it; at the XSLT level, the context value for an instruction is still always a single item. The only technical change is that we allow xsl:evaluate to pass any value as the context value to the dynamically evaluated expression (while retaining the name of the relevant attribute "context-item").

Fulfils action QT4CG-046-01

QT4 CG meeting 052 draft agenda #agenda-10-31

30 Oct at 11:10:00 GMT

Draft agenda published.

Issue #789 created #created-789

30 Oct at 11:09:46 GMT
Serialization spec: terminology

The serialization spec makes extensive use of the phrase "an instance of the data model". This phrase is defined to be a synonym of "value" or "sequence", and it seems to add a lot of words without adding any clarity. In many cases the context makes clear that it is actually referring to a tree rooted at a document node.

In section 3.1 "Setting Serialization Parameters by Means of a Data Model Instance" it would be helpful to use a more specific phrase, for example "a parameter document".

Elsewhere the phrase is often used to mean "the value being serialized", and again, it would be helpful to use a more specific phrase, perhaps a term that is quite distinctive such as "the payload". (It's sometimes referred to as "the result tree", but that assumes that the value being serialized is the result of a query or stylesheet.)

Issue #788 created #created-788

30 Oct at 08:48:42 GMT
New function fn:annotate()

I propose a function fn:annotate() which will add annotations to a function item. It will create a new function item that differs from the original only in its annotations.

Currently annotations are a dynamic property of function items, but they can only be set in very limited ways. Allowing them to be set dynamically creates lot of opportunities. For example, there is currently no way to set annotations on a map or array, but one could define annotations, for example, to indicate that a map should hold entries ordered by key, or that an array should use a "sparse" implementation; there is scope both for spec-defined and vendor-defined (and perhaps even user-defined) annotations.

Some of the possibilities this would open up have been outlined in other issues. At present, though, we have a capability in the data model that is underexploited, and an fn:annotate() function is a conceptually simple addition, which can be justified purely on the grounds of completeness - if something exists in the data model, surely it should have getters and setters?

Issue #635 closed #closed-635

30 Oct at 08:43:32 GMT

451: Schema compatibility

Issue #765 closed #closed-765

30 Oct at 08:41:48 GMT

XQuery version declaration: upgrade to 4.0

Issue #766 closed #closed-766

30 Oct at 08:41:47 GMT

765 Update version references etc to 4.0 status

Pull request #787 created #created-787

27 Oct at 07:46:18 GMT
783(part) - Editorial changes to Serialization spec
  1. Errors are raised, not signaled
  2. Cross-references point to 4.0 specs rather than 3.0/3.1
  3. Some internal markup changes for tidiness

Partial fix for #783 as it affects the serialization spec.

Issue #695 closed #closed-695

26 Oct at 22:22:22 GMT

Step in RangeExpression

Pull request #786 created #created-786

26 Oct at 22:21:48 GMT
695: Added xref to fn:slice()

This closes #695 .

Issue #777 closed #closed-777

26 Oct at 22:09:38 GMT

new replace() parameter $action

Pull request #785 created #created-785

26 Oct at 22:07:59 GMT
777: updated history

Closes issue #777

Pull request #784 created #created-784

26 Oct at 21:54:36 GMT
fos xsd

xsd:assert is illegal in schema version 1.0; updated to specify version 1.1.

Issue #783 created #created-783

26 Oct at 21:49:22 GMT
Editorial: errors are raised (not reported, signaled, generated, or thrown).

The most common phrase we use when describing an error condition is "A (dynamic|static|type) error is raised if ...".

There are other cases where we use the verb "reported" or "signaled" (and occasionally, "thrown" or "generated"). We should avoid these verbs, partly in the interests of consistency, and partly because they are misleading: an error is not reported if it is caught by a try/catch, but it is still raised.

XSLT also commonly uses the phrase "It is a (dynamic|static|type) error if ... " which is also acceptable.

Pull request #782 created #created-782

26 Oct at 13:26:37 GMT
469: array:of-members, map:of-pairs: Signatures, Examples

I’ve eventually renamed $pairs to $input (I didn’t rename $input to $members as initially suggested, as we have $member parameters in other functions that are of type item()*, not record(value as item()*)).

Closes #469

Issue #776 closed #closed-776

26 Oct at 11:37:57 GMT

/etc/XT40 does not get built

Issue #781 closed #closed-781

26 Oct at 11:37:56 GMT

Fix etc_XT40 output file

Pull request #781 created #created-781

26 Oct at 11:37:50 GMT
Fix etc_XT40 output file

Fix #776

You'll need to pull and rebase master after I merge this.

Issue #764 closed #closed-764

26 Oct at 07:25:17 GMT

XQuery: Simplify module imports

Issue #780 created #created-780

26 Oct at 04:04:48 GMT
format-number() etc incompatibility

We have changed format-number() and some XSLT functions including system-property(), function-available() etc so that the QName-valued argument is now declared with type union(xs:string, xs:QName) rather than xs:string. I believe that there are edge cases where this is an incompatible change -- for example if the supplied value is an xs:anyURI value. I think the edge cases are probably sufficiently obscure and unlikely to occur in practice that it is sufficient to document them.

Issue #779 created #created-779

26 Oct at 03:18:30 GMT
Hash/checksum function

I propose a new function for the core XPath functions, here called fn:hash() for the sake of discussion. The goal is to give XPath users access to CRC, checksum, and cryptographic hash functions.

Rationale

Simple checksums functions, such as those from the Fletcher family, are relatively easy to write in a host language such as XSLT. More complex ones are far more challenging to write, and may incur serious performance penalties. For example, from the TAN function library, see the MD5 checksum/hash functions. (Yes, one day I thought it would be fun to try to implement the MD5 algorithm in XSLT.) Most programming languages in which an implementation is written have access to cryptographic libraries that are highly performative.

Hash functions are widely used, and certainly important in XML-based workflows, whether as filenames, database fields, etc.

The closest comparable existing functions is generate-id(), but this was designed as an identifier for nodes.

In short, I believe there is a significant need that outstrips current functionality.

In the draft below, I have adopted only the MD-5 algorithm as a core requirement, to catalyze discussion. I have assumed the user wants the string form of the output, not the raw bits. I have not tried to flesh out prose that would warn users away from security complacency.

For discussion, here is a list of relevant algorithms. I look forward to community feedback.

fn:hash Draft Specification

Summary

This function takes as input a string or octet sequence and returns a string representation of the results from a specified hash, checksum, or cyclic redundancy check function.

Signature

fn:hash(
   $value as union(xs:string, xs:hexBinary, 
                   xs:base64Binary)?   := fn:string(.),
   $algorithm as xs:string             := "md5"
) as xs:string?

Properties

The zero-argument form of this function is deterministic, context-dependent, and focus-dependent.

The one- and two-argument form of this function is deterministic, context-independent, and focus-independent.

Rules

If the zero-argument version of the function is used, the result is the same as calling the one-argument version, with $value set to fn:string(.).

If the one-argument version of the function is used, the result is the same as calling the two-argument version, with $algorithm set to "md5".

The effective value of $algorithm is the value of the expression fn:lower-case(fn:replace($function, '\W+', '')).

If $value is the empty sequence, or a string of zero length, the function returns the empty sequence.

If $value is an instance of xs:string it is cast to xs:hexBinary on the basis of UTF-8 encoding. If $value is an instance of xs:base64Binary it is cast to xs:hexBinary.

The function returns an xs:string representation of the bytes returned by passing the xs:hexBinary value of $value as an octet sequence through the specified hash or checksum function.

Conforming implementations MUST support md5 and the associated MD5 Message-Digest algorithm defined by RFC 6151 (update to RFC 1321). They MAY support other checksum and hash functions with implementation-defined semantics.

Error Conditions

A dynamic error is raised [err:XXXXXXX] if the effective value of $algorithm is not one of the values supported by the implementation.

Notes

  • The MD5 algorithm is normally not used for cryptographic purposes. [More cautionary prose about not assuming that something can be trusted as secure.]

Examples

Expression | Result -- | -- fn:hash("abc") | 900150983cd24fb0d6963f7d28e17f72 fn:hash("ABC") | 902fbdd2b1df0c4f70b4a5d23525e932

Pull request #778 created #created-778

26 Oct at 01:53:07 GMT
XQFO edits 5.4-5.6

Light edits for consistency.

  • Examples in char() have been reordered so that the most interesting and useful examples are at top.
  • Substring functions were difficult to read in the specs because of the overly long collation URI. These are now bound to a variable, to enhance legibility.
  • Schema location fixed.
  • extra examples in string-length(), to help users recognize the role of combining characters.

Issue #777 created #created-777

26 Oct at 01:35:10 GMT
new replace() parameter $action

In the current draft for replace() the history log states that the new parameter $action has not yet had community review. This ticket provides a placeholder for attention and discussion.

If the CG has already reviewed and accepted $action, comment here and I will update the function history accordingly.

Issue #776 created #created-776

25 Oct at 21:54:10 GMT
/etc/XT40 does not get built

The serialisation spec has xspecref spec="XT31" references that need to be updated to XT40. But for some reason the /etc/XT40 file isn't being built, so these references fail to resolve.

Pull request #775 created #created-775

25 Oct at 21:14:06 GMT
517: Reflected Christian Gruen's remarks

I deleted my fork, got a new one and applied the latest changes. Seems this fixed the issues.

Issue #773 closed #closed-773

25 Oct at 20:38:33 GMT

QT4CG-051-04/05: Reflected today's meeting editorial suggestions

Issue #768 closed #closed-768

25 Oct at 16:44:29 GMT

Details about decode-for-uri

Issue #769 closed #closed-769

25 Oct at 16:44:28 GMT

768: details about decode-from-uri

Issue #517 closed #closed-517

25 Oct at 15:11:04 GMT

fn:chain (before: fn:multi-compose)

Issue #758 closed #closed-758

25 Oct at 15:10:42 GMT

XQFO UCA keyword strength, quaternary setting

Issue #686 closed #closed-686

25 Oct at 15:09:13 GMT

XQFO presentation of diagnostic functions

Issue #774 created #created-774

25 Oct at 12:46:53 GMT
What should be percent-encoded in a URI?

(This is related to fn:parse-uri, fn:build-uri, and fn:decode-from-uri. I'm making it a distinct issue to call it out and see if we can get consensus on the right answer. I've come to the conclusion that what I've implemented isn't justified by any specific reading of the relevant specifications, so it's wrong.)

This question is slightly tricky because encoding (or not encoding) characters can change the meaning of the URI.

If you trace your way through the ABNF in RFC 3986 you eventually get to:

   path-abempty  = *( "/" segment )
   path-absolute = "/" [ segment-nz *( "/" segment ) ]
   path-noscheme = segment-nz-nc *( "/" segment )
   path-rootless = segment-nz *( "/" segment )
   path-empty    = 0<pchar>

The various segment nonterminals boil down to some number of pchar. (The segment-nz form is used to forbid a zero length string before the first /; the segment-nz-nc form is used for a URI that does not begin with a scheme: it must have a non-zero length string before the first / that additionally must not contain a :.)

  pchar         = unreserved / pct-encoded / sub-delims / ":" / "@"

Where:

  unreserved    = ALPHA / DIGIT / "-" / "." / "_" / "~"
  pct-encoded   = "%" HEXDIG HEXDIG
  sub-delims    = "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," / ";" / "="

I think it follows that all characters except the following must be encoded:

Upper-and lower-case A to Z, the digits 0 to 9, -, ., _, ~, %, !, $, &, ', (, ), *, +, ,, ;, =, :, and @.

or, conversely, that any characters other than those must be encoded.

Observe that / isn't among the characters that are not encoded. That's because the / in hierarchical URIs divides the segments. It's part of the URI syntax. That's why a literal forward slash that appears somewhere in an actual path segment must be encoded %2F and it's why causually unencoding such a character changes the URI.

Observe also that there's no provision here for encoding a space with +, even though it's fairly common. I've carried that error through to some of the test results for fn:build-uri and fn:parse-uri. I'll fix those tests.

I'm going to try changing my implementation to follow the rule above and see what happens.

If you think this analysis is incorrect, please explain where I went wrong.

Pull request #773 created #created-773

24 Oct at 18:48:15 GMT
QT4CG-051-04/05: Reflected today's meeting editorial suggestions

All 3 editorial suggestions made in today's meeting are now reflected.

Pull request #772 created #created-772

24 Oct at 17:17:32 GMT
Revise the fn:parse-html rules to make them clearer to follow.

This actions the comment in #767 that the rules for the function are unclear.

The SVG element name handling is correct in the spec per the comments in https://github.com/qt4cg/qt4tests/issues/57, so don't need changing.

QT4 CG meeting 051 draft minutes #minutes-10-24

24 Oct at 16:10:00 GMT

Draft minutes published.

Issue #734 closed #closed-734

24 Oct at 16:06:35 GMT

517: fn:chain

Issue #763 closed #closed-763

24 Oct at 16:05:23 GMT

686: XQFO diagnostic function documentation

Issue #762 closed #closed-762

24 Oct at 16:04:34 GMT

758: XQFO minor edits 3

Issue #653 closed #closed-653

24 Oct at 16:04:19 GMT

XQuery - option to suppress entity expansion

Issue #749 closed #closed-749

24 Oct at 16:04:18 GMT

653: Add string literals E".." and L".." to control entity expansion

Issue #647 closed #closed-647

24 Oct at 16:04:07 GMT

XQuery: import schema with multiple location hints

Issue #659 closed #closed-659

24 Oct at 16:04:06 GMT

647: schema location hints

Issue #383 closed #closed-383

24 Oct at 16:03:15 GMT

fn:deep-equal: Order of child elements (unordered-elements)

Issue #771 created #created-771

24 Oct at 12:42:18 GMT
British vs. American English

Purely editorial: Some words in the specification seem to be British English (organised, generalised, behaviour), whereas the majority of the text is American English. Should this be fixed?

I assume there are tools to get this straight? I am too rarely affected by it to know more about it…

Pull request #770 created #created-770

23 Oct at 14:03:21 GMT
566: Use fn:decode-from-uri in fn:parse-uri

I think this is sufficient to close issue 566. @ChristianGruen could you let me know if you agree?

Pull request #769 created #created-769

23 Oct at 13:26:43 GMT
768: details about decode-from-uri

Closes #768

Issue #768 created #created-768

23 Oct at 12:53:14 GMT
Details about decode-for-uri

I'm unclear on how %XA should be decoded. Is that %X (an error) followed by A, or is that %XA (an error)?

Also, a couple of typos:

s/to an sequence/to a sequence/
s/that are no hexadecimal/that are not hexadecimal`

QT4 CG meeting 051 draft agenda #agenda-10-24

23 Oct at 10:10:00 GMT

Draft agenda published.

Issue #767 created #created-767

23 Oct at 09:20:20 GMT
parse-html(): case of SVG element names

Gunther Rademacher points out at https://github.com/qt4cg/qt4tests/issues/57 that the expected test results for parse-html() expect SVG elements to be output in lower case; and as far as I can see, this is consistent with the spec. But is it useful? It means we're producing XML that isn't valid according to the SVG schema, and that presumably might be rejected by tools that expect to handle valid SVG.

In passing, I note that it's very hard to work out from the parse-html() spec what the function actually does; it's somehow written as if it's obvious, but it never gets around to actually saying it explicitly. It also doesn't say what happens when the method option is omitted, as it is in these tests.

In addition, it seems that we don't have any tests that exercise the various options of the function, e.g. the ability to consue binary as well as character input.

Pull request #766 created #created-766

19 Oct at 22:45:58 GMT
765 Update version references etc to 4.0 status

Includes the following changes:

  • Clarification of the syntax and semantics of the version number in the XQuery version declaration
  • Updates to front and back matter to reflect the current status of the 4.0 specs
  • Update cross-spec references to point to the 4.0 specs rather than 3.1 specs
  • A DTD change from spec="SER40" to spec="SE40" to reflect the name of the /etc file generated by the gradle script.

Fix #765

Issue #765 created #created-765

19 Oct at 10:33:18 GMT
XQuery version declaration: upgrade to 4.0

The text at §5.1 does not yet acknowledge version="4.0".

We should also be a bit more precise about the syntax, which currently says that the version number is two integers separated by a dot. We should give a regular expression to be absolutely clear that we mean decimal integers, no underscores allowed.

There are a few other version references to tidy up at the same time, e.g. "XQuery is designed to meet the requirements identified by the W3C XML Query Working Group [[XQuery 3.1 Requirements]]."

Issue #764 created #created-764

19 Oct at 09:07:07 GMT
XQuery: Simplify module imports

With XQuery, you can import library modules as follows:

import module namespace utils = 'http://project.org/utils' at 'org/project/utils.xq`;
utils:action()

If an implementation has a mechanism to resolve a namespace URI, you can also do this:

import module namespace utils = 'http://project.org/utils`;
utils:action()

It would be desirable to simplify this further, and be able to do one of…

(: already legal, but no namespace binding takes place :)
import module namespace 'http://project.org/utils';
(: not supported yet :)
import module namespace http://project.org/utils;
(: no URI rewriting needed for relative paths :)
import module namespace org/project/utils;

utils:action()

…and…

  1. rewrite namespace URIs to local paths (this is how we proceed: https://docs.basex.org/wiki/Java_Bindings#URI_Rewriting)
  2. extract the prefix from the path.

I can imagine that the current ways to simplify imports depend a lot on the used implementations. Suggestions are welcome.

Issue #759 closed #closed-759

19 Oct at 05:41:18 GMT

Serialization: JSON Parameters

Pull request #763 created #created-763

19 Oct at 04:44:59 GMT
686: XQFO diagnostic function documentation

I think what prompted my original ticket was the use of "output". That term is used scores of times in the specs, and it always implies the output of a function within the XPath expression. Although the term is qualified in the diagnostic functions with the adjective "trace" or "log", it's tempting to think of the function's actual output. Coming back a few weeks later, I think some gentle massaging (instead of a convoluted preamble) can help readers not make the requisite adjustment, reflected in the present PR.

I have offered two examples that I hope show the practical side of these functions. I might have missed something though, so please chime in.

Pull request #762 created #created-762

19 Oct at 03:38:34 GMT
758: XQFO minor edits 3

This covers edits to XQFO 4 through 5.3, and incorporates the suggestion in #758

Issue #746 closed #closed-746

18 Oct at 20:54:43 GMT

break-when -> split-when in fn:partition

Pull request #761 created #created-761

18 Oct at 20:42:49 GMT
554/754 Simplify the new transitive-closure function

Drops the $min and $max parameters, with the effect that this corresponds much more closely to the general computer-science definition of a transitive closure.

The function now computes the set of nodes delivered by the transitive closure of the supplied $step function when applied to a given $start node, rather than returning a function item that must then be applied to the chosen $start node. This is hopefully easier for most users to understand, and does not lose any useful functionality.

The $min parameter of the old function is effectively forced to its default value of 1, and the $max value to its default of infinity.

Fix #754 Fix #554

This PR addresses the main points of #554 in making the function correspond more closely to the mathematical (or at least the computer-science) definition of transitive closure. It doesn't implement other ideas in #554, like returning the depth of search alongside the actual closure. That's because I believe in the principle that wherever possible a function should do one thing in as simple a way as possible.

Issue #760 created #created-760

18 Oct at 08:47:05 GMT
Serialize functions: consistency

We should be more ambitious about ensuring consistency when serializing data back to its original representation, and how to achieve it. This is the status quo:

Format| Serialize Function | Parse Function --- | --- | --- XML | fn:serialize($input) | fn:parse-xml JSON | fn:serialize($input, map { 'method': 'json' }) | fn:parse-json JSON | fn:xml-to-json($input) | fn:json-to-xml XHTML | fn:serialize($input, map { 'method': 'xhtml' }) | fn:parse-html HTML | fn:serialize($input, map { 'method': 'html' }) | fn:parse-html CSV | still missing | fn:parse-csv and variants

In BaseX, the XML created by fn:json-to-xml can already be serialized back to JSON as follows (related: #759):

serialize(
  <map xmlns="http://www.w3.org/2005/xpath-functions">
    <number key="A">1</number>
  </map>,
  map { 'method': 'json', 'json': map { 'format': 'basic' }
})

For CSV, we’ve introduced a CSV serialization method that supports all CSV flavors we support (see CSV Module for more details):

serialize(
  <map xmlns="http://www.w3.org/2005/xpath-functions">
    <number key="A">1</number>
  </map>,
  map { 'method': 'csv', 'csv': map { 'format': 'direct' }
})

Related (parse functions): #748

Issue #759 created #created-759

18 Oct at 08:41:04 GMT
Serialization: JSON Parameters

We have more and more serialization parameters that are specific to JSON (related: #530, #756, #576, #641). It won’t get better with each new option we add, so maybe it’s time to introduce a custom serialization parameter for JSON (and possibly other methods):

serialize(
  map { "abc": 123 },
  map { 'method': 'json', 'json': map { 'escape-solidus': true(), 'number-format':'#' }
})

If we want to be able to use output options in the XQuery prolog, we additionally need to define a syntax for serializing the options to a single string:

declare namespace output = 'http://www.w3.org/2010/xslt-xquery-serialization';
declare option output:method 'json';
declare option output:json 'escape-solidus=yes,number-format=#';
map { "abc": 123 }

Both suggestions are inspired by our own implementation; we use it for our custom JSON and CSV options.

Issue #706 closed #closed-706

18 Oct at 06:53:42 GMT

FLWOR: for member $m1 in $a1, member $m2 in $a2

Issue #752 closed #closed-752

18 Oct at 06:53:41 GMT

706: Fix "for member" grammar problems

Issue #750 closed #closed-750

18 Oct at 06:45:12 GMT

xsl:mode/@as and built-in template rules

Issue #751 closed #closed-751

18 Oct at 06:45:11 GMT

QT4CG-048-01: xsl:mode/@as with built-in templates

Issue #740 closed #closed-740

18 Oct at 06:32:38 GMT

QT4CG-047-01: Rename break-when to split-when, plus minor editorial cleanup

Issue #758 created #created-758

18 Oct at 03:56:23 GMT
XQFO UCA keyword strength, quaternary setting

In the XQFO specs, 5.3.3, the description of keyword strength, option quaternary, is I think confusing/misleading:

quaternary considers spaces and punctuation that would otherwise be ignored (for example data-base=database).

I propose the following:

"quaternary always considers as significant spaces and punctuation (data-base≠database; if maxVariable is punct or higher and alternate is not non-ignorable, lower strengths will treat data-base=database)."

That's a lot more words, but I think more accurate. And it may help the reader become familiar with the other two keywords.

Editors' input is welcome.

QT4 CG meeting 050 draft minutes #minutes-10-17

17 Oct at 17:30:00 GMT

Draft minutes published.

Issue #757 created #created-757

17 Oct at 16:24:09 GMT
Function families

We talked on the call today about the tension between defining multiple simple functions focussed on one task, and a small number of omnibus functions that have many different options.

I think we would all agree that multiple simple functions would be the better choice except for the problem that they all end up going into a single global namespace. So the question becomes, how can we better partition the name-space (using the term deliberately with a hyphen).

We're reluctant to use the namespace mechanism to partition our function library because namespaces are cumbersome and clutter the code with lots of boilerplate; declaring namespaces for binding function libraries also has side-effects for example on the semantics of element constructors.

One approach would be to build on the idea that @dnovatchev presented of using maps containing anonymous functions, so for example csv()?parse() would first call fn:csv() to load a family (or library) of functions, of which one is then selected for execution. This works, but I don't think it's a perfect solution; for example static analysis becomes a lot more difficult, and we don't get the benefits of default parameters and keyword arguments.

Most languages use hierarchic names with "." as a separator. Although XML names allow "." as a regular character, none of our built-in function names currently use it as such. So it would be entirely possible to adopt a convention where names like csv.parse() etc are used to name functions in a function family referred to as "csv". This wouldn't by itself require any language changes.

But if we adopted this convention, we could build on it to provide usability tweaks that make a large function library easier to manage. For example, we could put the math functions into the fn namespace with names like fn:math.sin(x), and then provide a way of binding a namespace prefix to a subtree of the fn namespace, so math:sin becomes a synonym for fn:math.sin(). The immediate benefit is that the namespace prefix doesn't need to be declared unless people want to use it. We could also then consider defining an algorithm for searching the fn namespace for abbreviated names such as sin(x), perhaps with some form of "import functions" declaration that says which subtrees of the fn namespace are to be searched.

Issue #741 closed #closed-741

17 Oct at 16:20:42 GMT

QT4CG-048-03: Fix copy and paste errors in describing type patterns

Issue #739 closed #closed-739

17 Oct at 16:18:53 GMT

Apply review comment changes to the HTML DOM XDM mapping.

Issue #618 closed #closed-618

17 Oct at 15:49:48 GMT

Symmetry: fn:html-doc, fn:csv-doc

Issue #756 created #created-756

17 Oct at 14:39:13 GMT
JSON serialization - number formatting

We get a lot of complaints about the use of exponential notation when formatting large numbers with the JSON serializer.

I propose adding an option such as number-format="picture" to control this, where the picture is a subset of what is allowed for format-number().

And perhaps we should bring xml-to-json() into line. The current capability of providing a callback function is more powerful, but difficult to align with serialization.

Issue #131 closed #closed-131

17 Oct at 11:16:22 GMT

Expression for binding the Context Value

Issue #755 created #created-755

17 Oct at 11:15:53 GMT
Expression for binding the Context Value

We have no expression yet to bind a value to the context value. Such an expression would be useful, among other things, to extend the focus function to sequences (fn { . }, see #129).

Here are 3 possible constructs for that, ordered by my personal preference:

1. Value Map Expression

ValueExpr      ::=  ValidateExpr | ExtensionExpr | ValueMapExpr
ValueMapExpr   ::=  SimpleMapExpr ("~" SimpleMapExpr)*
SimpleMapExpr  ::=  PathExpr ("!" PathExpr)*

(: Example :)
//flower ~ (count(.) || ' flowers: ' || string-join(name, ', '))

The expression would be similar to the simple map expression (which we could rename to item map expression). The following equivalents would then exist for simple FLWOR expressions:

for $i in (1 to 5) return string($i)  ≍  (1 to 5) ! string(.)
let $i := (1 to 5) return count($i)   ≍  (1 to 5) ~ count(.)

fn { E } could be rewritten to fn($c) { $c ~ E }.

2. Context Value Declaration

ContextExpr  ::=  "context" "{" Expr "}" EnclosedExpr

(: Example :)
context { //flower } {
  count(.) || ' flowers: ' || string-join(name, ', ')
}

The result of the first expression defines the context value, the second expression can reference the context.

fn { E } could be rewritten to fn($c) { context { $c } { E } }.

3. Enhanced FLWOR expression (for the sake of completion)

Similar to variables, the dot could be used to bind and reference the context:

LetBinding  ::=  ("." | ("$" VarName)) TypeDeclaration? ":=" ExprSingle
ForBinding  ::=  ("." | ("$" VarName)) TypeDeclaration? AllowingEmpty? PositionalVar? "in" ExprSingle

(: Example :)
let . := //flower
return count(.) || ' flowers: ' || string-join(name, ', ')

fn { E } could be rewritten to fn($c) { let . := $c return E }.

Assessment

  • The first solution looks most appealing to me. I like the analogy with the existing syntax for single items.
  • We could choose the second solution if we believe that the expression will be rarely used.
  • I‘ve backed away from the third solution; I think it would be too pervasive.

Issue #754 created #created-754

17 Oct at 07:22:21 GMT
fn:transitive-closure: signature; remarks; too specific?

I have problems grasping why fn:transitive-closure returns a function. Wouldn’t it be more consistent with the remaining function set, and easier, to pass on the input as the first argument and directly create the result?

fn:transitive-closure(
  $node  as node(),
  $step  as function(node()) as node()*,	
  $min   as xs:nonNegativeInteger?       := 1,
  $max   as xs:positiveInteger?          := ()
) as node()*

It would also be easier then to use the function within chains:

$nodes =!> transitive-closure(fn { * }) => count()

Issue #744 closed #closed-744

16 Oct at 14:32:04 GMT

XQFO Examples: minor fixes, formatting

Pull request #753 created #created-753

16 Oct at 14:05:35 GMT
65: Allow xmlns="xxx" to NOT change the default namespace for NameTests

Fix issue #65. Basically, fix the bug whereby xmlns="xxx" changes the default namespace for element NameTests, while retaining bug-compatibility.

QT4 CG meeting 050 draft agenda #agenda-10-17

16 Oct at 12:55:00 GMT

Draft agenda published.

Issue #649 closed #closed-649

16 Oct at 12:33:58 GMT

xsl:fallback

Issue #650 closed #closed-650

16 Oct at 12:33:56 GMT

649: fix an xsl:fallback problem

Pull request #752 created #created-752

16 Oct at 09:21:18 GMT
706: Fix "for member" grammar problems

Fix #706

Pull request #751 created #created-751

16 Oct at 06:41:40 GMT
QT4CG-048-01: xsl:mode/@as with built-in templates

Fix #750.

Issue #750 created #created-750

15 Oct at 22:56:41 GMT
xsl:mode/@as and built-in template rules

See ACTION QT4CG-048-01

The question arose, if an xsl:mode declaration specifies an expected type in @as, then all template rules in that mode are expected/required to deliver a value conforming to that type. But what about the default/fallback template rules? Surely they need to deliver a value of that type as well?

For example, suppose the mode specifies as="xs:boolean". Regardless of the value of xsl:on-no-match, none of the built-in template rules is going to deliver a boolean. You're expecting a boolean result from xsl:apply-templates, and if none of the template rules match, you're going to get something other than a boolean.

I think the answer is to say that you get a type error if the built-in template rule for the mode returns a value that's not of the required type.

I don't think this error should ever be reported statically, because the compiler has no way of knowing whether the set of explicit template rules is sufficient to cover all cases that will actually arise in source documents.

Issue #571 closed #closed-571

15 Oct at 22:29:49 GMT

XSLT: xsl:for-each-group/@break-when

Pull request #749 created #created-749

15 Oct at 20:24:23 GMT
653: Add string literals E".." and L".." to control entity expansion

Allows for expressions interoperable between XPath and XQuery. Fix #653.

Issue #748 created #created-748

15 Oct at 17:40:14 GMT
Parse functions: consistency

The functions for parsing input have been defined by different people, and the current state is quite inconsistent:

Function | Parameters --- | --- fn:parse-xml | $value as xs:string? fn:doc | $href as xs:string? fn:parse-json | $value as xs:string?, $options as map(*) fn:json-doc | $href as xs:string?, $options as map(*) fn:parse-html | $html as union(xs:string, xs:hexBinary, xs:base64Binary)?, $options as map(*) fn:parse-csv | $csv as xs:string?, $options as map(*)

I believe there’s some need to unify the functions, and we could at least:

  • introduce a fn:XYZ-doc($href, $options) function for each input format (with at least one encoding option), and
  • restrict the type of the input parameter of fn:parse-XYZ to xs:string? and always name it $value.

And I wonder if we should tag all fn:XYZ-doc functions as ·nondeterministic· (if it’s not too late)?

Issue #747 created #created-747

15 Oct at 09:27:50 GMT
QName literals

It's quite common to want to write a constant QName; I found myself doing this a lot, for example, in examples and test cases for the elements-to-maps() function. It's clumsy having to call xs:QName() or fn:QName() or parse-QName() for this purpose. It's particularly clumsy with map constructors where you want to write many QNames.

I propose we introduce the syntax Q"prefix:local".

The quotes can be either single or double; the prefix is optional. If there is no prefix, the result is a no-namespace QName.

The prefix (if present) must be bound to a namespace in the static context.

Character and entity references are not allowed.

Note that Q{uri}local is a NameTest, not a QName literal.

Issue #746 created #created-746

14 Oct at 18:16:31 GMT
break-when -> split-when in fn:partition

We decided to rename xsl:for-each-group/@break-when as @split-when. We should make the same change to the name of the second argument of fn:partition.

Issue #745 created #created-745

12 Oct at 11:21:39 GMT
Support for inline (anonymous) xslt functions

I propose adding support for inline xslt functions.

Whilst XPath supports this, Xpath functions are limited in what they can do, and how "look" e.g. returning newly constructed elements isnt possible without parse-xml-fragment.

I would suggest the syntax would be basically the same as for xsl:function except with the name omitted, e.g.

    <xsl:template name="apply-function" as="xs:integer">
        <xsl:param name="input" as="xs:integer"/>
        <xsl:param name="function" as="function(xs:integer) as xs:integer"/>
        <xsl:sequence select="$function($input)"/>
    </xsl:template>

`    <xsl:template ....>
        <xsl:call-template name="apply-function">
            <xsl:with-param name="input" select="1"/>
            <xsl:with-param name="function">
                <xsl:function as="xs:integer">
                    <xsl:param name="value" as="xs:integer"/>
                    <result>
                        <xsl:sequence select="$value * 2"/>
                    </result>
                </xsl:function>
            </xsl:with-param>
        </xsl:call-template>
    </xsl:template>
`

benefits

  • less syntactic "noise" of named functions
  • the ability to embed xslt functions inline inside maps (and other data types)
  • functional parity with xpath (and more)
  • natural generalisation to local function proposal

alternatives

  • use reference to explicitly named XSLT function
  • use XPath (though problematic when constructing new nodes)

Pull request #744 created #created-744

11 Oct at 11:48:45 GMT
XQFO Examples: minor fixes, formatting

Editorial: Some XQuery equivalents were buggy, and the formatting was unified.

Issue #743 created #created-743

10 Oct at 23:33:06 GMT
Extend enumeration types to allow values other than strings

In reviewing and accepting the spec for enumeration types, it was suggested that it might be useful to allow values other than strings.

  • There's a difficulty in that not all atomic values can be represented by literals. We have the same problem with function annotations; perhaps we need to bite the bullet and define some kind of "constant atomic expression" construct.
  • Aside from that, there don't seem to be any major obstacles.

We change

An EnumerationType has a value space consisting of a set of xs:string values. When matching strings against an enumeration type, strings are always compared using the Unicode codepoint collation.

to

An EnumerationType has a value space consisting of a set of atomic values. When matching values against an enumeration type, values are always compared using the fn:atomic-compare() function (as used for comparing map keys).

The subtyping rules (newly defined in terms of unions of singleton enumeration sets) seem to work in their current form, without change. enum("red", "green") is still a subtype of xs:string, because all the enumerated values are instances of xs:string.

QT4 CG meeting 049 draft minutes #minutes-10-10

10 Oct at 17:15:00 GMT

Draft minutes published.

Issue #742 created #created-742

10 Oct at 16:30:07 GMT
xsl:function-library: keep, drop, or refine?

The draft XSLT 4.0 specification (§5.3.2) proposes a new declaration xsl:function-library as a solution to the problem of having to qualify all function names except those in the core namespace. We have not reviewed this proposal.

Issue #688 closed #closed-688

10 Oct at 16:08:04 GMT

Coercion rules for union types and enumeration types

Issue #691 closed #closed-691

10 Oct at 16:08:03 GMT

688 Semantics of local union types, enumeration types, etc

Issue #372 closed #closed-372

10 Oct at 16:07:13 GMT

Separate default namespace for elements from the default namespace for types

Issue #715 closed #closed-715

10 Oct at 16:07:12 GMT

372 Rollback the default namespace changes

Issue #725 closed #closed-725

10 Oct at 16:06:55 GMT

Clarification to load-xquery-module

Issue #727 closed #closed-727

10 Oct at 16:06:54 GMT

725 Add clarification note for load-xquery-module

Issue #52 closed #closed-52

10 Oct at 16:06:35 GMT

Allow record(*) based RecordTests

Issue #728 closed #closed-728

10 Oct at 16:06:34 GMT

52 Allow record(*)

Issue #731 closed #closed-731

10 Oct at 16:06:07 GMT

Capturing accumulators: a couple of minor errors/omissions

Issue #732 closed #closed-732

10 Oct at 16:06:06 GMT

731 Capturing accumulators: Add error conditions, revise streaming rules

Pull request #741 created #created-741

10 Oct at 14:16:30 GMT
QT4CG-048-03: Fix copy and paste errors in describing type patterns

Fulfils Action QT4CG-048-03.

Pull request #740 created #created-740

10 Oct at 12:00:57 GMT
QT4CG-047-01: Rename break-when to split-when, plus minor editorial cleanup

Fulfils action QT4CG-047-01 : the CG decided to rename break-when as split-when. Also applies a few minor editorial corrections in the same general area.

Pull request #739 created #created-739

10 Oct at 11:56:42 GMT

Apply review comment changes to the HTML DOM XDM mapping.

QT4 CG meeting 049 draft agenda #agenda-10-10

09 Oct at 11:10:00 GMT

Draft agenda published.

Issue #738 created #created-738

08 Oct at 22:04:56 GMT
FO: Why is fn:op under section "17.3 Dynamic loading"

FO: Why is fn:op under section "17.3 Dynamic loading" ?

  1. Lexical substitution has little, if anything at all, to do with (dynamic) loading. Nothing is loaded from some external resource, as in the case of fn: load-xquery-module and fn:transform.

  2. There is nothing dynamic about having a predefined function that has a predefined set of possible values. In fact this could be defined as a strictly/statically defined map(xs:string, function(item()*, item()*) as item()*) with all allowed possible keys and as their values - the corresponding functions.

Taking this into account it is suggested to move fn:op to a section where it truly belongs. Maybe have in this section also other features of the language that are merely lexical substitution, as for example, function(s) for the creation of type-aliases.

Pull request #737 created #created-737

07 Oct at 22:21:13 GMT
295: Boost the capability of recursive record types

Fix issue #295

The main changes are:

  • In place of the special self-reference syntax "..", we now allow recursive use of type aliases, allowing types to be mutually recursive
  • We generalise the places that recursive references are allowed, for example the record type used by fn:random-number-generator is now legal
  • Subtyping rules for recursive record types are now defined (this was previously a gap in the specification). Acknowledgements to a John Snelson blog post for pointing me in the right direction.

Pull request #736 created #created-736

06 Oct at 21:33:18 GMT
730: Clarify (and correct) rules for maps as instances of function types

Fix issue #730

Note: the issue led to a wide-ranging discussion about possible enhancements to the type system, for example adding types for empty maps and arrays. I have ignored most of this, and have focussed on fixing the issue as raised (arising originally on the test suite), namely the incorrect use of V? to define a type that allows either an instance of the sequence type V or or an empty sequence.

QT4 CG meeting 048 draft minutes #minutes-10-03

04 Oct at 14:50:00 GMT

Draft minutes published.

Issue #735 created #created-735

02 Oct at 09:21:28 GMT
Local functions in XSLT

I propose that we should add local functions to XSLT: specifically, allowing an xsl:function declaration to appear within a sequence constructor, declaring a named function that is available for use only within the sequence constructor.

At present this can be achieved by declaring a local variable bound to an anonymous function, but it's clumsy to have to use completely different syntax for local and global functions, and functions defined in this way cannot be mutually recursive.

I propose that such functions should shadow any global functions with the same name, in the same way as happens with local variable declarations. I have an open mind as to whether shadowing of functions in reserved namespaces should be allowed.

The main difficulty is the scoping rules. We don't want the problems Javascript has with "hoisting". I propose that (a) all local function declarations must appear before any instructions (or local variable declarations, but not params) within the sequence constructor, and (b) these function declarations are in-scope throughout the sequence constructor including forwards references from the body of other functions declared earlier within the same sequence constructor.

Pull request #734 created #created-734

30 Sep at 19:15:42 GMT
517: fn:chain

Added fn:chain.

Took much effort to ensure Unix-style line-endings are used.

Issue #733 closed #closed-733

30 Sep at 19:13:04 GMT

517: fn:chain

Pull request #733 created #created-733

30 Sep at 02:08:42 GMT
517: fn:chain

I added fn:chain, which has been discussed in https://github.com/qt4cg/qtspecs/issues/517

Pull request #732 created #created-732

29 Sep at 16:55:09 GMT
731 Capturing accumulators: Add error conditions, revise streaming rules

Minor tweaks to the spec for capture=yes accumulator rules.

Fix #731

Issue #731 created #created-731

29 Sep at 15:57:22 GMT
Capturing accumulators: a couple of minor errors/omissions

Two little things in the spec for capturing accumulators that we agreed last week:

(a) We should define an error code for use when the capture attribute is present but phase="start".

(b) The streamability rules are too strict. They say that the select attribute must be motionless or consuming, but this is not necessary, because the select attribute is applied to a snapshot tree, which is instantiated in memory and therefore does not need to be streamable.

QT4 CG meeting 048 draft agenda #agenda-10-03

29 Sep at 13:20:00 GMT

Draft agenda published.

Issue #211 closed #closed-211

29 Sep at 12:55:47 GMT

XSLT streaming: capturing accumulators

Issue #717 closed #closed-717

29 Sep at 12:55:46 GMT

211: add capturing accumulators to XSLT

Issue #730 created #created-730

28 Sep at 17:43:11 GMT
Equivalence of map and function types

It is stated in XPath §3.6.4.2, and probably elsewhere, that

The function signature of a map matching type map(K, V), treated as a function, is function(xs:anyAtomicType) as V?

But V is a sequence type, not an item type, so you can't just tag a '?' onto the end of it. What is intended here by V? is a sequence type that is the union of V and empty-sequence().

Issue #729 created #created-729

28 Sep at 11:50:08 GMT
xsi:schemaLocation

The specifications (XQuery and XSLT) should say something about the effect of requesting validation on a document that contains xsi:schemaLocation and/or xsi:noNamespaceSchemaLocation attributes. At present XQuery says nothing, and XSLT says very little.

XQuery 3.1 says: A validate expression can be used to validate a document node or an element node with respect to the [in-scope schema definitions], using the schema validation process defined in [[XML Schema 1.0]] or [[XML Schema 1.1]]. This doesn't really answer the question. The "with respect to" phrase could be read as implying that ONLY the in-scope schema definitions are used. Particular problems occur if xsi:schemaLocation refers to a schema document that attempts to override or redefine the schema components that have been statically imported.

XSLT 3.0 says nothing of interest about what schema (=set of schema components) is used when validation is requested, though it does mention in passing that xsi:schemaLocation attributes might be interpreted in some way by a schema processor.

If we look to the behaviour of Saxon as a reference implementation, then we'll quickly find fault. There's a configuration option to control whether xsi:schemaLocation attributes are considered or ignored; if they are considered, then the schema components referenced are added to a global pool of schema components which are used not only to validate the document in question, but to validate any subsequent documents. We're in the process of redesigning this to do something that makes more sense.

Pull request #728 created #created-728

28 Sep at 11:24:54 GMT
52 Allow record(*)

Fix #52. Implements decision made at meeting 046.

Pull request #727 created #created-727

27 Sep at 17:38:18 GMT
725 Add clarification note for load-xquery-module

Add a note to clarify the behaviour of load-xquery-module.

Fix #725

Issue #724 closed #closed-724

27 Sep at 16:13:41 GMT

PR 717 with merge conflicts resolved

Issue #723 closed #closed-723

27 Sep at 15:30:33 GMT

Updated PR for capturing accumulators

Issue #726 closed #closed-726

27 Sep at 09:41:52 GMT

PR 723 with merge conflicts resolved

Pull request #726 created #created-726

27 Sep at 09:22:03 GMT

PR 723 with merge conflicts resolved

Issue #725 created #created-725

27 Sep at 09:14:59 GMT
Clarification to load-xquery-module

Add a clarification note to load-xquery-module, to correct a misunderstanding by a (very knowledgeable) user: see https://saxonica.plan.io/issues/6209

The function load-query-module does not modify the static or dynamic context in any way. In particular, the variables and functions that are loaded from the query module are not added to the static or dynamic context of the calling code. They are accessible only via the map that is returned from the function call.

Pull request #724 created #created-724

27 Sep at 08:50:42 GMT

PR 717 with merge conflicts resolved

QT4 CG meeting 047 draft minutes #minutes-09-26

26 Sep at 17:30:00 GMT

Draft minutes published.

Issue #722 closed #closed-722

26 Sep at 17:06:57 GMT

This is a test. This is only a test.

Pull request #723 created #created-723

26 Sep at 16:56:11 GMT
Updated PR for capturing accumulators

Updated to take account of comments

Pull request #722 created #created-722

26 Sep at 16:50:50 GMT
This is a test. This is only a test.

Had this been a real emergency, we would have fled in terror and you would not have been informed.

DO NOT MERGE THIS! :-)

Issue #721 closed #closed-721

26 Sep at 16:49:02 GMT

Attempt to fix the problem with PRE elements in autodiffs

Pull request #721 created #created-721

26 Sep at 16:48:55 GMT
Attempt to fix the problem with PRE elements in autodiffs

Maybe I'm more cleverer today than I was last time I looked into this.

Issue #663 closed #closed-663

26 Sep at 16:13:20 GMT

Calling xsl:original() with keywords

Issue #674 closed #closed-674

26 Sep at 16:13:19 GMT

663: Describe how calls to xsl:original with keywords work

Issue #570 closed #closed-570

26 Sep at 16:12:44 GMT

XSLT: Built-in template rules for maps and arrays

Issue #718 closed #closed-718

26 Sep at 16:12:42 GMT

Add on-no-match="shallow-copy-all"

QT4 CG meeting 047 draft agenda #agenda-09-26

25 Sep at 10:55:00 GMT

Draft agenda published.

Issue #720 created #created-720

25 Sep at 09:47:50 GMT
From Records to Objects

It has become idiomatic to use maps, and record type definitions, to declare a collection of functions; so for example the random-number-generator object offers a "method" next() that can be called using the syntax $rng?next().

The problem is that it's not possible, within the XPath/XQuery language, to implement such a function with implicit access to the object on which it is invoked. The implementation of the function does not have access to any kind of $this variable.

This issue considers how we can move forwards from supporting simple records to introduce object capabilities, in an incremental and compatible way.

Here are three steps in that direction:

  1. Where a named record type is declared, also create a corresponding constructor function. So if you declare

declare item type my:loc as record(longitude as xs:double, latitude as xs:double)

you also get a constructor function allowing my:loc(180, 180), allowing both positional or keyword arguments corresponding to the field names,

  1. Allow default values to be defined in the record type, which act as default values for the parameters in the constructor function.

  2. Allow functions that are defined as part of a record type access to a variable $this. The constructor function provides an implicit binding of this variable to the record/map/object that is being instantiated.

  3. Allow self-reference to a named record type (and its constructor function) within the record definition.

So you can now do:

declare type my:counter as record (
   value as xs:integer,
   increment := fn() as my:counter {my:counter($this?value + 1)}
)

and then

let $x := my:counter(0)
return $x?increment()?value

which returns 1.

Pull request #719 created #created-719

21 Sep at 20:21:29 GMT
413: Spec for CSV-related functions

This PR contains error fixes (typos, examples that contradicted the spec text), some (hopefully) improved language and one breaking change.

The current draft uses the type map(xs:integer, xs:string) for the column-names option to fn:csv-to-xdm and fn:csv-to-xml. This PR flips that to map(xs:string, xs:integer). It turns out that the examples were already using this, and it seems to me that having the names entry in the csv-columns-record record type be the transposed version of the column-names option that creates it, rather than be the same thing, is counterproductive.

I can think of some examples (a CSV split into several chunks, with only the first containing the headers) where being able to feed the names entry right back into another invocation of fn:csv-to-xdm would be useful. If nothing else it's confusing and not obvious, or I wouldn't have messed up the examples, and somebody would have noticed during the review process...

Pull request #718 created #created-718

21 Sep at 16:39:15 GMT
Add on-no-match="shallow-copy-all"

Enable recursive descent transformation with template rules for maps and arrays.

Fix #570

Pull request #717 created #created-717

21 Sep at 10:36:51 GMT
211: add capturing accumulators to XSLT

Adds the attribute capture="yes" to xsl:accumulator-rule. This has been available as a Saxon extension for some time and makes many accumulators much easier to implement.

Fix #211

Issue #716 created #created-716

20 Sep at 02:34:13 GMT
Generators in XPath

What is a generator?

Generators are well known and provided out of the box in many programming languages. Per Wikipedia:

“In computer science, a generator is a routine that can be used to control the iteration behaviour of a loop. All generators are also iterators.[1] A generator is very similar to a function that returns an array, in that a generator has parameters, can be called, and generates a sequence of values. However, instead of building an array containing all the values and returning them all at once, a generator yields the values one at a time, which requires less memory and allows the caller to get started processing the first few values immediately. In short, a generator looks like a function but behaves like an iterator.”


The goal of this proposal (major use-cases)

A generator in XPath should be a tool to easily implement the solutions to the following use-cases:

  1. Processing a huge collection whose members may not all be needed.
    A generator will produce only the next member of the collection and only on demand basis.

  2. Handling a collection containing unknown or infinite number of members. When requested the next member of the collection the generator will always produce it, if the collection still contains any members. It is the responsibility of the caller to issue only the necessary number of requests for the really needed next members.

What is achieved in both cases:

  • A (next) member is produced only on request. No time is spent on producing all members of the collection.
  • A (next) member is produced only on request. No memory is consumed to store all members of the collection.

A good problem that is based on these use-cases is to generate a collection of the first N members that have some wanted properties, and are generated from other collection(s), when it is not known what the size of the original input collections would be in order for the desired number of N members to be discovered.

For example: Produce the first 1 000 000 (1M) prime numbers.

Sometimes we may not even know if N such wanted members actually exist, for example: Produce the first 2 sequences of 28 prime numbers where the primes in each of the sequences form an arithmetic progression.


The Proposal

A generator is defined as (and synonym for):

let $generator as record
                   (initialized as xs:boolean,
                    endReached as xs:boolean,
                    getCurrent as function(..) as item()*,
                    moveNext as function(..) as .. ,
                    *  ) 

A generator is an extensible record .

It has four fixed-named keys, and any other map-keys, as required to hold the internal state of that specific generator.

Here is the meaning of the four fixed/named keys:

  • initialized is a boolean. When a generator $gen is initially instantiated, $gen?initialized is false(). Any call to $gen?getCurrent() raises an error. In order to get the first value of the represented collection, the caller must call $gen?moveNext()

  • endReached is a boolean. If after a call to moveNext() the value of the returned generator's endReached key is true() then calling moveNext() and/or getCurrent() on this generator raises an error.

  • getCurrent is a function of zero arguments. It must only be called if the values of initialized is true() and the value of endReached is false(), otherwise an error must be raised. This function produces the current member of the collection after the last call to moveNext, if this call didn't return a generator whose endReached value was true()

  • moveNext is a function of zero arguments. When called on a generator whose endReached value is false() then it produces the next (state of the) generator. including a possibly true() value of endReached and if this value is still false(), then calling getCurrent() produces the value of the next member of the collection.

Examples of operations on generators

The following examples are written in pseudo-code as at the time of writing there was no available implementation of records. And also, the code for recursion in pure XPath makes any such example longer than necessary for grasping its meaning.

The Empty Generator

emptyGenerator() {
             map{
                 initialized : true(),
                 endReached: true(),
                 getCurrent: function($this as map(*)) {error()},
                 moveNext: function($this as map(*)) {error()}
                }
              }

Take the first N members of the collection

take($gen as generator, $n as xs:integer) as generator
{
  let $gen := if(not($gen?initialized)) then $gen?moveNext()
                      else $gen,
      return
         if( $gen?endReached or $n eq 0) then emptyGenerator()
            else map{
                     "initialized": true(),
                     "endReached": false(),
                     "getCurrent": $gen?getCurrent,
                     "moveNext": take($gen?moveNext(), $n -1)
                   }
}

Skip the first N members from the collection

skip($gen as generator, $n as xs:integer) as generator
{
  if($n eq 0) then $gen
     else
     {
         let $gen := if(not($gen?initialized)) then $gen?moveNext()
                       else $gen
           return
             if(not($gen?endReached) then skip($gen?moveNext(), $n -1)
               else $gen
     }             
}

Subrange of size N starting at the M-th member

subrange($gen as generator, $m as xs:integer, $n as xs:integer) as generator
{
  take(skip($gen, $m -1), $n)
}

Head of a generator

head($gen as generator)
{
  take($gen, 1)?getCurrent()
}

Tail of a generator

tail($gen as generator)
{
  skip($gen, 1)
}

At index N

at($ind as xs:integer)
{
  subrange($ind, 1)?getCurrent()
}

For Each

for-each($gen as generator, $fun as function(*))
{
   map:put($gen, "getCurrent", function() { $fun($gen?getCurrent()) }  )                              
}

For Each Pair

for-each-pair($gen1 as generator, $gen2 as generator, $fun as function(*))
{
   let $gen1 := if(not($gen1?initialized)) then $gen1?moveNext()
                  else $gen1,
       $gen2 := if(not($gen2?initialized)) then $gen2?moveNext()
                  else $gen2,
    return
      if($gen1?endReached or $gen2?endReached) then map:put($gen1, "endReached", true())
        else map:put(map:put($gen1, "getCurrent", function() { $fun($gen1?getCurrent(), $gen2?getCurrent()) } ) ,
                     "moveNext", function() { for-each-pair(skip($gen1, 1), skip($gen2, 1), $fun)}
                    )                             
}

Filter

filter($gen as generator, $pred as function(item()*) as xs:boolean)
{
     let $getNextGoodValue := function($gen as map(*), $pred as function(item()*) as xs:boolean)
         {
            let $mapResult := iterate-while(
                                            $gen,
                                            function($gen) { not($pred($gen?getCurrent($gen))) },
                                            function($gen) { $gen?moveNext($gen) }
                                            )   
            return $mapResult?getCurrent($mapResult)                     
         },
       $gen := if($gen?initialized) then $gen 
                      else $gen?moveNext($gen)
        return
          map {
               "initialized": true(),
               "endReached":  $gen?endReached,
               "getCurrent": function($this as map(*)) { $getNextGoodValue($this?inputGen, $pred) },
               "moveNext":   function($this as map(*))
                             {    let $nextGoodValue := $getNextGoodValue($this?inputGen?moveNext($this?inputGen), $pred),
                                      $nextGen := iterate-while(
                                                                $this?inputGen?moveNext($this?inputGen),
                                                                function($gen) { not($pred($gen?getCurrent($gen))) },
                                                                function($gen) { $gen?moveNext($gen) }
                                                                )
                                    return
                                      map {
                                            "initialized": $nextGen?initialized,
                                            "endReached":  $nextGen?endReached,
                                            "getCurrent" : function($x) {$nextGoodValue},
                                            "moveNext" :   $this?moveNext,
                                            "inputGen" :   $nextGen
                                           }
                             },
               "inputGen" : $gen
              }
        }

Here are some other useful functions on generators -- with just their signature and summary:

  • concat($gen1 as generator , $gen2 as generator ) - produces a generator that behaves as $gen1 until $gen1.endReached becomes true(), and then behaves as $gen2

  • append($gen as generator, $value as item()*) - produces a generator that behaves as $gen until $gen.endReached becomes true(), and then as a generator that has only the single value value.

  • prepend($gen as generator, $value as item()*) - produces a generator whose first value is value and then behaves as $gen.

  • some($gen as generator) as xs:boolean - Produces true() if $gen has at least one value, and false() otherwise.

  • some($gen as generator, $pred as function(item()*) as xs:boolean) as xs:boolean - Produces true() if $gen has at least one value for which $pred($thisValue) is true(), and false() otherwise.

  • ofType($gen as generator, $type as type) - Produces a new generator from $gen that contains all values from $gen of type type -- for this we need to have added to the language the type object.

  • skipWhile($gen as generator, $pred as function(item()*) as xs:boolean) - Produces a new generator from $gen by skipping all starting values for which $pred($theValue) is true().

  • takeWhile($gen as generator, $pred as function(item()*) as xs:boolean) - Produces a new generator from $gen which contains all starting values of $gen for which $pred($theValue) is true().

  • toArray($gen as generator) - Produces an array that contains all values that are contained in $gen.

  • toSequence($gen as generator) - Produces a sequence that contains all values that are contained in $gen. Values of $gen that are sequences themselves are flattened.

  • toMap($gen as generator) - If the values in $gen are all key-value pairs, produces a map that contains exactly all the key-value pairs from $gen.


These and many other useful functions on generators can and should be added to every generator upon construction.

Thus, it would be good to have an explicit constructor function for a generator:

     construct-generator($record as 
                               record( initialized as xs:boolean,
                                       endReached as xs:boolean,
                                       getCurrent as function(..) as item()*,
                                       moveNext as function(..) as .. ,
                                      )
                         ) as generator

Pull request #715 created #created-715

19 Sep at 23:12:33 GMT
372 Rollback the default namespace changes

Implements the CG decision to roll back the changes that introduced two separate default namespaces for elements and types.

Fix #372

Issue #714 created #created-714

19 Sep at 21:19:43 GMT
Function annotations in XSLT

I propose that the following attributes on an xsl:function should be accessible as annotations, for example in a call on function-annotations:

  • visibility
  • streamability
  • new-each-time
  • cache

plus any extension attribute in a user-defined namespace, for example <xsl:function saxon:debug="yes"/> should have the annotation %saxon:debug("yes"). The value is always a single string, the actual attribute value as written.

Issue #703 closed #closed-703

19 Sep at 16:26:11 GMT

129 (1): XPath and XQuery changes for introduction of context value

Issue #701 closed #closed-701

19 Sep at 16:26:00 GMT

fn:concat: Support for 0 or more arguments

Issue #702 closed #closed-702

19 Sep at 16:25:59 GMT

701: fn:concat: Support for 0 or more arguments

Issue #696 closed #closed-696

19 Sep at 16:25:42 GMT

566: Rework query parameters on build-uri/parse-uri

Issue #694 closed #closed-694

19 Sep at 16:23:58 GMT

XQFO minor edits, with new examples and notes, 2 through 4.6

Issue #687 closed #closed-687

19 Sep at 16:23:42 GMT

Constructor functions for user-defined types

Issue #690 closed #closed-690

19 Sep at 16:23:41 GMT

687 Clarify constructor functions for user-defined types

Issue #668 closed #closed-668

19 Sep at 16:23:30 GMT

Definition of HTML case-insensitive collation

Issue #680 closed #closed-680

19 Sep at 16:23:29 GMT

668 define case insensitive collation normatively

Issue #713 created #created-713

19 Sep at 16:22:04 GMT
Annotations: Editorial notes

Copied from https://github.com/qt4cg/qtspecs/pull/710#pullrequestreview-1630129066:

For avoidance of doubt, we should say in XQuery 4.6.2.4 (Named Function References) that the function created by a named function reference has its annotations taken from the function definition. There are other places where we are not explicit about the annotations of a function item, for example with partial function application. We should add a note that in XPath and XSLT, it is not possible to define function annotations, so this function will always return an empty result. (However, we should consider giving user-defined functions in XSLT annotations based on their attributes, e.g. visibility and streamability).

…and my complementary note in the PR thread:

I decided to merge the PR without changes, and not add the reference to XPath and XSLT, because the function may also return results for XQuery functions imported via fn:load-xquery-module.

Additional notes from today’s meeting:

  • For annotations without values, we could assign true() as a default value.
  • Examples could be added to fn:function-annotations to demonstrate how to check for annotations without values (or with values whose EBV is false(): 0, "", etc.).
  • Dimitre suggested restructuring the spec for features that are not available in XPath.
  • Maybe annotations would also be helpful in XPath.

Feel free everyone to add more comments.

Issue #36 closed #closed-36

19 Sep at 16:12:54 GMT

fn:function-annotations (Allow support for user-defined annotations)

Issue #710 closed #closed-710

19 Sep at 16:12:51 GMT

36: fn:function-annotations

Issue #712 created #created-712

19 Sep at 13:50:37 GMT
array:sort: to be aligned with fn:sort

Related: #623. And an editorial note:

https://qt4cg.org/specifications/xpath-functions-40/Overview.html#func-sort

I think the type for the 3rd argument of fn:sort

$key  as (function(item()) as xs:anyAtomicType*)+  := fn:data#1

…should be changed to zero-or-more:

$key  as (function(item()) as xs:anyAtomicType*)*  := fn:data#1

While it will be rare to encounter queries with key := (), there seems to be no urgent reason to enforce at least one sort key. The change would also be in alignment with the corresponding rule (with the current signature, $key cannot be empty), which says:

The number of sort key definitions is determined by the number of function items supplied in the $key argument. If the argument is absent or empty, the default is a single sort key definition using the function data#1.

Issue #711 created #created-711

19 Sep at 09:22:59 GMT
Using annotations for navigation of JSON trees

This issue develops ideas presented in issue #596, which itself is a continuation of ideas raised in issue #341, issue #350, and elsewhere. It's related to the requirements presented in issue #262 and issue #297.

Firstly, I propose a change to the data model so that annotations can be attached to any item [or perhaps any value?], not only to a function. The annotations are a map of type map{xs:QName, item()*}. Some general principles:

  • Annotations on an item do not affect the result of any operation on that item unless otherwise specified.
  • Operations that are described as returning a result that contains items that are present in one of the operation's operands retain the annotations of those items, unless otherwise specified. (So for example $a[C] returns a sequence of items from $a in which the annotations are preserved).
  • Operations that construct "new" items (for example $a + $b) return an item with no annotations, unless otherwise specified.
  • The function annotations($x) (replacing function-annotations($x)) returns the annotations of an item.
  • The function annotate($x, key, value) returns a "clone" of $x with an additional annotation. (A clone of an item differs from the original only in having different annotations. All operations other than annotation-sensitive operations produce exactly the same result on the clone and the original - including tests for node identity.)
  • To avoid confusion, the term "type annotation" is replaced by "type label".

Secondly, we use annotations to aid navigation of JTrees (a term I use to describe trees of arrays and maps such as might be produced by parsing JSON).

We introduce a component of the static context tracked=true|false, defaulting to false. The construct tracked{expression} evaluates an expression and its subexpressions in tracked mode. In tracked mode any operator or function that performs selection within an array or map (for example the lookup operator, the map:get and array:get function, using the map/array as a function item, or the array:head() and array:foot() and map:find() functions) annotates the items in its result with two properties: "container" whose value is the map or array from which the item was selected, and "key" which is the key or array index of the selected value within that container. The effect is that if an item was found in a JTree using a tracked expression, the annotations on the resulting item can be used in effect to navigate upwards within the tree that was searched.

Note that this is not a new idea: in effect, the result of a tracked selection is a zipper data structure, as described in https://en.wikipedia.org/wiki/Zipper_(data_structure).

A further exploitation of the idea allows us to introduce deep update of JTree structures. For example, modify(root:=$a, selection:=fn{?x?y?z,} change:=fn{.+1}) can evaluate the selection argument in tracked mode, apply the change function to the resulting items, and then navigate back using the container annotation to create modified versions of all traversed containing JTrees, eventually returning a modified version of the root tree.

QT4 CG meeting 046 draft agenda #agenda-09-19

18 Sep at 14:00:00 GMT

Draft agenda published.

Issue #673 closed #closed-673

18 Sep at 13:49:45 GMT

HTML namespace changes

Pull request #710 created #created-710

17 Sep at 21:08:48 GMT
36: fn:function-annotations

@michaelhkay Again, some hints might need to be added for XPath (?).

Issue #709 created #created-709

17 Sep at 18:24:36 GMT
(Un)Checked Evaluation

Based on https://github.com/qt4cg/qtspecs/issues/707#issuecomment-1721596055 and the comments following in that thread:

I've been thinking recently about adding checked{} and unchecked{} modes. For example

<xsl:apply-templates select="checked{.//item}"/>

would throw an error if there are no items. The mode of execution would propagate downwards, so checked{a/b/c/d} would be able to tell you that a b was found, but it had no c children.

checked{item[22]} would have the effect of making sequences behave more like arrays, with bound checking; conversely unchecked{item?22} would return an empty sequence instead of throwing an error.

Issue #708 created #created-708

17 Sep at 08:56:39 GMT
Toward a design for generators

Motivation

The motivations for this are to explore the creation of sequences where the next item in the sequence is determined by evaluating a function on the current state of a system. These sequence generators have an initial starting value and state. They can also stop when some condition is met.

The motivating example here is the fn:random-number-generator function, where:

  1. let $rnd := fn:random-number-generator() initializes a new random number sequence with the default seed as its state;
  2. let $value := $rnd?number returns the current value of the sequence;
  3. let $rnd := $rnd?next() returns the state and value of the next item in the sequence.

This has all the properties needed for a forward (left-to-right) generating sequence. To make it a generalized generator sequence, the number field should be renamed value.

NOTE: The fn:random-number-generator function defines an infinite sequence as it has no termination/end of sequence condition.

Sequence Generators as Record Types

Therefore, a forward generator sequence could look like this:

declare item-type sequence-generator as record(
    value as item(),
    next as function() as record(value, next, *)?,
    *
);

declare function generated-sequence() as sequence-generator?;

If calling ?next() on a sequence above returns the empty sequence then there are no more values in that sequence.

If the generated-sequence() returns the empty sequence then there are no items in the sequence.

NOTE: This does not currently define reversible sequence support. A reversed property could be provided that returns a sequence-generator that operates on the sequence from right to left. I haven't figured out exactly how this should look, but reversed generator sequences may be better investigated as a separate issue.

NOTE: An implementation can process this sequence iteratively if needed.

Analysis

While this allows generator sequence types to be created, they are -- like fn:random-number-generator() cumbersome to use. This does have the advantage of being backward compatible with fn:random-number-generator(), though.

The problem comes when trying to make these work like sequences. The subtype and function coercion rules should be doable. The other sequence operations like filtering are more complicated to define properly.

However, this has the same issues that allowing array() in fn:* functions has -- how do you differentiate the use cases where the user is working on a sequence of generators, or the generated sequences?

NOTE: You can't extend the functions to take a (sequence-type | item()*) parameter as a sequence-type is a subtype of item() which matches item()*. If the functions were to be extended to handle these, then fn:head(fn:random-number-generator()) would return a number instead of a map.

Sequence Generator Function

The Kotlin language has a generateSequence function that takes a next function, an optional seed value or construction function, and returns a lazy sequence over that. -- Internally, it is building a Java iterator that produces values from calling the next function. The sequence will terminate when the next value is null.

I propose that -- in addition to the sequence-generator type above -- XPath defines the following function:

declare function fn:sequence($generator as sequence-generator) as item()*;

This solves the issues in the Analysis section above, and is analogous to array:values. It is implementation defined how the sequence is constructed. -- This allows an implementer to appropriately map the generator to their internal sequence implementation in order to provide lazy evaluation and other operations.

There should also be the following helper function for random numbers:

declare function fn:random-numbers(
    $seed as xs:anyAtomicType? := ()
) as xs:double* {
    fn:random-number-generator() => fn:sequence()
};

The user-defined sequences then become e.g.:

let $generator as sequence-generator := map {
    value : 1,
    next : function () { () }
}
return fn:sequence($generator)

Issue #299 closed #closed-299

17 Sep at 00:02:16 GMT

Short-circuiting functions, function-arity guards and lazy hints

Issue #707 created #created-707

15 Sep at 16:59:47 GMT
Dynamic Function Calls: Processing Empty Sequences

A fundamental – and brilliant – property of XPath is that many operations tolerate empty sequences: Instead of throwing an error, the empty result is passed on unchanged to the next operation. While this is unrewardingly confusing for binary operations (() + 1, () eq 5), it’s wonderful for pipelines:

(: paths :)
$nodes / a / b / c
(: lookups :)
$data ? 1 ? 2 ? 3
(: simple map operators :)
$data ! do(.) ! something(.)
(: arrow operator works differently, but the syntax is similar:  :)
$data => do() => something()

As far as I can judge, it would be a very simple and user-friendly addition if we extended dynamic function calls to return an empty sequence (instead of raising an error) if the base expression is an empty sequence. This way, the following expressions would all run through:

let $map := map { 'giovanni': map { 'city': 'roma' } }
return $map('andrea')('city'),
let $data := ()
return $data(1)(2)(3),
()(123),
()()

Many people use parentheses instead of the lookup operator for accessing maps & arrays, and the proposed change would make the syntax more interchangeable. I believe it would also be useful for function items in general.

Issue #706 created #created-706

15 Sep at 14:45:33 GMT
FLWOR: for member $m1 in $a1, member $m2 in $a2

Currently, the member keyword must always be placed directly after the for clause:

(: valid :)
for $a in 1 to 10
for $m member $m in $array

(: invalid :)
for $a in 1 to 10, member $m in $array

In addition, the keyword applies to all other bindings in the same for clause:

for member $m1 in $array1, $m2 in $array2

My feeling is that this syntax is a bit odd, as other keywords (allowing empty, at) only refer to the currently bound variable. Next, the member syntax would differ from the semantics of XQuery Full Text: The score keyword is placed before the variable name, and can be used more than once (or omitted):

let score $s1 := $data1, score $s2 := $data2

I think we should change this. It would also simplify the grammar:

InitialClause     ::=  ForClause | LetClause | WindowClause
ForClause         ::=  "for" ForBinding ("," ForBinding)*
ForBinding        ::=  (SimpleForBinding | ForMemberBinding) PositionalVar? "in" ExprSingle
SimpleForBinding  ::=  VarBinding AllowingEmpty?
ForMemberBinding  ::=  "member" VarBinding
AllowingEmpty     ::=  "allowing" "empty"
PositionalVar     ::=  "at" "$" VarName
VarBinding        ::=  "$" VarName TypeDeclaration?

And one more motivation for changing is that map bindings will be easier to define (see #31).

Issue #705 created #created-705

15 Sep at 07:08:45 GMT
Function Coercion: Function Arities

In 4.6.4 Function Coercion, a rule was added to support functions with an arity lower than the expected one:

If F has lower arity than the expected type, then F is wrapped in a new function that declares and ignores the additional argument; the following steps are then applied to this new function.

If I got it right, this is the resulting 4.0 behavior:

Spoiler: I probably got it wrong, see the next comment.

declare function local:function($a) { };
declare variable $function := function($a) { };

(: now legal :)
filter    (1984, true#0)
$function (1984, 'ignored')
fn { }    (1984, 'ignored')
map { }   (1984, 'ignored')
true#0    ('ignored')
sum(?, ())(1984, 'ignored')

(: still illegal :)
local:function(1984, 'ignored')

(: still legal: RHS items will be supplied one by one :)
map { }(1984, 'processed')     

Maybe some more examples should be added in the corresponding sections that refer to function coercion.

The new rule is powerful and allows for greater flexibility (see #516 and other issues), but the behavior may also be unexpected. We should probably document that:

  • It may go unnoticed that a passed on argument will be ignored. In other words, we reduce type safety by allowing users to supply more arguments than will be processed.
  • It makes a difference whether the invoked function is static or dynamic (dynamic functions will now provide less type safety than static functions).

Issue #704 created #created-704

14 Sep at 12:26:24 GMT
Context Value Expression → Context Value Reference

The specification defines Variable References for accessing values bound to a variable, and we should rename the equivalent operation for accessing the context from “Context Value Expression” to “Context Value Reference” (even more so if we should decide to introduce a Context Value Declaration later on, as discussed in #755).

Related: https://github.com/qt4cg/qtspecs/pull/703#issuecomment-1719345430

Pull request #703 created #created-703

14 Sep at 11:43:25 GMT
129 (1): XPath and XQuery changes for introduction of context value

Fix #129 This replaces the previous attempt from several months ago, which had too many conflicts to be salvageable.

This is a wide-ranging and pervasive change, and I would like the changes to be applied promptly and incrementally to reduce the risk of conflicts, even if further work is needed later. This first PR addresses the XQuery and XPath language specifications. Further changes (in subsequent PRs) are needed for F+O and for XSLT. There are also a couple of minor changes affecting Serialization (but none affecting the data model).

Issue #368 closed #closed-368

14 Sep at 11:38:46 GMT

129: Context item generalized to context value

Pull request #702 created #created-702

14 Sep at 09:29:55 GMT
701: fn:concat: Support for 0 or more arguments

Closes #701

Issue #701 created #created-701

14 Sep at 09:08:30 GMT
fn:concat: Support for 0 or more arguments

With #161, we plan to introduce support for variadic functions.

The scope of this issue is much smaller and can be seen as a preparatory one; it’s about allowing the first two arguments of the function optional. I’ll create a little PR for it.

Issue #700 created #created-700

14 Sep at 09:06:25 GMT
Operators for array mapping and filtering

With issue #129 generalising the context item to a context value, we have the opportunity to define context-based mapping and filtering operators for arrays that work in the same way as the A!B and A[B] operators for sequences.

I propose A!!B as a mapping operator for arrays. Unlike !, this does not flatten the result. The result is an array whose members correspond one-to-one with the members of A, each member of the result array being formed by evaluating B with the corresponding member of A as the context value.

For an array whose members are singletons, the expression A!!B has a similar effect to A?*!B, but (a) it is clearer, (b) it returns an array rather than a sequence, and (c) it performs no flattening. For the more general case, it can be used in place of the higher-order function call A => array:for-each(fn{B}).

I propose A?[B] as a filter operator for arrays. Unlike A[B], this is not overloaded to perform index-based selection. The result is an array containing those members of A for which B has an effective boolean value of true, when evaluated with the corresponding member of A as the context value. The expression $A?[B] is equivalent to $A => array:filter(fn{B}).

For example, $A?[exists(.)] filters the array $A to retain only those members that are non-empty.

Issue #645 closed #closed-645

14 Sep at 08:49:35 GMT

Editorial: Use `\n` instead of `\r\n` in XML documents

Issue #697 closed #closed-697

14 Sep at 08:49:33 GMT

645: Use \n instead of \r\n in XML documents

Issue #699 created #created-699

14 Sep at 08:44:58 GMT
GitHub: Signing

@ndw Sorry to keep you busy. I think we should disable the enforced signing of commits. We know the persons who send PRs, and currently we only have three persons (you, Reece, me) who sign their commits.

If signing is disabled, for example, also non-admins will be able to merge those PRs.

Issue #698 created #created-698

14 Sep at 08:40:34 GMT
GitHub: Line Endings

@ndw I’ve copied your suggestion from https://github.com/qt4cg/qtspecs/issues/645#issuecomment-1657056816 to a new issue:

we can tell git to fix the line endings in comments. I'll see about getting that setup.

In addition, it would be great if line endings were not changed when editing files in place.

image

Due to my changes in #645, all relevant files should now use Unix-style line endings, so this would be a sane default (e.g. if the modification of files leads to a mixture of \n and \r\n).

Pull request #697 created #created-697

14 Sep at 08:35:01 GMT
645: Use \n instead of \r\n in XML documents

#645 (editorial)

Pull request #696 created #created-696

13 Sep at 15:17:38 GMT
566: Rework query parameters on build-uri/parse-uri

Completes action QT4CG-042-02.

  1. Rework the query segments so that they're a simple map of key/value pairs.
  2. Rename query-segments to query-parameters.

Issue #692 closed #closed-692

13 Sep at 14:30:52 GMT

Use sequences instead of arrays in fn:parse-uri output

Issue #695 created #created-695

12 Sep at 22:08:22 GMT
Step in RangeExpression

In the XPath specs, it seems that a simple modification from...

[34] | RangeExpr | ::= | AdditiveExpr ( "to" AdditiveExpr )? -- | -- | -- | --

...to...

[34] | RangeExpr | ::= | AdditiveExpr ( "to" AdditiveExpr ("step" AdditiveExpr)? )? -- | -- | -- | --

...would be nonintrusive, and bring some nice benefits customary in other PLs, allowing expressions such as 1 to 9 step 2 and 100 to -100 step -4.

Thoughts?

Pull request #694 created #created-694

12 Sep at 21:57:37 GMT
XQFO minor edits, with new examples and notes, 2 through 4.6

Minor edits here are motivated by clarity or localized consistency. In one case, the change of 0 to 0 to 0 to 9 appears to address an important typo.

I have introduced select examples, to illustrate points in the corresponding rules.

The notes I have introduced need some context. Option 9 for the primary format token is somewhat vague, and the attached note of clarification stokes the imagination. I have trimmed that note for clarity (without substantive changes) but introduced a set of notes later, to help caution developers on unexpected behaviors with option 9, and secondarily to caution processor implementers on the challenges inherent in supporting this option. If I had my druthers, I would advocate deprecating option 9. It is--to use a technical term--squishy.

Spec editors: I am happy to pull back on any of this.

QT4 CG meeting 045 draft minutes #minutes-09-12

12 Sep at 17:15:00 GMT

Draft minutes published.

Issue #160 closed #closed-160

12 Sep at 16:13:36 GMT

Support named arguments on dynamic function calls

Issue #672 closed #closed-672

12 Sep at 16:12:34 GMT

XFO minor edits, chap. 1

Issue #671 closed #closed-671

12 Sep at 16:12:21 GMT

Switch expression without operand (analogous to XSLT choose)

Issue #678 closed #closed-678

12 Sep at 16:12:20 GMT

671 switch sans operand

Issue #669 closed #closed-669

12 Sep at 16:12:02 GMT

Typo in XSLT §26.4 - "appearing appearing"

Issue #679 closed #closed-679

12 Sep at 16:12:00 GMT

669 - fix typo "appearing appearing"

Issue #665 closed #closed-665

12 Sep at 16:11:48 GMT

Typo in fn:items-before and fn:items-ending-where

Issue #681 closed #closed-681

12 Sep at 16:11:46 GMT

665: Fix typos in fn:items-XX functions

Issue #637 closed #closed-637

12 Sep at 16:11:32 GMT

Annotation Values: Booleans

Issue #682 closed #closed-682

12 Sep at 16:11:31 GMT

637: allow true() and false() as function annotation values

Issue #90 closed #closed-90

12 Sep at 16:11:16 GMT

Simplified simplified stylesheets

Issue #599 closed #closed-599

12 Sep at 16:11:15 GMT

90: Simplified stylesheets with no xsl:version

Issue #93 closed #closed-93

12 Sep at 16:10:59 GMT

Support order by ascending/descending from a string value.

Issue #623 closed #closed-623

12 Sep at 16:10:57 GMT

93: sort descending

Issue #600 closed #closed-600

12 Sep at 16:10:22 GMT

fn:decode-from-uri: counterpart of fn-encode-to-uri

Issue #631 closed #closed-631

12 Sep at 16:10:20 GMT

600: fn:decode-from-uri

Issue #693 created #created-693

12 Sep at 14:30:37 GMT
QT4 Tests without counterpart in the specs

The following functions are not defined in the current spec:

  • fn:unparcel, fn:parcel → dropped
  • fn:xdm-to-json → #576
  • fn:concat() → see #701
  • fn:parts → see #463
  • codepoints-to-string(), etc. (sequence-values arguments)

Pull request #692 created #created-692

12 Sep at 14:21:20 GMT
Use sequences instead of arrays in fn:parse-uri output

Completes action QT4CG-042-01 on NW.

There is a corresponding PR against the test suite.

QT4 CG meeting 045 draft agenda #agenda-09-12

11 Sep at 11:15:05 GMT

Draft agenda published.

Pull request #691 created #created-691

08 Sep at 19:40:26 GMT
688 Semantics of local union types, enumeration types, etc

Fix #688.

This PR fleshes out the detailed semantics of local union types, enumeration types, and type aliases. It fills a number of gaps in the current specification but doesn't aim to change the overall intent.

Pull request #690 created #created-690

08 Sep at 11:29:06 GMT
687 Clarify constructor functions for user-defined types

Clarifies the rules for constructor functions, especially for list and union types, and for types defined by means of type aliases rather than in an imported schema. Fix #687.

Issue #689 created #created-689

08 Sep at 08:36:45 GMT
fn:stack-trace: keep, drop, replace with $err:stack-trace ?

The current specification contains a diagnostic function called fn:stack-trace. Many other languages provide a similar function: The returned output can possibly help to understand which function calls led to an error during the evaluation of a code.

Still, I have strong doubts that it is a good decision to include this function in the standard:

The specification gives you a vast amount of freedom how to implement and optimize things. As a result, it’s completely feasible and reasonable to rewrite the following code…

declare function local:double($f) {
  $f * 2
};
(1 to 6) ! local:double(.)

…to (1 to 6) ! (. * 2) at compile time. If a user adds a fn:stack-overflow call in the function body, s·he would expect to find the function invocation of the original code representation in the output. As always, there are technical solutions to achieve this (store additional information in the evaluation tree on the original query; suppress optimizations when fn:stack-trace is found), but all of them can affect the runtime behavior and lead to different evaluation trees, hiding possible bugs in the implementation (which can be a reason to call fn:stack-trace at all).

A standard should provide a minimum amount of assurance that a function behaves similarly across different implementations. At this time, I don’t believe we can’t give that guarantee.

Related: #55, #686

– As an alternative, a stack trace could optionally be created by an implementation when an error is triggered.

Issue #688 created #created-688

07 Sep at 17:33:43 GMT
Coercion rules for union types and enumeration types

The coercion rules for enumeration types have not been defined (there is a TODO in the spec).

For union types (including both schema-defined and locally-defined union types), the rules appear to need some further work. Given types R1 and R2 that are defined by restriction from B1 and B2, if an atomic value V is an instance of B1 that conforms to the rules of R1, then the relabelling coercion means it V will now be acceptable where the required type is R1. But if the required type is union(R1, R2), the relabelling coercion is not invoked. This feels inconsistent, since union(R1, R2) might be expected to accept anything that R1 accepts.

Test case FunctionCall-056 (currently failing) illlustrates the problem.

Issue #687 created #created-687

07 Sep at 15:21:50 GMT
Constructor functions for user-defined types

This is a deficiency in the 3.1 F+O specification.

Constructor functions for user-defined types are very poorly described:

  1. It's unclear how anonymous types are handled. The spec says there is a constructor function for every simple type in the static context. That would include anonymous types. But constructor functions for anonymous types, if they exist, are essentially useless, because their names are not known.

  2. The semantics of constructor functions for user-defined list and union types are described very vaguely, by analogy with built-in types; and the analogy points to the section on built-in atomic types which doesn't cover union and list types.

  3. For a union type U, it says that the return type of the constructor U(x) is defined as xs:anyAtomicType. Why not define it as U? Perhaps this predates the ability to use union types as return types.

Issue #686 created #created-686

06 Sep at 21:28:53 GMT
XQFO presentation of diagnostic functions

From an informal discussion on Slack, I feel that clarity is needed in the Diagnostic tracing section. The problem is that fn:trace() and fn:log() introduce the terms "trace output" with no definition or explanation, and this is easily confused with the primary output defined by the function signatures, and affects how readers think about the determinism of the functions. "The serialization of the trace output..." implies that the processor will necessarily serialize something, but I doubt that can or should be presumed. More needs to be said about the responsibilities of the processor in the contract for these functions.

In my opinion, this section would benefit from a brief preamble, providing context to set the stage for the rules. Some draft text for us to discuss:

Diagnostic tracing functions provide a transfer of information, either from the processor to the dynamic context, or vice versa.

The function that transfers information from the processor to the dynamic context, fn:stack-trace(), returns a string that can be further processed and used in the XPath expression and elsewhere in a host language.

The functions that transfer information from the dynamic context to the processor, fn:trace() and fn:log(), each have two effects. The first effect, the output, pertains to the returned values, defined by the function signature and essential to the XPath expression. Such output is always deterministic. The second effect, processor behavior, concerns the way the processor handles the values bound to the parameters, supplied for diagnostic tracing. Processor behavior is always directed toward the user or environment that invoked the processor. Actions may include sending messages, serializing the values and writing them to a log file or database, or something else. Unlike the output (the first effect), the results of processor behavior are implementation-defined and nondeterministic with respect to order of the parameter values.

The draft above attempts to avoid "output" to describe the processor-side diagnostics, so as to avoid potential confusion when dealing with the return-type defined in the signature. Where "trace output" appears in each of the rules, "processor behavior" can be used instead.

Questions:

  • Any objections, corrections, or suggestions?
  • Any other examples of how a processor might use trace diagnostics?
  • Should a paragraph be added to explain briefly how trace() and log() differ from xsl:message? (E.g., a serialized tree should not be presumed.)
  • In the informal Balisage birds-of-a-feather discussion this summer, reservations were expressed by participants about the name log(). Is it possible to drop the function and simply extend the arity of trace() with a parameter $return-input as xs:boolean? := true()?

Issue #685 closed #closed-685

06 Sep at 09:59:52 GMT

Style fixes

Pull request #685 created #created-685

06 Sep at 09:49:14 GMT
Style fixes
  1. Put the XSLT processor version comment at the end of the file instead of the beginning. Putting it before the <!DOCTYPE html> forces browsers into quirks mode.
  2. Improve the XPath Functions stylesheets so that they don't put div elements inside p elements when outputting examples.

Issue #684 closed #closed-684

06 Sep at 09:17:27 GMT

Ignore this PR

Issue #683 closed #closed-683

06 Sep at 09:14:34 GMT

XQFO context/focus in/dependent functions clarification note

Pull request #684 created #created-684

06 Sep at 08:37:43 GMT
Ignore this PR

This is just norm hacking about

Pull request #683 created #created-683

06 Sep at 02:16:02 GMT
XQFO context/focus in/dependent functions clarification note

Added note clarifying the relationship between context and focus, for the purpose of illustrating the relationships between focus/context in/dependent functions. Wrapped with companion <note>s in a parent <notes>.

Pull request #682 created #created-682

05 Sep at 20:12:37 GMT
637: allow true() and false() as function annotation values

Fix #637

Issue #658 closed #closed-658

05 Sep at 19:34:08 GMT

Constructor Function: Parameter Name, Zero-Arity

Pull request #681 created #created-681

05 Sep at 19:30:59 GMT
665: Fix typos in fn:items-XX functions

Fix #665

Pull request #680 created #created-680

05 Sep at 17:52:01 GMT
668 define case insensitive collation normatively

Fix #668

QT4 CG meeting 044 draft minutes #minutes-09-05

05 Sep at 17:20:00 GMT

Draft minutes published.

Pull request #679 created #created-679

05 Sep at 16:53:56 GMT
669 - fix typo "appearing appearing"

Fix #669

Pull request #678 created #created-678

05 Sep at 16:43:25 GMT
671 switch sans operand

Fix #671

Issue #619 closed #closed-619

05 Sep at 16:04:53 GMT

XDM ch. 6 minor edits

Issue #633 closed #closed-633

05 Sep at 16:04:28 GMT

Edits ch. 4.1 through 4.15

Issue #601 closed #closed-601

05 Sep at 15:15:13 GMT

fn:all → fn:every?

Issue #640 closed #closed-640

05 Sep at 15:15:08 GMT

601: fn:all → fn:every?

Issue #675 created #created-675

05 Sep at 14:55:54 GMT
XSLT streaming rules for new constructs

The XSLT spec has rules for the streamability of all system functions and XPath language constructs. These need updating for new 4.0 constructs.

Issue #664 closed #closed-664

05 Sep at 14:08:26 GMT

663 xsl:original keywords

Pull request #674 created #created-674

05 Sep at 14:07:29 GMT
663: Describe how calls to xsl:original with keywords work

Rework PR 664 (fix for 663) on new baseline

Fix #663.

Pull request #673 created #created-673

05 Sep at 13:53:36 GMT
HTML namespace changes

This PR applies my action items for updating the HTML XDM mapping around namespaces and local names.

Note: this currently makes dm:namespace-nodes return an empty sequence. I'm not currently sure what the best approach is here.

Pull request #672 created #created-672

04 Sep at 20:05:30 GMT
XFO minor edits, chap. 1

Most substantive change is the trimming of prose held over from before the revision of the diagrams.

Issue #671 created #created-671

02 Sep at 19:37:54 GMT
Switch expression without operand (analogous to XSLT choose)

By syntax analogy with switch,

choose
   test ($a < $b) return "lesser"
   test ($a > $b) return "greater"
   test ($a eq $b) return "equal"
   default return "Getting the default is hard to explain"

Something like

ChooseExpr ::= "choose" ChooseTestClause+ "default" "return" ExprSingle
ChooseTestClause ::= "test" "(" Expr ")" "return" ExprSingle

I know I can do this by stringing if-then-else together, but would greatly appreciate the cleaner and more manageable syntax for those times when many tests are inescapable.

Issue #670 created #created-670

02 Sep at 05:45:09 GMT
The trouble with XPath‘s fn:fold-right. A fix and Proposal for fn:fold-lazy

The trouble with XPath‘s fn:fold-right.
Laziness in XPath.

This article discusses the standard XPath 3.1 function fn:fold-right, its definition in the official Spec, its lack of apparent use-cases and its utter failure to reproduce the (lazy) behavior of Haskell’s foldr , which is presumed to be the motivation behind fn:fold-right.
The 2nd part of the article introduces the implementation of short-circuiting and generators, which together unprecedentedly provide laziness in XPath. Based on these, a new XPath function: fn:fold-lazy is implemented, that utilizes laziness, similar to Haskell’s foldr. This behavior is demonstrated in specific examples

Introduction

Higher order functions were introduced into XPath starting with version 3.0 in 2014 and later in version 3.1 in 2017.
The definition of the standard function fn:fold-right closely mimics that of Haskell’s foldr, and anyone acquainted with foldr can be left with the impression that fn:fold-right would have identical behavior (and hence use-cases) as Haskell’s foldr.

Unfortunately, there is a critical difference between the definitions of these two functions. Whereas the definition of foldr explicitly defines its behavior when provided with a function, lazy in its 1st argument – from Haskell’s definition of foldr:

“… Note that since the head of the resulting expression is produced by an application of the operator to the first element of the list, given an operator lazy in its right argument, foldr can produce a terminating expression from an unbounded list.”

The XPath definition of fn:fold-right does not mention any laziness.

There is no official concept of “laziness” in XPath, thus fn:fold-right doesn’t cover some of the most important use-cases of Haskell’s foldr , which can successfully produce a result when passed an infinite (or having unlimited length) list.

This in fact makes fn:fold-right almost useless, and explains why even some of the members of the XPath 3.1 WG have stated on occasions that they do not see why the function was introduced.

fn:fold-right gone wrong – example

This Haskell code:

foldr (\x y -> (if x == 0 then 0 else x*y)) 1 (map (\x -> x - 15) [1 ..1000000])

foldr (\x y -> (if x == 0 then 0 else x*y)) 1 (map (\x -> x - 15) [1 ..10000000])

foldr (\x y -> (if x == 0 then 0 else x*y)) 1 (map (\x -> x - 15) [1 ..])

produces the product of all numbers in the following list, respectively:

[-14, -13, -12, -11, -10, -9, -8, -7, -6, -5, -4, -3, -2, -1, 0, 1, 2, 3, …, 999985]

[-14, -13, -12, -11, -10, -9, -8, -7, -6, -5, -4, -3, -2, -1, 0, 1, 2, 3, …, 9999985]

[-14, -13, -12, -11, -10, -9, -8, -7, -6, -5, -4, -3, -2, -1, 0, 1, 2, 3, …, ] -- up to infinity.

Because all these 3 lists contain a zero as their 15th item, the expected result is 0 when evaluating any of these 3 expressions – even in the last case where the provided as argument list is infinite. And this is indeed what happens:

image

Not only Haskell produces the correct result in all cases, but regardless of the list’s length, the result is produced instantaneously!

Now, let us evaluate this equivalent XPath expression with BaseX:

let $product := function($input as array(xs:integer)) as xs:integer
                         { 
                           array:fold-right($input, 1, function($x as xs:integer, $y as xs:integer) as  xs:integer 
                                                               {if($x eq 0) then 0 else $x * $y}) 
                         },
    $ar := array { (1 to 36) ! function($x as xs:integer) as xs:integer {$x -15}(.)}
  return
     $product($ar)

Here we are passing a list containing just 36 integers. The result is quite unexpected and spectacular:

image

Here is what happens:

  1. Even though when processing the 15th integer in the array the result is 0, the XPath processor continues to evaluate the RHS (right-hand side) until the last member of the array (36).

  2. On “its way back” the XPath processor multiplies: (36*35*34*33*32* …*6*5*4)*3, and the result of the right-most multiplication is bigger than the maximum integer (or decimal) that this XPath processor supports.

  3. C r r r a s s h … As seen in the screenshot above.

The root cause for this unfortunate behavior is that the XPath processor doesn’t support short-circuiting and laziness. And thus, fn:fold-right is useless even in the normal/trivial case of a collection (array) with only 36 members. Not to speak of collections containing millions of members, or even infinite ones…

Let us see what happens when evaluating similar expressions with another XPath processor: Saxon.

Saxon seems to produce the correct result, however it takes exponentially longer times when the length of the passed array is increased, leading to this one:

image

It took 261 seconds for the evaluation to be done, but accessing the 15th member of the array and short-circuiting to 0 should be almost instantaneous…
So what happens in this case? The difference between BaseX and Saxon is that Saxon implements a “Big Integer” and thus can multiply almost 1 000 000 integers without getting a value that cannot be handled… But doing almost 1M multiplications of big integers obviously takes time …

What is common in these two examples? Obviously, neither BaseX nor Saxon detects and performs short-circuiting. Why is this? What is the reason for this?

I asked a developer of BaseX if I could submit a bug about this behavior. His answer was shockingly unexpected: “This is not a bug, because no requirement in the Specification has been violated”.

Thus, the main cause of the common behavior of both XPath processors to handle the evaluation of these examples, is the specification of the function, which blatantly allows such crap to happen.

Now that we see this, let us try to provide the wanted, useful behavior writing our own function.

The fix: Step 1 – fn:fold-right in pure XPath

Before going in depth with our pure XPath solution, we need as a base a pure-XPath implementation of fn:fold-right .

 let $fold-right-inner := function ($seq as item()*,
                                    $zero as item()*,
                                    $f as function(item(), item()*) as item()* ,
                                    $self as function(*)
                                   ) as item()*
{
  if(empty($seq)) then $zero
    else
      $f(head($seq), $self(tail($seq), $zero, $f, $self))
},

    $fold-right := function ($seq as item()*,
                             $zero as item()*,
                             $f as function(item(), item()*) as item()* 
                            ) as item()*
{
  $fold-right-inner($seq, $zero, $f, $fold-right-inner)
},
               
   $fAdd := function($x, $y)  {$x + $y},
   $fMult  := function($x, $y)  {$x * $y}
   
   return
     $fold-right((1 to 6) ! function($x){$x - 3}(.), 1, $fMult)

When we evaluate the above with any of the two XPath processors, the correct result is produced:

720

And we certainly do have exactly the same problems as the provided built-in fn:fold-right with a similar example:

image

The fix: Step 2 – $fold-right-sc detecting and performing short-circuiting

Now that we have $fold-right as a base, let us add code to it so that it will detect and perform short-circuiting. We will implement a function similar to $fold-right but having this signature:

    $fold-right-sc := function ($seq as item()*,
                                $zero as item()*,
                                $f as function(item(), item()*) as item()*,
                                $fGetPartial as function(*)
                               ) as item()*

The last of the function’s parameters $fGetPartial returns a new function that is the partial application of $f, when its 1st argument is set to the current member of the input sequence $seq. The idea is that whenever short-circuiting is possible, $fGetPartial returns not a function having one argument (arity 1), but a constant – a function with 0 arguments (arity 0).

If the arity of the so produced partial application is 0, then our code will immediately return with the value $f($currentItem).

Here is the complete code of $fold-right-sc:

 let $fold-right-sc-inner := function ($seq as item()*,
                                       $zero as item()*,
                                       $f as function(item(), item()*) as item()*,
                                       $fGetPartial as function(*),
                                       $self as function(*)
                                      ) as item()*
{
  if(empty($seq)) then $zero
    else
      if(function-arity($fGetPartial(head($seq), $zero)) eq 0)
        then $fGetPartial(head($seq), $zero) ()
        else $f(head($seq), $self(tail($seq), $zero, $f, $fGetPartial, $self))
},

    $fold-right-sc := function ($seq as item()*,
                                $zero as item()*,
                                $f as function(item(), item()*) as item()*,
                                $fGetPartial as function(*)
                               ) as item()*
{
  $fold-right-sc-inner($seq, $zero, $f, $fGetPartial, $fold-right-sc-inner)
},
               
   $fAdd := function($x, $y)  {$x + $y},
   $fMult  := function($x, $y)  {if($x eq 0) then 0 else $x * $y},
   $fMultGetPartial := function($x, $y)
   {
     if($x eq 0)
       then function() {0}
       else function($z) {$x * $z}
   }
   
   return
     $fold-right-sc((1 to 1000000) ! function($x){$x - 3}(.), 1, $fMult, $fMultGetPartial)

Do note:

  1. If the current item (the head of the sequence) is 0, then $fMultGetPartial returns a function with 0 arguments (constant) that produces 0.

  2. $fold-right-sc (inner) treats differently a partial application of arity 0 from a partial application with arity 1. In the former case it simply produces the expected constant value without recursing further. Here is the relevant code fragment

  if(empty($seq)) then $zero
    else
      if(function-arity($fGetPartial(head($seq), $zero)) eq 0)
        then $fGetPartial(head($seq), $zero) ()
        else $f(head($seq), $self(tail($seq), $zero, $f, $fGetPartial, $self))

And now BaseX has no problems with the evaluation, even though the input sequence is of size 1M. The complete evaluation takes just a fraction of a millisecond (0.04 ms):

image

With Saxon things are not so good. Even though Saxon produces the correct result, evaluating the expression with an input sequence of size 1M takes 0.5 seconds (half a second), and evaluating the expression with an input sequence of 10M takes 5 seconds (10 times as long):

image

What is happening?

Even though Saxon performs much faster than the previous 261 seconds, due to detecting the short-circuiting possibility and performing the short-circuit, Saxon still processes all 10M items when evaluating this subexpression (which obviously the more optimized BaseX doesn’t do in advance):

(1 to 10000000) ! function($x){$x - 3}(.)

Therefore, we have one remaining problem: How to prevent long sequences (or arrays) from being fully materialized before starting the evaluation of $fold-right-sc ?

The fix: Step 3 – replacing collections with generators

Generators are well known and provided out of the box in many programming languages. Per Wikipedia:

“In computer science, a generator is a routine that can be used to control the iteration behaviour of a loop. All generators are also iterators.[1] A generator is very similar to a function that returns an array, in that a generator has parameters, can be called, and generates a sequence of values. However, instead of building an array containing all the values and returning them all at once, a generator yields the values one at a time, which requires less memory and allows the caller to get started processing the first few values immediately. In short, a generator looks like a function but behaves like an iterator.”

A full-fledged generator (such as implemented in C#) is an instance of a Finite State Machine(FSM), and implementing it in full generality goes beyond the topic and goals of this article. Expect another article soon that will provide this.

Here we will implement a simple kind of generator, that when passed an integer index $N, produces the $Nth item of a specific sequence. Although this is probably the simplest form of a generator, it can be useful in many cases and is a good illustrative solution to our current problem. The whole approach of replacing “something” with a function that must be called to produce this “something” is known as “lifting”

First, we will add to our $fold-right just the use of generators, without the detection and performing of short-circuiting:

let $fold-right-lifted-inner := function ($seqGen as function(xs:integer) as array(*),
                                    $index as xs:integer,
                                    $zero as item()*,
                                    $f as function(item(), item()*) as item()* ,
                                    $self as function(*)
                                   ) as item()*
                                {
                                  let $nextSeqResult := $seqGen($index),
                                      $isEndOfSeq :=  $nextSeqResult(1),
                                      $seqItem := $nextSeqResult(2)
                                    return
                                      if($isEndOfSeq) then $zero
                                        else
                                          $f($seqItem, $self($seqGen, $index+1, $zero, $f, $self))
                                },

    $fold-right-lifted := function ($seqGen as function(xs:integer) as array(*),
                                    $zero as item()*,
                                    $f as function(item(), item()*) as item()* 
                                  ) as item()*
                                  {
                                    $fold-right-lifted-inner($seqGen, 1, $zero, $f, $fold-right-lifted-inner)
                                  },
                                  
   $NaN := xs:double('NaN'),
   
   $fSeq1ToN := function($ind as xs:integer, $indStart as xs:integer, $indEnd as xs:integer) as array(*)
                {
                  if($ind lt  $indStart or $ind gt $indEnd)
                    then  array{true(), $NaN}
                    else array{false(), $ind}
                },
   $fSeq-1-6 := $fSeq1ToN(?, 1, 6),
               
   $fAdd := function($x, $y)  {$x + $y},
   $fMult  := function($x, $y)  {$x * $y}
   
   return
     $fold-right-lifted($fSeq-1-6, 1, $fMult) 

Here we see an example of a simple generator – the function $fSeq1ToN.

This function returns an array with two members: a Boolean, which if true() indicates the end of the sequence, and the 2nd member is the current head of the simulated sequence.
The generator has two other parameters which are the values (inclusive) for the start-index and the end-index. Whenever the passed value of $ind is outside of this specified range, $fSeq1ToN returns a result array with its first member set to true() (the 2nd member of the result must be ignored in this case), which indicates end-of sequence.
Otherwise it returns array{false(), $ind} . It is the responsibility of the caller to stop calling the generator:

   $fSeq1ToN := function($ind as xs:integer, $indStart as xs:integer, $indEnd as xs:integer) as array(*)
                {
                  if($ind lt  $indStart or $ind gt $indEnd)
                    then  array{true(), $NaN}
                    else array{false(), $ind}
                }

Evaluating the complete XPath expression above produces the correct result both in BaseX and in Saxon: the product of the integers 1 to 6:

image

Now that we have successfully implemented the last missing piece of our complete solution, let us put everything together:

The fix: Step 4 – putting it all together

Finally we can replace the input sequence in $fold-right-sc with a generator:

let $fold-right-sc-lifted-inner := function ($seqGen as function(xs:integer) as array(*),
                                    $index as xs:integer,
                                    $zero as item()*,
                                    $f as function(item(), item()*) as item()* ,
                                    $fGetPartial as function(*),
                                    $self as function(*)
                                   ) as item()*
                                {
                                  let $nextSeqResult := $seqGen($index),
                                      $isEndOfSeq :=  $nextSeqResult(1),
                                      $seqItem := $nextSeqResult(2)
                                    return
                                      if($isEndOfSeq) then $zero
                                        else
                                          if(function-arity($fGetPartial($seqItem, $zero)) eq 0)
                                            then $fGetPartial($seqItem, $zero) ()
                                            else $f($seqItem, $self($seqGen, $index+1, $zero, $f, $fGetPartial, $self))
                                },

    $fold-right-sc-lifted := function ($seqGen as function(xs:integer) as array(*),
                                       $zero as item()*,
                                       $f as function(item(), item()*) as item()*,
                                       $fGetPartial as function(*) 
                                      ) as item()*
                                      {
                                         $fold-right-sc-lifted-inner($seqGen, 1, $zero, $f, $fGetPartial, $fold-right-sc-lifted-inner)
                                      },
                                  
   $NaN := xs:double('NaN'),
   
   $fSeq1ToN := function($ind as xs:integer, $indStart as xs:integer, $indEnd as xs:integer) as array(*)
                {
                  if($ind lt  $indStart or $ind gt $indEnd)
                    then  array{true(), $NaN}
                    else array{false(), $ind}
                },
   $fSeq-1-6 := $fSeq1ToN(?, 1, 6),
   $fSeq-1-1M := $fSeq1ToN(?, 1, 1000000),
   $fSeq-1-1M-minus-3 := function($n as xs:integer)
   {
     array{$fSeq-1-1M($n)(1), $fSeq-1-1M($n)(2) -3}
   },
               
   $fAdd := function($x, $y)  {$x + $y},
   $fMult  := function($x, $y)  {$x * $y},
   $fMultGetPartial := function($x, $y)
   {
     if($x eq 0)
       then function() {0}
       else function($z) {$x * $z}
   }
   
   return
     $fold-right-sc-lifted($fSeq-1-1M-minus-3, 1, $fMult, $fMultGetPartial) 

Now this expression (and even one involving a sequence of 10M items take 0 seconds to be evaluated in both BaseX and Saxon, producing the correct result 0:

image


Summary

This article demonstrated the problems inherent to the standard XPath fn:fold-right and correctly determined the root causes for these problems: no short-circuiting and no collection generators.

Then a step-by-step solution was built that shows how to implement lazy evaluation in XPath based on short-circuiting and collection generators. This fixed the error raised by BaseX and dramatically reduced the evaluation time of Saxon from 261 seconds to 0 seconds.

The new function produced can be called $fold-lazy and is a good candidate for inclusion in the XPath 4.0 standard functions.

A complete design and implementation of a general collection-generator will be published in a separate article.

Issue #667 closed #closed-667

29 Aug at 14:49:38 GMT

XPath minor edits, 4.16 through end

Issue #642 closed #closed-642

29 Aug at 13:09:32 GMT

561: Editorial (abbreviation fn=function, drop lambda syntax)

Issue #646 closed #closed-646

29 Aug at 12:39:53 GMT

508: Editorial, examples revised (array:split, array:slice, others)

Issue #627 closed #closed-627

29 Aug at 12:38:11 GMT

624: Adjusted function category descriptions

Issue #662 closed #closed-662

29 Aug at 12:36:42 GMT

658b: changes to constructor functions

Issue #656 closed #closed-656

29 Aug at 12:36:24 GMT

Better return type for map pair

Issue #654 closed #closed-654

29 Aug at 12:36:03 GMT

Add covers-40 attribute to generated tests

Issue #644 closed #closed-644

29 Aug at 12:35:35 GMT

Adjusted CSS to target classes, not element + classes

Issue #643 closed #closed-643

29 Aug at 12:35:13 GMT

414, 546: Adjusted XDM description of xs:string, added coding

Issue #669 created #created-669

20 Aug at 22:20:54 GMT
Typo in XSLT §26.4 - "appearing appearing"

The word "appearing" is doubled.

Issue #668 created #created-668

18 Aug at 17:42:10 GMT
Definition of HTML case-insensitive collation

The semantics of the collation URI http://www.w3.org/2005/xpath-functions/collation/html-ascii-case-insensitive are described (in F&O section 5.3.4) are described by reference to the HTML5 "living spec". The cross-reference to a changing spec is inevitably fragile and I suggest we make it non-normative. I also suggest that we define the ordering implied by this collation rather than leaving it implementation defined.

A sufficient definition is: the comparison of two strings A and B under this collation delivers the same result as the comparison of ascii-lower-case(A) to ascii-lower-case(B) under the Unicode codepoint collation, where ascii-lower-case($S) function is translate($S, "ABCDEFGHIJKLMNOPQRSTUVWXYZ", "abcedfghijklmnopqrstuvwxyz").

Perhaps we should also consider defining a collation URI that is unicode-case-blind, with the same definition except that ascii-lower-case(X) is replaced by fn:lower-case(X).

Pull request #667 created #created-667

18 Aug at 01:54:43 GMT
XPath minor edits, 4.16 through end

Various small edits. I added two simple examples to help readers quickly understand the difference between => and =!> before going to more complex examples.

Issue #666 created #created-666

17 Aug at 10:02:50 GMT
Polyfill function implementations

For transition purposes, it may be useful to provide or enable "polyfill" implementations of functions that are specified in QT40, but not yet available in all implementations. Currently this is not possible because the relevant namespaces are reserved.

I propose that we relax the rule on reserved function namespaces:

(a) in XQuery, if the function has an annotation along the lines `%polyfill('http://www.w3.org/2005/xpath-functions'), where the parameter indicates that the function should be injected into the specified namespace instead of the namespace of the containing module.

(b) in XSLT, if the attribute xsl:function/@override-extension-function="no" is present. A function is allowed to be in a reserved namespace if this attribute is present.

I've chosen syntax here that's already available in 3.0/3.1, to minimise the impact on existing processors. Of course, we can't retrospectively change the 3.0/3.1 specs to authorise older processors to make this work as intended. But we can suggest that they bend the rules.

In both cases, (a) if the annotation is present then the rules on reserved namespaces don't apply, and (b) the function declaration is ignored if the processor provides its own internal implementation of the function.

I'm proposing to publish polyfill implementations of many of the new functions (not the complex ones like parse-html!). But I don't intend to make this a QT4 deliverable, I'm thinking of doing it as an open source project in GitHub/Saxonica space.

Issue #665 created #created-665

17 Aug at 08:51:00 GMT
Typo in fn:items-before and fn:items-ending-where

The specifications refer to $seq in place of $input.

Pull request #664 created #created-664

16 Aug at 20:19:55 GMT
663 xsl:original keywords

Fix #663.

A simple fix, just specify that if xsl:original is called with keywords, the keywords used are those of the overridden function.

Issue #663 created #created-663

15 Aug at 22:17:21 GMT
Calling xsl:original() with keywords

We need to define what happens if xsl:original() is called with keyword arguments.

The answer isn't trivial, because (for 3.0 compatibility reasons) an overriding function isn't required to use the same parameter keywords as the function it overrides.

Perhaps we should recognize that there are functions that do not support argument keywords. This might be the case, for example, with Java or C# extension functions.

Pull request #662 created #created-662

15 Aug at 16:34:24 GMT
658b: changes to constructor functions

Changes argument name to "value", makes argument default to context item.

Issue #661 closed #closed-661

15 Aug at 16:32:32 GMT

658: constructor functions

Pull request #661 created #created-661

15 Aug at 16:16:09 GMT
658: constructor functions

Changes argument name of constructor functions to value, makes the argument optional, revises the way the proto markup is used to indicate emptyOk arguments.

Fix #658

Issue #638 closed #closed-638

15 Aug at 11:59:41 GMT

Editorial: Avoid e.g.

Issue #660 created #created-660

15 Aug at 10:03:37 GMT
Static functions, default parameters

In the current XQuery draft in 5.18.3 Function Parameters, it’ stated that:

If a parameter is optional, then all subsequent parameters in the list must also be optional. In other words, the parameter list includes zero or more required parameters followed by zero or more optional parameters.

I would suggest raising XPST0017 if that’s not the case.

Pull request #659 created #created-659

14 Aug at 18:03:57 GMT
647: schema location hints

Fix #647

Note: Need to check that this builds successfully. In my working copy, the new text for "import schema" found its way into the "xquery-assembled.xml" file but not into the final HTML. I can't see any reason for this.

Issue #658 created #created-658

14 Aug at 09:11:55 GMT
Constructor Function: Parameter Name, Zero-Arity

The parameter name for constructor functions is $arg: https://qt4cg.org/specifications/xpath-functions-40/Overview.html#constructor-functions

We should change it to $value, in alignment with the XQFO functions:

fn:string(value := 123),
xs:string(value := 123)

Issue #657 created #created-657

10 Aug at 12:14:16 GMT
User-defined functions in main modules without `local` prefix

I wonder where this has been discussed before (but I can’t find it):

For simple scripts and main modules, the necessity to prefix all functions with a local prefix is cumbersome. Next, it’s counterintuitive as the prefix is not required for variable declarations:

declare function local:f() { 1 };
declare variable $x := 1;
$x + local:f()

It’s possible to use declare default function namespace '...';, but that’s doesn’t feel much easier:

declare default function namespace 'x';
declare function f() { 1 };
f()

I wonder if we can do one of the following things?

  1. Allow functions without namespace (declare function x() {}, declare function Q{}x() {}).
  2. Assign functions without namespace to the default function namespace.

Pull request #656 created #created-656

09 Aug at 09:29:15 GMT
Better return type for map pair

Uses a record type (record(key, value)) for the return type of map:pair, to give more precision and to align with map:pairs() and map:of-pairs()

Issue #655 created #created-655

09 Aug at 07:30:59 GMT
fn:sort-with: Comparators

See https://github.com/qt4cg/qtspecs/issues/93#issuecomment-1017937220:

One solution to more powerful sorting would a variant of fn:sort that uses a comparator function. We've resisted this in the past because we can't trust a user-supplied comparator function to be well behaved (e.g. transitive). I wonder how serious an obstacle this is?

We should try to trust; there are much more hidden pitfalls in the existing language.

fn:sort could be extended with a comparator parameter with two arguments and returning xs:boolean:

(: returns John, Joe, Jim, Jack :)
sort(('Jack', 'Joe', 'Jim', 'John'), comparator := op('>'))

(: returns -1, 2, -3 :)
sort((-3, -1, 2), comparator := fn($a, $b) { abs($a) < abs($b) })

Obviously, if a comparator is supplied, other parameters (collation(s), key(s), ascending, see #623) must not be specified in parallel.

Pull request #654 created #created-654

09 Aug at 06:37:30 GMT
Add covers-40 attribute to generated tests

Add covers-40 attribute to the generated test set for function keywords, to avoid failures in app-CatalogCheck test Catalog014

Issue #653 created #created-653

07 Aug at 17:18:20 GMT
XQuery - option to suppress entity expansion

As an enabler for #652 (and for other reasons) I propose that XQuery should have an option to suppress recognition of entity references. This is appropriate for any context in which XQuery expressions are embedded in XML (including our own test suite), where entity expansion will have already been done before the XQuery text is parsed. With this change, any XPath expression becomes a valid and equivalent XQuery expression.

This could be done using a new prolog declaration such as

declare entity-expansion on|off;

If set to "off", & is treated as a normal character in contexts such as string literals and direct element constructors, rather than signalling an entity reference. The default remains "on".

Issue #652 created #created-652

07 Aug at 17:04:48 GMT
Defining a common function library for XPath, XSLT, and XQuery applications

This issue is motivated by Mary Holstege's talk at Balisage 2023: Adventures in Single-Sourcing XSLT and XQuery

https://www.balisage.net/Proceedings/vol28/html/Holstege01/BalisageVol28-Holstege01.html

It is also motivated by other issues that have been raised here proposing improved capabilities for writing applications in pure XPath.

I propose that we define a syntax for creating a module containing a set of function definitions that can be used to define a function library available to both XQuery and XSLT applications, and potentially also by pure XPath applications. A file should contain all the functions in one namespace. The format should be suitable for translation into other formats such as XSLT and XQuery, which means it should be in XML (though we could also consider JSON). We should provide XSLT stylesheets that convert libraries in this format into XSLT stylesheet modules or XQuery library modules.

Function signatures should be expressed in an XML syntax similar to xsl:function in XSLT.

Function bodies should generally be written in the form of a single XPath expression, though there should be fallback mechanisms to allow XSLT and XQuery implementations to be supplied in cases where XPath lacks sufficient expressive power.

There should be mechanisms to define the more important components of the static context, such as namespace bindings and dependencies on other function libraries.

We could consider using this format to publish "polyfill" implementations of many of the new XPath 4.0 functions.

Issue #651 created #created-651

07 Aug at 16:48:38 GMT
fn:log → fn:message

I think we've made a mistake in the choice of name for this function. Without the namespace prefix, it looks far too much like a function that computes logarithms. Also, it would be nice in the future to find some way of allowing the math functions to be called without a namespace prefix, and this choice scuppers that possibility.

I think my choice would be fn:message.

Pull request #650 created #created-650

06 Aug at 06:05:28 GMT
649: fix an xsl:fallback problem

Ensures that an xsl:fallback instruction is not processed in forwards compatibility mode, so that errors in the instruction are reported rather than being silently ignored; informally encourages adoption of the same rule in 3.0 and earlier processors where possible.

Fix #649

Issue #649 created #created-649

05 Aug at 19:06:07 GMT
xsl:fallback

I made the mistake of writing a test that said:

    <xsl:array version="4.0">
      <xsl:fallback select="array{1 to 10}"/>
      <xsl:sequence select="1 to 10"/>
    </xsl:array>

Now, the xsl:fallback instruction doesn't allow an @select attribute; and there's not much point in adding one, because xsl:fallback is only there for a processor implementing XSLT 1.0, 2.0, or 3.0, and such processors will ignore anything the XSLT 4.0 specification says.

However, we could supply some clarification of what such a processor is expected to do when it finds an xsl:fallback element with an unexpected @select attribute. At present, it seems that because we don't say anything else, the xsl:fallback element is itself evaluated in forwards compatibility mode, which means that the select attribute is ignored. Since the whole purpose of xsl:fallback is to provide code that an earlier XSLT processor can handle, I think it would make much more sense to say: "the effective version for an xsl:fallback element and its descendants, unless overridden with an explicit [xsl:]version attribute, is the version of the processor in use", so a 3.0 processor executing an xsl:fallback instruction in a 4.0 stylesheet reports a static error if it finds a construct like the above.

Although the 4.0 spec cannot dictate what a 3.0 processor does, we could add a non-normative note encouraging this interpretation.

Issue #648 created #created-648

03 Aug at 09:23:09 GMT
Schema for FN namespace should block extension and substitution

Weird things can happen if the user defines a schema that imports the schema for the FN namespace and then adds members to its substutition groups or extends its complex types. We can prevent this happening by blocking substitution and extension. We should also specify that when we validate the input to fn:xml-to-json, xsi:schemaLocation must be disabled.

Issue #647 created #created-647

01 Aug at 13:29:21 GMT
XQuery: import schema with multiple location hints

XQuery 3.1 clarified what is intended when an "import module" declaration provides multiple location hints - there's now a clear indication that the processor should expect to load multiple modules all with the same module namespace.

For "import schema" it's still completely vague what is intended, and there's no analogy in XSD or XSLT which only ever allow a single location URI to be supplied. Currently we say:

The URILiterals that follow the at keyword are optional location hints, and can be interpreted or disregarded in an implementation-dependent way. Multiple location hints might be used to indicate more than one possible place to look for the schema or multiple physical resources to be assembled to form the schema.

I propose changing this to:

The URILiterals that follow the at keyword are optional location hints, intended to allow a processor to locate schema documents containing definitions of the required schema components in the target namespace. Processors may interpret or disregard these hints in an implementation-dependent way. The recommended default strategy is as follows (but this may be varied through user options):

  • All the location hints are dereferenced, treating them as relative URIs
  • If any location hint cannot be dereferenced, or fails to resolve to a valid schema document with the required target namespace, then that location hint is disregarded (optionally with a warning); but if none of the location hints can be resolved to a valid schema document with the required target namespace, then a static error is reported.
  • If multiple location hints are dereferenced, yielding multiple schema documents A, B, and C, then they should be treated as if there were a single schema document containing xs:include declarations referencing A, B, and C. This implies that the schema documents must together comprise a valid schema, for example there cannot be two different type definitions with the same name.
  • Notwithstanding the previous rule, if a processor is able to establish that two or more location hints refer to identical or equivalent schema documents, then the duplicates should be ignored.

This text gives users and implementors alike a much better sense of what is intended, while still retaining flexibility for implementations to do something different.

Pull request #646 created #created-646

26 Jul at 11:12:02 GMT
508: Editorial, examples revised (array:split, array:slice, others)

@ndw @michaelhkay I wonder what we should do with minor edits like this (which fixes an example)? Should we merge them by ourselves, or wait for someone to approve and merge it?

Issue #645 created #created-645

26 Jul at 07:09:17 GMT
Editorial: Use `\n` instead of `\r\n` in XML documents

If we use a consistent newline encoding in all XML documents, it will be easier to perform cleanups and to edit documents in the GitHub frontend. Most documents already use Unix-style newlines. If I am correct, only three documents need to be updated:

specifications/xslt-xquery-serialization-40/src/errors.xml
specifications/xslt-xquery-serialization-40/src/ns-xslt-xquery-serialization.xml
specifications/xslt-xquery-serialization-40/src/xslt-xquery-serialization.xml

Pull request #644 created #created-644

26 Jul at 03:02:44 GMT

Adjusted CSS to target classes, not element + classes

Pull request #643 created #created-643

26 Jul at 02:22:22 GMT
414, 546: Adjusted XDM description of xs:string, added coding

Per meeting today, made good on my comment in #546

Pull request #642 created #created-642

25 Jul at 18:29:09 GMT
561: Editorial (abbreviation fn=function, drop lambda syntax)

Old -> lambda syntax removed from various examples; minor unifications.

I believe it’s ready to merge; please jump in otherwise.

Issue #634 closed #closed-634

25 Jul at 18:12:45 GMT

471: Quotes (missing cases)

QT4 CG meeting 043 draft minutes #minutes-07-25

25 Jul at 16:50:00 GMT

Draft minutes published.

Issue #641 created #created-641

25 Jul at 16:04:25 GMT
Serialization fallback.

I propose that we drop some serialization errors in favour of producing a fallback representation of the supplied value.

The rationale is that (a) serialization is often used in contexts like xsl:message where the primary purpose is diagnostic, and the last thing you want when producing diagnostics is a secondary error; and (b) seeing a fallback representation of an inappropriate value often shows you much more clearly what you have done wrong than any error message can do.

Compare with the .toString() method in Java and similar languages, which always outputs something even if it's not quite what you wanted.

I'm not proposing to change the principle that the output should always be syntactically valid (e.g. well formed XML or JSON).

I think some of the specific error conditions we might drop are:

  • In sequence normalization rule 7, instead of raising an error when an attribute, namespace, or function (including a map or array) is encountered, serialize that item using the adaptive output method, treat the result as a text node, and insert the text node into sequence S6.

  • In the JSON output method: when a sequence of two or more items is encountered, instead of raising SERE0023, treat it as an array containing those items.

Closely related, and perhaps best considered together: should the fn:string() function accept anything as input, and never raise an error?

Issue #632 closed #closed-632

25 Jul at 15:58:44 GMT

SENR0001: Error description updated

Issue #630 closed #closed-630

25 Jul at 15:58:30 GMT

XPath spec ch. 3 minor edits

Issue #574 closed #closed-574

25 Jul at 15:58:14 GMT

fn:log: Trace and discard results

Issue #629 closed #closed-629

25 Jul at 15:58:11 GMT

574: fn:log: Trace and discard results

Issue #508 closed #closed-508

25 Jul at 15:57:55 GMT

New Map & Array Functions: Inconsistencies

Issue #609 closed #closed-609

25 Jul at 15:57:52 GMT

508: New Map & Array Functions: Inconsistencies

Issue #23 closed #closed-23

25 Jul at 15:57:37 GMT

Extending element and attribute tests to NameTest unions

Issue #606 closed #closed-606

25 Jul at 15:57:35 GMT

Allow element(A|B) and attribute(A|B)

Issue #602 closed #closed-602

25 Jul at 15:57:17 GMT

Semi-strict static typing: reporting implausible expressions

Issue #603 closed #closed-603

25 Jul at 15:57:15 GMT

602 Implausible Expressions

Issue #561 closed #closed-561

25 Jul at 15:56:49 GMT

Alias for `function` keyword; drop thin arrow syntax?

Issue #589 closed #closed-589

25 Jul at 15:56:48 GMT

561: abbreviation fn=function, drop lambda syntax

Issue #575 closed #closed-575

25 Jul at 15:55:35 GMT

359: fn:void: Absorb result of evaluated argument

Issue #414 closed #closed-414

25 Jul at 15:55:08 GMT

Lift character set restriction of xs:string

Issue #546 closed #closed-546

25 Jul at 15:55:02 GMT

414: Attempt to implement expanding the allowed character repertoire

Issue #533 closed #closed-533

25 Jul at 15:54:42 GMT

413: Spec for CSV parsing with fn:parse-csv()

Pull request #640 created #created-640

25 Jul at 15:38:51 GMT
601: fn:all → fn:every?

@ndw Should be ready to be merged

Issue #514 closed #closed-514

25 Jul at 15:30:03 GMT

Lambda expression: Annotations

Issue #639 created #created-639

25 Jul at 15:26:23 GMT
fn:void: Naming, Arguments

A new function fn:void was added to the spec (see #359 for details).

This issue can be used to discuss alternative names for the function, as was suggested by @dnovatchev.

Issue #638 created #created-638

25 Jul at 14:35:21 GMT
Editorial: Avoid e.g.

Occurrences of e.g. should be replaced with alternatives, such as (for example).

See https://github.com/qt4cg/qtspecs/pull/629#issuecomment-1649952964

Issue #637 created #created-637

25 Jul at 13:35:03 GMT
Annotation Values: Booleans

Functions annotations in XQuery have become a popular feature to attach vendor-specific information (for unit testing, locking, RESTXQ, etc.) to functions.

Annotation values are limited to literals, though. It would often be helpful to supply boolean values, but we don’t have literals for that in the language.

I suggest enhancing the existing grammar…

Annotation  ::=  "%" EQName ("(" Literal ("," Literal)* ")")?

…and allowing the strings false() and true() as values:

Annotation  ::=  "%" EQName ("(" AnnotationValue ("," AnnotationValue)* ")")?
AnnotationValue  :=  Literal | "false()" | "true()"

The suggestion is upward compatible if we should decide later on that we want to allow arbitrary expressions for annotation values.

Issue #636 closed #closed-636

25 Jul at 07:25:47 GMT

Ternary operator

Issue #636 created #created-636

25 Jul at 03:53:03 GMT
Ternary operator

Per #171 the WG decided to allow the ternary operator. I'm looking at 4.16 of the current version of XPath 4.0 and the ternary operator is presented as illustrative, but the operator does not appear to have been properly introduced and defined in the specs. The terms "ternary", "??", and "!!" appear only twice each, and in contexts that could be confused as being illustrative.

By my reading, the definition of [11] ExprSingle should be expanded to allow a new option, call it TernaryOption, and to the grammar should be new entry TernaryOption ::= Expr "??" ExprSingle "!!" ExprSingle. Does such a definition allow any ambiguous constructs?

I propose that 4.16 be subdivided into two subsections, the first one briefly introducing the ternary operator and the second handling if/then statements. Slight aside: the latter should also include a note pointing out how to avoid a then branch (and an example), a courtesy to developers who are overly accustomed to the unbraced approach as to not give the braced approach much thought.

Pull request #635 created #created-635

24 Jul at 16:28:37 GMT
451: Schema compatibility

This PR addresses part (not all) of issue 451.

It recognises that an application may use more than one schema; for example in a pipeline using multiple stylesheets, it must be possible for the first stylesheet to produce valid output that is valid input to the second, without requiring that the two stylesheet have absolutely identical schema imports. It recognises that there are cases (for example involving substitution groups) where two schemas X and Y may both include the same type T, but produce different results when an element is validated against T. So it defines a concept of schema compatibility and defines its limitations, especially on the semantics of item types such as element(*,T) and schema-element(E). The rules for schema compatibility between different modules of a query and between different packages in a stylesheet are tightened up and brought into line with each other.

Pull request #634 created #created-634

24 Jul at 13:54:14 GMT
471: Quotes (missing cases)

See Matt’s comment in https://github.com/qt4cg/qtspecs/pull/533#issuecomment-1647945368

Issue #389 closed #closed-389

22 Jul at 15:38:19 GMT

The fn:build-uri function needs to perform URI encoding for path and query segments

Issue #556 closed #closed-556

22 Jul at 15:35:11 GMT

Serialization phase 5 note unclear

Issue #621 closed #closed-621

22 Jul at 13:41:38 GMT

Removed chapter 4 from XDM

Issue #626 closed #closed-626

22 Jul at 13:41:22 GMT

Adjusted serialization step 5 note

Issue #625 closed #closed-625

22 Jul at 13:40:42 GMT

XPath minor edits, chh. 1-2

Pull request #633 created #created-633

22 Jul at 03:35:01 GMT
Edits ch. 4.1 through 4.15

Note, this batch of edits includes a shift from "built-in" to "system" when describing functions, per edits made in XDM.

Pull request #632 created #created-632

21 Jul at 21:18:01 GMT
SENR0001: Error description updated

Observed by @line-o: https://xmlcom.slack.com/archives/C01GVC3JLHE/p1689946671105499

Pull request #631 created #created-631

21 Jul at 17:46:42 GMT
600: fn:decode-from-uri

I did my best to define rules for a counterpart of the fn:encode-for-uri function, including various edge cases.

I’m convinced that the function has been requested often enough to justify its inclusion in the spec. I’m also aware that users may have different expectations regarding the details of the conversion rules. On the other hand, this discussion can be observed for other languages as well, and that’s mostly due to the… heterogeneous history of URIs, not the actual implementations. For example, URLDecode.decode in Java converts the plus character to a space, and JavaScript’s decodeURI adopts it unchanged. I decided to convert it as well, as fn:encode-for-uri encodes the plus sign to %2B.

@ndw My rules have largely been inspired by your decoding rules for fn:parse-uri. I hope these rules can be dropped and replaced with a reference to this new function (analogous to fn:build-uri, which references fn:encode-for-uri).

QT4 CG meeting 043 draft agenda #agenda-07-25

21 Jul at 08:30:05 GMT

Draft agenda published.

Pull request #630 created #created-630

21 Jul at 02:53:50 GMT
XPath spec ch. 3 minor edits

Minor edits to ch. 3 of XPath spec. Note, this PR includes an adjustment to the XDM spec, in the form of a cross-reference, because the terms "string value" and "typed value" are heavily used in XDM but defined only in XPath.

Pull request #629 created #created-629

20 Jul at 16:36:56 GMT
574: fn:log: Trace and discard results
  • New function fn:log
  • Rules of fn:trace revised

Issue #620 closed #closed-620

20 Jul at 07:25:49 GMT

[616] Converted X Node to x node

Issue #622 closed #closed-622

20 Jul at 07:24:11 GMT

XDM minor edits, back material

Issue #628 created #created-628

20 Jul at 07:07:24 GMT
distinct-values and duplicate-values: order of results

I've noticed that a few tests have appeared in QT4tests distinct-values() that assume the order of results is "order of first appearance" (search for assert-deep-eq). We should either change the tests, or change the spec to require order of first appearance.

Since no implementors have objected to these tests, it seems likely that implementations are delivering results in "order of first appearance", and if that is the case, then I think it would be a convenience to users to guarantee this in the spec.

To allow for parallel implementations, we could say that the order is undefined if ordering mode is unordered.

Pull request #627 created #created-627

20 Jul at 02:39:53 GMT
624: Adjusted function category descriptions

Attempted revision in light of #624. Perhaps not everything is exactly right, but it should be a step in the right direction.

Pull request #626 created #created-626

20 Jul at 01:47:50 GMT
Adjusted serialization step 5 note

This brief edit addresses issue #556, which I've chosen to handle apophatically.

Pull request #625 created #created-625

20 Jul at 01:09:33 GMT
XPath minor edits, chh. 1-2

All minor edits. Some edits are motivated by an attempt to address broken parallelism or some other form of localized inconsistency. Does not include questions about function descriptions, raised at #624.

Issue #624 created #created-624

19 Jul at 22:56:34 GMT
XPath function definition clarification

In the XPath specs, 2.2, function definitions, the reader is informed that every (statically known) function definition takes one of three mutually exclusive categories: application, system, or external.

[Definition: Application functions are function definitions written in a host language such as XQuery or XSLT whose semantics are defined in this family of specifications. Their behavior (including the rules determining the static and dynamic context) follows the rules for such functions in the relevant host language specification.]

The first sentence appears to point to user-written functions in the host languages, and the second sentence appears to point to functions defined by the host language specifications. I assume that this category is meant to include both, and I propose the language be tightened up to make that clear. If my assumption is incorrect, then some other type of revision is needed.

Later on, when the term “built-in function” is introduced, it is not as clearly stated as it should be how that term maps onto the three-way division. It seems that “built-in function” encompasses all system functions and only those application functions that are defined by the specifications, and not user-written functions. Whatever the case, and whatever the best path of revision, this paragraph would be most effective if moved up with the tripartite category discussion.

The term "external function" is introduced here in the static context, with language that is highly suggestive of a definition. But the definition proper is reserved for the dynamic context, and in shorter, different prose that I think loses some of the considerations provided in the static context. I propose that the two passages be consolidated and located in the static context, where the two other types of functions are defined, with an xref in the dynamic context to the meaning of external function.

Down in the dynamic context:

The dynamically known function definitions may include external functions.

This sentence is a bit puzzling, because of course they are allowed to include all three types of functions -- none are forbidden, since, after all, the dynamically known functions are a superset of statically known ones. Perhaps the point is to drawn the reader's attention to those functions that are known dynamically but not statically? Perhaps this revision?: "Many of the function definitions known dynamically but not statically will be external functions, but they may include user-written application functions written in a host language, known only dynamically, e.g., through fn:transform."

Cumulatively the above points could go beyond mere minor touch-up, so comments are welcome before I attempt any edits.

Issue #573 closed #closed-573

19 Jul at 21:58:12 GMT

Node construction functions

Pull request #623 created #created-623

19 Jul at 16:51:53 GMT
93: sort descending

Enhances fn:sort to allow multiple major-to-minor sort keys each of which can independently specify a collation and an ascending/descending option.

Also includes infrastructure changes to allow occurrence indicators on function arguments or results that reference a named record type.

Similar changes will be needed for array:sort; to reduce the risk of rework I propose to make those changes after this PR has been reviewed and accepted.

Fix #93

Pull request #622 created #created-622

19 Jul at 00:51:45 GMT
XDM minor edits, back material

Note, the example was invalid because of a failure to access the included xsd file.

Pull request #621 created #created-621

18 Jul at 22:52:59 GMT
Removed chapter 4 from XDM

As noted in Slack, chapter 4 of the XDM specs is out of place. In this PR I have moved the very general material in chapter 4 to the preamble of chapter 6 (now 5), which provided the opportunity to orient the reader to the structure of that chapter.

Cross-references to chapter 4 have been search for and dealt with. The CSS deletion comes from my observation that there is no such class infoset-mapping in the resultant HTML file; section 4's infoset-mapping is an id.

Pull request #620 created #created-620

18 Jul at 22:29:45 GMT
[616] Converted X Node to x node

Per #616 names of nodes have been set lowercase. This PR does not address the good suggestion that xrefs and styling be selectively applied. That is reserved for a future pass.

Pull request #619 created #created-619

18 Jul at 21:28:11 GMT

XDM ch. 6 minor edits

Issue #615 closed #closed-615

18 Jul at 16:17:42 GMT

Xdm minor edits, chh. 3-5

Issue #128 closed #closed-128

18 Jul at 16:17:24 GMT

fn:replace: Tweaks

Issue #612 closed #closed-612

18 Jul at 16:17:22 GMT

128: fn:replace: Tweaks

Issue #329 closed #closed-329

18 Jul at 16:17:02 GMT

Keyword parameters: Error codes

Issue #611 closed #closed-611

18 Jul at 16:16:59 GMT

329: Keyword parameters: Error codes

Issue #506 closed #closed-506

18 Jul at 16:16:43 GMT

fn:error: parameter names

Issue #610 closed #closed-610

18 Jul at 16:16:41 GMT

506: fn:error: parameter names

Issue #607 closed #closed-607

18 Jul at 16:16:19 GMT

XQFO Examples: Fixes, Formatting

Issue #21 closed #closed-21

18 Jul at 16:15:30 GMT

New reserved function names

Issue #605 closed #closed-605

18 Jul at 16:15:29 GMT

21: Revise appendix for reserved function names

Issue #39 closed #closed-39

18 Jul at 16:14:48 GMT

URILiteral is defined in the EBNF grammar but not used

Issue #604 closed #closed-604

18 Jul at 16:14:47 GMT

[Editorial] Drop the unused symbol URILiteral from the XPath grammar appendix

Issue #123 closed #closed-123

18 Jul at 16:14:22 GMT

fn:duplicate-values

Issue #614 closed #closed-614

18 Jul at 16:14:20 GMT

123: fn:duplicate-values

Issue #618 created #created-618

18 Jul at 14:07:30 GMT
Symmetry: fn:html-doc, fn:csv-doc

If we keep fn:html-parse and if we add fn:csv-parse, we should also add fn:html-doc and fn:csv-doc.

Issue #617 created #created-617

18 Jul at 07:42:09 GMT
Implicit constructor functions for record types and union types

See also Issue #397 and Issue #322, which this proposal may partially supersede.

I propose that when declaring a named record type in the static context, this should automatically create a constructor function definition for records of that type.

So in XQuery if you write

declare item type my:location as record(longitude: xs:double, latitude: xs:double);

then the static context acquires both a type (let $loc as my:location := ....) and a function which you can call with either positional or keyword arguments (let $loc := my:position(-2.03, 50.95) or := my:position(longitude := -2.03, latitude := 50.95).

The semantics of the function are roughly map{longitude: -2.03, latitude: 50.95} treat as my:position, except that the function arguments are first coerced to the required types (in this case the decimals are coerced to double).

If the record type is extensible, the constructor function does not provide any capability to set values for extension fields. I'm not sure yet whether it will be possible to distinguish fields set to an empty sequence from fields that are absent.

This is consistent with user-defined atomic types where you automatically get a constructor function.

Similarly for named union types. If you declare

declare item type my:binary as union(xs:hexBinary, xs:base64Binary)

then you should automatically get an arity-1 constructor function my:binary($value) with the same semantics as if my:binary were an XSD-defined union type (that is, the same semantics as cast $value as union(xs:hexBinary, xs:base64Binary).

Issue #616 created #created-616

18 Jul at 02:26:33 GMT
XDM: X Node vs. x node

At the risk of being branded a pedant....

Currently the XDM specs have a predilection for Attribute Node, Element Node, Comment Node, Document Node, Text Node, Processing Instruction Node, Namespace Node. But in a healthy minority of cases in the XDM specs, they are rendered lowercase.

To complicate matters further, the other specs, which rely upon the XDM specs for the definition of the terms, universally prefer the lowercase form, and they use the terms preponderantly more. A healthy sample of four of the terms across five of the specifications ("elective" excludes instances where that particular case is required, e.g., capitalization in headers):

Term | Spec | Capitalized | Electively Capitalized | Electively Noncapitalized -- | -- | -- | -- | -- Attribute node | XDM | 71 | 41 | 18 Element node | XDM | 79 | 71 | 23 Comment node | XDM | 44 | 39 | 0 Document node | XDM | 55 | 48 | 10 Attribute node | Serialization | 0 | 0 | 15 Element node | Serialization | 0 | 0 | 39 Comment node | Serialization | 0 | 0 | 2 Document node | Serialization | 0 | 0 | 15 Attribute node | XQuery | 1 | 0 | 82 Element node | XQuery | 0 | 0 | 141 Comment node | XQuery | 0 | 0 | 9 Document node | XQuery | 2 | 0 | 50 Attribute node | XFO | 1 | 0 | 28 Element node | XFO | 1 | 0 | 67 Comment node | XFO | 1 | 0 | 4 Document node | XFO | 1 | 0 | 57 Attribute node | XSLT | 8 | 0 | 100 Element node | XSLT | 16 | 0 | 77 Comment node | XSLT | 1 | 0 | 6 Document node | XSLT | 12 | 0 | 144

Options:

  1. Do nothing.
  2. Change all specs to capitalize these terms universally.
  3. Change only XDM to capitalize these terms universally.
  4. Determine a principle that would differentiate the contexts in which the term should be capitalized or not within XDM only.
  5. Change all specs to uncapitalize these terms universally.

My personal preference is no. 5. I do not see any passages in XDM where the reader would be confused by the term being uncapitalized (the minority cases being witnesses). Lowercase would bring XDM into conformity with the other specifications, and with how the term is commonly used outside the specs. Further, the XDM specs do not capitalize other terms it specially defines, e.g., accessor. And in all the specs, including XDM, the clear preference is to use the lowercase for the names in the raw (i.e., without "node"): element, attribute, namespace, etc.

I wanted to bring this to the group before doing any edits. I am happy to do the work, but do not want to do it if it is unwelcome, or if there is a clear preference for another option.

Pull request #615 created #created-615

17 Jul at 21:52:42 GMT

Xdm minor edits, chh. 3-5

QT4 CG meeting 042 draft agenda #agenda-07-18

17 Jul at 13:00:05 GMT

Draft agenda published.

Issue #491 closed #closed-491

17 Jul at 11:40:19 GMT

Fix more examples in the FO 4.0 spec

Pull request #614 created #created-614

17 Jul at 11:38:14 GMT
123: fn:duplicate-values

I decided to create a PR for the initial proposal of this function, as I came across at least two other use cases for it since the issue was created.

I believe that the group by clause is the best choice for more complex operations, such as advanced comparisons or creating histograms.

Issue #613 created #created-613

17 Jul at 08:02:54 GMT
Allow "union" as synonym for "|" everywhere

It seems silly to allow "union" as a synonym for "|" in some places and not others. For example it's not allowed in a "case" clause of a typeswitch, nor in a catch clause of try/catch.

We've introduced the ability to write child::(a|b) as a synonym for child::a | child::b, but we don't allow child::(a union b) as a synonym for child::a union child::b.

Pull request #612 created #created-612

16 Jul at 14:39:19 GMT

128: fn:replace: Tweaks

Pull request #611 created #created-611

16 Jul at 14:04:27 GMT

329: Keyword parameters: Error codes

Pull request #610 created #created-610

16 Jul at 13:51:52 GMT

506: fn:error: parameter names

Pull request #609 created #created-609

16 Jul at 13:36:53 GMT
508: New Map & Array Functions: Inconsistencies
  1. map:of renamed to map:of-pairs (as a hint that the input must match a specific format)
  2. array:of renamed to array:of-members (as a hint…)
  3. add map:pair for creating a single pair
  4. add array:split for decomposing arrays to singleton arrays

Issue #608 created #created-608

15 Jul at 14:25:12 GMT
Formatting Monospace (II)

Bugging you again, @ndw …

My feedback has been triggered by #607. I’ve moved various code examples into eg blocks. In principle, the result is promising, but it produces cases such as the following one (see https://qt4cg.org/pr/607/xpath-functions-40/Overview.html):

Maybe the result is better to read if single-line and multi-line monospace is formatted identically (i.e., without lines at the top and bottom, and with a transparent background)?

Next, it would be fine if we could get rid of the scrollbars. Often, it’s not apparent from the rendering that the presentation includes results at all:

We could possibly fix it by wrapping even more examples manually. That can be challenging, though, for example if long strings are used (such as "http://www.w3.org/2013/collation/UCA?lang=en;alternate=blanked;strength=primary").

Finally, the formatting on mobile devices (here: an Android tablet with a Chrome renderer) differs from the desktop view: code blocks are rendered much smaller than the rest. – That’s just to mention it; I’d probably be annoyed if I was tasked with fixing that.

Pull request #607 created #created-607

15 Jul at 11:57:54 GMT
XQFO Examples: Fixes, Formatting

This PR contains lots of fixes in the XQFO examples and improves the formatting of examples with nested arguments and multiline expressions.

It’s probably helpful to merge it as soon as possible to avoid conflicts with other PRs.

Pull request #606 created #created-606

14 Jul at 17:15:14 GMT
Allow element(A|B) and attribute(A|B)

Fix #23

Pull request #605 created #created-605

14 Jul at 15:12:10 GMT
21: Revise appendix for reserved function names

Fix #21

Pull request #604 created #created-604

14 Jul at 14:23:51 GMT
[Editorial] Drop the unused symbol URILiteral from the XPath grammar appendix

Fix #39

Pull request #603 created #created-603

13 Jul at 11:08:13 GMT
602 Implausible Expressions

Fix #602

Issue #602 created #created-602

13 Jul at 06:55:05 GMT
Semi-strict static typing: reporting implausible expressions

Strict static typing, as originally defined for XQuery 1.0, was not a success, because it prohibits too many constructs that are perfectly reasonable to write. However, the alternative, pure dynamic typing, prevents a processor reporting many obvious errors at compile time. A compromise is optimistic static typing, where the processor is allowed to report a type error statically in the cases where it can be shown that evaluation of an expression is bound to fail at run-time.

Optimistic static typing has proved a reasonably successful compromise, but there are a number of cases where things that are obviously user mistakes cannot be reported as static errors. I propose that a processor should be allowed (not required) to treat some of these conditions as static errors.

The first of these conditions is exemplified by passing an argument whose static type is xs:integer* to a function where the declared parameter type is xs:string*. Under optimistic static typing this cannot be reported as a static error, because it is not bound to fail; if the actual value at run-time turns out to be an empty sequence, the call will succeed. So the proposal is that where the inferred supplied type and the required sequence types are both emptiable (that is, occurrence indicator is "?" or "*"), but their respective item types are disjoint, the processor should be allowed to report a static error.

The second condition is what I call a void path expression. Specifically, if we know statically that the result of $A/B, or $A!B, or $A?B will be an empty sequence for any possible value of $A (given its inferred type), this almost certainly means the user has made a mistake, and we should be allowed to report a static error. This extends to the unary or implicit forms of these operators, based on the inferred type of the context item. This is most likely to occur with schema-aware code, where it should be possible to report a path such as A/B/C/D as incorrect if the schema does not allow such a path. But it also arises for example for $A?B if the inferred type of $A is a non-extensible record type and B is not one of its known fields; and it arises for inappropriate combinations of axes such as @code/text().

Note that I'm not proposing (as XQuery 1.0 static typing did) that any expression whose result is bound to be empty is a static error; the rule is confined to a few specific operators.

Perhaps, for backwards compatibility and interoperability, we should require processors to provide an option to switch this kind of static error detection off. (For example, XSLT 1.0 code sometimes deliberately uses /.. to represent an empty sequence, and this construct would be flagged under these rules.)

Issue #601 created #created-601

13 Jul at 06:46:10 GMT
fn:all → fn:every?

Feedback we got (translated):

There are the every and some keywords, and there are the new functions fn:some and fn:all. They appear to be more or less similar, so it would be consistent if fn:all was renamed to fn:every.

If fn:all is called this way because there’s also fn:all-different and fn:all-equal, we should also have fn:some-different and fn:some-equal.

Issue #585 closed #closed-585

12 Jul at 08:23:52 GMT

Editorial: dynamic function calls

Issue #597 closed #closed-597

11 Jul at 16:39:35 GMT

Editorial fixes from #566

Issue #588 closed #closed-588

11 Jul at 16:39:21 GMT

Incompleteness of xsl:sort specification

Issue #595 closed #closed-595

11 Jul at 16:39:20 GMT

588: (Editorial, XSLT) minor clarifications regarding xsl:sort

Issue #592 closed #closed-592

11 Jul at 16:39:07 GMT

XSLT §5.6 xsl:decimal-format - no explanation of exponent-separator

Issue #594 closed #closed-594

11 Jul at 16:39:06 GMT

592: (XSLT, Editorial) Add missing description of exponent-separator

Issue #591 closed #closed-591

11 Jul at 16:38:53 GMT

Show defaults in XSLT element templates

Issue #593 closed #closed-593

11 Jul at 16:38:52 GMT

591: [XSLT, editorial] Add defaults to XSLT element syntax summaries

Issue #343 closed #closed-343

11 Jul at 16:38:18 GMT

$collation argument: Unification

Issue #590 closed #closed-590

11 Jul at 16:38:17 GMT

343: make $collation uniformly optional

Issue #365 closed #closed-365

11 Jul at 16:38:06 GMT

switch, typeswitch: Optional braces

Issue #587 closed #closed-587

11 Jul at 16:38:05 GMT

365: Allow braces in switch and typeswitch expressions

Issue #586 closed #closed-586

11 Jul at 16:37:49 GMT

585: [Editorial] Rearrange text (and grammar) for dynamic function calls

Issue #584 closed #closed-584

11 Jul at 16:37:37 GMT

Editorial: Correction to map:filter examples

Issue #317 closed #closed-317

11 Jul at 16:37:26 GMT

fn:format-integer: $lang → $language ?

Issue #578 closed #closed-578

11 Jul at 16:37:24 GMT

317: fn:format-integer: $lang → $language

Issue #577 closed #closed-577

11 Jul at 16:37:12 GMT

Editorial: improve generator for keyword tests

Issue #555 closed #closed-555

11 Jul at 16:36:56 GMT

464: Revised narrative of normalization steps for serialization

Issue #547 closed #closed-547

11 Jul at 16:35:32 GMT

Action QT4CG-036-02: Further elaboration of the rules for function identity

Issue #539 closed #closed-539

11 Jul at 16:34:03 GMT

FLOWR where clause with a "do when false" option

Issue #600 created #created-600

10 Jul at 07:49:55 GMT
fn:decode-from-uri: counterpart of fn-encode-to-uri

Adopted from https://github.com/qt4cg/qtspecs/issues/566#issuecomment-1607397586:

The initial suggestion in #72 was to provide a function for decoding a URI. Maybe we should still think about adding a fn:decode-uri or fn:decode-from-uri function, in which we could tackle the open issues that need to be solved for fn:parse-uri. The current URI decoding rules could then be replaced with a reference to this new function.

Pull request #599 created #created-599

09 Jul at 23:25:20 GMT
90: Simplified stylesheets with no xsl:version

Fix #90

Issue #598 closed #closed-598

09 Jul at 23:24:25 GMT

Issue90 XSLT: simplified stylesheets with no xsl:version

Pull request #598 created #created-598

09 Jul at 23:05:46 GMT
Issue90 XSLT: simplified stylesheets with no xsl:version

Fix #90. Allow simplified stylesheets with no xsl:version attribute (and therefore no XSL namespace declaration)

QT4 CG meeting 041 draft agenda #agenda-07-11

07 Jul at 13:00:00 GMT

Draft agenda published.

Pull request #597 created #created-597

07 Jul at 12:30:51 GMT
Editorial fixes from #566

This PR addresses some minor comments from issue #566. There are more substantive comments but I think they warrant discussion first.

Issue #596 created #created-596

05 Jul at 07:45:57 GMT
Pinned values: Transforming Trees

Pinned Values: Transforming JTrees

This is a continuation of ideas raised in issue #341, issue #350, and elsewhere. It's related to the requirements presented in issue #262 and issue #297.

I have also presented ideas on transforming JSON trees at XML Prague and at Balisage, and I have tried out ideas over the years in Saxon extension functions. The proposal here owes a lot to those ideas, but consolidates them in a slightly different way.

I'll use the term JTree to refer to a tree structure of maps and arrays. The key difference between a JTree and a node tree (let's call it an XTree) is that the nodes in a JTree have no identity and no parent pointers.

As a result, some operations are remarkably difficult. Let's take one example:

Consider the JSON structure

{ "cities": [
    {  "name" : "Paris",
       "size" : 300
    },
    {  "name" : "Berlin",
       "size" : 300
    }
]}

and suppose we want to return a modified version of this in which the size of Berlin is changed to 400. It would be nice to be able to write something like:

modify($input, $input?cities[?name="Berlin"]?size, 400)

But of course, we can't do this. The result of the second argument is simply a number, 300, and we don't want to change all instances of the number 300 to 400, we only want to change one specific instance. Without value identity, the concept of "one specific instance" has no meaning.

I'll leave it as an exercise to the reader to work out how to do this transformation using our current XSLT and XQuery capabilities. It's far more difficult than it should be. In addition, the tree-walking approach in XSLT of applying template rules recursively is inefficient, because its cost typically depends on the size of the tree, not on the size of the modification. With immutable/persistent data structures underpinning XDM maps and arrays, it should be possible to perform this modification in constant time, regardless of the size of the tree.

My solution to this is that the expression in the second argument should return a pinned value. The pinned value behaves just like a plain integer 300 when used in operations such as arithmetic, but being pinned means that its location in the original JTree is retained, meaning that it becomes possible to replace it in the JTree with a different value.

Data Model

We need a change to the data model. Any value (any item or sequence) can have the property of being pinned. If a value is pinned, then it has a property called its locus which identifies its position within a JTree.

With a small number of exceptions, specifically noted below, the fact that a value is pinned and has a locus does not change the effect of any operations on the value. For example, the fact that an integer with value 300 is pinned does not change the result of any arithmetic or comparison operations on the number.

A locus may identify a value as being the root of a JTree, or it may identify its position within a JTree. In the latter case it has two basic properties called its container and its slot. The container is a pinned sequence, array, or map, and the slot identifies the value's position within the container: if the container is a sequence or an array, then the slot is an integer position; if the container is a map, then the slot is a key value.

Operations on Pinned Values

A value (any value) can be pinned as the root of a JTree using the function fn:pin(value). This returns a value that is in every way identical to the original (including node identity, if it is a node or contains nodes) except for being pinned.

Some selected operations have their definition changed so that if the input is a pinned value, then the result is a pinned value. These are of two kinds:

  • Operations that return an existing value unchanged generally retain the pinned property and the locus. For example if a value is bound to a variable, then the result of a variable reference will retain these properties.
  • Operations that select a value within a sequence, array, or map, when that sequence, array, or map is pinned, return a pinned value whose locus identifies the container and the value's slot within that container.

So in the above example, if $input is pinned, the expression $input?cities[?name="Berlin"]?size returns an integer (300) whose locus is (C, "size"), where C is the map having name="Berlin". C in turn has a locus (A, 2) where A is the array of cities, and A has a locus (R, "cities") where R is the root of the JTree (the original $input value).

It now becomes possible to define the modify function as follows: The first argument is a value which must be pinned. The second argument must return a pinned value which must be within the tree identified by the first argument (that is, recursively finding the container must lead to this root). The result of the modify() function is formed by recursively replacing each container, all the way up to the root, with a new container in which the contents of the relevant slot are replaced.

Feasibility

Let's pause to ask ourselves two questions: is this reasonably feasible to implement, and is it realistically possible to expect users to understand what's going on?

Implementation

I've had an implementation of something rather similar in Saxon for years, and I don't think it's especially difficult. In effect we have a Java class PinnedValue which extends Value and which delegates nearly all operations (including "instance of") to a contained Value. The one thing you need to be careful of is assuming (for example) that if a value represents an XDM array, then it will be an instance of a Java class such as XdmArray. (The terminology in the current Saxon implementation is quite different, so don't expect to find this in the current code).

Usability

I think that basic features like the modify() function won't be too difficult to explain. We just have to explain that (a) you can only modify a JTree if the root is first pinned, and (b) the expression used in the second argument must use a restricted set of operations: basically, those that do downward selection of values within a container.

Further Operations

I've illustrated the benefits with one particular operation, a modify() function, but the feature opens up many other possibilities as well. Here are a few:

  • For pinned values we can expose the container and slot properties through functions (or through custom syntax such as axes). For example this means that in XSLT, if you are doing a recursive traversal of a JTree, a template rule for processing a particular value has access to its ancestors in the same way as is possible for XTrees.
  • In addition derived properties of a pinned value can be exposed, for example the preceding and following "siblings" within an array (or indeed, within a sequence).

I'll explore some additional use cases in further posts.

Pull request #595 created #created-595

04 Jul at 21:31:43 GMT
588: (Editorial, XSLT) minor clarifications regarding xsl:sort

The main issue turned out to be spurious; there is in fact an explanation of the order attribute in the right place. But I made a couple of other minor editorial changes, including dropping the default for @data-type in the (non-normative) schema for XSLT 4.0 - the default of data-type="text" is inappropriate because it forces conversion, e.g. of dates to strings, and the default should be no conversion.

Fix #588.

Pull request #594 created #created-594

04 Jul at 21:05:59 GMT
592: (XSLT, Editorial) Add missing description of exponent-separator

Fix #592

Pull request #593 created #created-593

04 Jul at 15:40:45 GMT
591: [XSLT, editorial] Add defaults to XSLT element syntax summaries

Fix #591. Add default values to e:attribute entries in the XSLT syntax summaries; change the DTD to allow these; change the stylesheet to render them; add a paragraph to the Notation section to explain the conventions.

Issue #592 created #created-592

04 Jul at 15:38:28 GMT
XSLT §5.6 xsl:decimal-format - no explanation of exponent-separator

This problem is inherited from XSLT 3.0

Section 5.6 gives a brief description of the purpose of every attribute of xsl:decimal-format, with the exception of exponent-separator.

Issue #591 created #created-591

04 Jul at 14:06:33 GMT
Show defaults in XSLT element templates

This came out of issue #588, but is separable.

It would be useful in the element templates in the XSLT specification to show default values for attributes.

Pull request #590 created #created-590

04 Jul at 10:14:35 GMT
343: make $collation uniformly optional

Fix #343

Pull request #589 created #created-589

04 Jul at 09:04:05 GMT
561: abbreviation fn=function, drop lambda syntax

Allows "fn" as an abbreviation for "function", and drops the "thin-arrow" lambda function syntax. Fix #561.

Issue #588 created #created-588

03 Jul at 15:14:27 GMT
Incompleteness of xsl:sort specification

These are problems inherited from XSLT 3.0.

The effect of the attribute xsl:sort/@order is never explained in the text; it is assumed that the meaning is obvious.

The only explicit statement that the default is "ascending" is in the schema-for-xslt40, which is non-normative.

The schema also gives a default of "text" for the data-type attribute, which is incorrect: when data-type is set to text, values will be cast to string before comparison, which is not what happens if the actual sort key values are numeric.

We have added prose to define what we mean by "effective value". This clarifies that the effective value of the order attribute, if omitted, is the default value. But we don't have a systematic and formal way of saying what the default value is. This affects the result of test merge-021, which depends on deciding whether the two xsl:merge-source elements have the same effective value for "order", when one is omitted and the other is set to "ascending".

Pull request #587 created #created-587

02 Jul at 22:04:09 GMT
365: Allow braces in switch and typeswitch expressions

Fix #365

Pull request #586 created #created-586

02 Jul at 21:32:07 GMT
585: [Editorial] Rearrange text (and grammar) for dynamic function calls

This PR is purely editorial. It addresses the problems described in issue #585, in particular, the syntax of dynamic function calls is now described before the semantics.

Issue #159 closed #closed-159

02 Jul at 18:16:46 GMT

Support named arguments on static function calls

Issue #585 created #created-585

30 Jun at 15:18:45 GMT
Editorial: dynamic function calls

Secion 4.4.2.1 describes the semantics of a dynamic function call, but there is no explanation of the syntax.

The syntax is defined in section 4.5.2.

It's rather odd for the syntax and semantics to be separated in this way, and in particular for the semantics to be explained first.

It's also rather odd that both sections claim to have definitions of the term "dynamic function call", and the definitions are different.

Furthermore the grammar productions in 4.5.2 reference, but don't include, PositionalArgumentList; instead they include ArgumentList which is not referenced. PositionalArgumentList is included under 4.23 Arrow Expressions.

Pull request #584 created #created-584

29 Jun at 16:42:22 GMT
Editorial: Correction to map:filter examples

Corrects the syntax of the lambda functions in these two examples. Also improves the formatting.

Issue #583 created #created-583

29 Jun at 14:44:42 GMT
array:replace(), etc

Some observations:

  1. array:replace would be more versatile if multiple positions could be specified, rather than just a single position.
  2. if all positions are selected, the function becomes identical to array:for-each. So perhaps we should scrap array:replace and instead add an optional parameter $positions as xs:integer* to array:for-each. However, that could be confusing: people might imagine that the items at positions not present in the list are discarded, rather than being returned unchanged in the result. So I propose we don't do that.
  3. if the function is useful on arrays, then it's also useful on sequences. But fn:replace does something completely different.
  4. We have a similar function on maps called map:substitute (but it's not quite the same, because it processes every entry in the map).

In the interests of alignment, I propose we have three functions:

fn:substitute($input as item()*, $positions as xs:positiveInteger*, $action as function(item()) as item())

equivalent to for $it at $pos in $item return if ($pos = $positions) then $action($it) else $it

array:substitute($array as array(*), $positions as xs:positiveInteger*, $action as function(item()*) as item()*)

equivalent to array{for member $it at $pos in $item return if ($pos = $positions) then $action($it) else $it}

map:substitute($map as map(*), $keys as xs:anyAtomicValue*, $action as function(anyAtomicValue, item()*) as item()*)

For the first two functions, we don't really need to allow the second argument to be omitted, because it would then be equivalent to the corresponding for-each() function. Unfortunately that's not quite true of map:for-each(), because it doesn't return a map. However if you want to do a functional replacement of every entry in a map, it can be done easily enough with

map:build(map:key-value-pairs($map), function{?key}, function{$action(?value)})

so we're not really losing anything.

Issue #582 closed #closed-582

29 Jun at 13:16:47 GMT

Fix examples to be consistent with spec

Pull request #582 created #created-582

29 Jun at 13:04:57 GMT
Fix examples to be consistent with spec

I know there are open comments on the actual spec, but in the short term, let's at least make the examples correct.

Issue #581 closed #closed-581

29 Jun at 12:44:23 GMT

Fix schema error

Pull request #581 created #created-581

29 Jun at 12:44:19 GMT
Fix schema error

I can't explain why this didn't turn up initially in my local testing. :-(

Issue #580 closed #closed-580

29 Jun at 12:34:22 GMT

Automatically generate app/fo-spec-examples on build

Pull request #580 created #created-580

29 Jun at 12:26:05 GMT
Automatically generate app/fo-spec-examples on build

This PR should update the build so that the test-suite app/fo-spec-examples.xml file is automatically build when the spec is published. The PR test won't tell us anything, so we'll have to merge it and see what happens...

Issue #579 closed #closed-579

29 Jun at 10:49:45 GMT

Support role=wide on fos:examples

Pull request #579 created #created-579

29 Jun at 10:37:26 GMT
Support role=wide on fos:examples

This is a purely style related change. It supports role=wide on fos:example elements. If an example is identified as "wide" then the presentaiton is in sequential rows of the table rather than adjacent columns. See, for example, parse-uri().

Pull request #578 created #created-578

28 Jun at 15:03:24 GMT
317: fn:format-integer: $lang → $language

Parameter name aligned with other functions (fn:format-dateTime, fn:lang, others). Closes #317.

Pull request #577 created #created-577

28 Jun at 14:50:40 GMT
Editorial: improve generator for keyword tests

Improves the stylesheet that generates the BuiltInKeywords.xml test set, using example values for arguments where appropriate, taking these from the function catalog.

Issue #576 created #created-576

27 Jun at 20:03:19 GMT
JSON serialization: Sequences, INF/NaN, function items

As far as possible, the JSON serialization method should be aligned with the new fn:items-to-json() (aka xdm-to-json) function.

Firstly, the results should be the same when the input consists entirely of atomic values, or sequences, maps, and arrays consisting entirely of atomic values. This only requires one change: serializing a sequence of length >1 should output the sequence as if it were an array, rather than raising an error. This is a compatible change.

Secondly, the new fn:items-to-json() function should have an option to output elements (for example elements appearing within maps or arrays) in the same way as the JSON serialization method does, by serializing the nodes to lexical XML or HTML contained within a JSON character string.

Finally, we should align the rules on how to output a function item (other than a map or array). We could either adopt the serialization approach (raise an error), or the items-to-json() approach (output some kind of placeholder saying "here be dragons") but they should be aligned.

Note that the two operations are still very different. items-to-json() is primarily about converting XML to JSON, while JSON serialization is primarily about turning maps and arrays into their JSON lexical form.

Issue #361 closed #closed-361

27 Jun at 16:10:11 GMT

Named arguments: $input vs. $value

Issue #562 closed #closed-562

27 Jun at 16:10:08 GMT

361: Named arguments: $input vs. $value

Issue #567 closed #closed-567

27 Jun at 16:09:46 GMT

Errors in schema for XSLT

Issue #568 closed #closed-568

27 Jun at 16:09:45 GMT

567: schema for xslt40

Issue #569 closed #closed-569

27 Jun at 16:09:28 GMT

Minor editorial corrections, XDM chh. 1, 2

Issue #106 closed #closed-106

27 Jun at 16:08:46 GMT

Decorators' support

Issue #175 closed #closed-175

27 Jun at 16:08:31 GMT

In XQuery, allow a semicolon at the end of the module

Issue #457 closed #closed-457

27 Jun at 16:07:59 GMT

Support parsing numeric, alphabetic, and additive number systems.

Pull request #575 created #created-575

27 Jun at 14:05:56 GMT
359: fn:void: Absorb result of evaluated argument

#359: fn:void: Absorb result of evaluated argument

Issue #574 created #created-574

27 Jun at 14:02:01 GMT
fn:log: Trace and discard results

See https://github.com/qt4cg/qtspecs/issues/359#issuecomment-1465971781

fn:dump

Summary

Outputs trace information and discards the result.

Signature

fn:dump(
  $value  as item()*,	
  $label  as xs:string?  := ()
) as empty-sequence()

Properties

This function is ·deterministic·, ·context-independent·, and ·focus-independent·.

Rules

Similar to fn:trace:, the values of $value, converted to an xs:string, and $label (if supplied and non-empty) may be directed to a trace data set. The destination of the trace output is ·implementation-defined·. The format of the trace output is ·implementation-dependent·, as is the ordering of the output.

In contrast to fn:trace, the function returns an empty sequence.

Issue #573 created #created-573

26 Jun at 16:53:46 GMT
Node construction functions

I propose introducing a set of functions that allow node construction in XPath. The basic functions are

new-document($children as node()*)
new-element($name as QName, $content as node()*)
new-attribute($name as QName, $value as xs:string)
new-namespace($prefix as xs:string, $uri as xs:string)
new-comment($content as xs:string)
new-processing-instruction($name as xs:string, $content as xs:string)
new-text($content as xs:string)

The semantics would be essentially identical to the current node constructors in XQuery and/or the corresponding instructions in XSLT (there are very few differences: a few minor ones, like if there are multiple attributes with the same name, XSLT takes the last, while XQuery throws an error).

As always, it's difficult to know where to draw the line in functionality between XPath and XQuery. One of the guidelines I use is that if an addition to XPath is likely to be useful to XSLT users, then it's worth including. Clearly these functions are not strictly necessary in XSLT (they could easily be user-supplied as wrappers around XSLT instructions). But to take advantage of some of the other capabilities we're introducing in XPath, node construction functions are increasingly handy. Consider for example:

<xsl:copy-of select="interleave(para, new-element(xs:QName(xhtml:br)))"/>

That's a one-liner that replaces

<xsl:for-each select="para">
  <xsl:if test="position() ne 1"><br/></xsl:if>
  <xsl:copy-of select="."/>
</xsl:for-each>

The new functions are also useful in XQuery because although they duplicate existing syntax, the fact that they are functions rather than custom syntax makes them more versatile.

As with existing constructs in XQuery and XSLT, a naive implementation that follows the semantics literally (which involves copying a subtree when adding an element to a new parent) would be rather inefficient. However, I think that the same established optimizations are equally applicable, for example lazy tree construction and/or push-mode evaluation.

Issue #572 created #created-572

26 Jun at 16:36:41 GMT
fn:evaluate-xpath() function

XSLT 3.0 introduced an instruction, xsl:evaluate, for dynamic evaluation of XPath expressions. But there is no way of doing this in XQuery.

It was done as an instruction in XSLT because functions at the time were not flexible enough to cope with a range of optional parameters. This situation has changed.

An fn:evaluate() function might operate successfully with the following parameters:

  • xpath - the XPath expression as a string.
  • namespaces - the namespace bindings as a map from prefix to URI, defaulting to the namespace bindings in the static context of the caller
  • parameters - parameter values as a map from QName to value, defaulting to an empty map
  • context-item - the context item for evaluation, defaulting to the context item of the caller
  • options:
    • default collation
    • base URI
    • schema-aware = true/false
    • allow-external-access = true/false
    • cache = true/false (whether to cache compiled expressions for reuse)

Before xsl:evaluate as introduced, there was some concern in the WG about security and the risk of injection attacks. I think that the allow-external-access switch is sufficient for this: if you don't trust the expression, set it to false, and all dangerous things like access to external documents and other resources, use of extension functions, etc, is disabled.

QT4 CG meeting 040 draft agenda #agenda-06-27

23 Jun at 10:55:00 GMT

Draft agenda published.

Issue #571 created #created-571

23 Jun at 09:05:20 GMT
XSLT: xsl:for-each-group/@break-when

The new @break-when attribute on xsl:for-each-group

(a) has not been reviewed by the CG

(b) is not mentioned in the changes appendix

(c) is not included in the schema-for-XSLT40

Issue #570 created #created-570

23 Jun at 07:59:29 GMT
XSLT: Built-in template rules for maps and arrays

The current built-in template rules for maps and arrays do not work well for a recursive-descent traversal of a JSON-like tree of maps and arrays. The effect of on-no-match="shallow-copy" is to do a deep copy, and the effect of on-no-match="shallow-skip" is to do a deep skip. I therefore propose two new values for on-no-match, provisionally "shallow-copy-all" and "shallow-skip-all".

For more details see my Balisage 2022 paper: https://balisage.net/Proceedings/vol27/html/Kay01/BalisageVol27-Kay01.html

We do in fact have some test cases that use this syntax, see attr/mode-4001 et seq, but it has never found its way into the spec.

Pull request #569 created #created-569

23 Jun at 03:40:55 GMT
Minor editorial corrections, XDM chh. 1, 2

Minor corrections to XDM chapters 1, 2.

  • expanded-QName versus expanded QName. The latter outnumbered the former ca. 3:2, and is better, so I with with it.
  • some language pointing to tables & lines, language rendered obsolete by @ndw 's nice new graphs, excised
  • other minor edits for clarity, consistency

Pull request #568 created #created-568

22 Jun at 20:15:58 GMT
567: schema for xslt40

This PR applies the errata to the XSLT 3.0 version of the schema, and extends the schema to support new syntax that has been introduced in XSLT 4.0.

Fix #567.

Issue #567 created #created-567

22 Jun at 17:24:17 GMT
Errors in schema for XSLT

Erratum E10 for XSLT 3.0 notes a number of errors in the schema for XSLT 3.0. The errata can be found in the repository at https://github.com/w3c/qtspecs/blob/master/errata/xslt-30/errata.xml but they are not published or linked from the spec; they were produced by the XSL WG after publishing 3.0 and before the group disbanded.

The version of the schema found in the 3.0 repository does not include these corrections; therefore, neither does the one in the 4.0 repository.

There is another version of the schema in the xslt30test repository, and this one does include the corrections (as far as I can see).

However, new additions for changes needed for 4.0 have been applied inconsistently to the two versions.

There's therefore a pressing need to bring everything back into line. A good step would be to cut out the duplication: once we've got it clean, the build process should copy qtspecs/specifications/xslt-40/src/schema-for-xslt40.xsd to xslt40-test/tests/misc/catalog/schema-for-xslt40.xsd and we should only maintain the former.

Issue #566 created #created-566

22 Jun at 11:31:04 GMT
fn:parse-uri, fn:build-uri: Feedback

Feedback on fn:parse-uri (thanks, @ndw, for the comprehensible rules in the spec):

  1. With the current rules, the port is not detected in http://x:80. Maybe x: is misinterpreted as the beginning of a Windows path?
  2. The uri decoding should be revised. It’s currently possible to create strings with invalid Unicode characters: parse-uri('%FF'). With the current rules, the following expression returns df83 and dc00:
parse-uri('%FF00')
=!> map:get('filepath')
=!> string-to-codepoints()
=!> format-integer('16^XX')

More to come (or not).

Issue #560 closed #closed-560

22 Jun at 11:23:18 GMT

Formatting Monospace

Issue #565 closed #closed-565

22 Jun at 09:29:31 GMT

More typographic changes

Pull request #565 created #created-565

22 Jun at 09:21:14 GMT
More typographic changes

Further to #560, I'll just merge this if it passes.

  1. Make the <pre> font the same size as the other monospaced environments
  2. Tighten up the spacing in function signatures
  3. Make table-formatted examples use valign=top
  4. Add subtle shading to table-formatted examples so the rows are easier to distinguish.

Issue #564 created #created-564

22 Jun at 07:54:18 GMT
Sorted maps

Based on a requirement from Michael Müller-Hillebrand on xsl-list.

We could define a variant of map:build() that constructs a sorted map. If a map is sorted, then:

  • map:is-sorted() returns true.
  • map:keys() returns the keys in sorted order.
  • map:range(map, min, max) returns a sorted sub-map whose keys lie within a particular range
  • map:get-by-prefix() returns the values whose keys start with a given substring

Issue #563 closed #closed-563

21 Jun at 12:15:38 GMT

Style and other editorial fixes

Pull request #563 created #created-563

21 Jun at 11:47:26 GMT
Style and other editorial fixes

This PR mostly attempts to fix #560 by making the formatting of code more consistent.

  1. I removed the reference to the W3C CSS and made a local copy
  2. Added CSS variables for code parameters and tried to make use of them everywhere
  3. I removed the use of "font:small" in index tables, so they're a little easier to read
  4. I noticed and fixed two places where we had old-style type hierarchy diagrams

These fixes aren't going to be visible in the PR build, so I'm just going to merge it.

Pull request #562 created #created-562

21 Jun at 09:03:26 GMT
361: Named arguments: $input vs. $value

#361: Just an editorial one to get one more issue closed.

QT4 CG meeting 039 draft minutes #minutes-06-20

20 Jun at 17:15:00 GMT

Draft minutes published.

Issue #526 closed #closed-526

20 Jun at 16:49:48 GMT

load-xquery-module() needs changes to account for functions with an arity range

Issue #548 closed #closed-548

20 Jun at 16:49:09 GMT

Space separation in lambda expressions

Issue #561 created #created-561

20 Jun at 16:44:54 GMT
Alias for `function` keyword; drop thin arrow syntax?

See https://github.com/qt4cg/qtspecs/issues/548#issuecomment-1591023928 and the subsequent comments:

[…] I would still be in favor to have fn or f as a plain alias for function (and optionally drop the thin arrow syntax), simply because function items and declarations can get so frequent:

fn($a) { $a + 1 }
fn { . + 1 }

Issue #552 closed #closed-552

20 Jun at 16:41:06 GMT

Editorial: Updates to back matter and status section of F+O spec

Issue #559 closed #closed-559

20 Jun at 16:39:03 GMT

Minor editorial edits

Issue #558 closed #closed-558

20 Jun at 16:38:50 GMT

Added fn:items-X cross-references

Issue #316 closed #closed-316

20 Jun at 16:38:26 GMT

Function fn:differences

Issue #551 closed #closed-551

20 Jun at 16:38:25 GMT

316: Drop the fn:differences function

Issue #550 closed #closed-550

20 Jun at 16:38:12 GMT

548: require parens around lambda arguments

Issue #549 closed #closed-549

20 Jun at 16:37:57 GMT

526 load xquery module

Issue #82 closed #closed-82

20 Jun at 16:36:39 GMT

Should the mode attribute for apply-templates in templates of enclosed modes default to #current?

Issue #112 closed #closed-112

20 Jun at 16:36:29 GMT

Abbreviate `map:function($someMap)` to `$someMap?function()`

Issue #331 closed #closed-331

20 Jun at 16:36:12 GMT

Extend fn:path to support arrays and maps.

Issue #376 closed #closed-376

20 Jun at 16:35:59 GMT

add documentation prefix attribute to xsl:stylesheet

Issue #399 closed #closed-399

20 Jun at 16:35:49 GMT

fn:deep-equal: Using Multilevel Hierarchy and Abstraction when designing and specifying complex functions

Issue #425 closed #closed-425

20 Jun at 16:35:31 GMT

Structural proposal (ThinLayer:tm:) : Add a layer of thin spec between XPath and the XPath Derived Language

Issue #560 created #created-560

20 Jun at 09:23:16 GMT
Formatting Monospace

It would be great if we could further tweak and unify the representation of monospaced text. On my machine, I see…

  • monospace (13px) for the function signature (using <code> instead of <pre>, with increased line heights)
  • monospace (11.7px) for <code>
  • Menlo, Consolas, "DejaVu Sans Mono", ... (13px) for <pre>
  • Menlo, Consolas, "DejaVu Sans Mono", ... (14,4px) for <pre> in examples

…and I guess there are other cases. Would it be possible to use the same font style, size & line height for monospaced text? Maybe a friendlier light grey color for all <pre> blocks, similar to the GitHub rendering.


image

Pull request #559 created #created-559

19 Jun at 20:27:44 GMT
Minor editorial edits

The submitted edits are based upon a fresh reading of the serialization specs, and should all be relatively minor. Review would be appreciated, as small errors can have significant consequences.

Pull request #558 created #created-558

19 Jun at 19:22:46 GMT
Added fn:items-X cross-references

This PR provides cross-references in the form of notes between members of the pairs items-before() & items-ending-where() and items-after() & items-starting-where(). The goal is a light editorial intervention to enhance function discovery. (Anticipating a user who, at any one of these spec descriptions, wonders how they might exclude/include the first matching item.)

QT4 CG meeting 039 draft agenda #agenda-06-20

19 Jun at 10:55:00 GMT

Draft agenda published.

Issue #512 closed #closed-512

19 Jun at 10:06:19 GMT

256: Context for default function parameter expressions

Issue #3 closed #closed-3

19 Jun at 09:55:21 GMT

Allow tokens in xsl:mode/@name

Issue #557 created #created-557

17 Jun at 12:31:14 GMT
fn:unparsed-binary: accessing and manipulating binary types

Dear All, When working with binary types currently one has to fall back to string conversions and/or extensions. A few ideas on nice to have functions that operate on binary types: -accessing a single byte at a specific position -splitting binary data at byte boundary (aka binary-subsequence) -converting to/from a sequence of byte(s) -joining binary data together -(optionally) loading data directly as (base64)binary (some extensions are using unparsed-text with proprietary decoding 'x-binarytobase64' to retrieve a base64 castable string) -standard bit-wise operators and, or, xor, not, rshift, lshift

If considered, then each needs to be discussed in detail separately -ex. which type to support signed/unsigned, 8 bit etc.-, I merely intend as conversation starter in case others encountered similar issues/limitations.

p.s.: Not sure if this is the right channel to raise this issue, so feel free to close/move/split accordingly.

Issue #556 created #created-556

16 Jun at 02:53:49 GMT
Serialization phase 5 note unclear

Section 4 of the Serialization spects ends with this note:

Serialization is only defined in terms of encoding the result as a stream of octets. However, a serializer MAY provide an option that allows the encoding phase to be skipped, so that the result of serialization is a stream of Unicode characters.

What is a stream of Unicode characters? AFAIK, Unicode characters cannot be streamed an sich but require some encoding. And how does would such a stream not consist of bytes/octets of one sort or another? A stream of what exactly?

Pull request #555 created #created-555

16 Jun at 00:02:29 GMT
464: Revised narrative of normalization steps for serialization

This PR acts on #464 by revising the description of steps involved in normalizing a sequence that is input to serialization. I normally would wield a light editorial hand, but the issues raised in #464 as well as closer reading of the prose convinced me that a wholesale revision would be beneficial. For example, new sequences were described as if the reader already knew about them, but they are really only introduced as the last sentence in many steps.

I have capitalized on the original version of step 1's appeal to array:flatten to abbreviate the description for two of the steps.

Issue #518 closed #closed-518

15 Jun at 02:10:51 GMT

transitive-closure() function

Issue #554 created #created-554

15 Jun at 01:49:25 GMT
The Transitive Closure function produces an incomplete result, completeness/success and number of actual iterations must also be returned

@michaelhkay and CG members,

I initially reopened the original issue, because there is some useful and needed functionality that the PR and the text of the FO specification do not provide, and we missed to discuss this at the June 13th meeting.

I believe that typically a developer would need to know if the "whole" TC was completely produced, or not, and maybe how many iterations were needed.

This can be provided if the result of the function was for example:

map{ 
        "TC" : $thefinalTCNodeset,
        "WasTCComplete": $boolForTCCompletion,
        "NumberOfIterations: $someInteger
       }

One could argue that if $result? NumberOfIterations < max, then we know that the complete TC was produced and thus we don't need the member "WasTCComplete".

However, this is not the case when $result? NumberOfIterations eq max. In this case both outcomes are possible: the complete TC was produced exactly in max iterations (so the value of "WasTCComplete" must be true() ), or max iterations were performed and the max + 1st iteration produced additional nodes -- in this case "WasTCComplete" must be false().

Failure to produce the complete TC in many cases will be regarded as error, and the developer needs to be sure that this was (or this wasn't) the case.

thus, the current specification of this function needs additional work to accomodate the full needed functionality.

Issue #553 created #created-553

15 Jun at 00:09:27 GMT
New function fn:substitute()

The discussion on the parse-csv() use cases suggests there would be value in a function

fn:substitute($value as item()*, $pos as xs:positiveInteger, $mangler as function(item()) as item()) as item()*

whose effect is to return the input sequence $value with the item at position $pos replaced by the result of invoking $mangler on that item.

This should be aligned with similar functions for maps and arrays.

Pull request #552 created #created-552

14 Jun at 21:32:51 GMT
Editorial: Updates to back matter and status section of F+O spec

Mainly updates to the changes appendix. Also improve the status section, and correct the bibref to the EXSLT specs.

Issue #256 closed #closed-256

14 Jun at 20:10:37 GMT

Function declarations: static and dynamic context for default parameter values

Issue #275 closed #closed-275

14 Jun at 20:08:52 GMT

Problems with nt/xnt links to grammar terms

Pull request #551 created #created-551

14 Jun at 20:04:07 GMT
316: Drop the fn:differences function

Fix #316

The draft specification for fn:differences was added to the spec before we had a review process in place. The spec is complex and incomplete; issue #316 points out some of the difficulties. This PR removes it from the spec.

Note that fn:deep-equal() now has a debug option which outputs diagnostic information in implementation-defined format; although not interoperable, this meets the main use case, which is to discover for diagnostic purposes why and in what way two values are not deep-equal to each other.

Pull request #550 created #created-550

14 Jun at 08:47:08 GMT
548: require parens around lambda arguments

Require parentheses around the parameter list in a lambda expression, even when there is only one parameter, to avoid the problem of needing whitespace before the arrow. Resolves issue #548.

Issue #327 closed #closed-327

14 Jun at 00:03:05 GMT

Tokenisation

Issue #333 closed #closed-333

13 Jun at 23:55:42 GMT

Equality of function items

Issue #382 closed #closed-382

13 Jun at 23:49:43 GMT

Improve whitespace handling in deep-equal

Issue #536 closed #closed-536

13 Jun at 23:49:01 GMT

Re: Mathematical Operator Symbols

Issue #513 closed #closed-513

13 Jun at 23:48:03 GMT

Arrow operator: Inline functions without parens

Pull request #549 created #created-549

13 Jun at 23:46:03 GMT
526 load xquery module

Updates the spec of fn:load-query-module to handle functions with optional parameters. Resolves issue #526

QT4 CG meeting 038 draft minutes #minutes-06-13

13 Jun at 17:30:00 GMT

Draft minutes published.

Issue #521 closed #closed-521

13 Jun at 16:34:26 GMT

518: Add transitive-closure() function

Issue #545 closed #closed-545

13 Jun at 16:33:38 GMT

513: after arrow operator, inline function no longer needs parens

Issue #544 closed #closed-544

13 Jun at 16:33:17 GMT

536: disallow mixing of symbols in operator tokens

Issue #543 closed #closed-543

13 Jun at 16:32:52 GMT

382 simplify rules for whitespace in fn:deep-equal

Issue #542 closed #closed-542

13 Jun at 16:32:35 GMT

Fixes a simple error in the description of XSLT error XTSE4020

Issue #541 closed #closed-541

13 Jun at 16:32:19 GMT

Fix typo in XPath §2.4.5 - E1 should be tagged as code not as var.

Issue #548 created #created-548

13 Jun at 07:55:05 GMT
Space separation in lambda expressions

Currently the expression $a -> {$a+1} requires a space before the arrow, because the hyphen is otherwise tokenized as part of the variable name.

Not only is this a very easy trap to fall into, it also tends to result in poor diagnostics because we're in lookahead territory where we are also considering an alternative parse along the lines $a- > EXPR (where the variable name is a-), and this parse only gets disqualified because an expression can't begin with {.

If we later introduce expressions that start with {, writing $a->{3} with no space would no longer be an error, it would just mean something completely different from what you intended.

One solution, not a very pretty one, is to require parentheses around the argument list even when there is only a single argument. Almost as inconvenient as requiring the space, but with better diagnostics if you forget.

Another solution might be to restrict the parameter name so it can't contain a hyphen. The problem with that is that the tokenization becomes sensitive to the grammatical context, which is not totally unexplored territory, but complicates the fact that we're already in lookahead territory where we are looking at alternative parses. (The Saxon implementation already does the lookahead by retokenizing, so it's not impossible.)

Would it be too horrendous to contemplate a backwards-incompatible change to say variable names cannot end in a hyphen, anywhere? Perhaps with a mode bit for compatibility? After all, all this complication is caused by the need to cater for something which no sane user ever does.

Pull request #547 created #created-547

12 Jun at 15:24:30 GMT
Action QT4CG-036-02: Further elaboration of the rules for function identity

Following review and acceptance of the proposal introducing the concept of function identity (PR #525, Issue #520) this PR makes some refinements in response to comments raised during the review, especially in the following areas:

  • clarification as regards named function references to context-dependent functions
  • relationship to (in)determinacy of a function
  • avoiding the phrase "new function item"
  • stating that the identity of a function item such as fn:count#1 applies even across execution scopes, e.g. calls to fn:transform.

Pull request #546 created #created-546

12 Jun at 13:28:46 GMT
414: Attempt to implement expanding the allowed character repertoire

Fix #414

This PR addresses ACTION QT4CG-036-01 on me.

QT4 CG meeting 038 draft agenda #agenda-06-13

11 Jun at 18:20:00 GMT

Draft agenda published.

Pull request #545 created #created-545

09 Jun at 20:52:44 GMT
513: after arrow operator, inline function no longer needs parens

Resolves issue #513 by removing the requirement for an inline function expression on the RHS of an arrow operator to be enclosed in parentheses.

Pull request #544 created #created-544

09 Jun at 16:51:23 GMT
536: disallow mixing of symbols in operator tokens

As proposed in issue #536, this change disallows mixing of ordinary and full-width angle brackets in the same token.

Pull request #543 created #created-543

09 Jun at 15:24:46 GMT
382 simplify rules for whitespace in fn:deep-equal

My previous attempt to make a pull request for this change got lost somewhere in the process; here is a renewed attempt. The changes are in response to comments made during the review of the orginal deep-equal() proposal, recorded in issue #382

Issue #497 closed #closed-497

09 Jun at 11:44:36 GMT

https://qt4cg.org/specifications/xpath-functions-40/Overview-diff.html#func-map-pairs has wrong function syntax order

Pull request #542 created #created-542

09 Jun at 11:36:39 GMT
Fixes a simple error in the description of XSLT error XTSE4020

A comment to issue #82 identified this typo.

Pull request #541 created #created-541

09 Jun at 11:34:40 GMT
Fix typo in XPath §2.4.5 - E1 should be tagged as code not as var.

A trivial markup error that leads to the meta-variable E1 being wrongly rendered.

Issue #66 closed #closed-66

09 Jun at 11:27:11 GMT

ThinArrowTarget should use FunctionBody

Issue #78 closed #closed-78

09 Jun at 11:26:16 GMT

Specify strict order of evaluation for a subexpression

Issue #98 closed #closed-98

09 Jun at 10:22:35 GMT

Support ignoring whitespace/indentation differences in fn:deep-equal.

Issue #125 closed #closed-125

09 Jun at 10:20:26 GMT

array:partition → fn:partition: empty results; examples

Issue #384 closed #closed-384

09 Jun at 09:20:58 GMT

Definition of "effective value" in XSLT

Issue #418 closed #closed-418

09 Jun at 09:18:13 GMT

array and map attribute in xsl:iterate and xsl:for-each-group

Issue #503 closed #closed-503

09 Jun at 09:14:48 GMT

Reinstate focus functions

Issue #381 closed #closed-381

09 Jun at 09:13:34 GMT

Deep-equal comparisons without errors

Issue #520 closed #closed-520

09 Jun at 09:11:45 GMT

Function identity

Issue #540 created #created-540

08 Jun at 09:24:38 GMT
Add fn:system-property() to XQuery

XSLT has specific additions to the XPath function library to facilitate identifying the running implementation:

  • https://www.w3.org/TR/xslt-30/#func-system-property
  • https://www.w3.org/TR/xslt-30/#func-available-system-properties

These would be useful for XQuery too. There should be something better than the fragile sadness of https://github.com/AndrewSales/XQS/blob/461a90a8e2f49d9ef646ff6940c6962f18c0f43a/port.xqm#L3-L12

Issue #539 created #created-539

06 Jun at 20:20:06 GMT
FLOWR where clause with a "do when false" option

This is a request for an enhancement.

Fairly often, I'll have a query arranged as

let $step1 := do some processing where exists($step1) let $step2 := processing based on step1 where exists($step2)

and so on.

This is a convenient pattern until I want to emit some sort of message about where the process stops.

It would be convenient to have

where expression else expression

with the else as an optional extension of the where clause to allow emitting information about which where clause the FLOWR expression stopped on.

It might be more congruent to the style of the language as

where expression return expression

but then again having multiple return keywords isn't obviously a good thing.

QT4 CG meeting 037 draft minutes #minutes-06-06

06 Jun at 17:10:00 GMT

Draft minutes published.

Issue #531 closed #closed-531

06 Jun at 16:06:45 GMT

grammar production LambdaParams has "(" and ")" incorrectly under the choice

Issue #532 closed #closed-532

06 Jun at 16:06:44 GMT

fix error in LambdaParams rule

Issue #534 closed #closed-534

06 Jun at 16:06:30 GMT

530: escape solidus in JSON

Issue #535 closed #closed-535

06 Jun at 16:06:18 GMT

Editorial: add an entry to the changes appendix

Pull request #538 created #created-538

06 Jun at 13:25:12 GMT
480: Attempt to allow xs:string to be 'promoted to' xs:anyURI

I think it might be a little cheeky to call it "promotion" in both directions, but this really has more to do with a kind of conversion, so I'm willing to let it slide.

If accepted, this PR resolves ACTION QT4CG-035-03.

Issue #537 closed #closed-537

05 Jun at 16:27:07 GMT

Editorial: present F&O examples as tables.

Pull request #537 created #created-537

05 Jun at 14:28:56 GMT
Editorial: present F&O examples as tables.

This is a purely editorial change to the stylesheet that formats examples in the F&O spec; it changes the presentation to be a two-column table containing expressions and results. The intention is to reduce clutter and to improve the readability where code samples (either expressions or results) need multi-line rendition.

In the vast majority of cases the change is clearly (IMHO) an improvement, but further tweaking is possible:

  • There may be scope for tailoring the CSS (for example, I don't like the fact that table cells are centred vertically).
  • Some of the tables (e.g. parse-uri examples) take too much horizontal space; the code should be edited to reduce the line length
  • Examples are sometimes introduced with a free-standing paragraph tag rather than with fo:preamble, which separates the introduction from the code into a separate table row.
  • Some of the examples that were designed for inline rendition could usefully take advantage of the opportunity to turn them into multi-line code samples.

QT4 CG meeting 037 draft agenda #agenda-06-06

05 Jun at 09:20:00 GMT

Draft agenda published.

Issue #536 created #created-536

03 Jun at 15:41:40 GMT
Re: Mathematical Operator Symbols

#460

=!> is not mentioned in A 3.3

Also, I do not think it makes sense to allow mixing both characters in one operator, like in <<. It combines the disadvantages of << and << without any advantages

Pull request #535 created #created-535

03 Jun at 14:22:06 GMT

Editorial: add an entry to the changes appendix

Pull request #534 created #created-534

31 May at 17:27:30 GMT
530: escape solidus in JSON

Add escape-solidus serialization parameter for the JSON output method.

Pull request #533 created #created-533

31 May at 15:58:57 GMT
413: Spec for CSV parsing with fn:parse-csv()

This is a spec proposal for fn:parse-csv() from #413.

I've tried to cover off most of what was discussed in that issue, but I have avoided dealing with backlash escapes (per @ChristianGruen's early comment), sticking with the RFC 4180 quoting approach.

There are some issues with the structure where I tried to follow the existing structure of chapter 15, but that leaves the function definition in 15.4 separated by a lot of text before the wider format discussion in 15.7. The split between function def and context affects the JSON and HTML parsing functions too, so I have avoided trying to fix that as well in this PR.

If this meets with approval, I'll squash commits and rebase before merging.

Pull request #532 created #created-532

31 May at 07:30:03 GMT
fix error in LambdaParams rule

fixes https://github.com/qt4cg/qtspecs/issues/531#issue-1733539612

LambdaParam | "(" | (LambdaParam ("," LambdaParam))? | ")" should be: LambdaParam | ( "(" (LambdaParam ("," LambdaParam))? ")" )

Issue #531 created #created-531

31 May at 07:14:11 GMT
grammar production LambdaParams has "(" and ")" incorrectly under the choice

The grammar production LambdaParams ::= LambdaParam | "(" | (LambdaParam ("," LambdaParam)*)? | ")" should be LambdaParams ::= LambdaParam | ("(" (LambdaParam ("," LambdaParam)*)? ")")

Issue #530 created #created-530

31 May at 06:00:48 GMT
Escaping of forward slash in JSON output method

The fact that we escape forward slash in the JSON output method has proved unpopular with quite a few users.

The rationale for doing it is discussed at https://stackoverflow.com/questions/1580647/json-why-are-forward-slashes-escaped

The short summary is that it's only needed when JSON is inserted into an HTML script element, and then only when immediately followed by >.

There's a workaround using character maps but it's really clumsy.

I don't want to escape forward slashes by default in the xdm-to-json() function because they appear so often in namespaces and it adds an awful lot of visual clutter. That means that our JSON formatter needs an option to suppress this escaping, which means we might as well provide user control over it as an output property...

Adding a new output property just for this purpose is rather heavyweight both in the specs and in our implementation, but I can't think of a better solution. So I propose adding escape-solidus, values yes/no, default (for compatibility) yes.

Yes, someone will ask about the name. Surely no-one calls it a solidus? Well, Unicode does, and I think we should use the official name. And it avoids arguing about whether it should be slash or forwards-slash or forward-slash. At least I'm not proposing virgule.

Issue #367 closed #closed-367

30 May at 22:19:55 GMT

Focus for RHS of thin arrow expressions

QT4 CG meeting 036 draft minutes #minutes-05-30

30 May at 17:15:00 GMT

Draft minutes published.

Issue #524 closed #closed-524

30 May at 16:17:45 GMT

503: reinstate focus functions

Issue #525 closed #closed-525

30 May at 16:17:00 GMT

520: add function identity and use it in deep-equal

Issue #519 closed #closed-519

30 May at 16:16:26 GMT

237: Revise tokenisation appendix

Issue #527 closed #closed-527

30 May at 16:16:02 GMT

Editorial: more corrections to F+O examples

Pull request #529 created #created-529

29 May at 11:22:35 GMT
528 fn:elements-to-maps

Revises the detail of the proposed json() function, including a name change to xdm-to-json().

The proposed changes give us a starting point for implementation, but I would expect there might be further tweaks to the spec once we try applying the function to real examples.

QT4 CG meeting 036 draft agenda #agenda-05-30

29 May at 07:53:00 GMT

Draft agenda published.

Issue #528 created #created-528

28 May at 20:08:08 GMT
fn:elements-to-maps (before: Review of the fn:json() function)

I've been writing tests for the fn:json function, whose spec I haven't read for quite a while, so it's an opportunity (a) to request WG review of the spec, and (b) for some minor comments.

  1. I think a better name for the function might be fn:to-json. Any other suggestions?
  2. Where we specify that a JSON object should be output with particular properties, I think we should be consistent about whether or not we prescribe the order. Writing tests is a lot easier if the order is always prescribed!
  3. Document nodes: it would be better to output both the document URI and the base URI where available.
  4. Under "The children of the element are processed as follows" there are four rules. In the case where an element has just one element node child, I think rule 4 should apply rather than rule 3.
  5. Under Processing-Instruction nodes: typo "A JSON object with a two properties".
  6. The section starting "Strings are escaped as follows" should be promoted up a level.
  7. Representing functions: I propose a different set of rules. (a) for a function that is a reference to a built-in or user-defined function definition, output "Q{uri}local#arity". (b) for an anonymous function, output "#anonymous-function". The rationale is that the JSON output here isn't going to be useful except as a placeholder to indicate that a function item is present.
  8. We might want to be more prescriptive about how numbers are formatted (or to provide user options)
  9. Like many XML-to-JSON libraries, there's the problem that two instances of the same element type might be output very differently depending on which children are present. For example the representation of a book with two authors might be very different from a book with one author. I would suggest that rather than the boolean element-map option, we allow the options to include a list of element names for which object representation rather than array representation is to be used.

Pull request #527 created #created-527

28 May at 10:24:27 GMT
Editorial: more corrections to F+O examples

Fixes errors in the fn:replace examples, updates elsewhere to reflect changes to the syntax of lambda expressions and focus functions.

Issue #526 created #created-526

26 May at 21:22:22 GMT
load-xquery-module() needs changes to account for functions with an arity range

The spec of load-xquery-module() assumes that each function declaration in a query module has a single integer arity; this doesn't allow for default parameters which mean it now has an arity range.

Because the returned map contains function items, which always have a fixed arity, I think it needs to contain one entry for each arity in the arity range. This involves evaluating the defaults for any parameters that have a default value defined; if the default value is context dependent, this is going to have to use the context of the load-xquery-module() function call, which isn't very meaningful, but I can't see what else to do.

An alternative is to only include one function item in the result, corresponding to the maximum arity.

If we introduce sequence-variadic functions, the arity range becomes infinite, which makes both of these ideas problematic. But presumably sequence-variadic functions will be callable with all the values supplied in a single array?

Pull request #525 created #created-525

26 May at 09:27:33 GMT
520: add function identity and use it in deep-equal

I believe this PR resolves issues

issue #520 - function identity issue #333 - equality of function items issue #381 - deep-equal comparison without errors

The PR introduces a concept of function identity in the data model, and for all expressions that create functions, explains what the identity of the returned function is.

The concept of function identity is used initially in two places: in fn:deep-equal(), when the operands include function items; and in the F+O prose defining the concept of determinism, which in turn is relied on by the definition of memo functions in XSLT.

I had hoped to go further and address issue #119, generalising what kinds of values are allowed as keys in maps, but as explained in a comment on that issue, I hit obstacles.

Pull request #524 created #created-524

25 May at 10:09:47 GMT
503: reinstate focus functions

This PR reinstates "focus functions", using the syntax function{EXPR} rather than ->{EXPR}. If accepted, this resolves issue #503.

Issue #523 created #created-523

24 May at 23:28:50 GMT
Dealing with component name conflicts with library packages

Override with visibility='hidden'

<override>
   <template name='foo' visibility='hidden'/>
</override>

This change allows the using package to override a component without running into a potential naming conflict with another component in the using package or in another used package. Because the visibility is hidden, the component is not invokable from the using package.

Accept with alias

<accept component='template' names='foo' aliases='fu'/>

This change allows the using package to accept components but with a different name.

Issue #522 closed #closed-522

24 May at 07:48:58 GMT

function-catalog.xml: Original line endings reverted (modified by GitHub’s 'direct edit' feature)

Pull request #522 created #created-522

24 May at 07:48:39 GMT
function-catalog.xml: Original line endings reverted (modified by GitHub’s 'direct edit' feature)

I learned it’s NOT advisable to use GitHub’s features to directly edit files in the browser. The newlines of function-catalog.xml of the original file seem to be changed.

This PR is restoring the original newlines. Sorry for that.

Pull request #521 created #created-521

23 May at 18:50:56 GMT

518: Add transitive-closure() function

QT4 CG meeting 035 draft minutes #minutes-05-23

23 May at 17:20:00 GMT

Draft minutes published.

Issue #504 closed #closed-504

23 May at 16:18:56 GMT

Merge map:keys and map:keys-where

Issue #515 closed #closed-515

23 May at 16:18:54 GMT

504: Merge map:keys and map:keys-where

Issue #396 closed #closed-396

23 May at 16:15:20 GMT

333: Deep-equal, no failure when comparing functions

Issue #520 created #created-520

23 May at 11:58:37 GMT
Function identity

To make deep-equal error-free for all arguments (issue #333), and to support the introduction of sets (issue #34), we need to be able to test whether two functions are "the same function". This is a proposed pragmatic solution.

We change the data model for functions so that functions, like nodes, have an identity that is acquired when the function is created; two functions are identical if and only if they have the same identity.

In general any expression that returns a new function allocates it an identity that is different from all other existing functions (as with nodes). However:

  • Repeated evaluation of a function reference such as count#1 returns the same function each time, provided that the target function is context-free.
  • Optimizers are allowed to rewrite expressions (for example by loop-lifting, etc) so that expressions that would in principle return distinct functions actually return the same function, provided the optimizer can determine that the two functions are equivalent in all respects other than their identity. For example if the expression contains(?, 'xxx') appears in a loop, the expression can be lifted out of the loop so there is no requirement that it returns different functions each time (as there is with nodes)

Benefits of this approach:

  • identical($x, $x) is always true (function identity survives binding to variables)
  • functions obtained by repeated evaluation of the same expression in the same context are likely to return identical results in cases that are simple enough for an optimizer to analyse
  • the results are likely to be reasonably intuitive
  • optimisers aren't constrained by rules on identity to restrict the rewrites they can attempt

This does mean that expressions that return functions become a little impure - but only in the same way that expressions that create nodes are a little impure. The impurity is well understood and tolerated.

Maps and arrays do not have identity as a property separate from their content.

Pull request #519 created #created-519

23 May at 10:50:52 GMT
237: Revise tokenisation appendix

This PR is an extensive revision to the rules for tokenisation that corrects a number of errors:

  • The problem mentioned in issue 237, namely the lack of clarity in the "longest token rule". This PR fixes this by clarifying what this rule means and where it applies. In particular it tackles the issue of "complex terminals" such as element constructor expressions and string templates where a symbol that is a single token at one level (in the sense that whitespace is constrained) also contains enclosed expressions.
  • Some productions/tokens were misclassified or omitted from the relevant lists of tokens in the appendix. This has been fixed in part by using general rules in the grammar2spec stylesheet to generate lists of tokens, rather than relying on annotations in the grammar file.

The PR includes changes to the grammar2spec stylesheet.

Issue #518 created #created-518

22 May at 16:45:24 GMT
transitive-closure() function

I've just found myself writing, yet again, a transitive closure function, and I feel we could add this to the spec.

I'm afraid it's another case where we really need set operations and therefore a universal equality operator. For the moment I'll just define it over nodes, which shelves the problem.

fn:transitive-closure($start as node()*, $step as function($node as node()) as node()*) as node()*

returns the set of all nodes reachable from a node in $start by zero or more applications of the $step function, in document order with duplicates removed.

Can probably define it formally something like

let $next-iteration := $start =!> $step()
return if (empty($next-iteration except $start))
           then $start
           else transitive-closure($start | $next-iteration, $step)

Issue #517 created #created-517

21 May at 01:43:18 GMT
fn:chain (before: fn:multi-compose)

FO: fn:multi-compose : Evaluate a chain of functions

As per Wikipedia:

" In mathematicsfunction composition is an operation  ∘  that takes two functions f and g, and produces a function 

h = g  ∘  f such that h(x) = g(f(x)).

In this operation, the function g is applied to the result of applying the function f to x. That is, the functions 

f : X → Y and g : Y → Z 

are composed to yield a function that maps x in domain X to g(f(x)) in codomain Z.

Intuitively, if z is a function of y, and y is a function of x, then z is a function of x.

The resulting composite function is denoted g ∘ f : X → Z, defined by:

 (g ∘ f )(x) = g(f(x)) for all x in X "


In this Proposal we generalize function composition to the case when a sequence of functions are composed together, so that the last one is applied on an argument $x, then the last-but-one is applied on the result of this application, and so on … until finally the first function in the sequence is applied on the result produced so far.

This is an effective way of chaining a sequence of functions together, and we don’t need to invent or use any special operators or syntax, but we just pass this sequence of functions as argument to fn:multi-compose.

fn:multi-compose := function($funs as function(*)*, $x)

Here is an XPath 3.0 implementation of fn:multi-compose:

let $apply := function($f, $x) {fn:apply($f, [$x])},
    $multi-compose := function($funs as function(*)*, $x)
                  {
                    fold-right($funs, $x, $apply)
                  },
    (: The functions $incr and $times are needed just to show this example :) 
    $incr := function($x) {op("+")(?, $x)},
    $times := function($y) {op("*")(?, $y)}                 
                  
 return
   $multi-compose(($times(5), $incr(1)), 2)

As wanted, the result of evaluating this is

15: (2 +1) * 5

Remarks

  1. In this implementation the type of the 2nd (last) argument of $multi-compose and $apply is item()* (any) and as such it is omitted. In case the function to be applied first, needs more than one argument, all of its arguments must be presented in the function call as a single sequence , and are passed (in order) as the members of a single array, as already implemented by the standard fn:apply.

  2. It is a dynamic error if any of the function applications produces a result which does not belong to the Domain of the function immediately preceding it in the function sequence.

Issue #181 closed #closed-181

20 May at 18:32:46 GMT

HOF Sequence Functions with Positional Arguments

Issue #516 created #created-516

20 May at 18:12:06 GMT
Add position argument to HOF callbacks

The coercion rules now allow a supplied function item to have lower arity than the signature of the declared type; the effect is that the information supplied in the additional arguments is ignored.

One of the intended use cases for this was to allow existing higher-order functions to be extended while retaining backwards compatibility. For example, in fn:filter, we can change the required type of the predicate function from function(item()) as xs:boolean to function(item(), xs:positiveInteger) as xs:boolean, with the second argument supplying the position of the item being tested. A function that isn't interested in the position can just ignore it, so existing calls will continue to work.

I propose that we add a position argument to the callbacks for:

fn:filter
fn:for-each
fn:for-each-pair
fn:partition
fn:items-after
fn:items-before
fn:items-starting-where
fn:items-ending-where
array:filter
array:for-each
array:for-each-pair

Other candidates include

fn:all
fn:some
fn:index-where
array:index-where

but I suggest we leave these unless someone can think of a use case.

QT4 CG meeting 035 draft agenda #agenda-05-23

20 May at 10:15:00 GMT

Draft agenda published.

Issue #471 closed #closed-471

20 May at 10:04:45 GMT

Unify formatting (function calls, code blocks, quotes) in the specification

Issue #511 closed #closed-511

20 May at 10:04:43 GMT

471: <code> elements, simple/typographic quotes

Pull request #515 created #created-515

20 May at 09:57:04 GMT

504: Merge map:keys and map:keys-where

Issue #514 created #created-514

20 May at 08:44:03 GMT
Lambda expression: Annotations

Edit 2023-05-21: Feedback was incorporated.

In the current grammar rules, there are subtle differences in the InlineFunctionExpr and LambdaExpr rules that we should dissolve.

Annotations are not supported in lambda expressions, which I believe is an unnecessary restriction:

(: currently legal :)
let $delete-texts := %updating function($nodes) { delete nodes $nodes//text() }
return $delete-texts(//city)

(: currently illegal :)
let $delete-texts := %updating ($nodes) -> { delete nodes $nodes//text() }
return $delete-texts(//city)

It should suffice to extend one rule in the grammar:

(: old :)
LambdaExpr  ::=  LambdaParams "->" EnclosedExpr
(: new :)
LambdaExpr  ::=  Annotation* LambdaParams "->" EnclosedExpr

We could also type declarations (as @michaelhkay has indicated below, though, this might not be as simple to realize as I hoped):

(: currently legal :)
let $find-john := function($node as node()) as xs:boolean { contains($node, 'john') }
return $find-john($node)

(: currently illegal :)
let $find-john := ($node as node()) as xs:boolean -> { contains($node, 'john') }
return $find-john($node)

The type declarations cannot be allowed if parentheses are omitted (unless we make them mandatory):

(: without parens :)
$i -> { ... }

(: parens :)
($i as xs:int) as xs:int -> { ... }

These are the current grammar rules:

FunctionItemExpr    ::=  NamedFunctionRef | InlineFunctionExpr | LambdaExpr
InlineFunctionExpr  ::=  Annotation* "function" FunctionSignature FunctionBody
FunctionSignature   ::=  "(" ParamList? ")" TypeDeclaration?
ParamList           ::=  Param ("," Param)*
Param               ::=  "$" EQName TypeDeclaration?
FunctionBody        ::=  EnclosedExpr
LambdaExpr          ::=  LambdaParams "->" EnclosedExpr
LambdaParams        ::=  LambdaParam | "(" | (LambdaParam ("," LambdaParam)*)? | ")"
LambdaParam         ::=  "$" VarName

As the InlineFunctionExpr and LambdaExpr both generate anonymous functions, we shouldn’t make a difference, and this is what I would recommend:

FunctionItemExpr    ::=  NamedFunctionRef | InlineFunctionExpr
InlineFunctionExpr  ::=  Annotation* (InlineFunction | LambdaFunction) FunctionBody
InlineFunction      ::=  "function" FunctionSignature
LambdaFunction      ::=  (Param | FunctionSignature)) "->"
FunctionSignature   ::=  "(" ParamList? ")" TypeDeclaration?
ParamList           ::=  TypedParam ("," TypedParam)*
TypedParam          ::=  Param TypeDeclaration?
Param               ::=  "$" VarName

Disclaimer: I could have raised this earlier, but I didn’t want to prolong the ongoing discussion on the open pull requests.

Issue #513 created #created-513

20 May at 08:00:24 GMT
Arrow operator: Inline functions without parens

See also https://github.com/qt4cg/qtspecs/issues/435#issuecomment-1508228624: If an inline function expression is used as the right-hand operand of the arrow operators, parentheses must be used:

(: now :)
$seq => (function($x) { ... })()
(: desirable :)
$seq => function($x) { ... }()

This could be changed by adding the InlineFunctionExpr to the ArrowDynamicFunction rule:

[115] SequenceArrowTarget ::= "=>" ((ArrowStaticFunction ArgumentList) | (ArrowDynamicFunction PositionalArgumentList)) [151] ArrowStaticFunction ::= EQName [152] ArrowDynamicFunction ::= VarRef | ParenthesizedExpr | InlineFunctionExpr [142] ArgumentList ::= "(" ((PositionalArguments ("," KeywordArguments)?) | KeywordArguments)? ")" [143] PositionalArgumentList ::= "(" PositionalArguments? ")"

It will be best to tackle this after we’ve resolved #503, and we’ll have to check if the simplification doesn’t cause ambiguities.

Issue #53 closed #closed-53

19 May at 20:01:03 GMT

Allow function keyword inline functions without parameters

Issue #436 closed #closed-436

19 May at 20:00:53 GMT

Allow inline function expressions in arrow operator call chains

Issue #435 closed #closed-435

19 May at 20:00:35 GMT

Remove the inlined function expression variant of the thin arrow operator

Pull request #512 created #created-512

19 May at 14:47:33 GMT
256: Context for default function parameter expressions

This is a renewed attempt to tackle issue 256, which concerns how to define the static and dynamic context for default value expressions for optional function parameters in XQuery and XSLT. The resolution is to define the static and dynamic context for these expressions in detail.

To make this work, some refinement of the static and dynamic context definitions is needed:

  • default collation is moved from the static context to the dynamic context, with a note that it is always known statically except in the case when defining the default for a function parameter.
  • static base URI (in the static context) and executable base URI (in the dynamic context) are now formally separated; previously we fudged this by saying they could be different, but without recognizing separate context components
  • the base URI for resolving relative collation URIs is now implementation defined. This allows implementors to use either the compile-time or run-time base URI, or some other URI defined using a processor API.

Pull request #511 created #created-511

19 May at 13:10:45 GMT
471: <code> elements, simple/typographic quotes

That was a work-intensive one, as expected, but I’m optimistic that the PR improves the overall situation. I’ll be happy to see subsequent PRs if I missed something (e.g., I didn’t touch ebnf.xml).

Closes #471

Issue #509 closed #closed-509

19 May at 09:08:43 GMT

471 (2): Remove more fn: prefixes

Issue #510 closed #closed-510

19 May at 09:08:28 GMT

471 (3): Render false/true/NaN/INF/-INF/+INF as code

Issue #375 closed #closed-375

19 May at 08:48:02 GMT

256: Context for default parameter values

Issue #507 closed #closed-507

19 May at 08:06:13 GMT

125: Rename array:partition as fn:partition

Issue #505 closed #closed-505

18 May at 17:25:58 GMT

418: Correct and expand an XSLT example

Issue #447 closed #closed-447

18 May at 17:24:32 GMT

435, 53, 436: lambda expressions, thin arrows

Issue #410 closed #closed-410

18 May at 16:57:55 GMT

Converting doubles to decimals, fractional digits

Issue #455 closed #closed-455

18 May at 16:57:52 GMT

410: Converting doubles to decimals, fractional digits

Issue #483 closed #closed-483

18 May at 16:53:32 GMT

452: window: make 'start' and 'when' optional

Pull request #510 created #created-510

18 May at 16:14:22 GMT
471 (3): Render false/true/NaN/INF/-INF/+INF as code

NaN, INF, -INF and +INF was easy, boolean values were trickier:

  • I used <code> for “The result/value/option/property is true/false”, “is set to true/false” and similar.
  • I didn’t tag “This is true/false”, “The condition is true/false” and similar.

I hope there won’t be too many conflicts if this is directly merged after #509.

Pull request #509 created #created-509

18 May at 12:38:02 GMT
471 (2): Remove more fn: prefixes

I’ve removed additional fn: prefixes from examples and eg code blocks. I have kept prefixes in the rules and formal code snippets untouched.

In the initial comment of #471, I have listed the remaining cleanups for which I want to prepare PRs. I’ll wait until this and possibly some other PRs have been merged.

Issue #508 created #created-508

17 May at 12:53:56 GMT
New Map & Array Functions: Inconsistencies

XQFO 3.1

…provides the following functions/constructs for creating and accessing maps & arrays:

Maps | Singleton Maps :--- | :--- Decompose | – Compose | map:merge($maps) Create single | map:entry($key, $value)
map { $key: $value } Extract keys | map:keys($map) Extract values (flat) | $map?* Arrays | Singleton Arrays Decompose | – Compose | array:join($arrays) Create single | [ $value ] Extract values (flat) | –

XQFO 4.0 Draft

…provides new functions for singletons and map representations:

Maps | Singleton Maps | Pairs (Key-Value Pair Maps) :--- | :--- | :--- Decompose | map:entries($map) | map:pairs($map) Compose | map:merge($maps) | map:of($pairs) Create single | map:entry($key, $value)
map { $key: $value } | –
map { 'key': $key, 'value': $value } Extract keys | map:keys($map) | $pairs?key Extract values (flat) | map:values($map)
$map?* | $pairs?value Arrays | Singleton Arrays | Members (Value Maps) Decompose | – | array:members($array) Compose | array:join($arrays) | array:of($members) Create single | [ $value ] | array { 'value': $value } Extract values (flat) | array:values($array) | $members?value

The following terminology can be derived from the function names:

  • A key-value pair map with a single map pair is called a Pair.
  • A value map with a single array member is called a Member.
  • A singleton map is called an Entry (due to map:entry)
  • We have no name for a singleton array.

Complete the Picture

I believe we should:

  1. rename map:of to map:of-pairs or map:merge-pairs (as a hint that singletons are not the expected input)
  2. rename array:of to array:of-members or array:join-members (as a hint…)
  3. add map:pair for creating a single pair
  4. add array:split (array:tokenize, …?) for decomposing arrays to singleton arrays

I’m not sure if we should add array functions for creating singletons or value maps; we also have array:build.

Pull request #507 created #created-507

17 May at 11:38:06 GMT
125: Rename array:partition as fn:partition

Reworked this PR to deal with merge conflicts. Technical change was already accepted. Made a correction to the "equivalent expression" published in the spec, which has now been tested (and was found wanting...)

Issue #454 closed #closed-454

17 May at 09:23:54 GMT

125: array:partition

Issue #506 created #created-506

17 May at 09:08:06 GMT
fn:error: parameter names

We should rename the $error-object parameter to $value, as it will be bound to $err:value later on:

try {
  error(value := 123)
} catch * {
  $err:value
}

Next, “object” is rarely used in the specs.

Issue #467 closed #closed-467

17 May at 08:20:49 GMT

map:keys-where: Return Keys That Match a Predicate

Pull request #505 created #created-505

17 May at 08:14:09 GMT
418: Correct and expand an XSLT example

Makes a further correction to an example identified in issue #418, and adds to the example giving an alternative solution

Issue #504 created #created-504

17 May at 07:46:32 GMT
Merge map:keys and map:keys-where

I propose that we merge map:keys and map:keys-where into a single function, with map:keys#1 behaving like it does now, and map:keys#2 taking over from map:keys-where#2. Effectively the default for the second argument becomes true#0.

Issue #30 closed #closed-30

17 May at 06:16:28 GMT

Improve the discoverability and parseability of the mathematical operator symbols

Issue #204 closed #closed-204

17 May at 06:16:02 GMT

Non-ascii alternative operator symbols

Issue #460 closed #closed-460

17 May at 06:15:24 GMT

Mathematical Operator Symbols

Issue #443 closed #closed-443

17 May at 06:12:59 GMT

@select on xsl:matching-substring and xsl:non-matching-substring

Issue #32 closed #closed-32

17 May at 06:12:16 GMT

try/catch: New variable for all error information

Issue #452 closed #closed-452

17 May at 06:11:12 GMT

window: make 'start' and 'when' optional

Issue #503 created #created-503

16 May at 23:09:58 GMT
Reinstate focus functions

As a result of accepting PR #447, we have lost the ability to write simple "focus functions" that take the context item as an implicit argument, for example sort(//emp, (), ->{@salary}).

The new status quo is that people have to write sort(//emp, (), $e->{$e/@salary}) which feels clumsy in comparison.

This issue examines options for reinstating such a capability, and perhaps making it more powerful.

A reason for dropping the syntax was that it didn't play well with the "thin arrow" operator in pipelines, but we have now changed the symbol for that to =!> so the objection no longer applies so strongly.

Ideally we want something that not only replaces focus functions (arity one arguments accepting an argument of type item()), but also meets some or all of the following additional requirements:

  • Works well on the RHS of the => and =!> operators, in a construct that we might write as $list =!> {.+1}().

  • Also allows arity-one functions whose argument is a sequence (item()*)

This becomes a lot easier if we can solve issue #129 which generalises the context item to a context value. Let's assume we do that, and keep an open mind for the moment as to whether the generalized context value is referenced as . or ~. I'll use ~ for now. So we want a compact notation for functions of arity one in which the function body refers to the argument value as ~. For aesthetic reasons, because it's going to be used on the RHS of an arrow operator, we really don't want to introduce it with a leading arrow like the previous syntax ->{.+1}. Use of "bare braces" (simply {~+1}) is very tempting, but I think there is a good argument for leaving that part of the syntactic space unused, for extensibility and for diagnostics. I think my preference is for fn{~+1}. Using a keyword (such as map, array, validate) before a braced expression is a uniform device and keeps the grammar coherent.

So in a callback such as fn:sort, we can write sort(//emp, (), fn{@salary}), and in a pipeline we can write $list =!> fn{.+1}(). (To allow this, all we need to do is to generalise what's allowed as an ArrowDynamicFunction).

A separate question is whether we can (and should) allow the empty argument list to be omitted. I think I'm persuaded by the arguments that it's better to keep it, as a visual signal that the function is being applied, not just returned.

QT4 CG meeting 034 draft minutes #minutes-05-16

16 May at 17:33:00 GMT

Draft minutes published.

Issue #478 closed #closed-478

16 May at 16:17:23 GMT

467: map:keys-where

Issue #481 closed #closed-481

16 May at 16:15:48 GMT

When we have array:build and map:build, then why do we also need array:of and map:of ?

Issue #466 closed #closed-466

16 May at 16:12:50 GMT

460: Fix math symbols

Issue #487 closed #closed-487

16 May at 16:12:38 GMT

485: Predeclare the prefixes math, map, array, and err

Issue #489 closed #closed-489

16 May at 16:12:24 GMT

443: Allow select attribute on xsl:[non-]matching-substring

Issue #492 closed #closed-492

16 May at 16:12:06 GMT

Fix examples, change filepath definition slightly

Issue #493 closed #closed-493

16 May at 16:11:53 GMT

32: try/catch: New variable for all error information

Issue #500 closed #closed-500

16 May at 09:15:16 GMT

Fix errant typographic quotes in XPath Data Model

Issue #502 closed #closed-502

16 May at 09:15:15 GMT

Fix typographic quotes in XPath Data Model

Pull request #502 created #created-502

16 May at 08:22:47 GMT
Fix typographic quotes in XPath Data Model

Close #500

On closer inspection, there were only a few places where typographic quotes were not used in prose. I've fixed those. I think the DM spec could probably use an editorial pass to add code around some literals, but I'm not doing that in this PR.

I've left typographic quotes around code and literals because I don't think straight quotes would be an improvement: the literal “<code>3</code>” doesn't need straight quotes because the quotes are not part of the literal.

Issue #501 created #created-501

15 May at 09:05:31 GMT
Error handling: Rethrow errors; finally block

Re-throw errors

In https://github.com/qt4cg/qtspecs/pull/493, a function/expression was suggested to re-throw errors:

try {
  (: wild stuff :)
} catch * {
  module:log($err:description),
  rethrow($err:map)
}

Alternatives

  • Use and extend the existing error function: fn:error(rethrow := $err:map)
  • Use an expression: throw $err:map

In principle, the error information can also be constructed by the user. If we extend fn:error to also accept a map, it could be used to both throw and re-throw errors:

try {
  1 + <empty/>
} catch * {
  (: ... :)
  fn:error($err:map)
}

Missing information in the map could be added as if fn:error is raised.

let $map := map { 'column-number': 12, 'line-number': 3 }
return fn:error(xs:QName('oob'), 'Out of bounds', map := $map)

Finally clause

It can be helpful to have a code block that is always executed, even if errors occur:

let $tmp := file:create-temp-file()
return try {
  (: I/O stuff :)
} finally {
  file:delete($tmp)
}

Issue #500 created #created-500

15 May at 08:46:49 GMT
Fix errant typographic quotes in XPath Data Model

In a comment on #471, @ChristianGruen observes that there are some errant typographic quotes in code examples in the XPath Data Model specification. I assume these are errors, mostly likely on my part, and should be corrected.

Issue #499 closed #closed-499

15 May at 08:21:01 GMT

Use natural language sort order for glossary

Pull request #499 created #created-499

14 May at 18:50:41 GMT
Use natural language sort order for glossary

This is a small stylesheet change which has the effect that the glossary in the XQuery specification (and elsewhere) now uses natural language sort order, so upper-case terms like Gregorian and NaN now appear in their proper alphabetic sequence.

QT4 CG meeting 034 draft agenda #agenda-05-12

12 May at 17:10:00 GMT

Draft agenda published.

Issue #48 closed #closed-48

12 May at 12:20:36 GMT

Create a schema-for-xslt40.xsd file for the current draft spec.

Issue #494 closed #closed-494

12 May at 09:39:10 GMT

Remove legacy materials from the working master branch

Issue #495 closed #closed-495

12 May at 09:07:55 GMT

separator example in https://qt4cg.org/specifications/xslt-40/Overview-diff.html#for-each-separator has xsl:sequence-of instead of xsl:sequence element

Issue #498 closed #closed-498

12 May at 09:07:54 GMT

Fix typo, replace sequence-of with sequence

Pull request #498 created #created-498

12 May at 08:07:34 GMT
Fix typo, replace sequence-of with sequence

Fix #495

Issue #497 created #created-497

12 May at 08:03:21 GMT
https://qt4cg.org/specifications/xpath-functions-40/Overview-diff.html#func-map-pairs has wrong function syntax order

In https://qt4cg.org/specifications/xpath-functions-40/Overview-diff.html#func-map-pairs the explanation of the new function map:pairs is given as:

map:for-each($map, ($k, $v) -> {map{"key":$k, "value":$v}})

I think the right syntax would be map:for-each($map, -> ($k, $v) {map{"key":$k, "value":$v}}).

Issue #496 closed #closed-496

12 May at 08:01:03 GMT

Ignore legacy directories entirely

Pull request #496 created #created-496

12 May at 08:00:49 GMT
Ignore legacy directories entirely

This PR is supposed to fix the action that builds PRs so that it ignores directories we never edit.

Issue #495 created #created-495

12 May at 07:37:30 GMT
separator example in https://qt4cg.org/specifications/xslt-40/Overview-diff.html#for-each-separator has xsl:sequence-of instead of xsl:sequence element

While looking through the XSLT 4 draft spec, I have found the following example in https://qt4cg.org/specifications/xslt-40/Overview-diff.html#for-each-separator:

<xsl:for-each select="6, 3, 9" separator="|">
   <xsl:sort select="."/>
   <xsl:sequence-of select="., .+1"/>
</xsl:for-each>

xsl:sequence-of should be xsl:sequence.

Pull request #494 created #created-494

11 May at 17:41:22 GMT
Remove legacy materials from the working master branch

This is intended to be an entirely uninteresting change. This PR removes a whole bunch of historical artifacts from the master branch, things like the requirements and use-cases documents that we aren't maintaining for QT4, the errata which don't apply to QT4, etc.

I will push a separate branch, legacy-documentation, to the repository that contains all of the the files removed by this PR so that the aren't lost and can easily be recovered. (I won't do that as a PR, I'll just push it to the repository.)

In the meantime, I think this trimmed down master branch works just fine and it's a lot simpler and easier to explain.

This PR does remove support for the legacy ant builds, but I doubt they've worked for a while now.

Pull request #493 created #created-493

11 May at 13:24:24 GMT

32: try/catch: New variable for all error information

Pull request #492 created #created-492

11 May at 12:09:45 GMT
Fix examples, change filepath definition slightly

This PR fixes the examples in the parse-uri() function. It also makes a small change to the filepath property, eliding it when the scheme is known not to be file.

Pull request #491 created #created-491

11 May at 11:37:09 GMT
Fix more examples in the FO 4.0 spec

Further corrections to example code in the F+O specification, found by testing (app-spec-examples in the test suite).

Issue #490 created #created-490

10 May at 22:03:16 GMT
Control over schema validation in parse-xml(), doc(), etc.

I'm struggling with a problem with the stylesheet that generates QT4 tests from the examples in the function catalog, and I think it's an example of a more general problem in schema-aware processing.

The spec gives this example (for json-to-xml):

The expression json-to-xml('{"x": "\\", "y": "\u0025"}', map{'escape': true()}) returns 
(with whitespace added for legibility):

<map xmlns="http://www.w3.org/2005/xpath-functions">
  <string escaped="true" key="x">\\</string>
  <string key="y">%</string>
</map>

But the test we actually generate expects the result:

<map xmlns="http://www.w3.org/2005/xpath-functions">
    <string escaped="true" key="x" escaped-key="false">\\</string>
    <string key="y" escaped="false" escaped-key="false">%</string>
</map>

and the test is failing because the result produced by Saxon correctly excludes the escaped-key="false" attributes which the test is expecting. How did the attributes get there?

The answer is that the stylesheet is doing parse-xml() followed by some transformation to normalise whitespace, followed by serialize(). The parse-xml() call is invoking schema validation, which adds default attributes.

We probably don't want schema validation here; if we do want it, we probably don't want default attribute values to be expanded. But parse-xml() doesn't give us the choice. It says it's implementation-defined and it gives no options for the user to control it. Saxon provides configuration-level options but they aren't fine-grained enough to use here.

Without being able to control this, the only option seems to be for the stylesheet to transform the result to take out the defaulted attributes that the schema processor has added.

We need options on functions like doc() and parse-xml() to control whether and how schema validation is performed.

One of the options we need whenever we do validation is probably "validate+strip" - validate the input, report errors if it's invalid, but return the untyped data that was supplied to the validator, not the type-annotated data with expanded defaults.

Pull request #489 created #created-489

10 May at 10:50:01 GMT

443: Allow select attribute on xsl:[non-]matching-substring

Issue #488 closed #closed-488

10 May at 10:46:30 GMT

433: Allow select attribute on xsl:[non-]matching-substring

Pull request #488 created #created-488

10 May at 09:26:40 GMT
433: Allow select attribute on xsl:[non-]matching-substring

Allow a select attribute on xsl:[non-]matching-substring in place of the contained sequence constructor.

Issue #484 closed #closed-484

10 May at 08:01:11 GMT

Update FO test generation stylesheet

Issue #486 closed #closed-486

10 May at 08:00:44 GMT

Fix some errors in examples, as revealed by testing

Pull request #487 created #created-487

10 May at 07:59:28 GMT
485: Predeclare the prefixes math, map, array, and err

In 3.1 XQuery processors were allowed to predeclare these prefixes; in 4.0 they are now required to do so.

Pull request #486 created #created-486

09 May at 23:22:28 GMT
Fix some errors in examples, as revealed by testing

Corrects errors in examples; changes other examples to make them testable. Further test failures remain to be investigated (some may be bugs in the Saxon implementation; others require improvements to the test generation mechanism).

Issue #485 created #created-485

09 May at 20:30:27 GMT
Predeclared namespaces in XQuery

XQuery defines that the prefixes xml, xs, xsi, fn, and local are predeclared, and states:

Additional predeclared namespace prefixes may be added to the [statically known namespaces]) by an implementation.

I propose that we add map, array, and math to this list, so that these can be used interoperably without pre-declaring them. It is already permitted for an implementation to do this, but it is not required. The change is backwards compatible, because user-defined namespace declarations override predeclared declarations.

Pull request #484 created #created-484

09 May at 19:56:48 GMT
Update FO test generation stylesheet

Updates the stylesheet that generates tests from examples in the FO spec; plus supply a missing record definition in the function catalog so that it becomes ID/IDREF valid.

Issue #63 closed #closed-63

09 May at 18:31:51 GMT

fn:slice, array:slice: Signatures, Examples

Issue #477 closed #closed-477

09 May at 18:31:20 GMT

63: array:slice (editorial)

Issue #29 closed #closed-29

09 May at 18:31:12 GMT

array:values (resolved: map:values, map:entries)

Issue #473 closed #closed-473

09 May at 18:27:21 GMT

NaN ne NaN

Issue #321 closed #closed-321

09 May at 18:26:18 GMT

relax $input in fn:serialize

Issue #325 closed #closed-325

09 May at 18:25:46 GMT

Operator precedence table needs updating

Issue #482 closed #closed-482

09 May at 18:24:30 GMT

473: NaN Comparisons (bug fix)

Issue #476 closed #closed-476

09 May at 18:24:05 GMT

29: array:values

Issue #475 closed #closed-475

09 May at 18:23:52 GMT

471: fn: prefix removed from function calls in the examples

Issue #472 closed #closed-472

09 May at 18:23:41 GMT

321: Add new note and examples demonstrating adaptive serialization method

Issue #468 closed #closed-468

09 May at 18:23:21 GMT

325 Update operator precedence table

Issue #462 closed #closed-462

09 May at 18:22:25 GMT

434: Added examples for parse-integer()

Pull request #483 created #created-483

08 May at 09:05:31 GMT

452: window: make 'start' and 'when' optional

Pull request #482 created #created-482

07 May at 20:05:18 GMT
473: NaN Comparisons (bug fix)

Drops the incorrect statement suggesting that NaN xx NaN is always false, for all six operators xx. In fact NaN ne NaN is true, as statements elsewhere in the spec make clear. Specifically, the operator mapping appendix of the XPath/XQuery language spec makes clear that X ne Y maps to not(op:numeric-equal(X, Y)).

Issue #481 created #created-481

07 May at 18:43:24 GMT
When we have array:build and map:build, then why do we also need array:of and map:of ?

Looking at the current specification of the pairs of functions: (array:build, array:of) and (map:build, map:of), it is impossible not to notice that the second function in each of these pairs is a weak duplicate of the first.

Also, the examples provided for array:build and array:of, seem to have a good deal of common content / duplication / overlap.

Another issue is that array:of requires as input a sequence of value records, whose definition is challenging to understand (and whose meaning seems to be solely to represent a sequence of sequences), and what is also really challenging is how to construct this argument to array:of. If this is unnatural and challenging, one would probably prefer to use just array:build.

Is there an example where it is possible to construct an array (or a map) with array:of (or with map:of) but it is impossible (or significantly more difficult) to construct the same array/map with the function array:build (or with map:build)?

If there are no such significant and convincing examples, then why do we need the xxx:of functions?

Thus the question naturally arises: "Why is the function xxx:of necessary at all?"

Issue #480 created #created-480

07 May at 18:22:51 GMT
Allow type promotion of xs:string to xs:anyURI

If it hasn't already been considered and ruled out, I'd like to propose adding a type promotion rule to XPath 4 that would allow xs:string to be type-promoted to xs:anyURI, so that functions with parameters whose types are declared as xs:anyURI can directly take xs:string values, without having to first cast these to xs:anyURI.

This would empower function authors to select xs:anyURI as a type - signaling that they’re expecting a URI - without forcing users of the function into explicitly casting their string-typed URIs.

The motivation behind this proposal is that many eXist users are frustrated when using the eXist extension functions that properly declare parameters as xs:anyURI. If this proposal isn’t adopted, that’s ok; we’ll just eliminate the use of xs:anyURI in our functions, as proposed in https://github.com/eXist-db/exist/issues/4632. But this would a bit unfortunate for authors of functions who see the use of xs:anyURI as a proper expression of intent in their functions.

The change would be to https://www.w3.org/TR/xpath-31/#promotion - and I guess would be a 3rd item, called "String type promotion", saying:

A value of type xs:string can be promoted to the type xs:anyURI. The result of this promotion is created by casting the original value to the type xs:anyURI.

Issue #479 created #created-479

07 May at 11:33:05 GMT
fn:deep-equal: Input order

#383 is about the specific order of children of element nodes.

I think we should also provide an option to ignore the top-level order of the input items:

(: returns true: both input arguments contain the same items, but in a different order :)
deep-equal(
  (1 to 10),
  reverse(1 to 10),
  map { 'unordered': true() }
)

(: returns false: the compared elements are different :)
deep-equal(
  <a><b/><c/></a>,
  <a><c/><b/></a>,
  map { 'unordered': true() }
)

(: returns false: the second sequence contains duplicates :)
deep-equal(
  (1, 2),
  (2, 1, 1),
  map { 'unordered': true() }
)

Pull request #478 created #created-478

07 May at 10:53:10 GMT

467: map:keys-where

Pull request #477 created #created-477

07 May at 10:04:42 GMT

63: array:slice (editorial)

Pull request #476 created #created-476

07 May at 09:59:33 GMT

29: array:values

Issue #423 closed #closed-423

05 May at 17:49:05 GMT

[XSLT 4.0] 2.2 Notation is incomplete

Pull request #475 created #created-475

05 May at 16:02:17 GMT
471: fn: prefix removed from function calls in the examples

#471: I’ve removed the fn: prefixes from the function calls in the examples.

I have left pretty much all true/false strings untouched, since I’m not sure what would be the most consistent approach to clean them up. It will be better anyway to create a separate PR for that.

Issue #474 closed #closed-474

05 May at 13:14:02 GMT

Per comments on #465, improve presentation of multi-line expressions

Issue #465 closed #closed-465

05 May at 12:16:12 GMT

80: fn:iterate-while: Examples revised

Pull request #474 created #created-474

05 May at 11:52:32 GMT
Per comments on #465, improve presentation of multi-line expressions

This PR addresses points raised in the comments on #465.

If an fos:expression element is a code block, nest an eg inside it:

               <fos:expression><eg><![CDATA[let $input := 3936256
return fn:iterate-while(
  $input,
  function($result) { abs($result * $result - $input) >= 0.0000000001 },
  function($guess) { ($guess + $input div $guess) div 2 }
)]]></eg></fos:expression>

I found four examples where an fos:expression contained more than one newline and I added eg wrappers in those cases.

There are many more fos:expression elements that contain a single newline, but automatically formatting them as code blocks was often unsuccessful. Many of those cases seem to be just newlines entered for authoring convenience.

I also fixed the CSS for code blocks and attempte to remove trailing newlines from code blocks.

Issue #473 created #created-473

04 May at 17:18:56 GMT
NaN ne NaN

It seems that ever since 2.0, the section in Functions and Operators "Comparison Operators on Numeric Values" (currently §4.3) has stated "If either, or both, operands are NaN, false is returned."

This is incorrect. If the operator is ne, then the correct result is true.

(And editorially, the first two commas in this sentence should be dropped).

Pull request #472 created #created-472

04 May at 15:21:17 GMT
321: Add new note and examples demonstrating adaptive serialization method

Per Issue 321, I've added a new note and two additional simple examples noting the adaptive serialization method to draw attention to this feature in the existing specs.

Issue #471 created #created-471

04 May at 13:03:07 GMT
Unify formatting (function calls, code blocks, quotes) in the specification

Todos (2023-05-18):

  • [x] Initial cleanup of fn: prefixes → #475
  • [x] Remove more fn: prefixes → #509
    • fn: prefixes in examples that raise an error.
    • fn: prefixes in eg code blocks
    • other documents: expressions.xml, query-examples.xml, …
  • [x] Render false, true, NaN, INF, +INF as code → #510
  • [x] Render string values as code and use quotes: "yes", "true", "0"#511
  • [x] Omit quotes for single characters: \b, \f, … → #511
  • [x] Rewrite simple to typographic quotes → #511

Inspired by https://github.com/qt4cg/qtspecs/pull/454#issuecomment-1534633089 ff.

The syntax of the examples in the XQFO specification is inconsistent. Sometimes, functions in the standard function namespace have an fn prefix…

  • fn:fold-right(1 to 5, "", fn:concat(?, ".", ?))
  • fn:substring("motor car", 6)

…sometimes they don’t…

  • data(123)
  • concat("http://www.example.com/", encode-for-uri("~bébé"))

…sometimes it’s both:

  • fn:fold-right(1 to 5, "$zero", concat("$f(", ?, ", ", ?, ")"))
  • fn:concat(01, 02, 03, 04, true())
  • fn:tokenize(fn:unparsed-text($href), '\r\n|\r|\n')[not(position()=last() and .='')]

Should we drop or keep the prefix – or doesn’t it really matter? If there’s interest, I can create a PR (I’d tend to drop the prefixes).

In addition, there doesn’t seem to be a consistent rule for representing booleans. We have:

Syntax | Comment --- | --- …returns false | mostly used in the rules (seems appropriate to me) …returns false | used in the rules; maybe we should use replace it with the first syntax? …the result is fn:false() | used in the rules; maybe we should use replace it with the first syntax? …returns false() | mostly used in the examples (seems appropriate to me)

Pull request #470 created #created-470

04 May at 11:34:57 GMT
369: add fixed-prefixes attribute in XSLT

A solution to some of the problems identified in issue #369. This proposal affects XSLT only.

Issue #469 created #created-469

04 May at 07:09:33 GMT
array:of-members, map:of-pairs: Signatures, Examples

Just trivia:

a) The parameter name of array:of-members is $input.     $members may be a better choice (or we should change map:of-pairs($pairs) to map:of-pairs($input)).

b) The type of $pairs is record(key as xs:anyAtomicType, value as item()*, *)*.     Shouldn’t it be record(key as xs:anyAtomicType, value as item()*)* (without the trailing , *)?     If the current syntax is correct, an explanatory comment could be helpful.

c) ~One map:of example needs to be fixed: map:of((map:entry(0, "no"), map:entry(1, "yes"))).~ See #607

See #508 for the proposal to rename map:of to map:of-pairs.

Pull request #468 created #created-468

03 May at 21:59:06 GMT
325 Update operator precedence table

Add "otherwise" and thin arrow to the table. Editorial.

Issue #467 created #created-467

03 May at 11:09:26 GMT
map:keys-where: Return Keys That Match a Predicate

Edit, 23/05/17: Reopened to discuss map:keys($map, $predicate) as an alternative.


Motivation

We have fn:index-where and array:index-where to locate items/members in a sequence/an array that match a specific predicate, and we could introduce an equivalent function for maps. A recent use case can be found in https://github.com/qt4cg/qtspecs/issues/413#issuecomment-1531288167d.

Proposal

Summary

Returns keys of map entries for which the value matches a supplied predicate.

Signature

map:keys-where(
  $map        as map(*),
  $predicate  as function(item()*) as xs:boolean
) as xs:anyAtomicValue*

Properties

This function is ·deterministic·, ·context-independent·, and ·focus-independent·.

Rules

The function takes any ·map· as its $map argument and applies the supplied function to the value of each map entry. The function supplied as $predicate takes the value of the corresponding map entry as an argument, and the result is a sequence containing the keys of those entries for which the function returns true.

More formally, the function returns the result of the expression:

map:for-each(
  $map,
  function($key, $value) {
    if($predicate($value)) then $key else ()
  }
)

Examples

let $numbers := map { 0: 'zero', 1: 'one', 2: 'two', 3: 'three' }
return map:keys-where($numbers, function($string) { $string = 'two' })

Comments

  • Edit (2023-05-04): Renamed from map:key-where to map:keys-where.
  • Similar functions (index-of, index-where) use the singular form.
  • An alternative would be to add an optional $predicate function argument to map:keys.
  • If we decide to introduce a shorter syntax (see #129 and #436), we could have:
map:keys-where($numbers, { . = 'two ' })

Pull request #466 created #created-466

02 May at 23:36:04 GMT
460: Fix math symbols

(1) drops the mathematical operator symbols appendix, which allowed an extensive range of non-ASCII characters as synonyms for language keywords, (2) retains × and ÷ as synonyms for multiplication and division, (3) allows full-width < and > in operator symbols in place of the usual ASCII characters, to avoid the need for XML escaping.

QT4 CG meeting 033 draft minutes #minutes-05-02

02 May at 17:20:00 GMT

Draft minutes published.

Issue #449 closed #closed-449

02 May at 16:27:49 GMT

Actions from review of PR #420

Issue #456 closed #closed-456

02 May at 16:26:55 GMT

Revises numeric literal syntax

Issue #458 closed #closed-458

02 May at 16:26:42 GMT

Update parse-integer and format-integer following review

Issue #224 closed #closed-224

02 May at 16:25:46 GMT

Infrastructure changes/improvements

Issue #461 closed #closed-461

02 May at 16:25:45 GMT

Make code more visually distinct

Pull request #465 created #created-465

02 May at 12:37:49 GMT
80: fn:iterate-while: Examples revised

This PR is editorial. I’ve reformatted the examples for fn:iterate-while

image

…to make them better readable.

QT4 CG meeting 033 draft agenda #agenda-05-02

01 May at 17:10:00 GMT

Draft agenda published.

Issue #464 created #created-464

29 Apr at 21:01:45 GMT
Serialization sequence normalization step 3 needs clarification

The specifications currently read:

If the item-separator serialization parameter is absent, then for each subsequence of adjacent strings in S2, copy a single string to the new sequence equal to the values of the strings in the subsequence concatenated in order, each separated by a single space. Copy all other items to the new sequence. Otherwise, copy each item in S2 to the new sequence, inserting between each pair of items a string whose value is equal to the value of the item-separator parameter. The new sequence is S3.

As written ("If...then.... Otherwise...."), this implies that the process whereby a block of adjacent strings are joined into a single string is performed only when the item separator parameter is absent. I.e., if the item-separator parameter is not absent, it will not be used to string-join adjacent groups of strings. Perhaps that is as intended, but I want to make sure.

Also, as written, this implies that when the parameter is absent, the sequence begins by finding all subsequences of adjacent strings, performing concatenation. Then all non-strings are added to the sequence. That seems wrong, because it appears to advise the processor to rearrange the sequence of input items.

I propose a revision along these lines:

Copy each item in S2 to a new sequence. If a given pair of adjacent items are both strings, then separate them with a string whose value is equal to the value of the item-separator parameter or is a single space if the item-separator parameter is absent. If a given pair of adjacent items are not both strings, insert between each pair of items a string whose value is equal to the value of the item-separator parameter. Once this is finished, take each adjacent group of strings and concatenate them into a single string. The new sequence is S3.

My revision is based upon what I imagine happens, but implementers will know better than I.

I am working on some editorial touch-ups of the Serialization specifications, and can incorporate comments/suggestions made in this thread in that larger enterprise.

Issue #463 created #created-463

28 Apr at 20:09:30 GMT
fn:parts() - extract the parts of a (not-really) atomic value

We have a whole raft of functions to extract the parts of date, time, and duration values: month-from-dateTime(), etc etc.

These aren't particularly convenient to use, for example getting multiple components of a duration is clumsy; and there are gaps, for example there are no functions to extract the parts of a gMonthDay.

I propose a general-purpose function fn:parts() which turns any of these composite atomic values into a map, enabling you to replace a call on get-month-from-dateTime($value) with parts($value)?month.

So far this dupllicates existing functionality perhaps with a bit of added convenience. However, the mechanism is much more extensible and flexible than what we have now:

  • we can apply it to atomic types that currently have no decomposition operators, such as gMonthDay
  • we can easily add additional components such as day-of-week or quarter or day-of-year or julian-day that are currently not available, or only available clumsily using format-dateTime().
  • the parts() function is polymorphic, so the same code can be used to get the year (say) from a date, a dateTime, or a gYearMonth.

Pull request #462 created #created-462

27 Apr at 23:19:43 GMT
434: Added examples for parse-integer()

Supplemental examples related to pr #434 .

Pull request #461 created #created-461

27 Apr at 11:56:50 GMT
Make code more visually distinct

Close #224

Thanks @ChristianGruen for the reminder that this was still open!

Issue #460 created #created-460

27 Apr at 10:57:22 GMT
Mathematical Operator Symbols

Appendix B.3 of the specification proposes a set of non-ASCII symbols that can be used in place of language keywords, for example "∃" for "some" and "∀" for "every".

I haven't detected a great deal of enthusiasm for this idea, and I can see it causing some confusion, partly because Unicode offers such a wide choice of symbols some of which are visually very similar.

I propose retaining a much smaller set of these symbols:

  • "÷" (xF7) for "div" because the symbol is widely recognised and "div" here is pretty ugly

  • "≺" (x227A) and "≻" (x227B) as alternatives to "<" and ">" in all operator symbols (other than XML markup contexts) that use these characters: for readability in contexts, especially XSLT, where the "<" and ">" characters need to be escaped

Issue #4 closed #closed-4

27 Apr at 10:32:00 GMT

[XPath] [XQuery] Better names for ThinArrowTarget and FatArrowTarget

Issue #59 closed #closed-59

27 Apr at 10:27:07 GMT

[FO] fn:replace no longer has the 3 an 4 argument variants

Issue #459 created #created-459

27 Apr at 07:29:30 GMT
Eager and lazy evaluation

In #359, different approaches were discussed for eager and lazy evaluation. This issue could be used to

  1. clarify if we have the same notion of eagerness and laziness, and
  2. define language constructs for how eager/lazy evaluation.

Pull request #458 created #created-458

26 Apr at 17:35:11 GMT
Update parse-integer and format-integer following review

Following review and acceptance of the parse-integer and format-integer functions, make changes suggested during the review. See actions QT4G-032-03 to -06.

Issue #457 created #created-457

26 Apr at 06:32:41 GMT
Support parsing numeric, alphabetic, and additive number systems.

This proposal is based on the work done in https://www.w3.org/TR/css-counter-styles-3/ when defining CSS rules for formatting the numbers in list items.

The idea is to define 3 parsing strategies:

  1. numeric -- number-like systems such as decimal;
  2. alphabetic -- alphabetical-like systems such as spreadsheet columns (A, B, ..., Z, AA, AB, ...)
  3. additive -- systems like roman and hebrew where the symbol represents a fixed value that is added together

Parsing these, we have 3 properties:

  1. system as enum("numeric", "alphabetic", "additive") := "numeric" -- which of the parsing strategies (number systems) to use;
  2. symbols as xs:string := "0123456789" -- the list of characters used to represent a digit;
  3. additive-symbols as map(xs:integer, xs:string) := map {} -- a map of the symbols in an additive system with the corresponding value of that symbol.

Consideration 1 -- Should these also allow any whitespace and optional "+"/"-" symbols like the radix-based parse-integer?

Consideration 2 -- Should we define decimal format options for these, so the decimal format name can format/represent other number systems (binary, hex, hebrew, tamil, roman numerals, etc.). -- Note: this would make system, symbols, and additive-symbols properties of the decimal format object with the above defaults. The formatting would work in the same way as it is defined in the CSS Counter Styles specification.

Design 1 -- Separate functions

fn:parse-numeric-integer($value as xs:string,
                         $symbols as xs:string := "0123456789") as xs:integer

fn:parse-alphabetic-integer($value as xs:string,
                            $symbols as xs:string := "ABCDEFGHIJKLMNOPQRSTUVWXYZ") as xs:integer

fn:parse-additive-integer($value as xs:string,
                          $additive-symbols as map(xs:integer, xs:string)) as xs:integer

Design 2 -- Combined functions

fn:parse-integer($value as xs:string,
                 $system as xs:string := "numeric",
                 $symbols as xs:string := "0123456789",
                 $additive-symbols as map(xs:integer, xs:string) := map {}) as xs:integer

fn:parse-integer($value as xs:string,
                 $radix as xs:integer) as xs:integer

Pull request #456 created #created-456

25 Apr at 18:55:27 GMT
Revises numeric literal syntax

Following actions from review on 25 Apri 2023 (QT4CG-032-02), revises the new syntax of numeric literals to disallow trailing underscores. Also adds more notes and examples.

QT4 CG meeting 032 draft minutes #minutes-04-25

25 Apr at 17:30:00 GMT

Draft minutes published.

Issue #429 closed #closed-429

25 Apr at 16:32:44 GMT

Hexadecimal and binary literals

Issue #241 closed #closed-241

25 Apr at 16:32:09 GMT

Functions integer-to-string and string-to-integer with radix

Issue #434 closed #closed-434

25 Apr at 16:31:49 GMT

Functions to parse and format hex integers

Issue #433 closed #closed-433

25 Apr at 16:31:09 GMT

429 Add hex and binary literals and allow underscores

Pull request #455 created #created-455

25 Apr at 13:40:00 GMT
410: Converting doubles to decimals, fractional digits

@michaelhkay In this PR, I tried to undo the changes that were introduced to make comparisons transitive. I haven’t made any changes to distinct-values and group by, because I am uncertain if I have spotted all the relevant parts of the specification. Maybe/hopefully we can address them in a next step.

Any feedback is welcome.

Issue #293 closed #closed-293

24 Apr at 12:48:20 GMT

Error in fn:doc-available specification

Issue #430 closed #closed-430

24 Apr at 12:48:18 GMT

fn:doc et al, error handling: inconsistencies. Closes #293

Pull request #454 created #created-454

24 Apr at 12:42:18 GMT
125: array:partition

This PR revisits array:partition, with extra editorial clarification of the spec; including but not confined to fixing issue #125.

I suggest we schedule this PR for discussion since we have not previously discussed it.

One question for the group is what the name of the function should be (including the choice of namespace).

Another is whether the polarity of the callback function should be changed (from break-when to continue-when or similar).

We could also consider returning an array of sequences rather than a sequence of arrays. (But in my view sequences of arrays are rather easier to manage at the moment.)

Issue #453 closed #closed-453

24 Apr at 11:17:13 GMT

Fix issue #86 (incorrect default timezone format)

Pull request #453 created #created-453

24 Apr at 11:00:58 GMT
Fix issue #86 (incorrect default timezone format)

Trivial bug fix.

Issue #89 closed #closed-89

24 Apr at 10:49:53 GMT

[XQuery] DirPIConstructor permits ':' in the PI name.

QT4 CG meeting 032 draft agenda #agenda-04-25

24 Apr at 09:10:00 GMT

Draft agenda published.

Issue #452 created #created-452

21 Apr at 23:09:53 GMT
window: make 'start' and 'when' optional

Every time I use tumbling window, I write start when true(). If start would be optional, default to true, that would be easier

Issue #450 closed #closed-450

20 Apr at 13:50:36 GMT

Fix issue #418 (editorial corrections)

Issue #438 closed #closed-438

20 Apr at 13:49:59 GMT

What are the "non-whitespace control characters"?

Issue #442 closed #closed-442

20 Apr at 13:49:57 GMT

Attempt to clearify XML serialization of control characters

Issue #451 created #created-451

20 Apr at 06:51:32 GMT
Multiple Schemas

There are many situations in which a single transformation wants to deal with multiple schemas: for example when transforming from v1 of some industry standard to v2 of the same standard, or when processing a collection of input documents each of which references its own schema using xsi:schemaLocation.

This is currently possible only if the schemas are compatible (that is, if the union of the schemas is itself a valid schema). And even where it is possible, validation against the union of S1 and S2 may produce a different outcome from validation against S2, for example because a strict wildcard allows content that S2 would not allow. Substitution groups are a particular problem: if v1 and v2 have elements with different substitution group membership, then validating against the union of v1 and v2 allows the union of the substitution groups, which means that you haven't actually verified that the result document is valid against v2.

The problem is confounded by considerations that are outside the scope of the spec. What happens when you run two different stylesheets against the same source document? If the source document has been validated against S1, this means that both stylesheets must use schemas that are supersets of S1. The way this requirement is managed in Saxon is to introduce the concept of a Configuration in which transformations run; a Configuration has a single schema, and all source documents and stylesheets within the Configuration must use compatible subsets of this schema. A source document validated using one Configuration cannot be used in a different Configuration, because the type annotations would be meaningless against a different schema.

My proposal is to introduce the idea of a named schema (that is, a named collection of schema components). When we do xsl:import-schema, we can give the imported schema a name, and there is no requirement that the components in this schema should be compatible with the components in any other schema. When we refer to a schema type (for example in $s cast as QName) we should be able to qualify the type name with a schema name (we can postpone discussions of syntax, let's say cast as my:part-number§v1 for now). When we request validation, we should be able to nominate the schema to be used for validation, for example <xsl:element name="e" validation="strict" schema="v2">.

The trickiest part is handling source documents, mainly because validation of source documents (especially those read using doc() or collection()) is at present almost entirely implementation-defined. I believe that we need explicit options to request validation of source documents against a specific schema. There should also be an option to validate a document against the schema identified in its own xsi:schemaLocation, in which case there should be no requirement that that schema is compatible with any schema known statically to the stylesheet.

Issue #49 closed #closed-49

19 Apr at 15:09:01 GMT

[XQuery] The 'member' keyword is still present on ForMemberBinding

Issue #74 closed #closed-74

19 Apr at 15:06:00 GMT

[FO] Support parsing HTML

Issue #87 closed #closed-87

19 Apr at 15:04:34 GMT

[XSL] Support for "master files"

Issue #109 closed #closed-109

19 Apr at 15:02:26 GMT

[xslt4] xsl:note for structured documentation

Issue #113 closed #closed-113

19 Apr at 14:59:58 GMT

[xslt] Constructing arrays

Issue #239 closed #closed-239

19 Apr at 14:55:54 GMT

Terminology concerning function items and their access to static and dynamic context

Issue #373 closed #closed-373

19 Apr at 14:52:39 GMT

apparent copy/paste error in annotation documentation of simple type yes-or-no-or-maybe

Pull request #450 created #created-450

19 Apr at 14:50:00 GMT

Fix issue #418 (editorial corrections)

Pull request #449 created #created-449

19 Apr at 13:52:08 GMT
Actions from review of PR #420

Actions from review of PR #420 (QT4CG-031-01, -02); new functions map:entries() and map:values() from issue #29

Issue #445 closed #closed-445

19 Apr at 07:59:56 GMT

Editorial updates to XSLT spec

Issue #448 created #created-448

19 Apr at 05:43:07 GMT
Support extended dateTime formats of ISO-8601:2019?

The ISO 8601:2019 standard supports extended dateTime formats including support for uncertain or approximate times and new quantifiers. Apparently, the extension are documented in Extended Date/Time Format (EDTF) Specification from the US LoC.

Pull request #447 created #created-447

18 Apr at 23:24:02 GMT
435, 53, 436: lambda expressions, thin arrows

Addresses issue 436 by introducing syntax similar to Java, C#, and JS for anonymous inline functions (lamda expressions). This involves finding a new symbol for the existing "thin arrow" operator; it also gives an opportunity to show how lambda expressions can be used in pipelines.

Some points for WG consideration:

(a) Do we really want the curly braces around the function body to be mandatory?

(b) What symbol should we use for the mapping arrow? I've used =!> as it suggests to me the combination of function application and sequence mapping.

(c) Should we reinstate the special syntax for arity-one "focus functions" (->{@salary}) ) which is dropped in this proposal

(d) I haven't necessarily worked through all the changes to examples needed, e.g.. in the XSLT and F+O specs.

Issue #437 closed #closed-437

18 Apr at 19:33:53 GMT

xsl:where-populated and table with header

QT4 CG meeting 031 draft minutes #minutes-04-18

18 Apr at 18:03:00 GMT

Draft minutes published.

Issue #357 closed #closed-357

18 Apr at 17:10:08 GMT

Representing key-value pairs

Issue #420 closed #closed-420

18 Apr at 17:10:07 GMT

Issue 357 Map composition and decomposition

Issue #446 closed #closed-446

18 Apr at 17:10:06 GMT

Fix merge conflicts in PR #420

Pull request #446 created #created-446

18 Apr at 17:01:40 GMT
Fix merge conflicts in PR #420

Close #420 Close #357

Pull request #445 created #created-445

18 Apr at 16:54:05 GMT
Editorial updates to XSLT spec

This PR fixes editorial issues in the XSLT 4.0 spec: issue #373, issue #384, issue #423. It also updates the XSD schema for XSLT 4.0 to incorporate most of the syntax changes that have been made to date (though further checking is needed), and updates some 3.0/3.1 references to 4.0 references.

Issue #444 closed #closed-444

18 Apr at 16:31:04 GMT

Resolve merge conflict for PR 420

Pull request #444 created #created-444

18 Apr at 16:29:39 GMT
Resolve merge conflict for PR 420

Close #420 Close #357

Issue #443 created #created-443

18 Apr at 15:40:00 GMT
@select on xsl:matching-substring and xsl:non-matching-substring

In the spirit of making @select or <sequence constructor> the norm, I think the children of xsl:analyze-string have perhaps been overlooked.

Pull request #442 created #created-442

18 Apr at 12:42:54 GMT
Attempt to clearify XML serialization of control characters

Fix #438

Clarify that the control characters #x1 through X1f and #x7f through #x9f must be output as character references except for the whitespace characters #x9, #xA, #xD, and #85.

Issue #439 closed #closed-439

18 Apr at 10:45:47 GMT

ExprSingle no longer allows OrExpr

Issue #440 closed #closed-440

18 Apr at 10:45:30 GMT

Fix bug #439 - grammar for ExprSingle

Issue #441 closed #closed-441

18 Apr at 10:45:07 GMT

Make XSLT function formatting consistent with F&O formatting

Pull request #441 created #created-441

18 Apr at 10:44:58 GMT
Make XSLT function formatting consistent with F&O formatting

Resolves action QT4CG-023-01, I believe. This is a minimal sort of fix, not an attempt to refactor everything.

Pull request #440 created #created-440

18 Apr at 10:13:54 GMT
Fix bug #439 - grammar for ExprSingle

Simple bug fix, shouldn't need any meeting time.

Issue #439 created #created-439

18 Apr at 09:03:37 GMT
ExprSingle no longer allows OrExpr

The grammar for ExprSingle seems to have been accidentally changed so it no longer allows an OrExpr as one of the alternatives.

Issue #438 created #created-438

15 Apr at 08:56:18 GMT
What are the "non-whitespace control characters"?

In Section 5, XML Serialization, we find:

A consequence of this rule is that certain characters MUST be output as character references, to ensure that they survive the round trip through serialization and parsing. Specifically, CR, NEL and LINE SEPARATOR characters in text nodes MUST be output respectively as "&#xD;", "&#x85;", and "&#x2028;", or their equivalents; while CR, NL, TAB, NEL and LINE SEPARATOR characters in attribute nodes MUST be output respectively as "&#xD;", "&#xA;", "&#x9;", "&#x85;", and "&#x2028;", or their equivalents. In addition, the non-whitespace control characters #x1 through #x1F and #x7F through #x9F in text nodes and attribute nodes MUST be output as character references. (The reference to "non-whitespace control characters" appears in a few other places as well, but for basically the same purpose.)

But what are the "non-whitespace control characters"? The spec doesn't say. I think it means all of the C0 and C1 control characters except CR, NL, TAB, and NEL. The fact that vertical tab and line feed might be considered "white space" doesn't really matter anyway since none of the other C0 control characters are allowed in XML 1.0 anyway (encoded or otherwise).

XML 1.0 doesn't actually care about the C1 control characters. There's no reason to encode them, but it does no harm, I suppose. You'd have to encode the C0 and C1 control characters for an XML 1.1 parser, but none of those exist.

I wonder if it might be a little clearer to say

A consequence of this rule is that certain characters MUST be output as character references, to ensure that they survive the round trip through serialization and parsing. Specifically, CR, NEL and LINE SEPARATOR characters in text nodes MUST be output respectively as "&#xD;", "&#x85;", and "&#x2028;", or their equivalents; while CR, NL, TAB, NEL and LINE SEPARATOR characters in attribute nodes MUST be output respectively as "&#xD;", "&#xA;", "&#x9;", "&#x85;", and "&#x2028;", or their equivalents. In addition, the other control characters #x1 through #x1F (except #x9, #xA, and #xD) and #x7F through #x9F (except #x85) in text nodes and attribute nodes MUST be output as character references.

QT4 CG meeting 031 draft agenda #agenda-04-18

14 Apr at 10:50:00 GMT

Draft agenda published.

Issue #437 created #created-437

13 Apr at 14:26:24 GMT
xsl:where-populated and table with header

Dear all,

I just discover the use of xsl:where-populated but was surprised it cover only a narrow use case of single level of wrapping

My use case is about table and I have to wrap thing into a table with header

<table ... bunch of attributes'
    <thead ....>
    <xsl:for-each ... >
       <tr>  
          <xsl:for-each ... >
              <td>...</td>
         </xsl:for-each>
     </tr>
   </xsl:for-each>
</table>

How can I do that with xsl:where-populated ?

Issue #436 created #created-436

11 Apr at 17:19:01 GMT
Allow inline function expressions in arrow operator call chains

It can be useful to create inline functions for simple operations (e.g. adding 1 to a number) to be used in arrow operator call chains.

The current proposal uses -> { ... } for just the thin arrow operator.

This proposal is split into two parts:

  1. restructure the grammar to make the function call usage simpler to follow;
  2. introduce the ability to use inline functions in thin/fat arrow contexts.

Part 1 -- Simplify the Grammar

I suggest changing the grammar to:

FatArrowTarget           ::= "=>" ( ArrowFunctionCall | ArrowDynamicFunctionCall )
ThinArrowTarget          ::= "->" ( ArrowFunctionCall | ArrowDynamicFunctionCall )
ArrowFunctionCall        ::= EQName ArgumentList
ArrowDynamicFunctionCall ::= ( VarRef | ParenthesizedExpr ) PositionalArgumentList

That is, I've grouped the function name/reference with the argument list.

Part 2 -- Allow Inline Functions

FatArrowTarget          ::= "=>" ( ArrowFunctionCall | ArrowDynamicFunctionCall | ArrowInlineFunctionCall )
ThinArrowTarget         ::= "->" ( ArrowFunctionCall | ArrowDynamicFunctionCall | ArrowInlineFunctionCall )
ArrowInlineFunctionCall ::= ( "function" | "fun" ) EnclosedExpr

Note: here, "fun" is a placeholder for whichever name/symbol we choose in https://github.com/qt4cg/qtspecs/issues/53.

This then allows expressions like (1, 2, 3) -> function { . + 1 } and (1, 2, 3) => fun { ~ = 1 } (see also https://github.com/qt4cg/qtspecs/issues/129) without overloading the meaning of ->.

QT4 CG meeting 030 draft minutes #minutes-04-11

11 Apr at 17:17:00 GMT

Draft minutes published.

Issue #435 created #created-435

11 Apr at 16:33:00 GMT
Remove the inlined function expression variant of the thin arrow operator

This proposal is to remove the third bullet/variant from the thin arrow operator so that the new inline function syntax (-> { ... }) cannot be used within the arrow expressions.

This makes the thin/fat arrows consistent in behaviour with each other, with the exception of how they pass the value to the expressions:

  1. thin arrow operators pass the values in the sequence one at a time to the associated function;
  2. fat arrow operators pass all the values in the sequence to the associated function in a single call.

Changes

  1. Update the syntax:
[46] FatArrowTarget  ::= "=>" ((ArrowStaticFunction ArgumentList) | (ArrowDynamicFunction PositionalArgumentList)) 	
[47] ThinArrowTarget ::= "->" ((ArrowStaticFunction ArgumentList) | (ArrowDynamicFunction PositionalArgumentList))
  1. Remove the text for the inline function variant:

If the arrow is followed by an EnclosedExpr:

Given a UnaryExpr U, and an EnclosedExpr {E}, the expression U -> {E} is equivalent to the expression (U) ! (E).

For example, the expression $x -> {.+1} is equivalent to ($x)!(.+1).

  1. Remove/update the associated examples, e.g. to use let $f := function ($x) { $x + 1 } return $x -> f() -> $f().

Issue #390 closed #closed-390

11 Apr at 16:16:37 GMT

Should parsing and building URIs attempt to special case Windows URIs for UNC names?

Issue #415 closed #closed-415

11 Apr at 16:16:36 GMT

Revise parse/build URI functions for UNC names

QT4 CG meeting 030 draft agenda #agenda-04-11

09 Apr at 19:35:00 GMT

Draft agenda published.

Pull request #434 created #created-434

07 Apr at 11:57:37 GMT
Functions to parse and format hex integers

Addresses issue #241 by providing functions to parse and format integers in any number base from 2 to 36.

Pull request #433 created #created-433

07 Apr at 09:56:14 GMT
429 Add hex and binary literals and allow underscores

Addresses issue #429. The grammar is extended to allow hex and binary integer literals, and all numeric literals may contain underscores for readability.

Issue #432 closed #closed-432

07 Apr at 07:43:16 GMT

fix attribute name : it is diff instead of role

Pull request #432 created #created-432

06 Apr at 20:01:36 GMT

fix attribute name : it is diff instead of role

Issue #417 closed #closed-417

06 Apr at 15:51:43 GMT

Fix residual reference to op:A2S which is no longer defined

Issue #315 closed #closed-315

06 Apr at 15:51:10 GMT

fn:transform inconsistency: initial-mode

Issue #427 closed #closed-427

06 Apr at 15:51:09 GMT

Change fn:transform to use the stylesheet's default mode

Issue #280 closed #closed-280

06 Apr at 15:50:59 GMT

Why is resolve-uri forbidden from resolving against a URI that contains a fragment identifier?

Issue #426 closed #closed-426

06 Apr at 15:50:58 GMT

Resolve #280 by allowing a fragid

Issue #428 closed #closed-428

06 Apr at 12:05:43 GMT

Fix problem in rendering empty <xnt> elements

Issue #431 closed #closed-431

06 Apr at 12:05:42 GMT

Fix problem rendering xnt elements

Pull request #431 created #created-431

06 Apr at 12:05:21 GMT
Fix problem rendering xnt elements

Close #428

Hi @michaelhkay . I took a slightly different approach. For unknown reasons long since lost in the mists of time the 'etc' files that act as databases for cross-spec references stored nt elements. That's weird because the nt elements are supposed to point to prod elements. I expect someone (let's be candid, probably me) got confused by the fact that nt elements have a def attribute and thought they were definitions. They're not. I've changed things so that the prod elements are now stored in the database. There's only going to ever be one of those.

I had to tidy up a few things to make that work, and we can't abandon support for nt files in 'etc' documents because we have existing files that don't get regenerated.

I also cleaned up the cross-reference error to StringLiteral in the XSLT spec and patched over a problem with a few link in the XQuery specifications.

There's no useful information from PR formatting of PRs that change the stylesheets, so I'm just going to cross my fingers and merge this. Please pull the latest and let me know if you see any problems!

Pull request #430 created #created-430

06 Apr at 09:32:47 GMT
fn:doc et al, error handling: inconsistencies. Closes #293

Resolves action item QT4CG-029-04.

Issue #429 created #created-429

05 Apr at 17:04:05 GMT
Hexadecimal and binary literals

Without wanting to challenge our weekly burn-down chart too much, I wonder whether it would be a big deal to support hexadecimal and binary literals in XPath? Examples:

(: decimal :)     1, 255,
(: hexadecimal :) 0x1, 0X00Ff,
(: binary :)      0b1, 0B11111111, 

The main question is probably if it conflicts with the existing grammar?

Pull request #428 created #created-428

05 Apr at 16:21:17 GMT
Fix problem in rendering empty <xnt> elements

In the F&O spec, function parse-QName, there is a link to BracedURILiteral repeated 6 times. This happens when an <xnt> element is written with empty content. There are 6 entries in the /etc/ XP40 file for the relevant grammar symbol, and each of them is output. I haven't tried to eliminate the redundancy in the /etc/XP40 file, I have simply changed the code for processing <xnt> so it only considers the first one.

Issue #411 closed #closed-411

05 Apr at 14:19:14 GMT

Remove the note from the parse-html unparsed-entity sections.

Pull request #427 created #created-427

05 Apr at 14:17:12 GMT
Change fn:transform to use the stylesheet's default mode

Close #315

Pull request #426 created #created-426

04 Apr at 21:01:57 GMT
Resolve #280 by allowing a fragid

Fix #280

(Based off the right branch this time. I hope.)

Issue #424 closed #closed-424

04 Apr at 21:01:15 GMT

Allow fn:resolve-uri to resolve against a base URI that includes a fragment identifier

Issue #425 created #created-425

04 Apr at 20:09:45 GMT
Structural proposal (ThinLayer:tm:) : Add a layer of thin spec between XPath and the XPath Derived Language

XPath is ubiquitous and is used even in places where we have no idea On the other hand XPath is very useful to us as the centerpiece of XSLT, XQuery

In order to allow people to have a more expanded use of XPath without to have to get the whole XSLT, XQuery story, it is perhaps the time to consider adding a thin layer of spec in order to have

  1. XPath
  2. Some typing definition mechanism
  3. Some function definition mechanism

The idea is also to better integrate this with all Validation technologies (XSD, Relax NG, NVDL, JSONSchema, etc.) and allow EXPath, EXQuery and other to have a standard way to use all this around

We also want people that want to use only XPath (for example in LinQ or inside SQL) to have a broader capacity to interact with XML (instead of being limited to XPath 1.0 with namespaces)

I will try to add more and more precision to this proposal along the line, but I feel it is good enough to be a first stone to break and allow people to help drive this initiative

For the moment the name of this new beast is XPathWithCustomizableTypesAndFunctions in our proposal call XPath Next https://github.com/XPath-Next/XPath-Next/blob/first-draft/spec.md

QT4 CG meeting 029 draft minutes #minutes-04-04

04 Apr at 17:30:00 GMT

Draft minutes published.

Pull request #424 created #created-424

04 Apr at 17:12:08 GMT
Allow fn:resolve-uri to resolve against a base URI that includes a fragment identifier

Fix #280

Issue #423 created #created-423

04 Apr at 14:24:37 GMT
[XSLT 4.0] 2.2 Notation is incomplete

"language" and "prefixes" are used in the definition of element but not defined here

On the other hand, "nmtokens" is defined but not used

Issue #22 closed #closed-22

04 Apr at 11:32:31 GMT

[XPath] Allowing multiple let clauses in LetExpr and for clauses in ForExpr

Issue #416 closed #closed-416

04 Apr at 10:23:13 GMT

NCName is usually lowercase in attribute type for the rest of the spec

Issue #422 closed #closed-422

04 Apr at 10:22:50 GMT

Fix syntax in examples

Pull request #422 created #created-422

04 Apr at 09:58:49 GMT
Fix syntax in examples

attribute namespace-uri corrected to namespace xsl:sequence-of corrected to xsl:sequence

Issue #419 closed #closed-419

04 Apr at 09:37:12 GMT

fix few syntax issues in the XSLT 4.0 examples

Issue #421 created #created-421

04 Apr at 09:36:07 GMT
Make sure the build system syntax checks the syntax of examples

Apparently the code is lying around somewhere...

Pull request #420 created #created-420

04 Apr at 09:26:51 GMT
Issue 357 Map composition and decomposition

This PR addresses the issues concerned with map composition and decomposition in issue #357. It adds a function to decompose a map into key-value pairs (map:key-value-pairs), and its inverse (map:of). It adds explanatory material to F+O to explain how these functions relate to each other, and it adds examples to the XSLT spec to show how the interwork with the xsl:array, xsl:map, and xsl:for-each instructions. Note: the map functions have been sorted alphabetically, so the changes will appear more extensive than they are.

Pull request #419 created #created-419

04 Apr at 08:29:09 GMT

fix few syntax issues in the XSLT 4.0 examples

Issue #418 created #created-418

04 Apr at 08:22:46 GMT
array and map attribute in xsl:iterate and xsl:for-each-group

It seems there is still some places where we can still spot some remnant peices of attribute array and attribute map in the spec

You can find this sentence at two places

, or constructed from the expressions in the array or map attributes.

Also there is some examples

  • Example: Grouping entries in a Map
  • Example: Processing an array using xsl:iterate

Pull request #417 created #created-417

03 Apr at 21:56:17 GMT
Fix residual reference to op:A2S which is no longer defined

The changes made to redefine array functions in terms of array:members and array:of rather than op:A2S and op:S2A weren't applied to array:get because that was the subject of a separate PR to add the fallback option. This PR corrects the omission.

Issue #403 closed #closed-403

03 Apr at 21:25:17 GMT

Michaelhkay actions 2023 02 01

Pull request #416 created #created-416

03 Apr at 17:48:42 GMT

NCName is usually lowercase in attribute type for the rest of the spec

Pull request #415 created #created-415

03 Apr at 16:39:42 GMT
Revise parse/build URI functions for UNC names

Fix #398 Fix #390

This PR attempts to address the questions raised in issues 389 and 390:

  1. It adds a unc-path option that is used to guide the parsing and construction of URIs that represent Windows UNC paths
  2. It adds a filepath property to the result of parse-uri. This property represents the local path part of the URI. For file: URIs, this is the local path.
  3. It addresses the use of "|" in URIs to represent the ":" in Windows filenames
  4. It clarifies that percent-decoding a path also involves intepreting the result as a UTF-8 sequence
  5. It clarifies that percend-decoding and encoding apply to the query parts of a URI as well.

Issue #300 closed #closed-300

02 Apr at 12:58:37 GMT

[F+O] Ambiguity regarding Unicode normalization (editorial)

QT4 CG meeting 029 draft agenda #agenda-04-04

31 Mar at 16:35:00 GMT

Draft agenda published.

Issue #414 created #created-414

31 Mar at 08:04:08 GMT
Lift character set restriction of xs:string

Adopted from https://github.com/qt4cg/qtspecs/issues/413#issuecomment-1491469514

I guess that raises the question of whether it is still appropriate to restrict the character set of xs:string to that of XML 1.0. Are there any benefits in doing so?

I believe that would simplify things a lot, in particular when working with input/output functions.

QT4 CG meeting 028 draft minutes #minutes-03-28

30 Mar at 16:50:00 GMT

Draft minutes published.

Issue #404 closed #closed-404

30 Mar at 15:52:47 GMT

Rework changes from action-qt4cg-019-01 to resolve persistent conflicts.

Issue #398 closed #closed-398

30 Mar at 15:52:19 GMT

User-defined functions clashing with constructor functions

Issue #406 closed #closed-406

30 Mar at 15:51:58 GMT

Revise xsl:array instruction and examples

Issue #408 closed #closed-408

30 Mar at 15:51:37 GMT

Fix issue #398 (clash with constructor functions)

Issue #413 created #created-413

30 Mar at 14:08:28 GMT
New function: parse-csv()

I propose a new function parse-csv() that accepts a CSV string (such as might be read from a CSV file using unparsed-text()). CSV is as defined in RFC 4180; implementations may be liberal in what they accept, and may define additional options.

An options parameter includes the option header=true|false to indicate whether the first line should be taken as containing column headings.

The result of the function is a sequence of maps, one map per row of the CSV file (excluding the header). Each map contains one entry per column, the key being taken from the column header if present, or an integer 1...n if not.

Pull request #412 created #created-412

28 Mar at 22:12:29 GMT
409, QT4CG-027-01: xsl:next-match

Clarifies the rules for xsl:next-match, especially for 4.0 type patterns, but also clarifying the exposition of rules unchanged since 3.0 or 2.0 (see issue 409)

Pull request #411 created #created-411

28 Mar at 11:51:25 GMT
Remove the note from the parse-html unparsed-entity sections.

This applies the review action:

RD to remove the note in 15.5.15 of functions and operators.

QT4 CG meeting 028 draft agenda #agenda-03-28

27 Mar at 13:00:00 GMT

Draft agenda published.

Issue #410 created #created-410

27 Mar at 10:37:57 GMT
Converting doubles to decimals, fractional digits

Adopted from a previous discussion on Slack: The result of the following computation…

<x>2</x> + .1

…is serialized as 2.1. If the result is cast to a decimal via xs:decimal(<x>2</x> + .1), 2.100000000000000088817841970012523233890533447265625 is returned, which feels counterintuitive.

Can we possibly change the conversion rules without compromising backward-compatibility?

Issue #409 created #created-409

26 Mar at 11:13:36 GMT
XSLT: xsl:next-match and xsl:apply-imports interaction with on-multiple-match

(This is an oversight in the XSLT 3.0 specification.)

It is possible for xsl:next-match or xsl:apply-imports to encounter a conflict - two template rules with the same precedence and priority. In this situation it should do exactly what xsl:apply-templates does when it encounters a conflict, for example it should follow the rules of xsl:mode/@on-multiple-match.

This is all fairly obvious, but it should be stated explicitly (and tested). The spec is written as if conflicts can only occur when finding the first matching rule, and not when finding next-match rules.

Issue #392 closed #closed-392

24 Mar at 11:00:07 GMT

Partial function application: Placeholders with keywords

Pull request #408 created #created-408

24 Mar at 10:29:55 GMT
Fix issue #398 (clash with constructor functions)

Add clarification to XSLT and XQuery specs to say that a used-defined function must not clash with a constructor function for an imported atomic type.

Issue #407 created #created-407

23 Mar at 11:39:34 GMT
XSLT-specific context properties used in function items

I just stumbled across the fact that current-group#0 doesn't work: see the note in 14.2.1 that says:

Like other XSLT extensions to the dynamic evaluation context, the [current group] is not retained as part of the closure of a function value. This means that the expression current-group#0 is valid and returns a function value, but any invocation of this function will fail with a dynamic error [see [ERR XTDE1061].

This restriction is unnecessary and we should remove it. As people become more accustomed to using function items, they don't want to hit restrictions like this, and there's really no good implementation reason for it.

Pull request #406 created #created-406

22 Mar at 18:03:20 GMT
Revise xsl:array instruction and examples

This PR revises the design of the xsl:array instruction to align it with the recently agreed specs for the functions array:members and array:of. The composite attribute and the xsl:array-member instruction are dropped; instead the instruction takes a use attribute which is an expression used to compute each array member value from the corresponding item in the value of the select expression or sequence constructor.

Issue #360 closed #closed-360

21 Mar at 18:12:12 GMT

Issue 314 array composition and decomposition

Issue #405 closed #closed-405

21 Mar at 18:12:11 GMT

MK PR #360 with merge conflicts resolved (array compositionand decomposition)

Pull request #405 created #created-405

21 Mar at 18:01:30 GMT
MK PR #360 with merge conflicts resolved (array compositionand decomposition)

Close #360

This PR is the same as 360 but fixes merge conflicts.

Accepted at meeting 027

Close https://github.com/qt4cg/qtspecs/issues/314

Issue #400 closed #closed-400

21 Mar at 17:19:13 GMT

Priorities for type-based patterns

Issue #401 closed #closed-401

21 Mar at 17:18:48 GMT

Issue 400: ranking of type patterns

Issue #395 closed #closed-395

21 Mar at 17:17:23 GMT

Make the (non-)hierarchical nature of URIs explicit

Issue #394 closed #closed-394

21 Mar at 17:17:05 GMT

Minor correction to fn:parse-uri

Issue #393 closed #closed-393

21 Mar at 17:16:47 GMT

Clarify explanations of functions/function items

Issue #391 closed #closed-391

21 Mar at 17:16:14 GMT

addressed typographical errors; adjusted Unicode character discussion…

QT4 CG meeting 027 draft agenda #agenda-03-21

21 Mar at 10:30:00 GMT

Draft agenda published.

Issue #336 closed #closed-336

20 Mar at 16:00:45 GMT

Action QT4CG-019-01 (type of $pattern in fn:tokenize())

Pull request #404 created #created-404

20 Mar at 15:59:39 GMT
Rework changes from action-qt4cg-019-01 to resolve persistent conflicts.

I reworked these (purely editorial) changes based on the current master to try and resolve the conflicts once and for all.

Pull request #403 created #created-403

20 Mar at 15:26:46 GMT

Michaelhkay actions 2023 02 01

Issue #402 created #created-402

20 Mar at 00:31:10 GMT
XSLT patterns: intersect and except

I would like to propose making an incompatible change to the semantics of XSLT patterns using the "except" and "intersect" operators, so that they have their intuitive meaning.

Consider the pattern p except appendix//p. Anyone writing this probably imagines that this will match any p element that does not have an appendix as an ancestor. The intuitive meaning of A except B is to match anything that matchesA unless it also matches B.

The actual meaning in the XSLT 3.1 specification is that it matches any node $N that has an ancestor $A such that the result of the XPath expression $A//(p except appendix//p) includes $N.

Consider the XML

<appendix>
  <div>
     <p>...</p>
  </div>
</appendix>

The <p> element here has an ancestor (the <div> element) where the result of $A//(p except appendix//p) includes the <p> element. So despite having an ancestor appendix this element matches the pattern p except appendix//p. This is not only a counter-intuitive result, it also makes such patterns useless in practice.

Patterns using intersect suffer the same problem, though it is much harder to construct a plausible example.

Patterns that only use the child or attribute axis, for example @* except @code, or * except note, don't suffer from this problem and will retain the same meaning as in 3.1.

The required effect can be achieved by writing p except p[ancestor::appendix]. Because the pattern p[ancestor::appendix] is equivalent to appendix//p, people are very likely to imagine that p except p[ancestor::appendix] is equivalent to p except appendix//p.

Making any incompatible change to the language semantics should be done only with a very strong justification, but I believe that it is justified in this instance. The existing semantics are not only counter-intuitive, they are also sufficiently useless that it is extremely unlikely anyone has existing working code, other than artificial test cases, that relies on the current semantics.

Issue #387 closed #closed-387

19 Mar at 11:10:26 GMT

Add compatibility notes for fn:namespace-uri-for-prefix

Issue #385 closed #closed-385

19 Mar at 11:09:25 GMT

Actions QT4CG-025-07 / -08

Issue #378 closed #closed-378

19 Mar at 11:06:41 GMT

Update the localName and unparsed entity reference notes for parse-html

Pull request #401 created #created-401

15 Mar at 22:02:24 GMT
Issue 400: ranking of type patterns

Proposes how to handle type-based match patterns (record tests, in particular) in the absence of explicit priorities, basing the decision on the type hierarchy. Note: user-defined priorities are always considered before any inferred selectivity rules. Also fixes some grammar problems with type patterns. See Issue #400.

Issue #400 created #created-400

15 Mar at 08:13:04 GMT
Priorities for type-based patterns

XSLT §6.6 currently has a big TODO:

TODO: define default priorities for type patterns, as suggested in https://www.saxonica.com/papers/xmlprague-2020mhk.pdf section 6.5.1

We need to plug this gap. [Note: it's worth reading that cited section as it points out some of the difficulties].

I'm going to suggest an alternative approach. Rather than allocating a numeric priority to patterns such as record(lat, long), we allocate them a relative priority -- called their selectivity -- based on the subtype relationship among types. This is a partial ordering. So we extend the rule that currently orders patterns by (1) import precedence, (2) priority, (3) declaration order, to become instead (1) import precedence, (2) selectivity, (3) priority, (4) declaration order.

Type-based patterns (such as type(xs:integer), record(lat, long)) are defined to have higher selectivity than any non-type-based pattern; all the latter (that is, all XSLT 3.1 patterns) are defined to have equal selectivity, which means the rules for discriminating among 3.1 patterns are unchanged.

For type-based patterns, we define that a pattern based on type T has higher selectivity than a pattern based on type U if T is a subtype of U. If neither is a subtype of the other, then they have equal selectivity.

The type pattern type(T) followed by one or more predicates is deemed to have higher selectivity than type(T) with no predicates, but apart from this, the predicates are ignored. Explicit numeric priorities can be used to define an ordering among type patterns that have the same selectivity.

Issue #399 created #created-399

15 Mar at 03:31:46 GMT
fn:deep-equal: Using Multilevel Hierarchy and Abstraction when designing and specifying complex functions

Whenever a function is too-complex, its precise and clear specification becomes problematic and the complexity results in huge volume of text that is difficult to fathom, understand and the correctness of whose meaning becomes less and less obvious.

Solving this problem would benefit all groups of readers, be they future implementors or just curious XPath enthusiasts.

Here I present one well-known and successfully tried in practice solution, which the Romans summarized in the phrase: "Divide et Impera" ("Divide and conquer")

Below is one possible splitting of the functionality of fn:deep-equal into different smaller and simpler functions on 5 levels, each level possibly dispatching to a function on another level. The intermediate-level functions each have their own value and could be used independently of fn:deep-equal and of each other. Even though there is the possibility of recursion, we can still get a simple picture and immediate understanding of this functionality, just playing with the following collapsible/expandable representation (click on the corresponding arrow), which fits on a single screen:

(writing the full specification from this animated picture is left as an exercise for the reader 😂

deep-equal-sequence
deep-equal-item

deep-equal-atomic

deep-equal-map

deep-equal-atomic

deep-equal-sequence

deep-equal-array

deep-equal-sequence

deep-equal-node

deep-equal-document

deep-equal-element

deep-equal-attribute

deep-equal-NS

deep-equal-PI

deep-equal-comment

deep-equal-text-node

deep-equal-attribute

deep-equal-NS

deep-equal-PI

deep-equal-comment

deep-equal-text-node

Issue #398 created #created-398

14 Mar at 23:39:13 GMT
User-defined functions clashing with constructor functions

There is no explicit rule in either XQuery or XSLT that a user-defined function must not clash (in name and arity) with a constructor function for an imported atomic type. It's implicit in the rule that you can't have two conflicting functions in the static context, but it would be helpful to say so explicitly and define an error code.

I have added tests to the XSLT3 and XQuery3 test suites.

See also Saxon bug 5921 - https://saxonica.plan.io/issues/5921

Issue #397 created #created-397

14 Mar at 21:46:13 GMT
Type names

The draft specifications propose the introduction of item type declarations that can associate a name with an item type. The feature probably still needs some work, which this issue aims to explore.

The main purpose of introducing named item types is that the ItemType for a record structure or a function signature can become quite complex and lengthy, and you don't want to have to repeat them every time they are used because it means you have to make the same change everywhere when a change occurs. Another motivation is to allow type definitions (for example, of records or functions) to be recursive.

I considered allowing named sequence types rather than just item types, but the rules for where you can and can't have an occurrence indicator get complicated, so I pulled back from that.

It seems natural to say:

  • Item type names are QNames
  • In XPath, type names (and their mapping to item types) appear in the static context
  • In XQuery, type names follow the conventions for global variables and function declarations. That suggests they can appear either in the main module or a library module; in a library module they must be in the namespace of the module; they can be annotated as %public or %private; an import module declaration makes the name visible in the importing module.
  • In XSLT, a name declared in a module is automatically available throughout the stylesheet package, and can be exposed to other packages using the same visibility mechanisms as other stylesheet components. However, I don't think it makes sense to allow a type name to be overridden, either using import precedence or using xsl:override.

The question then arises, should item type names be in the same "symbol space" as named atomic and union types? There seem to be several options here:

(a) Item type names are in a different symbol space from atomic types; the are no rules barring the same name being used for a named item type and an atomic type, and they are disambiguated by requiring item type names to be distinguished using some kind of marker syntax such as type(name), rather than just a bare name. (b) Item type names are in the same symbol space as atomic types, which means there must be a rule that an item type name must not be the same as an atomic type name that is visible in the same place. We could try and define this rule for individual names, or at the level of namespaces (if there are any atomic/union types in a particular namespace in the static context of any module, then there must be no declared type names in that namespace in that module, either declared in that module or imported from another module). (c) Atomic type names "shadow" item type names, or vice versa: if the same name is used for both, then one of them takes precedence. Probably not a good idea.

I'm inclined to go for (b). Note that a simple rule that item type names can't be in a reserved namespace will prevent conflict for all non-schema-aware applications, since those applications only access atomic types in the xs namespace.

Now, what about circular definitions?

There are legitimate circular definitions, like declare item type LIST = record(payload as item()*, next? as LIST), and there are "impossible" definitions, like declare item type THING = THING. Do we have to define the rules needed to ban "impossible" definitions, or can we just leave it that the determination of whether something is an instance of THING is non-terminating? I think we probably need to define the rules, which will require careful thought.

Where can item type names be used? The simple answer is: anywhere an ItemType is allowed. But what about contexts that only allow some ItemTypes and not others? For example, (a) "cast as", (b) as arguments of a LocalUnionType, (c) as the key type in a map type. (The solution in the current draft is that the syntax allows any ItemType to be used in these contexts, and there are semantic rules to constrain what kind of item types are allowed).

If we allow $v cast as my:X where my:X is a declared item type name, should we also allow the constructor function my:X($v)? That would presumably also mean that item type names and function names cannot overlap.

Should we define any "built-in" item type names? We've been defining built-in functions (such as build-uri and parse-uri) whose signatures use record type definitions. Should we define built-in names for these record definitions?

An editorial issue: I think it's becoming increasingly difficult to get away with overloading the word ItemType to mean both the abstract concept of an item type, and the specific BNF construct used to define it. Same for SequenceType. I think we should probably move to having a defined term "item type" and a BNF construct such as ItemTypeDesignator to represent the two separate meanings.

QT4 CG meeting 026 draft minutes #minutes-03-14

14 Mar at 17:20:00 GMT

Draft minutes published.

Pull request #396 created #created-396

13 Mar at 18:04:50 GMT
333: Deep-equal, no failure when comparing functions

Refines the spec of fn:deep-equal so it no longer fails when comparing function items, rather it returns a result which in general is implementation-dependent, though it must be false unless the functions are provably equivalent.

Pull request #395 created #created-395

13 Mar at 11:50:35 GMT
Make the (non-)hierarchical nature of URIs explicit

The fn:parse-uri function will parse hierarchical or non-hierarchical URIs, however, the parse cannot be reversed if the fn:build-uri function doesn't know whether the scheme is hierarchical. Consider fn:parse-uri("querty:abc"):

map {
  "path":"abc",
  "scheme":"querty",
  "path-segments":["abc"],
  "uri":"querty:abc"
}

When fn:build-uri parses that map, it produces: querty://abc because the scheme is not known to be non-hierarchical.

This PR changes fn:parse-uri so that it records whether or not the URI was hierarchical and fn:build-uri to use that information.

It's possible that we could finesse this by setting the authority to the empty string for hierarchical URIs, but it seems clearer to be explicit.

This PR also fixes a bug. Previously, if the scheme was not present when building a URI, the URI began with //. That's an error. If the scheme isn't present, there should be no scheme separator.

Issue #320 closed #closed-320

13 Mar at 11:07:57 GMT

Issue 98 - add options parameter to fn:deep-equal

Pull request #394 created #created-394

13 Mar at 10:08:22 GMT
Minor correction to fn:parse-uri

The fn:parse-uri() function recognizes "URIs" of the form c:/path/to/thing as implicitly being file: URIs. This small change adds a leading "/" to make the fact that it is a path explicit.

Issue #25 closed #closed-25

13 Mar at 09:54:10 GMT

[XPath] `%variadic("sequence")` does not allow specifying some argument values in the variadic sequence, and in one case even not the variadic sequence itself

Issue #26 closed #closed-26

13 Mar at 09:52:54 GMT

[XPath]A value in the last row (for "sequence-variadic" functions) of the table "Number of Arguments allowed in a Function Call" is incorrect

Issue #54 closed #closed-54

13 Mar at 09:51:12 GMT

[XPath] [XQuery] Keyword arguments don't work with all parameters/keys in static functions.

Issue #47 closed #closed-47

13 Mar at 09:48:51 GMT

[XPath] [XQuery] Allow argument placeholders on keyword arguments

QT4 CG meeting 026 draft agenda #agenda-03-14

13 Mar at 08:45:00 GMT

Draft agenda published.

Issue #386 closed #closed-386

13 Mar at 09:26:50 GMT

Action QT4CG-025-05 (markup typo)

Pull request #393 created #created-393

12 Mar at 22:07:08 GMT
Clarify explanations of functions/function items

This PR is purely editorial in the sense that it does not attempt to make any changes that would affect an implementation. It's intended to clear up ambiguity and lack of clarity in the description of operations on functions, in particular the way that a function item captures static and dynamic context. It addresses issues #239 and issue #392.

Issue #392 created #created-392

12 Mar at 18:56:05 GMT
Partial function application: Placeholders with keywords

It's clear that the following is allowed:

format-date(current-date(), '[Y]-[M]-[D]', place:=?, language:=?, calendar:="AD")

The resulting function item takes two arguments (place and language) but in what order? Is it the order of parameters in the original function definition, or the order in which they appear in the partial function application?

I think it should be the latter, but this needs to be made explicit in the spec.

Note that this doesn't only apply to optional parameters as in the above example, it applies equally, for example to

starts-with(substring=?, value=?)

While we're on the subject, we should also ask whether

concat(value83 := ?)

is legal, and if so, what it means.

Pull request #391 created #created-391

11 Mar at 05:25:50 GMT
addressed typographical errors; adjusted Unicode character discussion…

… for internal local consistency, clarity

This being my first PR, I opted to include beyond the typos I noted in #289 another small block of hopefully uncontroversial edits as a test balloon.

Issue #278 closed #closed-278

10 Mar at 18:04:19 GMT

array bound checking

Issue #289 closed #closed-289

10 Mar at 18:03:47 GMT

Proposal to add fallback behaviour to map:get and array:get

Issue #390 created #created-390

10 Mar at 17:59:35 GMT
Should parsing and building URIs attempt to special case Windows URIs for UNC names?

Depending on the platform and language APIs involved, we see file: URIs encoded in a variety of different ways. It doesn't help that there's no official RFC for file: URIs.

  • file:/path/part is a file: URI with no host and a path of /path/part.
  • file:///path.part is a file: URI with an explicitly empty host and a path of /path/part.
  • file://path/part is a file: URI with an authority of path and a path of /part. I think one common way to interpret this is as if it was file:/part. That is, in file: URIs, although a different host is possible, it's often just ignored.
  • c:\path\part is most usefully interpreted as file:/c:/path/part, a file: URI with no host and a path of /c:/path/part. These are only going to be useful on a Windows system, so it isn't a problem to treat them the same way on all platforms. (Aside: I don't actually know if the path part should be c:/path/part instead, but it's currently got the leading slash in fn:parse-uri().)

And then there's this: file:////name/path/part.

One interpretation is, "look, we accept file:/ and file:/// so let's just accept file:// and file://///////, etc. as the same." And I think that's generally right, with the single special exception of file:////. The problem is that on Windows, this is a very common way to encode the URI for a UNC path, that is: \\name\path\part which is a Windows UNC path for \path\part on a host named name (via whatever networking protocol backs UNC).

You'd think that this should be file://name/path/part, but I think because browsers and maybe other tools just discard the authority part of a file: URI (or maybe because these are paths in some Windows sense?), that's not how they're encoded.

Aside: Yes, I'm sure you also see file:\\\\name\path\part and file:c:\path\part and other forms as well. Those are out of scope, they're simply, flatly, completely wrong. You can't use \ as a delimiter in a URI. RFC 3986 is authoritative on this point. Step one of dealing with random strings we think should be URIs is replacing all \ with / because RFC 3986.

It's problematic to deal with file://// as a special case, but it's also problematic to leave out support for a common pattern on a widely deployed operating system.

Recognizing four slashes after file: and treating that specially isn't hard. The hard part is how do we encode this in the map that fn:parse-uri produces bearing in mind that the result should round-trip if you push it back through fn:build-uri.

Consider file:////uncname/path/part

Today, that is parsed as:

map {
  "uri": "file:////uncname/path/part",
  "scheme": "file",
  "authority": "uncname",
  "host": "uncname",
  "path": "/path/to/file",
  "path-segments": array { "", "path", "to", "file" }
}

and that doesn’t round trip. If you feed that to fn:build-uri, you get file://uncname/path/part and that absolutely doesn’t mean the same thing on a Windows machine.

We could encode the slashes in the authority in which case we also have to encode them in the host because in the presence of host, the authority isn’t used to by fn:build-uri():

map {
  "uri": "file:////uncname/path/part",
  "scheme": "file",
  "authority": "////uncname",
  "host": "////uncname",
  "path": "/path/to/file",
  "path-segments": array { "", "path", "to", "file" }
}

It kind of works, but it’s really ugly and it means we have a host value that is a complete kludge. It doesn’t match the RFC rules for hostnames at all.

The other option that occurs to me is to add a “unc-path” property to the map:

map {
  "unc-path": true(),
  "uri": "file:////uncname/path/part",
  "scheme": "file",
  "authority": "uncname",
  "host": "uncname",
  "path": "/path/to/file",
  "path-segments": array { "", "path", "to", "file" }
}

That works but it introduces all sorts of possibilities for incoherent data, such as an https: URI with a unc-path flag set to true().

What’s the right answer?

  1. Ignore the UNC path special case, it’s the users problem to deal with them.
  2. Recognize them, encode the details in the authority and host.
  3. Recognize them, use a special property like unc-path.
  4. Recognize them, and do this other much better idea I have: ________________

Issue #389 created #created-389

09 Mar at 09:31:07 GMT
The fn:build-uri function needs to perform URI encoding for path and query segments

The fn:parse-uri function describes decoding, but the fn:build-uri function fails to encode.

Issue #388 closed #closed-388

08 Mar at 16:35:31 GMT

Update the example background color in serialization

Pull request #388 created #created-388

08 Mar at 16:25:12 GMT
Update the example background color in serialization

This PR completes my action to fix the dark blue background in examples in the serialization spec. I've made them the same as the examples in the XSLT spec which seem to have been satisfactory.

Issue #328 closed #closed-328

07 Mar at 21:08:45 GMT

Switch Cases: Lift single-item restriction on operands

Issue #28 closed #closed-28

07 Mar at 21:05:42 GMT

[XPath] Support multiple clauses in ForExpr and LetExpr.

Pull request #387 created #created-387

07 Mar at 21:03:39 GMT
Add compatibility notes for fn:namespace-uri-for-prefix

Action QT4CG-024-01

Pull request #386 created #created-386

07 Mar at 20:32:18 GMT

Action QT4CG-025-05 (markup typo)

Pull request #385 created #created-385

07 Mar at 20:15:34 GMT
Actions QT4CG-025-07 / -08

Improves termdef markup; adds error code; updates change history appendix.

Issue #344 closed #closed-344

07 Mar at 17:22:44 GMT

Issue 22: allow "for"/"let" keyword to be repeated in XPath

Issue #307 closed #closed-307

07 Mar at 17:20:21 GMT

Parsing and building URIs comments and queries

Issue #347 closed #closed-347

07 Mar at 17:20:19 GMT

Attempt to clarify fn:parse-uri and fn:build-uri

Issue #355 closed #closed-355

07 Mar at 17:19:47 GMT

Action QT4CG-022-02 - add to imp-def-feature appendix

Issue #370 closed #closed-370

07 Mar at 17:19:15 GMT

Bump XSLT version

Issue #345 closed #closed-345

07 Mar at 17:18:52 GMT

Missing rule for matching atomic values against atomic types

Issue #363 closed #closed-363

07 Mar at 17:18:36 GMT

Fix issue #345 - missing rules for type matching

Issue #364 closed #closed-364

07 Mar at 17:18:13 GMT

Generalize switch expressions in XQuery (issue #328)

Issue #371 closed #closed-371

07 Mar at 17:17:34 GMT

Issue 370: forwards and backwards compatibility for 4.0

QT4 CG meeting 025 draft minutes #minutes-03-07

07 Mar at 17:15:00 GMT

Draft minutes published.

Issue #147 closed #closed-147

07 Mar at 17:13:54 GMT

Terse syntax for map entries

Issue #60 closed #closed-60

07 Mar at 17:13:38 GMT

[FO] fn:namespace-uri-for-prefix no longer supports passing a prefix by string

Issue #45 closed #closed-45

07 Mar at 17:13:13 GMT

Second parameter of fn:sum must be neutral element for +

Issue #384 created #created-384

06 Mar at 16:24:20 GMT
Definition of "effective value" in XSLT

The term "effective value" is defined in XSLT with a rather narrow definition in the context of attribute value templates. The term is used throughout the spec (sometimes hyperlinked, sometimes not) in a much more general sense, for example the "effective value" of an attribute is the explicit value given to the attribute, or the value after basic normalization such as whitespace stripping, or the default value if the attribute is not present.

This affects the determination of the correct result for test merge-021, where it is a little ambiguous whether two xsl:merge-source/@order attribute have the same "effective value" given that one is defaulted.

QT4 CG meeting 025 draft agenda #agenda-03-07

05 Mar at 11:15:00 GMT

Draft agenda published.

Issue #383 created #created-383

28 Feb at 17:21:13 GMT
fn:deep-equal: Order of child elements (unordered-elements)

At meeting 024 where PR https://github.com/qt4cg/qtspecs/pull/320 was accepted, there remained an open question of how best to specify that in some circumstances the comparisons should be made without regard to the order of (some) children.

Can the name of the option be improved?

Should the option support wildcard names?

Issue #382 created #created-382

28 Feb at 17:19:32 GMT
Improve whitespace handling in deep-equal

At meeting 024 where PR https://github.com/qt4cg/qtspecs/pull/320 was accepted, there remained an open question of how to deal with whitespace.

The current options can be seen as having somewhat overlapping domains. Can this be improved?

Issue #381 created #created-381

28 Feb at 17:18:00 GMT
Deep-equal comparisons without errors

At meeting 024 where PR #320 was accepted, there remained an open question of how to deal with errors.

On the one hand, in order for fn:deep-equal to be most easily used as a comparison function in the many contexts where a comparison function is required, it would be best if it simply returned false() rather than raising an error when incomparably items are encountered.

On the other hand, making "return false()" the default will mean that it is possible to construct items that are not equal to themselves, which will certainly violate the expectations of some users.

This conflict needs to be resolved somehow.

QT4 CG meeting 024 draft minutes #minutes-02-28

28 Feb at 17:15:00 GMT

Draft minutes published.

Issue #377 closed #closed-377

28 Feb at 14:49:28 GMT

Published XQuery 4.0 spec renders XML predefined entities instead of literal characters

Issue #380 closed #closed-380

28 Feb at 14:49:27 GMT

Removed CDATA sections around markup

Pull request #380 created #created-380

28 Feb at 14:23:20 GMT
Removed CDATA sections around markup

Fix #377

I took a minimal approach here. I've removed CDATA sections where the section contained an & but not a <.

  1. If the section contains <, then it's (presumably) necessary to escape the markup
  2. If the section does not contain an &, then it's irrelevant. But not removing it limits the number of places changed by the script

Issue #379 created #created-379

28 Feb at 12:09:11 GMT
Namespace handling in parse-html

The HTML5/"Living Standard" specification has two modes when it comes to handling namespaces:

  1. For XHTML content the document is parsed as XML with full namespace support.
  2. For HTML content, it has pseudo-namespace support.

For example, the HTML parsing algorithm:

  1. places html, svg, and mathml elements in their corresponding namespaces.
  2. allows certain element/attribute tag names (e.g. xlink:href) to be parsed as QNames.

From the XSLT/XQuery perspective, this affects the data model. Specifically, how to model and specify the node-names and the set of namespaces associated with a given element node.

Pull request #378 created #created-378

28 Feb at 08:55:01 GMT
Update the localName and unparsed entity reference notes for parse-html

This PR applies the following changes:

  • [x] QT4CG-021-03: RD to change must to will in DOM notes about lowercase
  • [x] QT4CG-021-04: RD to revise and move the note about unrecognized entities

Issue #377 created #created-377

28 Feb at 07:22:53 GMT
Published XQuery 4.0 spec renders XML predefined entities instead of literal characters

When rendered in the browser, XML examples in the XQuery 4.0 specification show, for example, '&lt;' instead of '<':

Screenshot 2023-02-28 at 07 21 49

Issue #376 created #created-376

27 Feb at 17:16:37 GMT
add documentation prefix attribute to xsl:stylesheet

Although the addition of xsl:note is very welcome, i had been hoping for something like the xsl:stylesheet attribute extension-element-prefixes, e.g. ignored-element-prefixes.

The specification would be something like, Elements and attributes associated with an ignored element prefix are not treated as direct constructors, and are removed when the stylesheet is compiled. For such an element, this is equivalent to having an xsl:use-when attribute with value false on the element; for attributes, they are simply discarded along with their value.

It is not an error for a prefix to be listed both as an ignored element prefix and as an extension element prefix; the result is implementation dependent in this case, but MUST not result in neither an extension being invoked nor the element or attribute being ignored.

Ignored elements may appear anywhere in the input tree, and ignored attributes may appear on any element.

Example:

<xsl:template match="city/park" css:module="main">
  <css:rule>
    color: green;
    trees: tall;
  </xsl:rule>
  <div class="park">
    <xsl:apply-templates />
  </div>
</xsl:template>

Pull request #375 created #created-375

27 Feb at 17:03:04 GMT
256: Context for default parameter values

This is an attempt to resolve issue #256 by providing details of the static and dynamic context for evaluating default parameter values, including providing a mechanism for accessing parts of the static and dynamic context of the caller.

If this PR is accepted we will need to follow up with (a) similar changes to XSLT, and (b) use of the new notation in the signatures of standard functions and operators that have context-dependent default values for parameters.

Note that the PR also breaks up the rather unwieldy sections for Function Declarations and Variable Declarations into more manageable subsections, which has involved some re-ordering; some of the change marking may therefore be spurious.

Issue #374 created #created-374

27 Feb at 10:11:31 GMT
Can't view the XSD for XSLT in the browser

If you attempt to open https://qt4cg.org/specifications/xslt-40/schema-for-xslt40.xsd in the browser (in Firefox), you'll get:

Error loading stylesheet: An unknown error has occurred (805303f4)
http://www.w3.org/2008/09/xsd.xsl

In a Chrome-derived browser I get a blank screen on which even the context menu doesn't work. Digging about in the inspect window leads me to

Unsafe attempt to load URL http://www.w3.org/2008/09/xsd.xsl from frame with URL
https://qt4cg.org/specifications/xslt-40/schema-for-xslt40.xsd. Domains, protocols and ports must match.

I conclude that the problem is trying to load the XSL for XSD from a different domain. Boo. I guess we should copy those stylesheets to qt4cg.org, or remove the stylesheet PI, or ignore the whole thing on the assumption that we'll eventually publish these specifications in some W3C location and the probem will go away. Maybe.

Issue #373 created #created-373

27 Feb at 09:29:57 GMT
apparent copy/paste error in annotation documentation of simple type yes-or-no-or-maybe

The XSD 1.1 schema for XSLT 3 and the one for XSLT 4 (at https://qt4cg.org/specifications/xslt-40/schema-for-xslt40.xsd) has an error in the annotation/documentation section of the simple type yes-or-no-or-maybe as it there says One of the values "yes" or "no" or "omit".. I think that should be One of the values "yes" or "no" or "maybe", the error probably exists as someone copied the text from the yes-or-no-or-omit type declaration and forgot to adapt the description.

Issue #372 created #created-372

26 Feb at 23:06:13 GMT
Separate default namespace for elements from the default namespace for types

Currently the static context provides a "default namespace for elements and types". It's not at all clear why these should be the same. For types, the vast majority of QNames representing types are in the XML Schema namespace, which is never used for elements.

In the current 4.0 drafts the two default namespaces are separated; but this has not been reviewed or agreed by the CG. This issue is raised for discussion of the change, and I will also review the design to see whether it still make sense.

Some observations on the current text for XQuery:

  1. In section 2.2.1 (static context) it would be good to give a bit more detail (if only as a forwards reference) about the circumstances in which the default element namespace and the default type namespace are used.
  2. In 3.4 Sequence Types the sentence "[Lexical QNames]appearing in a [sequence type] have their prefixes expanded to namespace URIs by means of the [statically known namespaces] and (where applicable) the [default element namespace] or [default type namespace]" is rather inelegantly worded. If there is a prefix, then the statically known namespaces are used; if there is none, then the relevant default namespace is used, and it would be nice to explain more clearly which one applies.
  3. In 3.6 Item Types, we need to be clearer about references to named/declared item types, and about how the names are resolved. Do we really want these names to be in the same symbol space as atomic types? Perhaps we should have a rule that Item Types (like functions) must be in a namespace and this must not be the same as an imported schema namespace.
  4. In 5.14, Default namespace declaration, there seems to be duplication between the two paragraphs starting "for backwards compatibility reasons"
  5. Appendix C.1 (much though I dislike it) should say something about the initialisation of the default namespaces for elements and for types.

Note that issue #65 talks of the need for different default namespaces for input and output elements. I think that's a separate issue.

Observations on the current text for XSLT:

  1. In 5.1.2.1 the paragraph "The [xsl:]xpath-default-namespace attribute must be in the [XSLT namespace] if and only if its parent element is not in the XSLT namespace needs to be generalised to [xsl:]default-element-namespace. In fact, this rule should move to the parent section 5.1.2, which needs an introduction.

Pull request #371 created #created-371

24 Feb at 16:59:12 GMT
Issue 370: forwards and backwards compatibility for 4.0

This is essentially editorial; it updates the XSLT rules for forwards and backwards compatible processing to acknowledge the fact that the current version is now 4.0.

Issue #370 created #created-370

24 Feb at 16:18:07 GMT
Bump XSLT version

There are various places where the XSLT spec refers to XSLT 3.0 where it should now refer to 4.0.

QT4 CG meeting 024 draft agenda #agenda-02-28

24 Feb at 08:25:00 GMT

Draft agenda published.

Issue #19 closed #closed-19

22 Feb at 21:34:21 GMT

[xslt] annotation-prefixes

Issue #84 closed #closed-84

22 Feb at 21:32:36 GMT

Proposal : allow ignorable <xsl:div> wrapper for documentation or organize the code

Issue #189 closed #closed-189

22 Feb at 21:29:24 GMT

Adopt the coercion rules for variables in XQuery

Issue #352 closed #closed-352

22 Feb at 21:24:40 GMT

The @array attribute of xsl:for-each-group is no more

Issue #354 closed #closed-354

22 Feb at 10:08:23 GMT

Combine multiple signatures of XSLT functions to use defaults

Issue #353 closed #closed-353

22 Feb at 10:07:34 GMT

Issue109 xsl note

Issue #362 closed #closed-362

22 Feb at 10:06:31 GMT

Drop obsolete note in XSLT regarding for-each-group/@array

QT4 CG meeting 023 draft minutes #minutes-02-21

22 Feb at 09:53:00 GMT

Draft minutes published.

Issue #369 created #created-369

21 Feb at 17:44:13 GMT
Namespaces for Functions

What problem are we trying to solve? Essentially, I think "namespace clutter".

Namespace clutter manifests itself in several different ways.

  • Firstly, declaration clutter in source code. Here's the start of a module in an XSLT Stylesheet of medium complexity:
<xsl:stylesheet 
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
    xmlns:xs="http://www.w3.org/2001/XMLSchema" 
    exclude-result-prefixes="#all"
    version="3.0" 
    xmlns="http://ns.saxonica.com/xslt/export" 
    xmlns:doc="http://www.saxonica.com/ns/documentation"
    xmlns:map="http://www.w3.org/2005/xpath-functions/map" 
    xmlns:ex="http://ns.saxonica.com/xslt/export" 
    xmlns:f="MyFunctions" 
    xmlns:t="MyTypes"
    expand-text="true">

Eight namespace declarations here, of which 3 are concerned with functions; and Wit can get a lot worse than that.

  • Secondly, namespace clutter in the static and dynamic context. The namespace bindings shown above don't disappear when the code is compiled; even with exclude-result-prefixes="yes", they have to hang around at run-time just in case someone tries to resolve a QName dynamically. Preserving the namespace context in the expression tree through optimization rewrites is a significant cost that has no user benefit; very rarely are they actually going to use the namespace context at run time.
  • Thirdly, prefix clutter in the executable code. Writing math:cos(math:cos($x)) is just so clumsy compared with cos(cos($x)).

I think there are a number of things we can do to reduce this.

First, separate out the namespace context for static resolution of function names as a separate part of the static context, used only for this purpose. Ensure that there is no functionality that depends on knowing this part of the static context at run time, so it can be discarded by the compiler as soon as function names are resolved. Then provide source syntax for binding function prefixes to function namespaces in XSLT and XQuery to populate this part of the static context; there is no reason this has to be done using XML namespace declarations. There is also no reason for having different bindings in force in different parts of a single module. And once we've separated these declarations from XML namespace declarations, there's no reason why we can't provide default bindings. We could also allow bindings to have cross-module scope to reduce duplicated code. Note: the xsl:function-library proposal in the current XSLT 4.0 draft tries to achieve some of these things.

Second, allow functions to be referenced by local name alone where the reference is unambiguous; and perhaps provide some aliasing mechanisms to make more existing names unambiguous.

We've explored a third idea, which is to introduce some kind of polymorphism where function names have local scope and are distinguished by the types of objects to which they are applied. I think that given our type system, this is very hard to achieve and I haven't seen any very satisfactory proposals. We also need to remember that there are considerable costs if we start resolving function names dynamically at run time. I wouldn't rule out making progress in this direction, but I'm not optimistic of coming up with a workable solution. There might be some simple things we could do, like having a single function size() that performs the work of both map:size() and array:size() depending on the argument.

Issue #318 closed #closed-318

21 Feb at 17:02:29 GMT

Serialization HTML/XHTML output methods: meta elements and the charset attribute

Pull request #368 created #created-368

21 Feb at 10:34:36 GMT
129: Context item generalized to context value

This is a first cut proposal to generalize the context item to a context value, allowing (for example) array predicates.

The proposal covers XPath and XQuery only at this stage; it doesn't address the consequences for XSLT.

Careful review requested!

Addresses issue #129 and issue #367.

Issue #367 created #created-367

21 Feb at 00:06:18 GMT
Focus for RHS of thin arrow expressions

We define A -> F(B, C) as being equivalent to A ! F(., B, C) which means that B and C are evaluated with a focus based on the current item in A, not with the outer focus. This is different from the => operator. For example if the $E is an element E, with several children called F, then

namespace-uri(.) -> fn:QName(name())

has a different effect from

namespace-uri(.) => fn:QName(name())

whereas it might reasonably be expected that in the case where the LHS produces a single value, the two operators are equivalent. We can't change the meaning of => because it's defined in 3.1. So should we change the meaning of -> to fall into line?

We could do this easily enough by defining A -> F(B, C) as equivalent to for $a in A return F($a, B, C). I think that as well as being more consistent with =>, the result is probably more intuitive. (We could also define it as equivalent to let $f := F(?, B, C) return A ! $f(.))

For the expression A -> {B}, and for the proposed A => {B}, I don't think we have any choice other than evaluating B with an inner focus based on A. But at least we can do it consistently for both operators.

Issue #366 created #created-366

20 Feb at 18:43:25 GMT
Support xsl:use-package with xsl:package-location

Unless I am misreading the specs (which I do commonly enough), there is currently no way for an XSLT writer using xsl:use-package to indicate where the package is to be found, except outside the XSLT environment. I propose to allow xsl:use-package to contain zero or more xsl:package-location children. I propose the addition of an element and not an attribute, because a package may be in multiple locations, and need nuance, as noted below.

Attributes:

  • @href, on the model of xsl:import and xsl:include, would specify by relative or absolute URI where the package is.
  • @priority (default 0) would provide a mechanism to indicate whether the specified xsl:package-location should override (value greater than 0), or simply provide a fallback for (less than or equal to 0), the preconfigured place the package should be retrieved from.
  • @use-when would allow a developer to manage different versions of a package for different cases.

Other attributes given to xsl:package-location would need discussion.

Issue #365 created #created-365

20 Feb at 16:41:34 GMT
switch, typeswitch: Optional braces

The indentation of switch expressions is often a mess. Now that we allow curly braces for if, it would be nice to also allow optional braces for switch and typeswitch:

typeswitch($item) {
  case xs:numeric return 'number'
  default return '...'
},
switch($item) {
  case 0 to 9 return 'single digit'
  default return '...'
}

The current syntax is:

typeswitch($item)
  case xs:numeric return 'number'
  default return '...',

switch($item)
  case 0 to 9 return 'single digit'
  default return '...'

Pull request #364 created #created-364

20 Feb at 15:43:59 GMT

Generalize switch expressions in XQuery (issue #328)

Issue #337 closed #closed-337

20 Feb at 14:38:11 GMT

Local union and enum types: and the definition of generalised atomic types

Pull request #363 created #created-363

20 Feb at 13:07:43 GMT

Fix issue #345 - missing rules for type matching

Pull request #362 created #created-362

20 Feb at 12:12:08 GMT
Drop obsolete note in XSLT regarding for-each-group/@array

Fixes issue #352

Issue #361 created #created-361

20 Feb at 11:38:00 GMT
Named arguments: $input vs. $value

Great effort has been made in unifying the parameter names of the XQFO standard; thanks for that!

I believe to remember that:

  • $value, $values, $value1, etc. is used for atomic/atomized arguments, whereas
  • $input, $input1, etc. is used for input, mostly of type item(), that is processed unchanged.
  • $uri is used for arguments that could have been defined as items of type xs:anyURI.

I believe the following argument names need to be double-checked (if not, It may be that I haven’t fully grasped how the naming rules are supposed to work):

Function | Currently | Presumably | Justification --- | --- | --- | --- array:slice | $input | $array | Alignment with array:size et al. trace | $value | $input | Argument is not atomized json | $input | $value | Argument is atomized string | $item | $value | $item is used nowhere else expanded-QName | $qname | $value | Alignment with prefix-from-QName et al. resolve-QName | $qname | $value | Alignment with prefix-from-QName et al. parse-QName | $eqname | $value | Alignment with parse-xml et al. parse-json | $json | $value | Alignment with parse-xml et al. json-to-xml | $json | $value | Alignment with parse-xml et al. char | $name | $value | Input may also be codepoint values, etc. namespace-uri-for-prefix | $prefix | $value | $prefix is used nowhere else resolve-uri | $relative | $uri | Absolute URIs are legal as well array:append | $add | $member | Alignment with array:put

And we should probably pay particular attention to the naming conventions when adding new functions.

Pull request #360 created #created-360

20 Feb at 11:24:36 GMT
Issue 314 array composition and decomposition

This PR addresses parts of issue 29, issue 113, and issue 314 relating to the composition and decomposition of arrays.

It introduces two functions array:of for array composition, and array:members for decomposition, and defines all other array functions in terms of these two primitives (replacing the internal functions op:A2S and op:S2A). The items in the decomposed form of an array are called "value records", singleton maps of the form map{'value': $value}

The function array:from-sequence is renamed array:build to reflect its symmetry with map:build.

Question for the group: should we have a new function for constructing a "value record", or is the syntax map{'value': $value} adequate for the purpose?

Issue #359 created #created-359

20 Feb at 09:53:03 GMT
fn:void: Absorb result of evaluated argument

Summary

Absorb the result of the evaluated argument.

Signature

fn:void(
  $input as item()*
) as empty-sequence()

Motivation

Developers tend to get creative if they want to suppress the result of an expression. The reason is that there is no simply solution to do this properly. Some constructs I have seen in practice:

let $unused := EXPRESSION
return 'ok'

EXPRESSION[position() = 10000], 'ok'

let $result := 'ok'
return if(exists(EXPRESSION)) then $result else $result

Cases like this are frequent in nondeterministic code. Think e.g. of side-effecting functions of the EXPath HTTP-Client and File Modules: The function results are not always relevant for the invoking application, or already known.

The function is also helpful during development and for testing code. fn:void#1 and fn:identity#1 can both be passed on to functions to either return or ignore the result of their arguments. The function can potentially be used to measure the runtime performance of an expression (but an implementation should not be prevented from discarding the function call if the argument expression is deterministic).

Issue #358 created #created-358

19 Feb at 23:55:31 GMT
serialization indent whitespace

There could be an option to control whether the serialization indents with space or tabs, and how many of them (e.g. 2 or 4 spaces )

Related: https://github.com/qt4cg/qtspecs/issues/101

A user request: https://github.com/benibela/xidel/issues/100

Issue #357 created #created-357

18 Feb at 16:07:01 GMT
Representing key-value pairs

A map can be decomposed into, or composed from, a sequence of key-value pairs (KVPs).

There are two natural representations of a key-value pair (K, V): it can be represented as a singleton map (map{ K: V }) or as a "doubleton" map (map{ 'key': K, 'value': V}).

This issue examines how well either of these representations is currently supported, which of them is preferable, and how this support should be improved.

I'll consider the following basic operations: constructing a KVP from a key and a value, assembling a map from a set of KVPs, decomposing a map into a sequence of KVPs, extracting the key from a KVP, and extracting the value from a KVP.

Singleton Representation

Constructing a KVP from a key and a value:

map{ $key : $value }
map:entry($key, $value)
<xsl:map:entry key="$key" select="$value"/>

Assembling a map from a set of KVPs

map:merge($kvps)
<xsl:map>

Decomposing a map into a sequence of KVPs:

map:for-each($map, map:entry#2)

Extracting the key from a KVP:

map:keys($kvp)

Extracting the value from a KVP:

$kvp?*

Doubleton Representation

Constructing a KVP from a key and a value:

map{ 'key': $key, 'value': $value }

Assembling a map from a set of KVPs

map:build($kvps, ->{?key}, ->{?value})

Decomposing a map into a sequence of KVPs:

map:for-each($map, ->($K, $V){map{ 'key': $key, 'value': $value })

Extracting the key from a KVP:

$kvp?key

Extracting the value from a KVP:

$kvp?value

Analysis

The singleton representation is better supported at present, and it makes sense therefore to fill in the gaps that currently make it awkward. The main attraction of the doubleton representation is the ease of extracting the key and the value using $kvp?key and $kvp?value. The equivalents for the singleton representation (map:keys($kvp) and $kvp?*) feel clumsy and unintuitive; however, it's not at all obvious what would be better, short of introducing new custom syntax, which seems over-the-top. The best idea I can come up with is to have two functions map:key($kvp) and map:value($kvp) which require $kvp to be a singleton map. But I hate the namespace prefixes...

The other thing needed to "fill the gaps" is a function map:entries($map) equivalent to map:for-each($map, map:entry#2).

What if we chose to go the other way, and improve support for the doubleton representation?

We could add map:key-value-pair($key, $value) to create KVP, and map:of($kvps) to build a map from a set of KVPs, and map:key-value-pairs($map) to decompose a map. The trickiest problem is what to do about XSLT, where the 3.0 instructions <xsl:map> and <xsl:map-entry> use the singleton representation.

Issue #356 created #created-356

18 Feb at 00:21:40 GMT
array:leaves

1. Issues

There are at least two issues with the definition of the function array:flatten:

  1. Unlike most other functions on arrays (such as array:put, array:replace, array:append, array:slice, array:subarray, array:remove, array:insert-before, array:tail, array:trunk, array:reverse, array:join, array:for-each, array:filter, array:for-each-pair, array:sort, array:partition) , which produce an array as their result, this function produces only a sequence

  2. This function is not lossless -- any members that are the empty sequence or the empty array are not represented in the returned result.

2. Suggested solution(s)

We want to have a function that is similar to the wrongly defined one, but produces its contents as an array, and is lossless. There are two obvious ways to do this:

  1. Correct the specification of array:flatten so that its result is an array and it represents the empty sequences and empty arrays as the same members of its result.

  2. Add to the Specification a new function: array:leaves that produces an array as its result and that is lossless. array:leaves returns an array whose members are exactly all the leaves of the input array, by the order of their appearance. By definition leaves are all, and at any depth, members that are not an array except when they are the empty array. Thus () (the empty sequence) and [] (the empty array) are leaves by definition.

Solution 2. will not cause any compatibility issues.

3. Examples

The expression array:leaves([1, (), [4, 6], 5, 3]) returns [1, (), 4, 6, 5, 3].

The expression array:leaves([1, 2, 5], [[10, 11], 12], [], 13) returns [1, 2, 5, 10, 11, 12, [], 13].

QT4 CG meeting 023 draft agenda #agenda-02-21

17 Feb at 17:49:00 GMT

Draft agenda published.

Pull request #355 created #created-355

16 Feb at 21:41:38 GMT
Action QT4CG-022-02 - add to imp-def-feature appendix

Adds entries to the implementation-defined-features appendix of the serialization spec, corresponding to the option to generate <meta charset="XXX"> for HTML5.

Pull request #354 created #created-354

16 Feb at 18:30:51 GMT
Combine multiple signatures of XSLT functions to use defaults

This PR addresses issue 69, by modifying those XSLT built-in functions that currently have multiple signatures, to use a single signature with parameter defaults instead.

The changes however don't currently render correctly. The XSLT processing pipeline needs to be changed to pick up the changes that were made to the F+O stylesheets to render parameter defaults correctly. I haven't yet managed to work out where this is done.

Pull request #353 created #created-353

16 Feb at 17:22:48 GMT
Issue109 xsl note

Addresses issue #109 and issue #87. Unfortunately the PR also includes the unrelated commits for issue 22.

Issue #352 created #created-352

16 Feb at 14:58:30 GMT
The @array attribute of xsl:for-each-group is no more

There is a note in XSLT §14.2 concerning the @array attribute of xsl:for-each-group, but this attribute has been dropped.

Issue #351 closed #closed-351

16 Feb at 08:48:37 GMT

Another attempt to build off the merge-base branch

Pull request #351 created #created-351

16 Feb at 08:48:31 GMT

Another attempt to build off the merge-base branch

Issue #341 closed #closed-341

16 Feb at 00:17:45 GMT

[XPath] Error-free selection operator for maps or arrays, or finite-domain functions

Issue #350 created #created-350

16 Feb at 00:11:51 GMT
CompPath (Composite-objects path) Expressions

CompPath (Composite-objects path) Expressions

As initially discussed in issue #341, we were exploring different ways to provide an XPath-like language to traverse in depth composite objects such as maps and arrays and select their members at any depth. While working on this, the idea of an XPath-like language for composite items started to emerge and here we present this idea in a more or less crystalized form.

1. Root Component

Any CompPath expression must start off a composite item (of type map or array, or of other future composite item type (maybe set? ) ). This can be a literal composite item or a reference to a variable whose value is a composite item.

Examples:


(: Literal composite items: :)
[1, 2, 3]

[1, [2,  3]]?2

{"x":1, "y" : map{ "z": 2}}

{"x":1, "y" : map{ "z": 2}} ?y

(: Variables containing composite items: :)
let $comp1 := [1, [2, 3]],
 $comp2 :=$comp1 ?2,
 $comp3 := {"x":1, "y" : map{ "z": 2}},
 $comp4 := $comp3 ?y

In the above examples all literal expressions and all variables ($comp1, $comp2, $comp3, $comp4) may serve as the root component for a CompPath expression.

2. The component-path operator (\)

The component-path operator "\" is used to build expressions for locating members at any depth within component trees. Its left-hand side expression must return a result that is a composite item or else this result is represented as such by wrapping it into an array.

The operator returns an array, the values of whose members are composite items themselves or any such value may be a non-composite "leaf" in the root-component tree).

Each operation E1\E2 is evaluated as follows: Expression E1 is evaluated, and the result is wrapped in an array A1. If any member of A1 is not a composite item, a type error is raised. Each member of A1 serves in turn to provide an inner "composite-focus" (the member as the "composite-context-item" or ., its index in A1 as the "composite-context-position" or index(), the set of keys of the composite-context-item as the "composite-keyset" or keys() and the size of this member as the "composite-context-size" (specified as one of: size(), or array-size() or key-size()) ) for the evaluation of E2. The result of each evaluation of E2, if it isn't a single composite item, is wrapped in a single array. The arrays resulting from all the evaluations of E2 are wrapped in a single array and this single array is the result of the evaluation.

E2 is typically a function over the context-focus and its results will be the set of the next step composite-context-items (used as the left-hand-side of the next in chain composite-step-expression (see below)), or these results would be the final results of evaluation if this is the last-in chain composite-step-expression.

3. Composite-Steps

A composite-step is a part of a composite-path-expression that generates an array and filters its members by zero or more predicates. A composite-step-expression is either a CompositeAxisStep or a CompositePostfixExpression.

4. Composite-Axes

The following axes are defined for traversing a composite-item tree:

  • The child-member:: axis contains the members of the composite-context-item.
  • The value-member:: axis contains the members of the composite-context-item that are not composite themselves.
  • The node-member:: axis contains the members of the composite-context-item that are nodes.
  • The descendant-member:: axis is defined as the transitive closure of the child-member:: axis; it contains the descendent-members of the composite-context-item (the child members of the composite-context-item, and their child-members, ... and so on).
  • The self:: axis contains just the composite-context-item.
  • The descendant-member-or-self:: contains the composite-context-item and all of its descendent-members.
  • The following-sibling-member:: axis contains the members of the immediate container of the composite-context-item that follow it. For any two members mem1 and mem2 of a composite item Comp, by definition mem2 follows mem1 if and only if Comp is an array and the index of mem2 in Comp is greater than that of mem1, or if Comp is a map, then the key of mem2 is greater than that of mem1.
  • The preceding-sibling-member:: axis contains the members of the immediate container of the composite-context-item that precede it. For any two members mem1 and mem2 of a composite item Comp, by definition mem1 precedes mem2 if and only if Comp is an array and the index of mem2 in Comp is greater than that of mem1, or if Comp is a map, then the key of mem2 is greater than that of mem1.

For example, following-sibling-member::5 means all members of the composite-context-item with index > 5, and preceding-sibling-member::5 means all members of the composite-context-item with index < 5

Note: If the immediate container of the composite-context-item is a map whose key-values cannot be ordered, then specifying either of the following-sibling-member:: or preceding-sibling-member:: axes on this composite-context-item must raise a type error. (Obviously, these two axes are meaningful only for composite items, whose members are ordered, such as the array).

If the composite-axis name is omitted from a composite-axis step, the default axis is child-member::

5. Composite Axis Steps

A composite axis step completely resembles the ordinary axis step in XPath. It consists of three parts:

  1. The composite axis (child-member::, descendant-member::, value-member::, node-member::, following-sibling-member::, preceding-sibling-member::, self::, or the descendant-member-or-self:: axis)
  2. The member test
  3. The composite-predicates

6. Member Tests

A member test is a condition on the key-name, index, or kind (composite, map, array or value, node, or (any) member). A member test determines which members contained by a copmosite-axis are selected by a composite-step.

As such, a member test is either an identifier-test (key-name or index) or a kind-test (composite, map, array, value, or member)).

Examples of member identifiers:

  • A string specifies a name of a key, whose value will be selected. For example: \child-member::X selects from the composite-context-item the value corresponding to its key which has the name "X".

  • \child-member::3 selects from the composite-context-item the value of its 3rd member, if it is an array or the value corresponding to its key 3, if it is a map.

  • following-sibling-member::3 selects from the composite-content-item (which is most-likely an array) all of its members having index greater than 3.

  • preceding-sibling-member::3 selects from the composite-content-item (which is most-likely an array) all of its members having index less than 3.

  • \descendant-member-or-self::X selects from the composite-context-item (that must be a map) and from all its descendant-members, the values corresponding to their key named "X", if these descendents have a key named "X".

  • Similarly \5 is equivalent to \child-member::5 and selects from the composite-context-item that is an array the value of its 5th member. This will also select the value corresponding to the key 5 from the composite-context-item if it is a map, because on the child-member:: axis both maps and arrays may be selected.

  • \X is equivalent to \child-member::X and selects from the composite-context-item (that must be a map), the value corresponding to its key which has the name "X".

    There is also the pseudo-operator \\ . This is an abbreviation for:

    \descendant-member-or-self::member()\

    Thus, \\X means: "(Deep) Select all members of the root-component that are the corresponding values of keys equal to 'X' "

  • We may use a kind test as part of the previous example, if we want to select only a specific kind of members of the composite-context-item. \array() In this example, although we are on the child-member:: axis, we want to select only members of the composite-context-item that are arrays.

  • \map() In this example, although we are on the child-member:: axis, we want to select only members of the composite-context-item that are maps.

  • \value() In this example we want to select only members of the composite-context-item that are not composite items themselves.

  • \node() In this example we want to select only members of the composite-context-item that are nodes.

  • \member() In this example we want to select all members of the composite-context-item, regardless whether they are maps, arrays, or values.

6.1 Wildcards

The * wildcard can be used instead of a member identifier. Its meaning is to select all existing members of the composite-context-item, that is possibly selected by a specific axis and limited by a specific member kind-test.

Examples:

  • \* (: (Shallow) Selects all members of the composite-context-item :)
  • \map()\* (: Selects from the composite-context-item all values that correspond to a key of any map-member of the composite-context item :)
  • \array()\* (: Selects from the composite-context-item all members of all its members that are arrays :)
  • \\* (: (Deep) Select all members of the composite tree rooted by the root-component :)

7. Predicates

As defined above, a composite-step has three parts: composite-axis (can be omitted and then a default axis is used), member test, and an optional list of composite-predicates.

A composite-predicate in a composite-step is an expression used as a filter applied on the members of the composite-context-item that are already selected by the axis and member tests of the axis step, and not filtered out by any preceding composite-predicates in the composite-predicates-list. The composite-predicate may be any XPath expression and is written within double square brackets.

Examples:

  • \*[[3]] (: Selects any member of the composite-context-item, that is an array and has a 3rd member or any member of the composite-context-item, that is a map and has a key 3 :) This is a shorthand for: \*[[array-size() ge 3 or 3 = keys()]]
  • \array()[[3]] (: Selects those array members of the composite-context-item that have a 3rd member :) This is a shorthand for: \*[[size() ge 3]]
  • \*[[size() eq 7]] (: Selects those members whose array-size() or key-size() is exactly 7:) This is a shorthand for: \composite::*[[self::map() and key-size() eq 7 or self::array() and array-size() eq 7]]
  • \*[[X]] (: Selects any member of the composite-context-item, that is a map and has a key X :)
  • \map()[[X]] (: Selects any map member of the composite-context-item, that has a key X :) The above two expressions are a shorthand for: \*[['X' = keys()]]
  • \value()[[. gt 0]] (: Selects any value (non-composite member) of the composite-context-item, that is a positive number :)

8. Mixing CompPath and XPath expressions

CompPath and XPath expressions can be used as parts of a single expression:

  • A CompPath expression may be appended at the end of any XPath expression that produces a composite-object .

  • An XPath expression may be appended at the end of any CompPath expression. When doing this,

    CompPathExpr / XPathExpr

    is equivalent to:

    CompPathExpr\node::* / XPathExpr

    And this:

    CompPathExpr ! XPathExpr (: Note: also causes ordering and deduplication of the nodes! :)

    is equivalent to:

    CompPathExpr\value::* ! XPathExpr (: Note: No ordering or deduplication, can be applied on any item, not just on nodes :)

  • A CompPath expression may be substituted for the expected argument of any XPath expression, for example: count(MyCompPathExpr)

  • Any XPath expression that produces a composite item can be used as the composite-root for any CompPath expression

Example:

let $myBooks := 
<books>
 <book name="Tom Sawyer">
   <author>Mark Twain</author>
 </book>
 <book name="Wuthering Heights">
   <author>Emily Brontë</author>
 </book>
 <book name="Jane Eyre">
   <author>Charlotte Brontë</author>
 </book>
 <book name="Adventures of Huckleberry Finn">
   <author>Mark Twain</author>
 </book>
</books>,
$map1 := map {"science-works": map{"Einstein": "Special Theory of relativity",
                                  "Darwin" : "On the Origin of Species"
                                 },
             "literature" : map{"19the Century": $myBooks}
            }
return
  $map1\literature\\*/book[author eq 'Mark Twain']

Evaluating this mixed CompPath and XPath expression produces the correct result:

<book name="Tom Sawyer">
  <author>Mark Twain</author>
</book>
<book name="Adventures of Huckleberry Finn">
  <author>Mark Twain</author>
</book>

Issue #349 closed #closed-349

15 Feb at 18:04:32 GMT

Revert PR change; it doesn't work in this context

Pull request #349 created #created-349

15 Feb at 18:04:27 GMT

Revert PR change; it doesn't work in this context

Issue #348 closed #closed-348

15 Feb at 17:46:33 GMT

Attempt to build PR with merge-base version of master

Pull request #348 created #created-348

15 Feb at 17:46:07 GMT
Attempt to build PR with merge-base version of master

This PR changes the CI build-pr.yml script so that it checks out the version of master that the branch started from, rather than the current version of master, for building the specifications.

  • Pro: we won't get build failures when the current master can't build the old version (for example, when images have been removed)
  • Con: we won't get any features from the current master, such as stylesheet updates

Since failing builds are more troublesome than formatting issues, I'm going to say the pros outweigh the cons.

Pull request #347 created #created-347

15 Feb at 14:06:34 GMT
Attempt to clarify fn:parse-uri and fn:build-uri

Fix #307

Issue #346 closed #closed-346

15 Feb at 10:11:01 GMT

Remove dagger from record cross-references

Pull request #346 created #created-346

15 Feb at 10:10:36 GMT
Remove dagger from record cross-references

Record types are better supported by the stylesheets so the dagger is simply a distraction.

Issue #345 created #created-345

14 Feb at 22:52:25 GMT
Missing rule for matching atomic values against atomic types

In XPath §3.6.2 we have forgotten to state the obvious rule:

"An Atomic Value AV matches a generalized atomic type GAT if the type annotation of AV (call it T) satisfies the condition derives-from(T, GAT)."

At the same time it would a good idea to clarify whether locally-declared union and enum types fall within the definition of "schema types" (I think they should do so.)

Issue #342 closed #closed-342

14 Feb at 17:24:35 GMT

Issue318 meta elements

QT4 CG meeting 022 draft minutes #minutes-02-14

14 Feb at 17:12:00 GMT

Draft minutes published.

Issue #338 closed #closed-338

09 Feb at 17:30:02 GMT

Add ednote per action QT4CG-016-02

Pull request #344 created #created-344

09 Feb at 17:06:56 GMT
Issue 22: allow "for"/"let" keyword to be repeated in XPath

Addresses the proposal in issue 22 to allow repetition of the "let" or "for" keyword in a ForExpr or LetExpr. (It does not, however, allow "for" and "let" to be mixed).

Issue #343 created #created-343

09 Feb at 12:47:25 GMT
$collation argument: Unification

In the function set of the XQFO current specification, the type of the $collation parameter is sometimes xs:string and sometimes xs:string?, depending on the position of the parameter. Examples:

Mandatory

fn:distinct-values($values as xs:anyAtomicType*, $collation as fn:default-collation()) as xs:anyAtomicType*
fn:index-of($input | as xs:anyAtomicType*, $search | as xs:anyAtomicType, $collation as xs:string) as xs:integer*

Optional

fn:sort($input as item()*, $collation as xs:string?, $key as function(item()) as xs:anyAtomicType*
fn:lowest($input as item()*, $collation as xs:string?, $key as function(item()) as xs:anyAtomicType*) as item()*

I think we should always allow an empty sequence.

Pull request #342 created #created-342

09 Feb at 12:27:01 GMT
Issue318 meta elements

Revises the rules for serializing meta elements to take account of new HTML5 syntax.

Resolves issue #318

Issue #18 closed #closed-18

09 Feb at 11:01:40 GMT

[DM31] Function types do not form a hierarchy

Issue #58 closed #closed-58

08 Feb at 16:20:16 GMT

[XQuery] String Value Templates

Issue #107 closed #closed-107

08 Feb at 16:11:07 GMT

Allow self::(a|b|c)

Issue #234 closed #closed-234

08 Feb at 11:45:42 GMT

If Without Else

Issue #330 closed #closed-330

08 Feb at 08:35:16 GMT

Update fn:parse-html to apply review feedback.

Issue #341 created #created-341

08 Feb at 01:50:08 GMT
[XPath] Error-free selection operator for maps or arrays, or finite-domain functions

In March 2021 Jarno Elovirta raised on the #general channel of the XML.com Slack the problem that the existing map or array lookup operator "?" prevents a free traversal of a nested mapp/array object. For example, this expression results in error:

[
  map {"k0": 1}, 
  map{"k0": [1, 2, 3]}
]  ?* ?("k0")  ?*

[XPTY0004] Input of lookup operator must be map or array: 1.


There are three possible types of reaction to this problem:

  1. Do nothing

  2. Relax the semantics of the map/array lookup operator "?" so that it can be applied on items of non-map/non-array type and in such case produce the empty sequence.

  3. Introduce a similar operator to "?" that will behave as it, but instead of producing an error when applied on items of non-map/non-array type it produces the empty sequence.

Obviously, we are not advocating the 1st choice above, or otherwise we wouldn't be raising any issue 😄

Choice 2 could be implemented, but this would have a few drawbacks:

  • it would bring a certain degree of backwards incompatibility
  • "silently returning nothing" is really difficult to debug or even notice unexpected results, as pointed out by @michaelhkay

This proposal is to choose alternative 3. above.

Why is it better than the 2nd one?

  • No incompatibility can be introduced, as this is a new operator.
  • The user has intentionally chosen this operator over the "?" operator, and this means that the user is well aware of the new, sometimes tricky to observe/explain/debug behavior, but the user doesn't mind these effects and is ready to deal with them.

Definition

By definition the operator "->" with left-hand-side any expression E and right-hand-side a literal string X:

   E -> X

is lexically expanded to:

   E[. instance of map(*) or . instance of array(*)]?X

Example

With the original expression provided by Jarno Elovirta, but now using the "->" operator:

[
  map {"k0": 1}, 
  map{"k0": [1, 2, 3]}
]  ->* ->("k0")  ->*

its evaluation produces the expected result (all the values within just one of the leaves of the tree), and no error:

1, 2, 3

That is, 1 ->* produces the empty sequence and no error.

Note:

Of course, the above example can be rewritten to this equivalent XPath 3.0 expression and will get the wanted result, but literally no one, myself included, will ever write this:

[
 map {"k0": 1}, 
 map{"k0": [1, 2, 3]}
] [. instance of map(*) or . instance of array(*)]      ?*
           [. instance of map(*) or . instance of array(*)]      ?k0
                                [. instance of map(*) or . instance of array(*)]   ?*

image

Thus this is all about making it possible/feasible and empowering our users!

Issue #340 created #created-340

07 Feb at 22:02:07 GMT
fn:format-number: Specifying decimal format

It would be nice if the decimal format for fn:format-number could also be supplied via an additional argument. The current syntax is:

(: result: 12.345,67 :)
declare decimal-format de decimal-separator = ',' grouping-separator = '.';

format-number(
  value := 12345.67,
  picture := '#.##0,00',
  decimal-format-name := 'de'
)

The syntax could be enhanced as follows:

format-number(
  value := 12345.67,
  picture := '#.##0,00',
  format := map { 'decimal-separator': ',', 'grouping-separator': '.' }
)

If both decimal-format-name and format are supplied, an error should be raised.

Edit 2023-05-02, adopted from a comment further below:

Next, language-specific default settings would be sensible. The existing syntax could be used:

format-number(12345.67, '#.##0,00', 'de')

As known from the other functions for formatting numbers and dates, it could be up to the implementation to decide which languages are supported. The defaults could be overwritten by custom decimal-format declarations in the prolog to ensure that a setting is applied, even if an implementation does not support it.

QT4 CG meeting 021 draft minutes #minutes-02-07

07 Feb at 17:20:00 GMT

Draft minutes published.

Issue #339 created #created-339

07 Feb at 13:46:57 GMT
The constraints on document-uri are too...constraining

The XPath data model imposes the following constraints on the document-uri property:

If the document-uri is not the empty sequence, then the following constraint must hold: the node returned by evaluating fn:doc() with the document-uri as its argument must return the document node that provided the value of the document-uri property.

In other words, for any Document Node $arg, either fn:document-uri($arg) must return the empty sequence or fn:doc(fn:document-uri($arg)) must return $arg.

This contraint turns out to be inconvenient whenever the larger environment doesn’t enforce a 1:1 mapping between URIs and documents.

For example, in a browser context, a JavaScript function that returns different versions of the same document over time cannot identify those documents with the same document-uri.

In XProc, a p:add-attribute step that returns a copy of its input document with one additional attribute, cannot identify the output document with the same document-uri as the input document.

Given that the document URI is often necessary to evaluate relative URI references within a document, the constraints imposed in the data model are too strict.

Pull request #338 created #created-338

07 Feb at 13:33:27 GMT
Add ednote per action QT4CG-016-02

This is a purely editorial change. Unless someone objects over the next few days, I'm just going to merge it in.

QT4 CG meeting 021 draft agenda #agenda-02-07

04 Feb at 15:09:00 GMT

Draft agenda published.

Issue #337 created #created-337

02 Feb at 00:52:31 GMT
Local union and enum types: and the definition of generalised atomic types

We need to review the proposed specs for local union and enum types, and decide whether or not to proceed with them.

I note that the definitions of generalized atomic type and pure union type say they must be "schema-defined", which appears to exclude locally-defined union and enum types.

I wonder if the definition of local enum types should be aligned more closely with an XSD type derived from xs:string by restricting with an enum facet. Now that we allow down-casting in the coercion rules, the objections to this seem to disappear.

cast and castable should also probably pay more attention to these types.

Pull request #336 created #created-336

01 Feb at 12:48:05 GMT
Action QT4CG-019-01 (type of $pattern in fn:tokenize())

Also, update the fos:history record for a number of functions.

Issue #308 closed #closed-308

01 Feb at 12:13:58 GMT

Improve the legends in the diagrams

Issue #335 closed #closed-335

01 Feb at 12:13:57 GMT

Rework type hierarchy diagrams as styled lists

Pull request #335 created #created-335

01 Feb at 12:13:50 GMT
Rework type hierarchy diagrams as styled lists

Close #308

This proposal was accepted at meeting 020 on 31 January 2023.

The PR won't format correctly because there are style changes, so I'm just going to merge this. I have fixed the diagrams in both the data model specification and f&o.

Issue #205 closed #closed-205

01 Feb at 10:18:56 GMT

Make higher-order-function support mandatory

Issue #221 closed #closed-221

01 Feb at 10:18:37 GMT

Expose op:same-key() as a user-visible function

Issue #324 closed #closed-324

01 Feb at 10:17:52 GMT

Proposed syntax and semantics for string templates

Issue #326 closed #closed-326

01 Feb at 10:17:34 GMT

Issue 205: make support for higher-order functions mandatory

Issue #319 closed #closed-319

01 Feb at 10:17:21 GMT

Issue 221: op:same-key becomes fn:atomic-equal

Issue #334 created #created-334

01 Feb at 09:17:02 GMT
Transient properties: a new approach to deep selection and update in maps and arrays

After exploring many alternatives, I have come to the conclusion that we can't solve the problem of deep navigation and transformation of JSON structures without a data model change.

Most of the problems boil down to this: JSON trees do not have parent pointers, therefore after navigating down to a leaf node of the tree, we cannot get any information from higher up the tree. The solution to this (the "zipper" model) is to retain transient information about how a particular node in the tree was reached, so that we can retrace our steps and revisit nodes that were passed en route.

The change I propose is quite minor, but powerful: Any XDM value can be augmented with a set of transient properties represented as a set of key-value pairs. These properties are ignored (and typically dropped) by all operations on a value, except where otherwise specified. For the purpose of exposition, I'll use the syntax $value¶name to refer to the transient name property of $value.

We'll change the semantics of map:get() and array:get(), and the associated lookup operators, so that the resulting values have transient properties indicating how they were selected. For example, given

let $name := $person?firstName

the resulting value (perhaps the string "Michael") will be augmented with transient properties

  • ¶parent - the map from which the value was selected (retaining its own transient properties if any)
  • ¶key - the key used to make the selection, here "firstName"

and derived properties:

  • ¶ancestors - the transitive closure of ¶parent
  • ¶root - the last ¶ancestor
  • ¶path - a string representation of the path used to select the value

We can also define other "downward selection" operations such as map:find, and array:foot to retain these transient properties. So for example map:find($json, 'firstname')[.='Michael']¶parent?surname now finds the surnames of anyone named 'Michael', at any depth of the tree.

If we turn back to the use cases in my 2016 paper on transforming JSON

https://www.saxonica.com/papers/xmlprague-2016mhk.pdf

The first use case (bulk update) relied on matching items expressed in XML as

match="map[array[@key='tags']/string='ice']/number[@key='price']/text()"

which couldn't be done in JSON because of the inability to match based on ancestor context. With the new transient properties we can match this as

match="type(xs:integer)[¶key = 'price'][¶parent?tags?* = 'ice']"

In the second use case (hierarchic inversion), we can again get properties of parent or ancestor maps

$students ! map:put("course", ¶parent?name)

I think we can also use this to define deep update operations. But I'll leave that investigation until later.

Note: transient properties potentially have many other applications, for example we might use them to solve our problems with document-uri(). But exploring that would be a distraction here. The nice thing about transient properties is that they give a lot of potential for augmenting existing functionality with full backwards compatibility, because we can define existing operations to return results with additional transient properties that all existing operations will ignore. If we were so minded, for example, we could have different functions/operators return "quiet NaN" and "signalling NaN" by adding a transient property to the NaN value returned.

Issue #333 created #created-333

01 Feb at 00:34:41 GMT
Equality of function items

The question of equality of function items arises in the discussion of determinism of functions and memo functions in XSLT - see F&O 3.1 section 1.7.4, and came up again today in the context of fn:deep-equal.

1.7.4 makes a brave attempt to describe situations under which two functions are "identical", though leaving implementations room for flexibility. I think we can build on this and improve it, by describing more situations in which the result is predictable.

The data model describes the properties of a function item, and we can say that two function items are equivalent if all their properties are the same.

The properties that cause problems are the "implementation" and the "closure", and in both cases I think we can find ways of doing a comparison.

For the implementation, we can define this by reference to the way in which the implementation property is set. For function items constructed by reference to static functions (e.g. my:func#3 or function-lookup(my:func, 3)) then they have the same implementation if and only if they are constructed by reference to the same static function. Similarly for function items constructed by evaluating an inline function expression. Other ways of constructing a function item, such as partial application, essentially create a new function with the same implementation as an existing function and a different closure.

For the closure (ignoring for the moment functions that include parts of the dynamic context in their closure), this is essentially just a set of variable bindings and it's not too difficult to say that functions are identical if these sets of variable bindings are identical.

Issue #332 created #created-332

31 Jan at 18:51:03 GMT
Add a namespace uris option to fn:path

The output of fn;path using namespaces is very verbose as it is specified to use the Q{uri}name syntax. It would be useful if it was extended to take a namespace prefix to uri map.

  1. Add a second $namespaces parameter that has the type map(union(xs:NCName, enum('')), xs:anyURI) (the same as fn:in-scope-namespaces) -- this will have a default value of map{} to preserve the existing behaviour.
  2. If the namespace uri is in the map, use the given prefix. If that prefix is "" then just use the local name.
  3. If the namespace uri is not in the map, use the Q{uri}name syntax.

This allows for things like fn:path($e, namespaces := fn:in-scope-namespaces()).

Issue #331 created #created-331

31 Jan at 18:43:14 GMT
Extend fn:path to support arrays and maps.

Currently, fn:path is defined for nodes. This means it is not possible to use it with arrays or maps (e.g. to determine the path to a JSON item when a comparison fails).

As such, I recommend:

  1. changing the type to item()
  2. If the value is a node use the current logic.
  3. If the value is in an array, use ?n where n is the nth item of the array where the item is located.
  4. If the value is in a map, use ?name or ?"name" where name is the key name of the map where the item is located.
  5. If the value is an atomic item, or the root of a map/array structure, use ..

Example: .?4?user?name

QT4 CG meeting 020 draft minutes #minutes-01-31

31 Jan at 17:12:00 GMT

Draft minutes published.

Pull request #330 created #created-330

31 Jan at 12:56:33 GMT
Update fn:parse-html to apply review feedback.

This PR applies the following review comments:

  • [x] QT4CG-016-03: RD to add a note clarifying “known character encoding”
  • [x] QT4CG-016-04: RD to add a note clarifying the “”/”” html/version combination
  • [x] QT4CG-016-05: RD to add a “todo” noting the dependency on keyword arguments
  • [x] QT4CG-016-06: RD to reword the introduction to mapping to clarify who’s doing the mapping
  • [ ] QT4CG-016-08: RD to clarify how namespace comparisons are performed.
  • [x] QT4CG-016-09: RD to add a note stating that the local name should always be lowercase
  • [x] QT4CG-016-10: RD to consider how to clarify parsed entity parsing.

Issue #329 created #created-329

30 Jan at 11:45:03 GMT
Keyword parameters: Error codes

I’ve read the current specification twice, and I have checked the existing qt4 tests, but I’m still confused by the exact meaning of the new error codes for keyword arguments, XPST0141 and XPST0142. Things are getting particularly tricky if we consider partial function applications.

My proposal would be to stick with the existing error code XPST0017 for functions that cannot be matched.

Initial suggestion (obsolete):

  1. use the existing error code XPST0017 for all cases in which a function cannot be chosen as the available arguments (both positional and keyword-based) don’t match the function definition, and
  2. only raise a new error code (XPST0141, possibly) if a keyword argument has been specified more than once (as this can be done without checking the function definitions).

Issue #328 created #created-328

30 Jan at 10:05:00 GMT
Switch Cases: Lift single-item restriction on operands

Motivation

XQuery switch cases have a peculiar restriction: The operand of a single case must yield an empty sequence or a single item. There seem to be no (obvious) reasons why this restriction exists, so I believe we should lift it and allow arbitrary sequences.

A similar extension is planned for Java 12 (JEP 325: Switch Expressions). The required changes in XQuery are simpler, though, as the 3.1 grammar already supports arbitrary expressions as operands.

Examples

switch($value)
  case 1
  case 2
  case 3
  case 4
  case 5
    return 'small'
  default
    return 'big'

Proposed syntax:

switch($value)
  case 1 to 5
    return 'small'
  default
    return 'big'

Required Changes

The current matching rules could be rephrased as follows:

  1. The SwitchCaseOperand is evaluated.
  2. The resulting value is atomized.
  3. The case matches if the value is empty and if the value of the switch expression is empty as well.
  4. Otherwise, the atomized value of the switch operand expression is compared with each item of the atomized value of the SwitchCaseOperand using fn:deep-equal, with the default collation from the static context.

References

  • Original Proposal: https://github.com/expath/xpath-ng/pull/12
  • Discussion on Slack: https://xmlcom.slack.com/archives/C011NLXE4DU/p1675006336963479

QT4 CG meeting 020 draft agenda #agenda-01-31

30 Jan at 08:47:00 GMT

Draft agenda published.

Issue #327 created #created-327

30 Jan at 08:42:45 GMT
Tokenisation

The rule in A.2

When tokenizing, the longest possible match that is consistent with the EBNF is used.

needs clarifying. It could be read as suggesting that if taking the longest match turns out to lead to a syntax error, the tokenisation should be re-attempted using a shorter match. I don't think that has ever been intended. So what exactly does the qualifier "that is consistent with the EBNF" actually mean?

Possibly related, A.2.2 Terminal Delimitation states:

Terminal symbols that are not used exclusively in [/* ws: explicit */] productions are of two kinds: delimiting and non-delimiting.

But (at least in the XQuery version) the list of delimiting tokens includes a number that are indeed used exclusively in ws:explicit productions, for example a number of tokens containing back-ticks, and ]]>.

I think we need to be clearer that tokens used in ws:explicit productions are recognised only when parsing the production that uses them. For example given the expression A[B[C]]>3, we should not recognise ]]> as a token under the longest-token rule. I think that's probably what the "consistent with the EBNF" rule is intended to convey.

Pull request #326 created #created-326

29 Jan at 22:45:46 GMT

Issue 205: make support for higher-order functions mandatory

Issue #325 created #created-325

29 Jan at 22:03:56 GMT
Operator precedence table needs updating

The otherwise and -> operators (and maybe others) are missing from the non-normative precedence table in Appendix A.4.

Pull request #324 created #created-324

29 Jan at 18:23:31 GMT
Proposed syntax and semantics for string templates

See issue #58.

I would recommend reviewing the XQuery version of the spec first, since it contains additional notes contrasting string templates and the existing string constructors. The section on string constructors has moved, but is unchanged except for the addition of this note.

Issue #323 created #created-323

27 Jan at 21:56:32 GMT
add select attribute to xsl:text

Although xsl:text select="socks" would be the same as xsl:value-of select="socks" in implementation terms, users of XSLT 2 and later, even people who have been using XSLT 2 or 3 for some time, are often surprised to learn that xsl:value-of makes a text node, and that they need to use xsl:sequence to return something else.

So it'd be great to have them use xsl:text instead of xsl:value-of, where text nodes are wanted, because then introducing xsl:sequence is a small step.

Of course, beginners also often use value-of where they should be using apply-templates, e.g. to handle mixed content! But again, using xsl:text reduces that temptation.

We do have value templates now, xsl:text{ .... }</xsl:text>, which mitigates the need slightly, but i think only slightly, because the select= analogy is very compelling.

Issue #322 created #created-322

26 Jan at 10:10:12 GMT
Map construction in XSLT: xsl:record instruction

Constructing maps in XSLT often involves code rather like this:

            <xsl:map>
               <xsl:map-entry key="'author'" select="string(AUTHOR)"/>
               <xsl:map-entry key="'title'" select="string(TITLE)"/>
               <xsl:map-entry key="'price'" select="xs:decimal(PRICE)"/>
               <xsl:map-entry key="'publisher'" select="string(../@name)"/>
           </xsl:map>

The alternative using XPath is also rather ugly:

<xsl:sequence select="map{'author': string(AUTHOR),
                                                 'title':string(TITLE), 
                                                 'price': xs:decimal(PRICE), 
                                                 'publisher':string(../@name)}"/>

(the fact that it is creating a map doesn't stand out; the xsl:sequence is a distraction because there's no sequence involved; and many users dislike long multi-line XPath expressions because of formatting problems in their editing tools)

I propose a new instruction xsl:record which allows:

            <xsl:record author="string(AUTHOR)"
                                title="string(TITLE)" 
                                price="xs:decimal(PRICE)" 
                                publisher="string(../@name)"/>

This is rather like literal result elements in that the attributes are user-defined rather than system-defined. Unlike LREs, the values are general expressions rather than AVTs, because the values are not necessarily strings. The instruction can only be used where the keys (field names) take the form of NCNames.

If variable entries are required, or entries whose keys are not NCNames, they can appear as child instructions:

            <xsl:record author="string(AUTHOR)"
                                title="string(TITLE)" 
                                price="xs:decimal(PRICE)" 
                                publisher="string(../@name)">
              <xsl:if test="@private">
                  <xsl:map-entry name="'private entry'" select="true()"/>
              </xsl:if>
          </xsl:record>

Follow the tradition of LREs, duplicates are resolved as "last one wins".

If "standard attributes" such as [xsl:]version are required, they must be in the XSLT namespace, as with LREs.

Issue #321 created #created-321

26 Jan at 04:12:06 GMT
relax $input in fn:serialize

Relevant specifications: https://qt4cg.org/specifications/xpath-functions-40/Overview-diff.html#func-serialize

Would it be possible to relax the strictures on $input (first parameter) of fn:serialize()?

  1. The specifications do not explicitly forbid map(*) or array(*) as input, but in practice, when these are supplied, Saxon rejects them. Developers (or at least this one) who work with arrays and maps often need to render them in string output or messages, if only for diagnostics. If there is something really prohibitively wrong with those two items as input to fn:serialize(), then the specifications should say so.
  2. Attributes are forbidden, but it is unclear why. They get serialized fine in the context of a parent, why not alone?
  3. Namespace nodes are forbidden; see previous point.

(No doubt there must have been discussion on points 2-3, but the rationale is not clear from the specs.)

Perhaps the question is that the details of what the serialization should look like is contestable. I think the answer there is simply, pick one. I think we'll happily live with whatever is chosen.

For the serialization of maps and arrays, I'll point as one possible model my tan:map-to-xml() and tan:array-to-xml(), which have been indispensable for daily troubleshooting.

Pull request #320 created #created-320

25 Jan at 18:58:47 GMT
Issue 98 - add options parameter to fn:deep-equal

This proposal adds an options parameter to fn:deep-equal, giving much more detailed control over how the comparison is performed (while remaining backwards compatible by default).

This proposal is a first draft and I would request careful review, it's not one to pass through "on the nod".

Pull request #319 created #created-319

25 Jan at 16:20:34 GMT
Issue 221: op:same-key becomes fn:atomic-equal

The proposal renames op:same-key as fn:atomic-equal, thus making it directly available to applications.

Issue #294 closed #closed-294

25 Jan at 14:56:17 GMT

fn:remove removing multiple items

Issue #318 created #created-318

25 Jan at 11:36:27 GMT
Serialization HTML/XHTML output methods: meta elements and the charset attribute

HTML5 introduces the ability to write

<meta charset="utf-8"/>

in place of

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>

The serialization spec (for HTML and XHTML output methods) ignores this.

(a) it requires the serializer to add a meta element in the second form rather than the first.

(b) when removing existing meta elements, it requires the second form to be deleted, but not the first. This may result in invalid (X)HTML in which both elements are present.

Issue #309 closed #closed-309

24 Jan at 17:20:48 GMT

Drop ternary conditionals, as agreed on 2023-01-17

Issue #310 closed #closed-310

24 Jan at 17:20:32 GMT

Fix outstanding issues from PR 304

Issue #312 closed #closed-312

24 Jan at 17:20:13 GMT

Minor editorial improvements

Issue #313 closed #closed-313

24 Jan at 17:19:16 GMT

Issue 294: fn:remove()

Issue #317 created #created-317

24 Jan at 17:16:34 GMT
fn:format-integer: $lang → $language ?

A minor inconsistency in the XQFO specification: The third parameter of fn:format-integer is named $lang

https://qt4cg.org/specifications/xpath-functions-40/Overview-diff.html#func-format-integer

…whereas all other language parameters are named $language.

QT4 CG meeting 019 draft minutes #minutes-01-24

24 Jan at 17:11:00 GMT

Draft minutes published.

Issue #316 created #created-316

23 Jan at 16:23:37 GMT
Function fn:differences

I didn't see any issues thread devoted fn:differences(), so am opening this one. Please respond with xrefs to anything relevant.

Draft here: https://qt4cg.org/specifications/xpath-functions-40/Overview.html#func-differences

IMO, this function seems overly complicated for both users and implementors. The specs provide difficult reading. But it is the first function to try to address the desideratum for differencing. Something like it is needed methinks.

My suggestion would be to simplify the function as a straightforward string comparison, i.e., change the signature to something like fn:differences($input1 as xs:string, $input2 as xs:string) as OUTPUT where OUTPUT is either a tree structure (like the output of fn:analyze-string()) or a sequence of records (e.g., (is-in-1 as xs:boolean, is-in-2 as xs:boolean, fragment as xs:string)).

Such a change would make the function more tractable for both users and implementers. The user, would need to cast each sequence to a string, and in so doing will be able to (be compelled to) make fine-grained decisions on things such as normalization. Processor implementers have far simpler input, and they can choose the difference algorithm that makes best sense at the moment.

One counterargument might be that the resultant output would be difficult to correlate to the original sequences. Ostensibly, one wants to do things such as decide whether to drop certain items in sequence 1 or sequence 2. My response is that the current draft results in output that suffers from the same problem. Navigating the map to correlate it to the original sequence sounds daunting. With my suggestion, there are ways around this, through auxiliary functions or arity expansions that normalize the output.

But I don't want to get postprocessing output here, which would be tangential to the main question, i.e., how fn:differences() should be constructed in a way conducive to both users and implementers.

Issue #311 closed #closed-311

23 Jan at 11:46:17 GMT

Stylesheet fix to mark optional fields in record definitions

Issue #62 closed #closed-62

21 Jan at 00:16:24 GMT

[FO] The parameter types for fn:unique and array:partition are incorrectly specified.

Issue #71 closed #closed-71

21 Jan at 00:14:10 GMT

[XSLT] Use of multiple predicates: order of evaluation

Issue #171 closed #closed-171

20 Jan at 23:44:17 GMT

XPath ternary conditional operator

Issue #315 created #created-315

18 Jan at 14:58:00 GMT
fn:transform inconsistency: initial-mode

The fn:transform specification in F+O says that if no initial-mode is supplied, the unnamed mode is used.

The XSLT 3.0 specification says that if no initial mode is supplied, then the default mode is used if one has been specified, or the unnamed mode is used if not.

I think the XSLT 3.0 spec should win here: it makes more sense if a default has been declared that it should actually be used.

(Thanks to Amanda Galtman for pointing this out.)

Issue #314 created #created-314

18 Jan at 09:24:52 GMT
Basic Operations on Maps and Arrays

Manipulating Arrays and Maps

This is an outline of proposed new facilities designed to make processing of maps and arrays easier. The basic facilities needed for transformation of maps and arrays are the ability to decompose them into their parts, manipulate the parts, and the compose new arrays and maps from these parts.

Further background is in my 2022 Balisage paper, https://balisage.net/Proceedings/vol27/html/Kay01/BalisageVol27-Kay01.html

This proposal considers only the "shallow" operations on maps and arrays. Further proposals for deep search and update of nested structures are to be expected.

Basics

A map entry is an item used to represent a key-value pair in a map; it is an item of type record(key as xs:anyAtomicType, value as item()*), aliased in this proposal as type(map-entry).

Note: an alternative representation for key-value pairs is as a singleton map, and that's the representation used by the existing map:entry() function and by the xsl:map-entry instruction. A representation as record(key, value) is rather more convenient to enable extraction of the key and value, but does create some compatibility issues...

An array entry is an item used to represent a member of an array; it is an item of type record(value as item()*), aliased in this proposal as type(array-entry).

Decomposing Maps and Arrays

The function map:entries($map) returns a sequence of map entries, in unpredictable order, representing the contents of the supplied map. It is equivalent to map:for-each($map, ->($k, $v){map:entry($k, $v)}).

The function array:entries($array) returns a sequence of array entries, in array order, representing the members of the supplied array. It is equivalent to array:for-each($array, ->($v){array:entry($v)}).

Constructing Maps and Arrays

The function map:of($entries as type(map-entry)*) as map(*) constructs a map from a sequence of map entries. A second parameter, $options, is available to control handling of duplicates, as with map:merge(). The function is equivalent to map:merge($entries!map{'key':?key, 'value':?value}).

The function array:of($entries as type(array-entry)*) as array(*) constructs an array from a sequence of array entries. It is equivalent to array:fold-left($entries, [], array:append#2).

The function map:entry($key, $value) as type(map-entry) is equivalent to map{'key':$key, 'value':$value}. Problem: we already have a function map:entry() in 3.1 that does something different. Need to change the terminology...

The function array:entry($value) as type(array-entry) is equivalent to map{'value':$value}.

Filtering Maps and Arrays

The construct $map?[PREDICATE] is equivalent to map:of(map:entries($map)[PREDICATE]). For example, given a map in which the keys are dates, $map?[year-from-date(?key)=2023] returns a map containing those entries in which the key is a date in 2023.

The construct $array?[PREDICATE] is equivalent to array:of(array:entries($array)[PREDICATE]). For example, $array?[1] selects the first item in the array (as a single-member array), while $array?[exists(?value)] returns an array containing all those entries in the input array that are not empty. If $array is an array of maps, then $array?[?value?name='John'] selects those members of the array that are maps having ?name='John'.

Mapping Maps and Arrays

The construct $map!!EXPR evaluates EXPR once for each entry in $map and returns the result as a flattened sequence. For example map:of($map!!map:entry(?key, ?value+1)) returns a map in which each value has been incremented by one.

The construct $array!!EXPR evaluates EXPR once for each entry in $array, and returns the result as a flattened sequence. For example, array:of($array!!array:entry(?value+1)) returns an array in which every value has been incremented by one.

FLWOR Expressions

The for-member clause for member $m in $array is equivalent to for $sys:var in array:entries($array) let $m := $sys:var?value.

The for-entry clause for entry ($k, $v) in $map is equivalent to for $sys:var in map:entries($map) let $k := $sys:var?key, $v := $sys:var?value.

XSLT

Iteration over maps and arrays is achieved using <xsl:for-each select="array:entries()"> and <xsl:for-each select="map:entries()"> directly.

Construction of maps uses the existing instructions <xsl:map> and <xsl:map-entry>. There is an inconvenience here in that the <xsl:map-entry> instruction returns a singleton map (map{key:value}) rather than a map entry as defined in this proposal (map{'key':key, 'value':value})..

Construction of arrays uses the new instructions <xsl:array> and <xsl:array-entry>. The xsl:array-entry instruction is defined to construct an array entry as defined in this proposal.

Use Cases

To be supplied.

Pull request #313 created #created-313

17 Jan at 22:53:06 GMT
Issue 294: fn:remove()

Allow remove() to remove several items, aligning it with array:remove() and map:remove()

Pull request #312 created #created-312

17 Jan at 22:25:54 GMT
Minor editorial improvements
  • Issue 300 (clarification about results being normalized)
  • Action QT4CG-018-02 (explaining signature notation)
  • Action QT4CG-018-04 (explaining numeric predicates on ancestor unions)

Pull request #311 created #created-311

17 Jan at 21:39:32 GMT

Stylesheet fix to mark optional fields in record definitions

Pull request #310 created #created-310

17 Jan at 21:27:45 GMT
Fix outstanding issues from PR 304

See https://github.com/qt4cg/qtsp…ecs/pull/304#issuecomment-1378532583 - but excluding item 3 because that's a stylesheet change.

Pull request #309 created #created-309

17 Jan at 19:20:00 GMT
Drop ternary conditionals, as agreed on 2023-01-17

We agreed today to drop ternary conditional expressions from the proposal; this PR implements that change.

QT4 CG meeting 018 draft minutes #minutes-01-17

17 Jan at 16:15:00 GMT

Draft minutes published.

Issue #286 closed #closed-286

17 Jan at 17:14:46 GMT

Spec changes to allow child::(a|b|c) - Issue 107

Issue #290 closed #closed-290

17 Jan at 17:14:01 GMT

Fix issue #18 (function type hierarchy)

Issue #35 closed #closed-35

17 Jan at 17:13:37 GMT

[FO]The `union ( | )`, `itersect`, `except` and `combine (,)` operators are not mentioned in the F & O. Have not the best categorization in the XPath spec.

Issue #288 closed #closed-288

17 Jan at 17:13:36 GMT

Error in fn:path specification

Issue #257 closed #closed-257

17 Jan at 17:13:36 GMT

Improving the styling/presentation/prepresentation of the record types in the F&O spec

Issue #70 closed #closed-70

17 Jan at 17:13:35 GMT

[FO] Built-in function changes to support default values

Issue #291 closed #closed-291

17 Jan at 17:13:35 GMT

DTD validity of F&O spec

Issue #304 closed #closed-304

17 Jan at 17:13:34 GMT

Mike's content changes from PR 292

Issue #284 closed #closed-284

17 Jan at 17:12:56 GMT

Add grammar for "if (test) then {expr}" with no else

Pull request #308 created #created-308

15 Jan at 17:05:08 GMT
Improve the legends in the diagrams

This PR completes my action QT4CG-015-03: NW to make sure the direction of the arrow is in the legends

I also made sure the legends aren't too wide. I still have more work to do for the other actions.

Issue #259 closed #closed-259

15 Jan at 16:46:57 GMT

Issue #74 - add the fn:parse-html function

QT4 CG meeting 018 draft agenda #agenda-01-17

15 Jan at 15:30:00 GMT

Draft agenda published.

Issue #306 closed #closed-306

15 Jan at 16:17:10 GMT

fn:char - editors actions from 2023-01-10

Issue #307 created #created-307

15 Jan at 15:42:30 GMT
Parsing and building URIs comments and queries
  1. fn:build-uri states:

If the scheme key is present in the map, the URI begins with the value of that key concatenated with //, otherwise it begins //.

a. Shouldn't the concatenation be :// so e.g. http becomes http://? b. How are non-heirarchical schemes handled like urn, and mailto?

  1. RFC 3986 allows IPv6 and IPvFuture addresses that contain : characters, e.g. http://[::1]:80.

My understanding of fn:parse-uri is that this will fail to parse.

  1. RFC 3986 states that for userinfo, the user:password form is deprecated.

Browsers will reject this due to the security risk, and the RFC suggests that applications should not render the password (the part after the :) in clear text. -- Should fn:build-uri follow suite, or (along with fn:parse-uri) have an option to control the behaviour (keep, remove, invalid), where if the option is invalid, it will throw an fn:error?

  1. RFC 3986 suggests that the port should be omitted if it matches the default for the scheme

Should fn:build-uri have this behaviour?

Pull request #306 created #created-306

10 Jan at 21:23:26 GMT
fn:char - editors actions from 2023-01-10

Changes to the new fn:char function (issue #121) as follows:

  • Action QT4CG-017-01 clarifies the definition of formats #nnn and #xnnn.
  • Action QT4CG-017-02 changes the order of the rules
  • In discussion it was asked whether any HTML5 entity names refer to strings comprising more than one character. On investigation it appears that they do, and the spec has been revised to allow for this.
  • added history/status information

Issue #121 closed #closed-121

10 Jan at 17:20:20 GMT

[FO] fn:nl, fn:tab, fn:cr

QT4 CG meeting 017 draft minutes #minutes-01-10

10 Jan at 16:20:00 GMT

Draft minutes published.

Issue #261 closed #closed-261

10 Jan at 17:19:36 GMT

Proposed fn:char function - see issue 121

Issue #305 created #created-305

09 Jan at 17:50:13 GMT
parse-xml() and whitespace stripping

There seems to be nothing in either the XSLT spec or in F+O that says explicitly whether stylesheet-defined space stripping rules (xsl:strip-space and xsl:preserve-space) apply to documents loaded using fn:parse-xml (or, by extension, parse-html).

The spec says that these rules apply to "source trees" defined as "any tree provided as input to the transformation. This includes the document containing the [global context item] if any, documents containing nodes present in the [initial match selection], documents containing nodes supplied as the values of [stylesheet parameters], documents obtained from the results of functions such as [document], [doc], and [collection]...".

I guess one reasonable interpretation is that the "such as" includes parse-xml(). But it goes rather against the grain that the behaviour of parse-xml() should be affected by the containing stylesheet declarations, when there is no mention of such a context-dependency in the function specification; in this, parse-xml() is rather different from doc() which deliberately says very little about how the XDM instance returned relates to the URI supplied as input.

Issue #292 closed #closed-292

09 Jan at 09:43:27 GMT

Merge signatures with optional params

Pull request #304 created #created-304

09 Jan at 09:32:41 GMT
Mike's content changes from PR 292

I teased apart some of the omnibus PR #292. I've commited the schema and stylesheet changes. This PR covers the remaining prose changes.

Mike writes:

I regret that this has turned into a bit of an omnibus PR. The main changes are:

  • Fix validity issues with the function catalog and its schema (Issue 291)
  • Convert all functions to use a single signature with optional parameters (Issue 70)
  • Extend the function catalog to handle record definitions (Issue 257)
  • Fix the (trivial) bug with properties of fn:path (Issue 288)
  • Add introductory text concerning the handling of operators (Issue 35)

Fix #291 Fix #70 Fix #257 Fix #288 Fix #35

Issue #303 closed #closed-303

09 Jan at 07:38:47 GMT

Mike's proposed schema and stylesheet changes

Pull request #303 created #created-303

09 Jan at 07:32:02 GMT
Mike's proposed schema and stylesheet changes

These are the schema and stylesheet changes from PR #292. They don't break the build and on casual inspection they seem fine, so I'm just going to accept them.

QT4 CG meeting 017 draft agenda #agenda-01-10

06 Jan at 16:02:00 GMT

Draft agenda published.

Issue #300 created #created-300

06 Jan at 16:19:33 GMT
[F+O] Ambiguity regarding Unicode normalization (editorial)

In §1.7.1 the paragraph

Unless explicitly stated, the xs:string values returned by the functions in this document are not normalized in the sense of [Character Model for the World Wide Web 1.0: Fundamentals].

is a little bit ambiguous for my taste. By "are not normalized" it means "no action is taken to normalize the strings", it doesn't mean "the strings will not be in normalized form".

I suggest: "Unless explicitly stated, the functions in this document operate on strings as sequences of codepoints and do not attempt to convert input strings, or produce output strings, in Unicode normalized form. Unicode normalization occurs only when explicitly requested, for example by use of the fn:normalize-unicode function."

At the same time we might update the reference to point to "Character Model for the World Wide Web: String Matching", revised in 2021, though it is still only a Working Group Note. See https://www.w3.org/TR/charmod-norm/#unicodeNormalization

Issue #281 closed #closed-281

02 Jan at 18:12:42 GMT

XPath: Short-circuiting Functions and Lazy Evaluation Hints

Issue #299 created #created-299

02 Jan at 18:08:24 GMT
Short-circuiting functions, function-arity guards and lazy hints

I. Shortcutting and lazy hints

Let us have this expression:

let $f := function($arg1 as item()*, $arg2 as item()*) as function(item()*) as item()*
             {  (: Some code here :) }
  return
    $f($x) ($y)

Evaluating $f($x) produces a function. The actual arity of this resulting function can be any number N >= 0 :

  • If N > 1 there would be arity mismatch error, as only one argument $y is provided in the expression.

  • If N = 1 the final function call can be evaluated, and the argument $y must be evaluated, or

  • If N = 0, then $y is unneeded and can safely be ignored according to the updated Coercion Rules / Function Coercion in Xpath 4.0.

Because a possibility exists to be able to ignore the evaluation of $y, it is logical to delay the evaluation of $y until the actual arity of $f($x) is known.

The current XPath 4.0 evaluation rules do not require an implementation to base its decision whether or not to evaluate $y on the actual arity of the function produced by $f($x), thus at present an implementation could decide to evaluate $y regardless of the actual arity of the function produced by $f($x).

This is where a lazy hint comes: it indicates to the XPath processor that it is logical to make the decision about evaluation of $y based on the actual arity of the function returned by $f($x).

A rewrite of the above expression using a lazy hint looks like this:

let $f := function($arg1 as item()*, $arg2 as item()*) as function(item()*) as item()*
             {  (: Some code here :) }  
  return
    $f($x) (lazy $y)

Here is one example of a function with short-cutting and calling it with a lazy hint:

let $fAnd := function($x as xs:boolean, $y as xs:boolean) as xs:boolean
             {
                let $partial := function($x as xs:boolean) as function(xs:boolean) as  xs:boolean
                                {
                                  if(not($x)) then ->(){false()}
                                              else ->($t) {$t}
                                }
                 return $partial($x)($y)
             }
  return
     $fAnd($x (: possibly false() :), lazy $SomeVeryComplexAndSlowComputedExpression)

Without the lazy hint in the above example, it is perfectly possible that an XPath implementation, unrestricted by the current rules, would evaluate $SomeVeryComplexAndSlowComputedExpression - something that is unneeded and could be avoided completely.

Formal syntax and semantics

  1. The lazy keyword should immediately precede any argument in a function call. If specified, it means that it is logical to make the decision about evaluation of this argument based on the actual arity of the function in this function call.

    Based on this definition, it follows that lazy $argK implies lazy for all arguments following $argK in the function call. Thus specifying more than one lazy hint within a given function call is redundant and an implementation may report this redundancy to the user.

    The scope of a lazy keyword specified on an argument is this and all following arguments of (only) the current function call.

  2. It is possible to specify a lazy keyword that is in force for the respective argument(s) of all function calls of the given function. To do this, the lazy keyword must be specified immediately preceding a parameter name in the function definition of that function.

    For example, if the function $f is specified as:

    let $f := function($arg1 as item()*, lazy $arg2 as item()*, $arg3 as item()*, $arg4 as item()* ) 
              { (: some code here:) }
      return
         $someExpression
    

    Then any call of $f in its definition scope that has the form:

    $f($x, $y, $z, $t)
    

    is equivalent to:

    $f($x, lazy $y, $z, $t)
    
  3. It is possible to specify the lazy keyword immediately preceding a function definition. This instructs the XPath processor that any call of this function is only necessary to be evaluated if the function is actually called during the evaluation of the expression that contains this function call.

    For example:

    let $complexComputation := lazy function($x, $y) {$x + $y}, (: Make it as complex as you want ... :)
         $someCondition := function()
            {
                let $date := current-date()
                  return
                      month-from-date($date) eq 2
                    and 
                     day-from-date($date) eq 29 
           }
      return if($someCondition()) 
               then $complexComputation(2, 3)
               else 0
    

    Specifying the lazy keyword in the function definition for $complexComputation can save significant computing resources, because the programmer knows that $someCondition() is true during only a single day in any 4-years period.

II.fn:lazy

Summary

Applied on a single argument that can be any expression. Lazily returns its argument expression.

Signature

lazy fn:lazy( 
        $expression as item()*
) as item()*

Properties

This function is deterministic, context-independent, focus-independent

Rules

The semantics of the function is strictly defined below:

let $lazyFunction := lazy fn:identity#1
   return
      (: AnyExpression here :)

Any expression Q of the form:

Q(E1, lazy(E2))

where E1 and E2 are subexpressions of Q, must be evaluated by the Processor in two steps:

  1. Substitute the expression

    Q(E1, lazy(E2))

    with:

    Q(E1, ?) (lazy E2)

  2. Evaluate the latter according to the rules for a lazy argument

Example

We can use almost the same example as above, but here $complexComputation is defined without the lazy keyword and thus is not a lazy function. To have $complexComputation evaluated lazily, we call the lazy() function, passing $complexComputation to it:

let $complexComputation := (: no lazy here :) function($x, $y) {$x + $y}, (: Make it as complex as you want ... :)
     $someCondition := function()
        {
            let $date := current-date()
              return
                  month-from-date($date) eq 2
                and 
                 day-from-date($date) eq 29 
       }
  return 
        $someCondition() and lazy( $complexComputation(2, 3))

Here the expression Q is:

$someCondition() and lazy( $complexComputation(2, 3))

This is the same as:

fn:op("and")($someCondition(), lazy( $complexComputation(2, 3))

According to the Rules above, the processor converts this to:

fn:op("and")($someCondition(), ?) (lazy( $complexComputation(2, 3)) )

$someCondition() is evaluated and if its value is false(), then the expression to be evaluated is:

fn:op("and")(false(), ?) (lazy( $complexComputation(2, 3)) )

As fn:op("and")(false(), ?) by definition is function() {false()}. then the final result false() is produced and the unnecessary argument $complexComputation(2, 3) is not evaluated at all.

III. A function's arity is a guard for its arguments

Let us have a function $f defined as below:

let $f := function($arg1 as item()*, $arg2 as item()*, …, $argN as item()*)
   as function(item()*, item()*, …, item()*) as item()*
     {
       if($cond0($arg1))       then -> () { 123 }
        else if($cond1($arg1)) then -> ($Z1 as item()*) {$Z1}
        else if($cond2($arg1)) then -> ($Z1 as item()*, $Z2 as item()*) {$Z1 + $Z2}
        (:    .        .        .        .         .        .        .         .  :)
        else if($condK($arg1)) then -> ($Z1 as item()*, $Z2 as item()*, …, $Zk as item()*)
                                       {$Z1 + $Z2 + … + $Zk}
        else ()
     }
  return
     $f($y1, $y2, …, $yN) ($z1, $z2, …, $zk)

A call to $f returns a function whose arity may be any of the numbers: 0, 1, …, K.

Depending on the arity of the returned function (0, 1, …, K), the last (K, K-1, K-2, …, 2, 1, 0) arguments of the function call:

$f($y1, $y2, . . . , $yN) ($z1, $z2, . . . , $zk)

are unneeded and it is logical that they would not need to be evaluated.

So, the actual arity of the result of calling $f is a guard for the arguments of a call to this function-result.

Thus, one more bullet needs to be added to [2.4.5 Guarded Expressions] https://qt4cg.org/specifications/xquery-40/xpath-40.html#id-guarded-expressions), specifying an additional guard-type:

  • In an expression of the type E(A1, A2, ..., AN) any of the arguments AK is guarded by the condition actual-arity(E) ge K. This rule has the consequence that if the actual arity of E() is less than K then if any argument Am (where m >= K) is evaluated, this must not raise a dynamic error. An implementation may base on the actual arity of E() its decision for the evaluation of the arguments.

Issue #298 created #created-298

02 Jan at 15:36:58 GMT
Abstract supertype for map and array

I've been wondering whether there would be any mileage in introducing an abstract super type for map() and array(), perhaps called lookup(). This would basically treat an array as a map with integer keys.

This would allow a cleaner type signature for map:find() and any future functions such as xx:search() that work both on maps and arrays. It might simplify the description of the lookup operator "?". For functions that already exist in both the map and array namespaces, such as get(), we could introduce a unified function in the fn namespace with the cosmetic benefit of reducing the need for namespace prefixes and namespace declarations.

I'm still keen to find a better way of doing iteration, filtering, mapping, and construction of maps and arrays, and I think this might be a useful stepping stone.