@qt4cg statuses in 2026
This page displays status updates about the QT4 CG project from 2026.
See also recent statuses.
QT4 CG meeting 153 draft minutes #minutes-02-17
Draft minutes published.
Issue #2457 closed #closed-2457
Improved use of fos:result
Issue #2456 closed #closed-2456
Stylesheet handling of fos:result/@narrative
Issue #2234 closed #closed-2234
Replace `a/get(XX)` with `a/?(XX)`
Issue #2427 closed #closed-2427
Node construction in XPath
Issue #2446 closed #closed-2446
2427 Add computed node constructors to XPath
Issue #2459 closed #closed-2459
What are "invalid XML characters" in the XPath file read functions?
Issue #2385 closed #closed-2385
The XML version of the XPath spec isn't the XML version of the spec, it's HTML
Pull request #2467 created #created-2467
Harmonize the fn: and file: functions that read text
Close #2460
This PR harmonizes the functions fn:unparsed-text, fn:unparsed-text-lines, file:read-text, and file:read-text-lines with respect to handling non-permitted characters. Each function has an options parameter and that parameter may contain a fallback function to remap non-permitted characters.
This leaves unresolved the question of what to do about permitted characters not allowed in XML, but that’s orthogonal. Strings containing such characters might arise from any of these functions, but equally, might arise from other operations.
Issue #2466 created #created-2466
format-number() precision
The specification of format-number() says:
If there are several such values that are numerically equal to the mantissa (bearing in mind that if the mantissa is an xs:double or xs:float, the comparison will be done by converting the decimal value back to an xs:double or xs:float), the one that is chosen should be one with the smallest possible number of digits not counting leading or trailing zeroes (whether significant or insignificant).
The parenthetical "bearing in mind" note needs updating, because comparison of a decimal to a double is no longer done by converting the decimal back to a double.
Background: XSLT test case format-number-044a, which I have extricated from the composite test format-number-044, formats the x:double obtained as 1E100 div 3. This is giving me a result with sixteen 3s on Java, fifteen 3s on C#. I am trying to work out which is correct, or at any rate which one should be produced according to the above rules.
QT4 CG meeting 153 draft agenda #agenda-02-17
Draft agenda published.
Issue #2465 created #created-2465
Error description of FODC0006 should be more generic
The error description for FODC0006 can be raised by fn:parse-xml and fn:parse-xml-fragment.
It is currently described as
err:FODC0006, String passed to fn:parse-xml is not a well-formed XML document.
Raised by [fn:parse-xml](https://qt4cg.org/specifications/xpath-functions-40/Overview.html#func-parse-xml)
if the supplied string is not a well-formed and namespace-well-formed XML document;
or if DTD validation is requested and the document is not valid against its DTD.
I propose to alter the description to
err:FODC0006, String cannot be parsed as XML.
Raised by [fn:parse-xml](https://qt4cg.org/specifications/xpath-functions-40/Overview.html#func-parse-xml) or [fn:parse-xml-fragment](https://qt4cg.org/specifications/xpath-functions-40/Overview.html#func-parse-xml-fragment)
if the supplied string is not a well-formed and namespace-well-formed XML document;
or if DTD validation is requested and the document is not valid against its DTD;
or if it was passed to parse-xml-fragment and is not a well-formed external general parsed entity, if it contains entity references other than references to predefined entities, or if a document that incorporates this well-formed parsed entity would not be namespace-well-formed.
Issue #2464 created #created-2464
add method to use path()
The description of fn:path() says Returns a path expression that can be used to select the supplied node relative to the root of its containing document, but gives no guidance as to how exactly to use such a path expression to select a node; one can use eval(), or could write an invisible XML grammar, but that'd like saying “if and do-while can be used to write a program to solve chess problems”.
Easiest fix - remove the text saying people can use the result of path() to select nodes, since they can’t.
Slightly harder - add a path-to-node($root, $path) as node()? function.
I’d prefer the function—unless there is one and i missed it, and then i’d prefer the obvious editorial change :-) —but reducing the reader’s expectations would be OK too.
Issue #2463 created #created-2463
Add map:apply() or third argument to fn:apply() as a map of names to parameter values.
With keyword arguments being available, it would make sense to have a map:apply($function, $map), in which keys in the map are mapped to argument names of the function.
Admittedly since fn:apply() takes an array and is not called array:apply(), maybe an optional 3rd argument to fn:apply() would work better, with the same rules as for static function application.
Issue #2462 created #created-2462
Revert dynamic function calls on sequences
I would like to question the decision to allow function calls on sequences.
While I favored the change in the past, repeated user feedback indicates that the new behavior is confusing and complicates debugging. First, it legalizes cryptic code like ()(). Second, seemingly simple function calls like $add(1, 2) return an empty sequence if $add turns out to be empty.
I believe we should make type safety and readability a priority over convenience. If a short syntax is needed, one can still write $add ! .(1, 2).
Related: #2219 (can be closed if we revert dynamic function calls).
Issue #2461 created #created-2461
Unparsed entities
Support for unparsed entities and notations is something of a minority interest, but recent correspondence with a Saxon user reminds me that there are applications that depend on them quite heavily. There is limited support for them in the data model and in XSLT, but none in XPath or XQuery.
There is no logical reason for the two functions unparsed-entity-uri() and unparsed-entity-public-id() to be XSLT-only, other than to save XQuery implementors the trouble of implementing them.
At the same time, the functions are incomplete and inadequate. For example, there is no way of obtaining a complete list of declared entities, and there is no way of getting information about the notations that they refer to.
Many XML parsers do not expose information about entities and notations, so anything we define should be capable of returning a result that indicates "information not available for this document (or for this implementation)".
I propose a new function unparsed-entities($doc) which returns a map along the lines:
{ "entities": {
"e-name-1": {
"system-id": ....
"public-id": ....
"notation": ....
}, ....
},
"notations":{
"n-name-1": "system-id", ...
}
}
and which is allowed to return an empty sequence if information about unparsed entities is not available (making it possible to provide a trivial fallback implementation).
Issue #2460 created #created-2460
file:read-text and invalid XML characters
file:read-text() should be aligned with fn:unparsed-text() to remove the restriction regarding valid XML characters.
Incidentally, bin:decode-string() has never had such a restriction even in the EXPath 1.0 version.
Issue #2459 created #created-2459
What are "invalid XML characters" in the XPath file read functions?
Are we applying XML 1.0 rules or XML 1.1 rules? Does the user get to decide? How?
Issue #2458 created #created-2458
NodeTests: Unify jnode(X) and get(X)
I can currently write $map/jnode("key") or $map/get("key"), and both have the same meaning.
Both imply the child axis: I can also write $map/child::jnode("key") or $map/child::get("key")
There are differences:
jnodealso allows a content type to be specifiedgetallows the key to be an arbitrary expression, whilejnoderequires aConstantjnodeallows an NCName-valued key to be written without quotesgetallows multiple keys to be selectedjnodeallows all keys to be selectedgetalso works with XNodes
I propose that we unify these constructs.
To do this we distinguish jnode as an item type from jnode as a selector.
When used as a selector, the first argument of jnode should accept a KeySpecifier rather than a Constant. KeySpecifier is the subset of Expression that we allow after the lookup operator ?. (We should extend it to generalise Literal to Constant) A KeySpecifier allows an arbitrary expression in parentheses; we can debate what focus should be used to evaluate it.
This means, for example, that $array/get($i) becomes $array/jnode($i)
We could apply the same treatment to element() and attribute(), allowing $doc//element($N).
The second optional argument of jnode(), element(), and attribute() is unaffected.
QT4 CG meeting 152 draft minutes #minutes-02-10
Draft minutes published.
Issue #2437 closed #closed-2437
SimpleNodeTest: TypeTest → RegularItemType?
Issue #2434 closed #closed-2434
`fn:has-children`: buggy examples?
Issue #2441 closed #closed-2441
2434 Fix inconsistencies with GNode tests in axis steps
Issue #2445 closed #closed-2445
fn:element-to-map - ignore `xsi:type` and similar attributes
Issue #2449 closed #closed-2449
2445 Add rules for xsi namespace elements in element-to-map
Issue #2453 closed #closed-2453
XSLT Patterns: the "child-or-top" adjustment
Issue #2444 closed #closed-2444
XSLT Patterns for matching JNodes
Issue #2451 closed #closed-2451
2444 Make match="*" and match="N" match element nodes only
Issue #2450 closed #closed-2450
JNode types: matching root JNodes
Issue #2452 closed #closed-2452
2450 Add jnode((), *) to match root JNodes
Issue #2435 closed #closed-2435
Incorrect namespace prefixes in EXPath Binary example
Issue #2439 closed #closed-2439
Fix prefix on bin:int-octets example function
Issue #2436 closed #closed-2436
`jnode` type: arguments (spec vs. tests)
Issue #2432 closed #closed-2432
Constructor Functions: conversions
Issue #2440 closed #closed-2440
2432 Clarify effect of coercion on constructor functions
Pull request #2457 created #created-2457
Improved use of fos:result
Changes the main function catalog to make improved use of the fos:result element. Specifically:
- Uses fos:error-result for error examples
- Makes greater use of explicit results rather than narrative results where possible
- Uses fos:result narrative="true" to get auto-checking of examples having no definitive result, and improved rendition.
Pull request #2456 created #created-2456
Stylesheet handling of fos:result/@narrative
Following a schema change that allows the function catalog to contain code examples annotated <fos:result narrative="true">, this PR makes stylesheet changes allowing such examples to be rendered.
- In generating the specification documents, narrative results are simply rendered as prose, without containing
<code>tags - In generating the QT4 tests, the generated test case ensures that the example can be successfully compiled and run, but the test always succeeds so long as the result is not a static or dynamic error.
Issue #2399 closed #closed-2399
Canonical JSON Serialization: edge cases
Issue #2418 closed #closed-2418
2399b Add rules and advice for JSON output of special numerics
Issue #2455 created #created-2455
`file:copy`: creating targets
The current rules of the file:copy function say:
Copies a file or a directory given a source and a target path/URI. The following rules apply if $source points to a file:
- if
$targetdoes not exist, it will be created.
I think we should refine this rule and prevent this function from creating arbitrary directory structures if simple files are copied.
QT4 CG meeting 152 draft agenda #agenda-02-10
Draft agenda published.
Issue #2454 created #created-2454
Grammar: literals & constants, negative numbers
I think we could easily tweak the grammar by changing…
Constant ::= StringLiteral | ("-"? NumericLiteral) | QNameLiteral | ("true" "(" ")") | ("false" "(" ")")
Literal ::= NumericLiteral | StringLiteral | QNameLiteral
…to:
Constant ::= Literal | ("true" "(" ")") | ("false" "(" ")")
Literal ::= ("-"? NumericLiteral) | StringLiteral | QNameLiteral
As a result, a negative number would be returned by the parser as a literal instead of a unary expression (weird constructs like - - -2344 will still be possible), and the key specifier of a lookup expression could be a negative number.
Issue #2453 created #created-2453
XSLT Patterns: the "child-or-top" adjustment
This issue identifies a bug in XSLT 3.0 (retained in XSLT 4.0).
There is a special rule in XSLT 3.0 designed to ensure that the pattern match="a" will match an a element even if it is parentless. Without the special rule, it would not do so, because "a" expands to child::a which would otherwise only match an element that is a child of something.
The rule is written (in §5.5.3):
If any PathExprP in the Pattern is a RelativePathExprP, then the first StepExprP PS of this RelativePathExprP is adjusted to allow it to match a parentless element... If PS uses the child axis (explicitly or implicitly), and if the NodeTest in PS is not document-node() (optionally with arguments), then the axis in step PS is replaced by child-or-top, which is defined as follows. If the context node is a parentless element, comment, processing-instruction, or text node then the child-or-top axis selects the context node; otherwise it selects the children of the context node. It is a forwards axis whose principal node kind is element.
Now consider the pattern match="*/(b union c)". Clearly b is a PathExprP and a RelativePathExprP, so the adjustment applies to its first StepExprP, namely to b. So the pattern should expand to match="*/(child-or-top::b union child-or-top::c)", and a literal reading of the semantics of patterns, in conjunction with the semantics of path expressions, means that this should match a parentless b or c element, whereas the clear intention (and the Saxon implementation) is that it matches a b or c element only if it has an element node parent.
Similarly, consider the pattern match="b[b or c]". Again, the rules suggest that this should be adjusted to match="child-or-top::b[child-or-top::b or child-or-top::c]" which would technically match a parentless b element having no b or c child. This is clearly not intended
Saxon in fact is not applying any syntactic adjustment to the pattern at all. Rather, it is evaluating the pattern steps from right to left, and if a sub pattern has no preceding "/" or "//" operator then it is assuming there are no constraints on the element's ancestry.
I can think of a couple of ways the bug might be fixed.
Firstly, we could try to define more precisely the StepExprP subexpressions that are subjected to this adjustment. It's basically any StepExprP that is not the right-hand operand of "/" or "//", and that is not an operand of a union, intersect, or except expression that is the right hand operand of "/" or "//", and is not contained within a predicate.
Alternatively, we could try to amend the "equivalent expression" rule. We say that a pattern P matches a node N if N has an ancestor-or-self node $A such that the path expression $A//(P) selects N. We could amend this rule to say that if N is parentless, then this rule is evaluated "as if" N had an imaginary parent.
Neither solution is very elegant.
Pull request #2452 created #created-2452
2450 Add jnode((), *) to match root JNodes
Fix #2450
Pull request #2451 created #created-2451
2444 Make match="*" and match="N" match element nodes only
Fix #2444
The effect of the change is that simple patterns like match="*" and match="order" will only match element nodes, they will no longer match JNodes.
This is motivated firstly by implementation experience: the current rules cause a performance regression compared with XSLT 3.0 because it's harder to do precise static type inferencing, and some streaming use cases are no longer streamable for the same reason.
However, I think there is also a usability benefit. These simple match patterns are instinctively understood by the entire XSLT user population, and extending their meaning so they match things unexpectedly could be a debugging nightmare. Consider the case of a mature stylesheet with hundreds of template rules designed to process XML elements, which is then extended with a new module to handle a JSON representation of the same data; it's very unlikely the user will actually want or intend the same template rules to process both, and if this is what they want, it's clearer to make this explicit by using a union pattern.
Issue #2450 created #created-2450
JNode types: matching root JNodes
In the description of JNode types in 3.2.9, there is no explicit statement of what a type like jnode(*, map()) means. In particular, it isn't clear (except perhaps from studying examples) whether it matches a JNode whose selector property is absent (i.e. the root of a JTree). The examples suggest that it does.
For pattern matching in XSLT, it would be useful to have a pattern that ONLY matches the root of a JTree. Perhaps the syntax jnode(/, map()) might serve this purpose.
Pull request #2449 created #created-2449
2445 Add rules for xsi namespace elements in element-to-map
Fix #2445
Issue #2448 created #created-2448
Change title of XPath specification
The title of the XPath specification is
XML Path Language (XPath) 4.0
No-one actually calls it "XML Path Language" -- and it's now a path language for JSON as well.
I propose we change the title to
XPath 4.0
Issue #2447 created #created-2447
Drop string-literals for names in computed constructors
XQuery 3.1 allowed element foo { "bar" }.
And then we discovered syntax ambiguities, so we allowed element "foo" {"bar"}
And then we introduced QName literals, allowing element #foo {"bar"}
We then dropped string literals from the grammar, but there are incorrect examples that use it:
XQuery 4.0 allows the node name to be written in quotation marks (for example, element "book" {}
in 4.12.3
Pull request #2446 created #created-2446
2427 Add computed node constructors to XPath
Fix #2427
Issue #2445 created #created-2445
fn:element-to-map - ignore `xsi:type` and similar attributes
Following discussion in #1948, test element-to-map-017 has been changed so it is treated as if the xsi:type attribute were not present. However, I can't see anything in the current spec to justify this behaviour.
I propose changing the spec to say that all attributes in the xsi namespace should be ignored (except to the extent that when the input is schema-validated, they may have affected the choice of a type annotation, which itself may affect the outcome). This means (a) the attribute itself is not included in the result of the conversion, and (b) if all the attributes of an element are in this namespace, the element is treated as having no attributes.
Issue #2444 created #created-2444
XSLT Patterns for matching JNodes
A pattern such as match="order" currently matches both an element named "order" and a JNode whose selector is "order".
This makes it much more difficult to do type inferencing on the body of the template rule, and it greatly complicates streamability analysis. While usability rightly takes priority over implementation concerns, I don't think this design has usability benefits either. I think in practice users will know whether their template rules are intended to match XNodes or JNodes, and using the same pattern syntax for both is more confusing than helpful. We already offer the syntax `match="jnode(order)" to match JNodes, and I think that is clearer.
Note that the semantics of match="order" already depart from the semantics of the equivalent XPath expression child::order because the pattern will match a parentless element. So we only have to adapt the current magic rule:
If PS uses the child axis (explicitly or implicitly), and if the NodeTest in PS is not document-node() (optionally with arguments), then the axis in step PS is replaced by child-or-top, which is defined as follows. If the context node is a parentless element, comment, processing instruction, or text node then the child-or-top axis selects the context node; otherwise it selects the children of the context node. It is a forwards axis whose principal node kind is element.
so that the child-or-top axis only selects XNodes.
Issue #2443 created #created-2443
Naming: JNodes, selectors and contents
I am playing around with the new JNode functions, and my impression is that the resulting code does not look very catchy. It seems way too technical to me. For creating a map with element names and contents, we can write:
map:build($xnodes, name#1, data#1)
Similar code for JNodes would be:
map:build($jnodes, jnode-selector#1, jnode-content#1)
I wonder whether we really need to introduce so many completely new terms for rather straightforward concepts, instead of borrowing existing terminology. What about renaming the terms “Selector” to “JKey” and “content” to “JValue”? This would pretty much resemble the map terminology (even if we also use it for arrays), and it might help to understand that these terms are specific to JNodes.
map:build($jnodes, jkey#1, jvalue#1)
Issue #2177 closed #closed-2177
F+O: improve cross-referencing between functions
Issue #2404 closed #closed-2404
2403 Enhancements to fos.xsd
Issue #2442 closed #closed-2442
MK: 2403 enhancements to fos xsd
Pull request #2442 created #created-2442
MK: 2403 enhancements to fos xsd
This PR is #2404 with the addition of the compiled fos.scm.
Close #2404 Close #2403 Close #2177
The CG agreed to merge this PR at meeting 151
Pull request #2441 created #created-2441
2434 Fix inconsistencies with GNode tests in axis steps
Fix #2434 Fix #2437
- The axis step
child::gnode()should be allowed. - The axis step
self::array(*)is used in an example but is invalid syntax and makes no sense - Functions with an argument of type
gnode()that accept the context item as a default should allow the context item to be any gnode, not just an XNode.
Issue #2422 closed #closed-2422
XSLT: drop 3.11 Embedded Stylesheet Modules
Pull request #2440 created #created-2440
2432 Clarify effect of coercion on constructor functions
Fix #2432
Pull request #2439 created #created-2439
Fix prefix on bin:int-octets example function
Fix #2435
QT4 CG meeting 151 draft minutes #minutes-02-03
Draft minutes published.
Issue #2407 closed #closed-2407
`fn:type-of`: function vs fn
Issue #2409 closed #closed-2409
2407 Change function to fn in type-of output
Issue #2398 closed #closed-2398
fn:highest documentation in F&O spec not up to date?
Issue #2410 closed #closed-2410
2398 Fix fn:highest to match fn:lowest
Issue #2406 closed #closed-2406
Rounding dates/times and durations
Issue #2416 closed #closed-2416
2406 Add fn:parts-of-dateTime and fn:build-dateTime functions
Issue #2365 closed #closed-2365
Record types: extensible and non-extensible pairs
Issue #1484 closed #closed-1484
Functions that expect a record type should make it extensible
Issue #2413 closed #closed-2413
2365 Drop extensible record types
Issue #2428 closed #closed-2428
2422 Drop XSLT section on embedded stylesheet modules
Issue #2421 closed #closed-2421
XSLT edge case incompatibility with simplified stylesheet
Issue #2423 closed #closed-2423
2421 document XSLT incompatibility with simplified stylesheets
Issue #2292 closed #closed-2292
The XSLT document() function
Issue #2419 closed #closed-2419
2292 XSLT document() function: options parameter
Issue #2397 closed #closed-2397
Additions for "Functions Defined in XSLT" section in F&O spec
Issue #2411 closed #closed-2411
2397 add to F&O list of functions defined in XSLT
Issue #2396 closed #closed-2396
Missing "New in 4.0" labels for functions in F&O Spec
Issue #2395 closed #closed-2395
The new fn:regex-groups function is not labelled "New in 4.0"
Issue #2412 closed #closed-2412
2395 2396 Add missing "new in 4.0" entries
Issue #2438 closed #closed-2438
Michaelhkay 2403 enhancements to fos xsd
Pull request #2438 created #created-2438
Michaelhkay 2403 enhancements to fos xsd
This is MK's PR with the compiled form of the schema added.
The CG agreed to merge this PR at meeting 151.
Close #2404 Close #2403 Close #2177
Issue #2426 closed #closed-2426
2408 editorial omnibus
Issue #2429 closed #closed-2429
Feature/2026 01 28 draft review
Issue #1962 closed #closed-1962
fn:map-to-element
Issue #2053 closed #closed-2053
Add fn:collection-available
Issue #2430 closed #closed-2430
Updates to schema for xslt
Issue #2437 created #created-2437
SimpleNodeTest: TypeTest → RegularItemType?
With the current grammar…
AxisStep ::= (AbbreviatedStep | FullStep) Predicate*
AbbreviatedStep ::= ".." | ("@" NodeTest) | SimpleNodeTest
FullStep ::= Axis NodeTest
NodeTest ::= UnionNodeTest | SimpleNodeTest
SimpleNodeTest ::= TypeTest | Selector
TypeTest ::= NodeKindTest | JNodeType
…type tests in axis steps are limited to node() and its subtypes as well as jnode(). The test cases include tests for additional types like gnode() or array(*).
Maybe TypeTest should be replaced by the RegularItemType:
RegularItemType ::= AnyItemTest | NodeKindTest | GNodeType | JNodeType | MapType | ArrayType | RecordType | EnumerationType
Issue #2436 created #created-2436
`jnode` type: arguments (spec vs. tests)
The current spec defines the following grammar for the jnode type:
JNodeType ::= "jnode" "(" (("*" | NCName | Constant) ("," ("*" | SequenceType))?)? ")"
Constant ::= StringLiteral | ("-"? NumericLiteral) | QNameLiteral | ("true" "(" ")") | ("false" "(" ")")
As far as I can judge, none of the current test cases seems to use this syntax. Instead, the tests I found expect a single sequence type argument, for example fn-jtree-006:
<test-case name="fn-jtree-006">
<description> JNode applied to an array - type of result</description>
<created by="Michael Kay" on="2025-06-16"/>
<test>fn:jtree([1,2,3]) instance of jnode(array(xs:integer))</test>
<result>
<assert-true/>
</result>
</test-case>
Is it the test suite or the spec that needs to be updated?
If this is still subject to discussion, my preference would be to disallow constants in the jnode syntax:
- It is not clear to me how
fn:jtree({ 'a': 1, 'b': 2 }) instance of jnode(a)can be interpreted. - Instance checks for linear hierarchies are generelly more intuitive.
- With the presence of
get(), the jnode constants should be redundant. - We can only use atomic items in the node tests for which literals exists.
Issue #2435 created #created-2435
Incorrect namespace prefixes in EXPath Binary example
In EXPath Binary Module 4.0 some example functions have been redefined in a different namespace prefix (which was bin:) to avoid suggesting they were part of the supported library.
However in 2.2 Example – reading and writing variable length ASN.1 integers, the definitions of asn:int-octets() and asn:encode-ASN-integer() still contains references to the original prefix definition bin:int-octets() which should be asn:int-octets()
Issue #2434 created #created-2434
`fn:has-children`: buggy examples?
I believe that the recently added examples for fn:has-children need to be fixed (or it’s my brain that needs to be updated):
[1,2,3] => has-children()
[] => has-children()
The function signature expects gnode()? as input type, so I would expect both queries to return an error unless the arguments are not explicitly wrapped into a JNode.
Issue #2433 created #created-2433
`fn:jtree`: Identity
The rules for fn:jtree say:
If two maps or arrays M1 and M2 have the same function identity, as determined by the
function-identityfunction, thenjtree(M1) is jtree(M2)MUST return true: that is, the same JNode must be delivered for both.
Note: It is to some extent implementation-defined whether two maps or arrays have the same function identity. Processors SHOULD ensure as a minimum that when a variable
$mis bound to a map or array, callingjtree($m)more than once (with the same variable reference) will deliver the same JNode each time.
Shouldn’t SHOULD be MUST? The argument will always have the same function identity if jtree($m) is called more than once. Maybe the note can also be dropped, as fn:jtree is not about the identity of maps and arrays, but the identity of JNodes.
QT4 CG meeting 151 draft agenda #agenda-02-03
Draft agenda published.
Issue #2431 closed #closed-2431
Patch grammar explorer
Issue #2432 created #created-2432
Constructor Functions: conversions
The specification says in [22.1 Constructor functions for XML Schema built-in atomic types](https://qt4cg.org/specifications/xpath-functions-40/Overview.html#constructor-functions-for-xsd-types)…
If the value passed to a constructor is not in the lexical space of the datatype to be constructed, and cannot be converted to a value in the value space of the datatype under the rules in this specification, then an dynamic error is raised [err:FORG0001].
…but it is not clear which rules in the specification are meant.
Specifically, I think we should clarify whether query like the following one are supposed to return an error or a duration:
xs:anyURI('P2000Y') => xs:yearMonthDuration()
Pull request #2431 created #created-2431
Patch grammar explorer
On all pages: Improve display of Headlines with slightly smaller font-size.
On rule detail pages: The Name of the Grammar is now visible in the back button of the ribbon instead of the H1 which only displays the rule name.
On the right hand side:
- Fix complement character class display.
- Sequence and choice items now are indented if they do not fit on one line
- occurrence indicators always stick to the item they belong to
- use spans to group items and mark any character that is displayed
- dashes in character ranges and the pipes in choices are now also recognized as part of EBNF syntax
- literals are never wrapped into the next line
Pull request #2430 created #created-2430
Updates to schema for xslt
Updates the schema for XSLT 4.0:
- Adds
canonicaltoxsl:outputandxsl:result-document - Adds
xsl:package-locationto content model ofxsl:use-package.
Pull request #2429 created #created-2429
Feature/2026 01 28 draft review
I'm suggesting two minor changes following my reading of the most recent draft specs.
Pull request #2428 created #created-2428
2422 Drop XSLT section on embedded stylesheet modules
This doesn't actually abolish the feature, it just de-emphasises it. AFAIK, no-one actually uses it.
Issue #2427 created #created-2427
Node construction in XPath
There was pushback on issue #573 which proposed a set of functions for constructing nodes, on the grounds that for XQuery users, this was unnecessary duplication.
A possible alternative is to add a subset of the XQuery syntax for node construction to XPath: specifically, computed node constructors, which are relatively free of hassles such as dependence on the namespace context, boundary space rules, etc.
Specifically we could add computed constructors:
ComputedConstructor::=CompDocConstructor
| CompElemConstructor
| CompAttrConstructor
| CompNamespaceConstructor
| CompTextConstructor
| CompCommentConstructor
| CompPIConstructorCompDocConstructor::="document" EnclosedExpr
CompElemConstructor::="element" CompNodeName EnclosedContentExpr
CompAttrConstructor::="attribute" CompNodeName EnclosedExpr
CompNamespaceConstructor::="namespace" CompNodeNCName EnclosedExpr
CompTextConstructor::="text" EnclosedExpr
CompCommentConstructor::="comment" EnclosedExpr
CompPIConstructor::="processing-instruction" CompNodeNCName EnclosedExpr
with the restriction that CompNodeName / CompNodeNCName are either expressions in curly braces, or use the new XQuery 4.0 form with a leading "#".
It is of course trivial to define a function library on top of this if someone wants the extra flexibility:
let $new-element := fn($name, $content) { element {$name} {$content} }
etc
(Incidentally, EnclosedContentExpr serves no useful purpose as it's identical to EnclosedExpr.)
We could common up the rules for "constructing simple content" and "constructing complex content" at the same time, putting them in XPath where both XQuery and XSLT can refer to them. I believe they are identical except for (a) error codes, (b) with duplicate attribute names, XQuery throws an error while XSLT takes the last.
Pull request #2426 created #created-2426
2408 editorial omnibus
Fixes nearly everything in #2408
Issue #2424 closed #closed-2424
More Explorer tweaks
Issue #2425 created #created-2425
Permanent diffs for PRs
Since we link to pull requests in the spec and in test cases, I wonder whether it would be possible to publish a permanent diff showing the effect of each PR?
Essentially, the idea would be to take the HTML diff as we currently publish it, and reduce it to those sections of the specs that actually contain changes.
I would find this very useful, for example, when a PR has been accepted but is still marked with "Tests needed" - it's not easy at present to see retrospectively what tests might be required. It would also be useful, of course, when implementing the PR. But I think that all readers of the specs might find this beneficial.
Pull request #2424 created #created-2424
More Explorer tweaks
Building on the awesome work from @ndw I just tweaked the grammar explorer layout a little more
- layout works better on bigger and smaller screens
- consistent navigation between all screens with back button alwasy on the navigation at the top
- consistent sizes, paddings, colors set by CSS variables
- output as html5 which fixes small issues with whitespace in inline elements
- additional, minor layout improvmements
Before
After
Pull request #2423 created #created-2423
2421 document XSLT incompatibility with simplified stylesheets
Fix #2421
Issue #2422 created #created-2422
XSLT: drop 3.11 Embedded Stylesheet Modules
XSLT Section 3.11 describes embedded stylesheet modules - a stylesheet rooted at an element node which is not the outermost element of a document. There are no real conformance requirements associated with this feature and it isn't widely used. I proposed we drop the section, while retaining the statement in §3.5 that a stylesheet module can be "all or part" of an XML document.
Issue #2421 created #created-2421
XSLT edge case incompatibility with simplified stylesheet
Simplified stylesheets have changed so that the implicit template rule now does match="." rather than match="/".
This creates a theoretical incompatibility when
(a) the stylesheet is invoked supplying a node other than a document node as the input. It will now execute the (only) template rule, previously it would execute the built-in template for the node kind
(b) the simplified stylesheet module is included/imported into another stylesheet. This is a highly unlikely scenario, but it is tested by test case include-0601.
The incompatibility should be documented.
Issue #2420 closed #closed-2420
Explorer tweaks
Pull request #2420 created #created-2420
Explorer tweaks
h/t @line-o
Plus a few other tweaks.
Pull request #2419 created #created-2419
2292 XSLT document() function: options parameter
Fix #2292
Issue #2414 closed #closed-2414
Diff markup issues
Pull request #2418 created #created-2418
2399b Add rules and advice for JSON output of special numerics
Fix #2399
Issue #2417 closed #closed-2417
2399 Add rules/advice for JSON output of special xs:double values
Pull request #2417 created #created-2417
2399 Add rules/advice for JSON output of special xs:double values
Fix #2399
Pull request #2416 created #created-2416
2406 Add fn:parts-of-dateTime and fn:build-dateTime functions
Fix #2406
Issue #2415 closed #closed-2415
Publish the grammar explorer pages
Pull request #2415 created #created-2415
Publish the grammar explorer pages
Issue #2414 created #created-2414
Diff markup issues
There seem to be two consistent errors in the diff markup that appears in PRs on the dashboard:
- When an inline
<code>element is modified, the diff version shows the old code as deleted (red background), but does not show the new code.
For example:
- When a grammar entry is modified, the text gets duplicated.
For example:
Pull request #2413 created #created-2413
2365 Drop extensible record types
This PR drops the concept of extensible record types, replacing it with a rule that coercion to a record type drops any map entries that are not defined by the record type. In effect this means that a record type used when declaring a function parameter is implicitly extensible.
The benefits of the proposal are:
- It simplifies the spec, especially rules on type subsumption and on generation of implicit constructor functions
- It avoids the need to declare pairs of record types, one extensible and one not.
- It avoids all the awkward decisions about whether record types used in core functions should be extensible or not.
The rules for type patterns in XSLT are changed to invoke coercion.
Fix #1484 Fix #2365
Pull request #2412 created #created-2412
2395 2396 Add missing "new in 4.0" entries
Fix #2395 Fix #2396
Pull request #2411 created #created-2411
2397 add to F&O list of functions defined in XSLT
Fix #2397
Pull request #2410 created #created-2410
2398 Fix fn:highest to match fn:lowest
Fix #2398
Pull request #2409 created #created-2409
2407 Change function to fn in type-of output
Fix #2407
Issue #2195 closed #closed-2195
Editorial notes (incremental)
Issue #2408 created #created-2408
Editorial notes (incremental)
This issue summarizes the unresolved comments from #2195:
- [x] Oxford/serial comma should be used consistently at numerous places. Candidates:
- sine, cosine and tangent
- durations, dates and times
- year, month, day, hour, minute, second and timezone
- The month, day, hour and minute components
- the tokens w, W and Ww
- rules for overflow, underflow and approximation
- [ ] The comma may need to be removed at other places:
- This function is context-independent, and focus-independent.
- A dynamic error is raised [err:FODC0002] if a relative URI reference is supplied, and the base-URI property in the static context is absent.
- [x]
fn:distinct-values:$coll→$collation - [x] “If … is an empty sequence” vs. “If … is the empty sequence” (which one do we prefer?)
- MHK: I can't say I have a strong preference, but pedantically, the is probably more accurate.
- [x] Section 2.1.3 Values of the XPath 4.0 spec includes a change note highlighting that the terms XNode and JNode have been introduced but the section text only mentions XNode; there's no reference to JNode in that section other than in the change note itself.
- [ ] In the XPath 4.0 spec, the
publocURL appears to be incorrect (/spec/header/publoc/loc) - [x] A number of examples in the function catalog should be annotated with
spec="XQuery"so that the corresponding test cases are marked as inapplicable to XPath. Specifically:
fo-test-fn-count-001
fo-test-fn-deep-equal-005
fo-test-fn-every-010
fo-test-fn-function-annotations-002
fo-test-fn-function-annotations-003
fo-test-fn-hash-009
fo-test-fn-hash-010
fo-test-fn-serialize-004
fo-test-fn-sort-with-005
- [x] The serialization spec, in the section on the serialization parameter document, makes it rather hard to discover that the namespace prefix
outputis bound to the URIhttp://www.w3.org/2010/xslt-xquery-serialization - [x] F+O The definition of fn:compare refers to a function
fn:months-from-dateTime, this should befn:month-from-dateTime - [x] F+O The description of unparsed-text refers to the available text resources component of the dynamic context which has been dropped.
- [x] The serialization spec for Adaptive serialization of function items gives an example
**fn:exists#1** is serialized as **function fn:exists#1**but the word function does not actually appear in the result. - [x] fn:parse-html needs to define an error code for use when $encoding is an unknown or invalid encoding.
- [ ] Consistent rendition for term definitions. Most of the specs output definitions as
[Definition: here is the definition]. F+O uses[Definition] here is the definition. XSLT puts the keyword "Definition" in small caps. - [x] The function catalog contains entries for functions such as
op:gMonthDay-equalthat are no longer referenced. - [x] Serialization, HTML5, Processing Instructions: Dashes in the name must be escaped as well. Example:
<?a---b c---d?>should be serialized as<!--?a- - -b c- - -d?-->.. - [x]
UTF-16le→UTF-16LE,UTF-16be→UTF-16BE(caused by #2239) - [ ]
fn:unparsed-text: Referencebin:infer-encoding, drop redundant rules - [x]
op:divide-dayTimeDuration-by-dayTimeDuration: An example could be simplified by usingseconds(1)
Issue #2407 created #created-2407
`fn:type-of`: function vs fn
As we use the fn alias in (almost) all XQFO signatures, we could also return it by the fn:type-of function.
Issue #2355 closed #closed-2355
bin:infer-encoding error conditions
Issue #2362 closed #closed-2362
2355 bin:infer-encoding: further alignments
QT4 CG meeting 150 draft minutes #minutes-01-27
Draft minutes published.
Issue #2361 closed #closed-2361
Encoding parameters: upper/lower case, normalization
Issue #2394 closed #closed-2394
2361 Use upper case for encoding names; comparisons are case-blind
Issue #2349 closed #closed-2349
Revert `array:join`
Issue #2363 closed #closed-2363
2349 Revert array:join
Issue #2378 closed #closed-2378
HTML indenting
Issue #2391 closed #closed-2391
2378 HTML indenting: clarify the definition of inline elements
Issue #1944 closed #closed-1944
Try/Catch/Finally - order of evaluation
Issue #2127 closed #closed-2127
JNodes: Include atomic items
Issue #2159 closed #closed-2159
JNodes: Learning from JSONiq?
Issue #2351 closed #closed-2351
Current Drafts: What will we keep, what may be dropped?
Issue #2354 closed #closed-2354
`fn:append`
Issue #2360 closed #closed-2360
fn:root() vs. absolute path expressions
Issue #2384 closed #closed-2384
`fn:xsd-validator` - attribute nodes
Issue #2392 closed #closed-2392
2384 Clarify that fn:xsd-validator can validate attributes
Issue #2406 created #created-2406
Rounding dates/times and durations
The precision returned for dates, times, and durations is implementation-defined, and it has changed between Saxon releases. This leads one of our users to point out that there is no easy way to request a reduced precision (e.g. milliseconds) in order to ensure interoperability. The simplest approach we can offer seems to be current-dateTime() => format-dateTime("....") => xs:dateTime() which is pretty cumbersome and inefficient.
Rather than providing specific functions for rounding dates, times, and durations, the most versatile solution to this might be to provide functions that reduce a dateTime or duration to a record containing the numeric values of the components, allowing these to be manipulated as numbers, with a further function to reconstruct the dateTime or duration from the record: rather like the parse-uri()/build-uri() pair. Something like:
parts(current-dateTime()) ! map:put(., 'seconds', round(?seconds, 3)) ! build-dateTime()
Issue #2405 closed #closed-2405
The published XML for XPath and XQuery is incorrect
Pull request #2405 created #created-2405
The published XML for XPath and XQuery is incorrect
It’s not the specification XML, it’s the pre-fixed-up HTML as XML. h/t to @martian-a for noticing first!
Pull request #2404 created #created-2404
2403 Enhancements to fos.xsd
Schema enhancements to the function catalog for
Issue #2403 - allow non-testable results for examples to be labelled narrative="true" Issue #2177 - allow a fos:see-also element to make links to related functions
Note this PR is purely an enabler, it does not include changes to the function catalog to exploit this features, nor stylesheet enhancements to render them.
QT4 CG meeting 150 draft agenda #agenda-01-27
Draft agenda published.
Issue #2402 closed #closed-2402
This PR should fail to build
Issue #2403 created #created-2403
Testable examples in the file spec
In PR #2401 Norm introduced a temporary fix needed because the function catalog in the EXPath file spec doesn't conform to the fos.xsd schema.
I think we can fix this without a schema or stylesheet change by using existing mechanisms illustrated by this example from fn:collation:
<fos:test>
<fos:expression>collation({ 'lang': 'de', 'strength': 'primary' })</fos:expression>
<fos:result>"http://www.w3.org/2013/collation/UCA?lang=de;strength=primary"</fos:result>
<fos:test-assertion>
<result xmlns="http://www.w3.org/2010/09/qt-fots-catalog">
<any-of>
<assert-string-value>http://www.w3.org/2013/collation/UCA?lang=de;strength=primary</assert-string-value>
<assert-string-value>http://www.w3.org/2013/collation/UCA?strength=primary;lang=de;</assert-string-value>
</any-of>
</result>
</fos:test-assertion>
<fos:postamble>The order of query parameters may vary.</fos:postamble>
</fos:test>
The fos:result (or fos:error-result) element must always be present, and will always be rendered in the spec as the expected result. If the example will not always deliver this result, then <fos:test-assertion> can appear to give the result as it will appear in the generated test case.
But I suggest we add another attribute <fos:result narrative="true"/> to indicate that the result is given as explanatory prose, not as a testable XPath expression. This would allow another fn:collation example
<fos:example>
<p>The expression <code>collation({ 'lang': default-language() })</code>
returns a collation suitable for the default language in the
dynamic context.</p>
</fos:example>
to be rewritten as
<fos:example>
<fos:test>
<fos:expression>collation({ 'lang': default-language() })</fos:expression>
<fos:result narrative="true">A collation suitable for the default language in the
dynamic context.</fos:result>
<fos:test-assertion>
<result xmlns="http://www.w3.org/2010/09/qt-fots-catalog"><assert>true()</assert></result>
</fos:test-assertion>
</fos:test>
</fos:example>
which will (a) make it easier to fit the results into the tabular presentation of examples, and (b) cause a test case to be generated which will ensure that the example is syntactically valid.
(Or we could leave out <fos:test-assertion> in this example. If the supplied fos:result has narrative="true" and there is no test assertion, the generated test case can assume <assert>true()</assert>)
Pull request #2402 created #created-2402
This PR should fail to build
We're never going to merge this, it's just a CI test.
Issue #2379 closed #closed-2379
Use exported schema to validate function catalogs
Issue #1948 closed #closed-1948
fn:element-to-map: Tests
Issue #2401 closed #closed-2401
Stopgap fix to get the status quo drafts built
Pull request #2401 created #created-2401
Stopgap fix to get the status quo drafts built
The content model of fos:test requires an fos:result or fos:error-result. For the EXPath File module, we have to work out what those should be or change the markup or change the schema.
In the short term, I’ve made bogus fos:results of FIXME:
Issue #2400 closed #closed-2400
Irrelevant whitespace change to nudge CI
Pull request #2400 created #created-2400
Irrelevant whitespace change to nudge CI
Issue #2399 created #created-2399
Canonical JSON Serialization: edge cases
RFC 8785 says:
Note: Since Not a Number (NaN) and Infinity are not permitted in JSON, occurrences of NaN or Infinity MUST cause a compliant JCS implementation to terminate with an appropriate error.
We have just decided to treat these cases more liberally, but I think we should continue to raise an error if canonical serialization is requested.
In our serialization spec, we also say:
Implementations may serialize an xs:double value using any lexical representation of a JSON number defined in [RFC 7159], but it is recommended to use the same representation as when the canonical parameter is true.
We may need to exclude the edge cases from this recommendation.
If we keep the recommendation, we may need to fix the test case Serialization-json-11, which expects -0 instead of 0 (what is returned for RFC8785).
Issue #2398 created #created-2398
fn:highest documentation in F&O spec not up to date?
The description for the new fn:highest function does not align with the description for the fn:lowest function; contrary to what I'd expect. In particular, the rules section makes no mention of the $key argument; so I suspect this is not up to date?
Issue #2397 created #created-2397
Additions for "Functions Defined in XSLT" section in F&O spec
The new XSLT 4.0 functions current-merge-key-array and regex-groups are missing from the "Functions Defined in XSLT" section.
Also I believe the function unparsed-text-available should be added saying "Originally XSLT 2.0; then XPath 3.0 and later".
Issue #2396 created #created-2396
Missing "New in 4.0" labels for functions in F&O Spec
Please add changes entries to say "New in 4.0" for the functions: function-identity and jnode-content.
Also I assume the first change entry for function-annotations should actually say "New in 4.0" (rather than be a duplicate of the second change entry).
Issue #2395 created #created-2395
The new fn:regex-groups function is not labelled "New in 4.0"
Please add a changes entry in the spec for the new XSLT 4.0 regex-groups function.
Pull request #2394 created #created-2394
2361 Use upper case for encoding names; comparisons are case-blind
Standardizes on upper case for encoding names, and mentions that comparisons are case-blind.
Fix #2361
Issue #2393 created #created-2393
Keep or drop `array:members` and `array:of-members`?
Adopted from #2351:
We have recently dropped map:pairs and map:of-pairs. With array:members, the members of arrays are returned as single-entry maps, which may confuse users. Thus, for the sake of reducing redundant functionality, do we want to keep array:members and array:of-members, or rather promote the use of for member $m and array:split/array:join instead?
If we keep the functions, we should add a dedicated record type for record(value).
Pull request #2392 created #created-2392
2384 Clarify that fn:xsd-validator can validate attributes
Fix #2384
Pull request #2391 created #created-2391
2378 HTML indenting: clarify the definition of inline elements
Fix #2378
Issue #2390 created #created-2390
methods and inheritance
Don’t panic, i am not suggesting a large change :)
But i would like to suggest a small change to the semantics of method calls, with the goal of third parties being able to build something much larger.
Today, we have,
A method call combines accessing a map M to look up an entry whose value is a function item F, and calling the function item F supplying the map M as the implicit value of the first argument.
I’d like to add,
If there is no such function, but there is in the map a key fn:fallback whose value is a function, then that function is called with the map, the function name, and the arity of the desired function as arguments.
In this way one could write a function that looked for an "isa" entry in the map whose value was a sequence of "class" maps, and find the function.
It’s limited in that there is no possibility of polymorphic functions, but we do not have those elsewhere in the language.
One practical benefit is that you can have a map with all your functions in it, and “instance maps’ then do not need to have, say, 40 entries for all the methods that can be called. In an application in which a map gets updated a million times (I do have one of those), adding 40 extra entries to copy is a significant burden, even though of course it’s encapsulated in a single function.
Issue #2344 closed #closed-2344
HTML Serialization: Processing Instructions
Issue #2372 closed #closed-2372
2344 Change rendition of PIs in HTML5
QT4 CG meeting 149 draft minutes #minutes-01-20
Draft minutes published.
Issue #2359 closed #closed-2359
Implicit conversion to JNodes with absolute path expressions
Issue #2373 closed #closed-2373
2359 No conversion to JNode in absolute paths
Issue #2337 closed #closed-2337
XSLT xsl:mode/@typed attribute
Issue #2376 closed #closed-2376
2337 Extend xsl:mode/@typed to handle JNodes etc
Issue #2387 closed #closed-2387
641 NaN/Infinity in JSON
Issue #2088 closed #closed-2088
File Module: Feedback, Observations
Issue #2364 closed #closed-2364
2088 File Module: Feedback, Observations
Issue #2185 closed #closed-2185
Request for an `fn:xproc` function
Issue #2383 closed #closed-2383
Attempt to resolve action QT4CG-148-01
Issue #2389 created #created-2389
Adaptive Serialization: more freedom?
The adaptive serialization method was introduced “for the purposes of debugging query results”. For our processor, it has turned out pretty soon that it does not satisfy the requirements of our users, which is why we have introduced a custom debugging method.
I wonder what others think: Shouldn’t we relax several of the rules and let the implementation decide what to output? We haven’t defined either how the output of fn:trace needs to look like.
Some examples:
- The output of doubles often causes confusion. If parsed JSON is output, small integers will be output in exponential notation. For example,
parse-json('{ "A" : 20 }')needs to be output as{ "A": 2.0e1 }. xs:date("2001-01-01")is output asxs:date("2001-01-01"), whilexs:token('x')is output as"x".fn() { 1 }is output as(anonymous-function)#0, whereas an implementation could prefer to use the output offn:function-identity(see #2388), or output the original query string (if available), reproduce a string representation of the function body, etc.
I will be glad to create a PR.
Issue #2388 created #created-2388
Adaptive Serialization: function items
The serialization specs defines rules for creating a string representation for function items. Now that we have `fn:function-identity', we should replace the rules and use this string instead.
QT4 CG meeting 149 draft agenda #agenda-01-20
Draft agenda published.
Issue #2386 closed #closed-2386
Add namespace declaration to environment for generated tests
Pull request #2387 created #created-2387
641 NaN/Infinity in JSON
Addresses part of issue #641
In the JSON serialization method, NaN is output as null, and infinity is output as ±1e9999.
The parse-json function adds recommendations on how to achieve round-tripping of these values.
Pull request #2386 created #created-2386
Add namespace declaration to environment for generated tests
Changes the stylesheet for generating keyword and function signature tests so that the namespace prefix "output" is explicitly declared in the test environment. This prefix is used in one of the tests and it needs to be declared if the test is to work in XPath.
Issue #2385 created #created-2385
The XML version of the XPath spec isn't the XML version of the spec, it's HTML
Probably XQuery too. I have no idea why.
Issue #2384 created #created-2384
`fn:xsd-validator` - attribute nodes
The type signature of the xsd-validator function suggests that it can be used to validate attribute nodes (as well as documents and elements), and the prose description concurs with this. However the function summary says "can be invoked to validate a document or element node against this schema."
Test case xsd-validator-092 expects an attribute node to be rejected.
Also the Notes in the specification say
The validation process is explained in more detail in the XQuery ([[XQuery 4.0: An XML Query Language]] section [4.25 Validate Expressions] and XSLT ([[XSL Transformations (XSLT) Version 4.0]] section [25.4 Validation]
but the detailed description has since been moved to F&O 17.2.4.
Note that XSLT has always allowed validation of free-standing attribute nodes, but the validate expression in XQuery allows only document and element nodes.
Pull request #2383 created #created-2383
Attempt to resolve action QT4CG-148-01
Per #2315:
- Added ‘at-risk’ changes to the fn:insert-separator, array:members, and array:of-members
- Added ‘at-risk’ changes to the XPath/XQuery section on map and array filtering
- Added a note about what ‘at risk’ means to the status sections
Issue #2382 closed #closed-2382
Tool changes for action QT4CG-148-01
Pull request #2382 created #created-2382
Tool changes for action QT4CG-148-01
These should have no effect without additional commits, but they have to be merged into main in order to have a visible effect on my subsequent PR.
Issue #573 closed #closed-573
Node construction functions
Issue #2124 closed #closed-2124
573 Functions to Construct Trees
QT4 CG meeting 148 draft minutes #minutes-01-13
Draft minutes published.
Issue #2357 closed #closed-2357
element() vs element(*) in function signatures
Issue #2358 closed #closed-2358
2357 Standardize on element() rather than element(*)
Issue #2367 closed #closed-2367
Documentation for new main-module attribute of xsl:stylesheet
Issue #2366 closed #closed-2366
json-lines attribute for xsl:output and xl:result-document in XSLT spec
Issue #2356 closed #closed-2356
Clarification on scope of variables in xsl:for-each-group/(@split-when|@merge-when)
Issue #2368 closed #closed-2368
2367 Misc XSLT editorial fixes
Issue #2369 closed #closed-2369
F+O section 11 is empty
Issue #2371 closed #closed-2371
2369 Add content for F&O section 11 (Processing binary values)
Issue #2375 closed #closed-2375
2195 Editorial Omnibus
Issue #1591 closed #closed-1591
Implausible filter expressions
Issue #1934 closed #closed-1934
Supporting RELAX NG validation
Issue #2377 closed #closed-2377
2195 F+O Editorial Corrections
Issue #2381 created #created-2381
Add facility to serialize binary values as url-safe base64 encoded strings
In XQuery 3.1 there are two XDM types to represent binary values xs:base64binary and xs:hexBinary.
The current draft adds binary literals as an additional option.
Thus, it is possible to base64 encode any value by casting a xs:base64binary to a xs:string.
xs:base64Binary("+w==") => xs:string()
There is no standard way to serialize those binary values to the URL safe variant of that encoding described in section 5 of RFC 4648.
The simplest workaround is replacing the unsafe characters of the alphabet (+ and /) and dropping the padding at the end with
xs:base64Binary("+w==") => translate("+/=", "-_")
This of course will only work for relatively small binary values. In order for processors to offer a performant and efficient way I see several options.
- adding new type
xs:base64BinaryUrlSafewhose string representation uses the adapted alphabet with-and_and does not add padding at the end - a new function in fn namespace
fn:encode-base64-url-safe($data as (xs:string | xs:base64Binary | xs:hexBinary)) as xs:string - a new function in bin namespace
bin:encode-base64-url-safe($data as (xs:string | xs:base64Binary | xs:hexBinary)) as xs:string - add an output option that will serialize all binary values to base64 url-safe when cast to strings
Addendum
I am also wondering why binary values cannot be created from numeric literals. Especially now that we have the binary notation for integer literals and the xs:integer type is unbounded this would be a perfectly fine literal notation to create binary values from. At least as suitable as string literals that are currently allowed.
xs:hexBinary(0xfb) and xs:base64Binary(0b11111111)
Issue #2380 created #created-2380
Use Case for Generators: News Feeds Aggregation Using Generators
In response to:
QT4CG-147-02: NW to chase up DN and LQ about follow-up to the generator discussion
Use Case: News Feeds Aggregation Using Generators
Contents
Use Case: News Feeds Aggregation Using Generators
- Actors
- Goals
- Functional Requirements
- Constraints / Assumptions / Preconditions
- Proposed High-Level Solution
- Known Approaches that are Problematic
- Benefits of the Generators Approach
- End-to-End Flow
- Brief Description of the Core Processes in the Pipeline
- Notes on the Process Pipeline
- Why This Fits the Generator Datatype Extremely Well
- Alternative Flows
- Alternative Flow-1: A Feed Temporarily Stops Producing New Items
- Alternative Flow-2: Partial Consumption of the Pipeline
- Alternative Flow-3: Editor Inserts or Reorders Items 11
- Exception Flows
- Exception Flow-1: Feed Unreachable or Network Failure
- Exception Flow-2: Malformed Feed Data
- Exception Flow-3: Resource Exhaustion Risk
- Postconditions
- References
The Problem
Modern RSS/JSON aggregators must process hundreds of continuously updating feeds without excessive memory usage or latency, while supporting filtering, merging, and prioritization in real time.
Actors
- End-User
- Editor
- Administrator
- System components (internal processes acting as secondary actors)
- External services (RSS providers, APIs, social signals)
Goals
-
End-User
“As a user, I want to get the latest, up-to-the-minute news from many important sources. I want each brief news item to be presented with a link to more detailed information from the original source.” -
Editor
“As an editor, I want to be alerted to any change in the aggregated news-stream, as it happens continuously, and to have powerful ways of inserting, reordering, appending, prepending or deleting one or more news-items.” -
Administrator
“As an administrator, I want to start, stop, or restart the system, manage the configured feeds, and monitor operational health and error conditions.”
Functional Requirements
- Consume RSS / Atom / JSON-LD feeds incrementally
- Filter items by topic or sensitivity
- Merge multiple feeds chronologically
- Produce continuously updated summaries
Constraints / Assumptions / Preconditions
Assumptions
- Feeds may be large or unbounded
- Items arrive over time
Constraint
- Memory usage must remain bounded
Preconditions
- At least one news feed is configured
- Feeds are RSS or JSON-LD and timestamped
- Items within a feed are presented in reverse-chronological order
- Each item contains a content-link or optionally - inline content
- Items may belong to multiple categories
Proposed High-Level Solution
Each feed is modeled as a generator producing yield values lazily.
The ordered set of values produced by successive, demand-driven calls to move-next() is called the yield of the generator.
A generator’s yield may be finite or infinite, and may be empty for a given generator instance without implying exhaustion of the underlying data source.
Known Approaches That Are Problematic
These approaches require full materialization in memory:
- Eager sequences (XPath)
- DOM-style loading
- Materialized feeds
Benefits of the Generators Approach
- Bounded memory usage
- Low latency
- Composability
- Deterministic control of evaluation
End-to-End Flow
+-------------------------------+
| 1. Feed Fetching |
| Input: external providers |
| Output: G_rawItems |
+---------------+---------------+
|
+---------------v---------------+
| 2. Normalization |
| Input: G_rawItems |
| Output: G_normalizedItems |
+---------------+---------------+
|
+---------------v---------------+
| 3. Filtering | <-- unwanted content removed
| Input: G_normalizedItems |
| Output: G_filteredItems |
+---------------+---------------+
|
+---------------v---------------+
| 4. Topic Classification |
| Input: G_filteredItems |
| Output: G_classifiedItems |
+---------------+---------------+
|
+---------------v---------------+
| 5. Clustering |
| Input: G_classifiedItems |
| Output: G_clusteredItems |
+---------------+---------------+
|
+---------------v---------------+
| 6. Ranking |
| Input: G_clusteredItems |
| Output: G_rankedItems |
+---------------+---------------+
|
+---------------v---------------+
| 7. Summary Page Generation |
| Input: G_rankedItems |
| Output: G_summaryPageItems, |
| HTML |
+---------------+---------------+
|
+---------------v---------------+
| 8. Detail Page Generation |
| Input: G_summaryPageItems |
| Output: HTML Detail Pages |
+-------------------------------+
Remarks
- The participating generator instances are named using the convention
G_{name}. - Every stage except the final one produces a new generator.
- Every stage except the very first uses a generator as its input.
- Arrow semantics: the output generator of one stage is the input for the next stage.
Brief Description of the Core Processes in the Pipeline
Process 1 — Feed Fetching & Acquisition
Goal:
Continuously pull RSS / Atom / JSON-LD feeds from CNN, Fox, NBC, BBC, etc.
Includes:
- Periodic polling (e.g., every 5 minutes)
- Detection of new items (GUID, URL hash, published timestamps)
- N-way merging to ensure the resulting yield is sorted in reverse-chronological order
- Basic sanity validation (e.g., XML schema validity)
Output:
A generator whose yield values are raw feed items (XML / JSON documents) → input to Process 2.
Process 2 — Parsing & Normalization
Goal:
Convert heterogeneous raw feed items into a uniform internal format.
Normalized fields include:
- Title
- Description / Summary
- Full text (if available)
- URL
- Publication time (converted to UTC)
- Source
- Images, categories, tags
- Named entities (optional NLP-based enrichment)
Output:
A generator yielding clean, normalized NewsItem documents → input to Process 3.
Process 3 — Content Filtering & Exclusion Rules
Goal:
Remove unwanted items early using configurable rule sets.
Examples:
- Blocked topics: politics, celebrity gossip, violence, etc.
- Blocked entities: Donald Trump, Joe Biden, Kanye West, etc.
- Blocked publishers (optional)
- Expiration rules:
- Tech news stale after 48 hours
- Breaking news stale after 6 hours
Techniques:
- Keyword filtering
- Named Entity Recognition (NER)
- Sensitive-topic classifiers (ML-based)
- Freshness scoring
Output:
A generator yielding allowed, filtered NewsItem documents → input to Process 4.
Rejected items are stored separately for auditing.
Process 4 — Topic Classification
Goal:
Assign each item to one or more topics.
Example topics:
- Politics
- World
- Tech
- Health
- Sports
- Business
- Disasters / Urgent events
- Crime / Safety
- Entertainment
Approaches:
- Fine-tuned BERT classifier (preferred)
- TF-IDF + SVM (simpler)
- Feed-provided category tags (fallback)
Output:
A generator yielding categorized NewsItem documents → input to Process 5.
Process 5 — Similarity Analysis & Clustering
Goal:
Group news items from different sources describing the same event.
Techniques:
- Semantic vector embeddings (e.g., SBERT, Ada embeddings)
- Cosine similarity
- Hierarchical clustering or DBSCAN
Produces:
- Clusters of highly similar articles
- A primary (best) representative per cluster
Output:
A generator yielding clusters of related articles → input to Process 6.
Note:
To better match streaming behavior, clustering may operate within bounded windows (e.g., sliding windows) while still consuming the input generator.
Process 6 — Ranking, Urgency, and Freshness Scoring
Goal:
Prioritize which news appears on the Summary Page.
Computed scores:
- Freshness score (more recent → higher)
- Urgency score (disasters, crises, violence)
- Coverage score (number of sources reporting)
- Engagement score (optional: social signals)
Weighted formula:
FinalScore = a*Urgency + b*Freshness + c*Coverage + d*EditorRules
Items with the highest scores per topic are selected.
This stage does not require a full total ordering; instead a partial ordering (e.g., top-K per topic) preserves bounded memory.
Editor-driven operations (insert, remove, reorder) are modeled as generator transformations applied downstream of ranking.
Output:
A generator yielding ranked clusters → input to Process 7.
Process 7 — Summary Page Generation
This stage consumes the input generator and produces finite views intended for presentation.
Goal:
Build a continuously updated Summary Page (“Front Page”) containing:
- Top events per topic
- Short summaries
- Links to primary articles
- “Read similar news” (cluster siblings)
- Source icons
- Timestamp of most recent update
The page auto-refreshes and always reflects the newest items.
Process 8 — Detailed Pages & Cross-Links
This stage consumes its input generator and produces finite presentation views.
For each cluster:
- Canonical article (primary representative)
- Related articles across sources
- Timeline of developments
- Additional metadata (images, entities, tags)
Cross-links include:
- “More like this…”
- “Earlier developments…”
- “Follow-up stories…”
Notes on the Process Pipeline
- Feed Fetching typically wraps one or more data providers
→ producesG_rawItemslazily (RSS, JSON APIs, DB cursors, web services) - Every stage is expressible as:
for-each,filter,append,prepend,insert-at,remove-where,concat, orfold, etc., producing a new generator derived from the previous one
- No stage requires full materialization unless explicitly demanded
(e.g.,to-array, bounded sort, pagination) - Infinite generators are valid until stage 6; stages 7–8 typically consume finite prefixes (
take(n))
Why This Fits the Generator Datatype Extremely Well
- The pipeline is a composition of generator transformers
- Each box maps almost 1-to-1 to generator operations
- External data providers integrate naturally at Stage 1
- Sorting can be introduced in different ways:
- External merge-sort over generators
- Bounded-window ranking
- Top-K lazy ranking – e.g. using heaps.
Alternative Flows
Alternative Flow 1 — Feed Temporarily Stops Producing New Items
Condition:
A feed is reachable but has no new items since the last polling cycle.
Flow:
- The feed generator advances (
move-next()). - The data provider returns no new items.
- The feed-generator instance yields no items during this interval.
- Downstream generators remain operational.
- If all feeds are empty, no new items are added downstream.
Result:
The pipeline continues uninterrupted; no special handling is required.
Alternative Flow 2 — Partial Consumption of the Pipeline
Condition:
Only a finite prefix of the stream is required (e.g., top N items).
Flow:
- Downstream consumers apply
take(N). - Upstream generators are evaluated only as needed.
- Remaining potential yield values are never materialized.
Result:
Latency and memory usage remain bounded. The pipeline supports early termination naturally.
Alternative Flow 3 — Editor Inserts or Reorders Items
Condition:
An editor manually modifies the aggregated stream.
Flow:
- Editor operations are applied as generator transformations
(append,prepend,insert-at,remove-at,remove-where). - A new generator with the modified yield is produced.
- Downstream stages consume it transparently.
Result:
Editorial control integrates seamlessly without breaking the pipeline.
Exception Flows
Exception Flow 1 — Feed Unreachable or Network Failure
Condition:
A feed cannot be reached during polling.
Flow:
- The data provider reports an error or timeout.
- The next instance of the feed generator yields no items during this polling interval.
- The error is logged for monitoring.
- A retry policy (e.g., exponential backoff) is applied.
Result:
The system continues operating with remaining feeds.
Exception Flow 2 — Malformed Feed Data
Condition:
A feed item is malformed (invalid XML/JSON or schema validation problems, e.g. missing required fields).
Flow:
- The normalization stage detects the issue.
- The item is discarded or quarantined.
- Processing continues with subsequent items.
Result:
Malformed data does not propagate downstream.
Exception Flow 3 — Resource Exhaustion Risk
Condition:
A downstream operation risks exceeding memory limits.
Flow:
- Bounded strategies (windowing, top-K selection) are applied.
- Full materialization is avoided.
- If needed, the operation degrades gracefully (e.g., reduced clustering depth).
Result:
System stability is preserved under load.
Postconditions
Upon successful execution:
Functional Outcomes
- End users see an up-to-date Summary Page.
- Each summary item links to a Detailed Page.
- Editors can intervene using generator operations.
- Administrators retain full system control.
Technical Guarantees
- Memory usage remains bounded.
- Latency is minimized through lazy evaluation.
- Full materialization occurs only when explicitly requested.
System State
- All generators remain composable.
- Generator composition remains valid after alternative and exceptional flows.
- Empty generators correctly represent exhaustion.
- Infinite yields are supported up to stages that require finiteness.
References
-
RSS 2.0 Specification
https://www.rssboard.org/rss-specification -
Atom Publishing Protocol (RFC 5023)
https://www.rfc-editor.org/rfc/rfc5023 -
JSON-LD Specification
https://json-ld.org/spec/ -
TF-IDF, “Understanding TF-IDF (Term Frequency-Inverse Document Frequency)”, https://www.geeksforgeeks.org/machine-learning/understanding-tf-idf-term-frequency-inverse-document-frequency/
-
TF-IDF + SVM, “Strengthening Fake News Detection: Leveraging SVM and Sophisticated Text Vectorization Techniques. Defying BERT?”, https://arxiv.org/html/2411.12703v1
-
Sentence-BERT (SBERT)
Reimers, N. & Gurevych, I., 2019
https://arxiv.org/abs/1908.10084 -
Fine-tuned BERT, “Fine-tuning a BERT model”, https://www.tensorflow.org/tfmodels/nlp/fine_tune_bert
-
Ada Embeddings (OpenAI)
Radford et al., 2021
https://arxiv.org/abs/2103.00020 -
Cosine Similarity
https://en.wikipedia.org/wiki/Cosine_similarity -
Hierarchical Clustering
https://en.wikipedia.org/wiki/Hierarchical_clustering -
DBSCAN
Ester et al., 1996
https://www.aaai.org/Papers/KDD/1996/KDD96-037.pdf
Pull request #2379 created #created-2379
Use exported schema to validate function catalogs
This PR adds an exported SCM version of the fos.xsd schema with an embedded license. Updates to the build script use it to validate all function-catalog.xml files.
Pro: validation
Con: any change to the fos.xsd also has to be accompanied by an update to the exported schema, which probably, only Mike or I can do.
Issue #2378 created #created-2378
HTML indenting
The spec says (in both 3.1 and 4.0):
The inline elements are those included in the %inline category of any of the HTML 4.01 DTDs or those elements defined to be phrasing elements in HTML5
This could be read as defining one set of inline elements for version="4.01" and a different set of inline elements for version="5.0", or it could be read as indicating that an element is an inline element if it satisfies either of these two conditions.
Issue #2370 closed #closed-2370
Add character-maps to the allowed context dependencies
Issue #2374 closed #closed-2374
Markup: Allow empty nt elements
Pull request #2377 created #created-2377
2195 F+O Editorial Corrections
F+O: editorial corrections to issues identified in #2195 (mainly corrections to examples), plus completion of missing change metadata.
Pull request #2376 created #created-2376
2337 Extend xsl:mode/@typed to handle JNodes etc
Fix #2337
Pull request #2375 created #created-2375
2195 Editorial Omnibus
Fixes a number of problems from issue #2195.
Pull request #2374 created #created-2374
Markup: Allow empty nt elements
Stylesheet change to allow empty NT elements, bringing them into line with other referencing elements such as termref and xnt. The markup <nt def="AxisStep"/> is treated as equivalent to <nt def="AxisStep">AxisStep</nt>. This removes a common source of error which tends to result in missing text in the spec rather than in any kind of build error.
Pull request #2373 created #created-2373
2359 No conversion to JNode in absolute paths
Fix #2359
Pull request #2372 created #created-2372
2344 Change rendition of PIs in HTML5
Fix #2344
Pull request #2371 created #created-2371
2369 Add content for F&O section 11 (Processing binary values)
Fix #2369
Pull request #2370 created #created-2370
Add character-maps to the allowed context dependencies
Currently function-catalog.xml is invalid against the schema fos.xsd. This PR updates the schema to allow "character-maps" in the enumeration of allowed context dependencies.
(Note, this should probably cause the build to fail. The problem was only spotted when using Oxygen to query the function catalog.)
Issue #2369 created #created-2369
F+O section 11 is empty
F+O section 11, Processing Binary Values, is currently empty
Has something gone wrong, or should we delete the section?
Pull request #2368 created #created-2368
2367 Misc XSLT editorial fixes
Most of the changes here are to bring the change log entries up to date. Also:
Fix #2356 Fix #2366 Fix #2367
Issue #2367 created #created-2367
Documentation for new main-module attribute of xsl:stylesheet
In the XSLT 4.0 spec, 3.6 Stylesheet Element says:
The optional main-module attribute is purely documentary. By including this attribute in every stylesheet module of a package, an XSLT editing tool may be enabled to locate the top-level module of the relevant package [...]
But what does "top-level module" mean? Should this say "principal stylesheet module" instead? I can see that top-level package is defined, but not "top-level module", so I'm confused.
Issue #2366 created #created-2366
json-lines attribute for xsl:output and xl:result-document in XSLT spec
The new serialization parameter json-lines is documented at 26.2 Serialization parameters. But please add a "changes" entry for the new attribute json-lines in the sections 25.1 Creating Secondary Results and 26.1 The xsl:output declaration in the XSLT 4.0 spec. This is currently missing.
(Note that there is already a changes entry in the Serialization spec at 3 Serialization Parameters.)
Issue #2365 created #created-2365
Record types: extensible and non-extensible pairs
It is often useful in a function signature for an argument type to be an extensible record type (so additional fields are allowed, which the function can ignore, saving the need to check for their presence) while the return type is non-extensible (giving better static type checking for lookup expressions, for example).
Currently this requires two separate named record types to be declared, differing only in that one of them is extensible and the other not. This duplication is clearly undesirable.
One solution to this might be to have a single non-extensible definition of the name of the record type, with some way of indicating at the point where the record type is used that extensions are allowed.
For example (probably not viable syntax as written):
fn:element-to-map-plan(
$input as element()*
) as fn:element-to-map-conversion-plan
fn:element-to-map(
$node as element(),
$plan as extensible fn:element-to-map-conversion-plan
} as map(*)
Perhaps the syntax extensible(fn:element-to-map-conversion-plan) would work.
Pull request #2364 created #created-2364
2088 File Module: Feedback, Observations
Closes #2088
Issue #2250 closed #closed-2250
Function to detect/infer the string encoding from a binary
Issue #2092 closed #closed-2092
Drop map:pair, map:of-pairs, map:pairs, array:members, array:of-members
Issue #2194 closed #closed-2194
fn:transform sandbox=yes option
Pull request #2363 created #created-2363
2349 Revert array:join
Closes #2349
Pull request #2362 created #created-2362
2355 bin:infer-encoding: further alignments
Closes #2355
Issue #2361 created #created-2361
Encoding parameters: upper/lower case, normalization
The serializer spec says…
Serializer are required to support values of
UTF-8andUTF-16
…whereas the XQFO spec mentions utf-8 as default value for fn:serialize. Similarly, only the UTF lower-case variants are listed for fn:unparsed-text, and there may be other places.
I think we should mention the upper-case variants everywhere, and add notes that upper/case is ignored when processing the encoding string.
Issue #2360 created #created-2360
fn:root() vs. absolute path expressions
Is there a particular reason why the absolute slash / is defined as complicated as…
self::gnode()/(fn:root(.) treat as (document-node()|jnode())/PP
…and wouldn’t it be helpful to simplify it get rid of the treat as expression?
self::gnode()/fn:root(.)/PP
In many cases, the document node does not exist or is not really needed, and it would allow users to use the slash for nodes that would otherwise needs to wrapped into document nodes, for example:
let $as := analyze-string('abc', 'b')
return $as/fn:match[/fn:non-match]
Issue #2359 created #created-2359
Implicit conversion to JNodes with absolute path expressions
Section 4.7.1 discusses absolute path expressions.
The first part of the section concerns leading "/", and includes the note:
If the context value includes a map or array, it is not converted implicitly to a JNode; rather, a type error occurs.
The second part concerns leading "//", and includes the statement:
Any map or array that is present in the context value is first coerced to a JNode by applying the [fn:jtree] function.
It might be inferred that "/" doesn't do this conversion, but "//" does. However, this certainly isn't stated explicitly, and there would be no logical reason for treating the two cases differently.
We should either do the conversion for both cases, or for neither.
I'm inclined to do it for neither. Partly because an implicit conversion wouldn't do any upwards navigation to a different "root" node, as users might expect; partly because doing the conversion reduces the type information available to the compiler.
QT4 CG meeting 147 draft minutes #minutes-01-06
Draft minutes published.
Issue #407 closed #closed-407
XSLT-specific context properties used in function items
Issue #2274 closed #closed-2274
407 Function items capturing XSLT context components
Issue #1011 closed #closed-1011
fn:transform() improvements
Issue #2348 closed #closed-2348
1011 fn transform improvements
Issue #2339 closed #closed-2339
Default priority of match="element(A|B)"
Issue #2335 closed #closed-2335
Make `jnode()` like `element()`
Issue #2334 closed #closed-2334
XSLT: Parenthesized subexpressions within Patterns
Issue #2297 closed #closed-2297
XSLT pattern ambiguities with typed matches
Issue #2336 closed #closed-2336
2334 Revise XSLT pattern syntax and semantics
Issue #2048 closed #closed-2048
Untrusted execution, and security more generally
QT4 CG meeting 148 draft agenda #agenda-01-13
Draft agenda published.
Pull request #2358 created #created-2358
2357 Standardize on element() rather than element(*)
Fix #2357
Issue #2357 created #created-2357
element() vs element(*) in function signatures
We use element(*) and element() interchangeably in function signatures. I propose we standardise on the simpler form, element().
Ditto attribute().
Issue #2356 created #created-2356
Clarification on scope of variables in xsl:for-each-group/(@split-when|@merge-when)
A user experimenting with xsl:for-each-group/@split-when with my 4->3 source-code transformer, inferred that the variable $group was available within the sequence constructor of the grouping instruction.
(Unfortunately due to an error in my transformer code $group was within scope in the sequence constructor, though with the wrong value ;-) - this has since been corrected.)
A close and detailed reading of the spec shows that $group and $next are implied as only in scope for the evaluation of the @split-when expression. Might I suggest that there is a small note emphasising this is the case? Similar clarification may be worthwhile for @merge-when too.
QT4 CG meeting 147 draft agenda #agenda-01-06
Draft agenda published.