Please check the errata for any errors or issues reported since publication.
See also translations.
This document is also available in these non-normative formats: Specification in XML format using HTML5 vocabulary and XML function catalog.
Copyright © 2000 W3C® (MIT, ERCIM, Keio, Beihang). W3C liability, trademark and document use rules apply.
This document defines an API for XPath 4.0 to handle the manipulation of binary data. It defines extension functions to process data from, and generate date for, binary resources, including extracting subparts, searching, basic binary operations and conversion between binary and structured forms of XDM numbers and strings.
The document is an update of the original [EXPath Binary 1.0] specification, developed by the EXPath Community Group, defined for [XML Path Language (XPath) 2.0] and published in 2013.
The principal semantic alteration is use of functional argument defaults available in XPath 4.0.
These functions are defined for use in [XML Path Language (XPath) 4.0] and [XQuery 4.1: An XML Query Language] and [XSL Transformations (XSLT) Version 4.0] and other related XML standards. The signatures and summaries of functions defined in this document are available at: https://qt4cg.org/specifications/EXPath/binary-40/.
A summary of changes since published version 1.0 (the XPath 2.0 version) is provided at F Changes since version 1.0.
This version of the specification is work in progress. It is produced by the QT4 Working Group, officially the W3C XSLT 4.0 Extensions Community Group. Individual functions specified in the document may be at different stages of review, reflected in their History notes. Comments are invited, in the form of GitHub issues at https://github.com/qt4cg/qtspecs.
Changes in 4.0 ⬇
Use the arrows to browse significant changes since the 1.0 version of this specification.
Sections with significant changes are marked Δ in the table of contents. New functions introduced in this version are marked ➕ in the table of contents.
The purpose of this document is to define functions to manipulate binary data for
inclusion in XPath 4.0, XQuery 4.0, and XSLT 4.0. The binary data is represented by
the type xs:base64Binary
as defined by Section
3.2.16 base64BinaryXS2.
The exact syntax used to call these functions and operators is specified in [XML Path Language (XPath) 4.0], [XQuery 4.1: An XML Query Language] and [XSL Transformations (XSLT) Version 4.0].
This document defines several classes of functions:
Functions to create binary 'constants' and convert between the binary forms and sequences of octets.
Functions to perform basic operations on binary values, such as joining, selecting and searching.
Functions to perform bitwise operations.
Functions to pack or unpack numeric values from within or into binary data.
Functions to decode or encode strings from within or into binary data>.
References to specific sections of some other specifications are indicated by cross-document links in this document. Each such link consists of a pointer to a specific section followed a superscript specifying the linked document. The superscripts have the following meanings: FILE40 [EXPath File 4.0], FO40 [XQuery and XPath Functions and Operators 4.0] and XS2 [XML Schema Part 2: Datatypes Second Edition]
Error conditions are identified by a code (a QName
.) When such an
error condition is reached in the evaluation of an expression, a dynamic error
is thrown, with the corresponding error code (as if the standard XPath function
error()
had been called.)
In this specification these codes use the
http://expath.org/ns/binary
namespace and a 'descriptive string'
local part, e.g. bin:index-out-of-range
, rather than the
http://www.w3.org/2005/xqt-errors
namespace and alpha-numeric local
part, e.g. err:FOCH0004
, used in [XQuery and XPath Functions and Operators 4.0]. These error codes were chosen originally in the 1.0 version of 2013.
Binary 'data' arguments to the functions are now declared to be either
xs:hexBinary
or xs:base64Binary
, but binary
function results remain of type xs:base64Binary
. This should
not cause any backward incompatibilities as casting back and forth between
the two representations has been possible since at least version
2.0
The principal binary type within this module is xs:base64Binary
as
defined by Section
3.2.16 base64BinaryXS2.
In general, for the functions defined in this specification, if the result return
is binary data, it will always be of type
xs:base64Binary
. Any arguments to a function which are considered
to be binary data can be either of type xs:base64Binary
or
xs:hexBinary
.
Conversion to and from xs:hexBinary
can be performed by casting with
xs:hexBinary()
and xs:base64Binary()
.
Note:
As these types are normally implemented as wrappers around byte array structures containing the data, and differ only when being serialized to or parsed from text, such casting in-process should not involve data copying.
An item of type xs:base64Binary
can be empty, i.e.
contain no data, (in the same way that items of type xs:string
can
contain no characters.) Where 'data' arguments to functions that return binary
data are optional (i.e. $arg as type?
) and any of
those optional arguments is set to the empty sequence, in general an empty
sequence is returned, rather than an empty item of type
xs:base64Binary
.
A suite of test-cases for all the functions defined in this module, in [QT3] format, is defined at [Test-suite].
This specification follows the general remarks on and terminology for conformance given in Section 1.2 ConformanceFO40
In this document, text labeled as an example or as a note is provided for explanatory purposes and is not normative.
The functions defined in this document are contained exclusively in the namespace
http://expath.org/ns/binary
and referenced using a
xs:QName
binding to that namespace.
This document uses the prefix bin
to refer to this namespace.
User-written applications can choose a different prefix to refer to the
namespace, so long as it is bound to the correct URI. In accordance with current
practice, it is recommended that the prefix bin
be reserved for use
with this namespace.
Each function (or group of functions having the same name) is defined in this specification using a standard proforma, full details of which can be found in Section 1.5 Function signatures and descriptionsFO40. In particular in this update (trailing) optional arguments for functions (introduced in XPath 4.0) are used where appropriate in the signatures, rather than multiple-arity signatures as previously.
Development of this specification was driven by requirements which some XML developers regularly encounter in examining or generating data which is presented in binary, or other non-textual forms. Some typical use cases include:
Getting the dimensions of an image file.
Extracting image metadata.
Processing images embedded as base64 encodings within a SOAP message.
Processing legacy text files which use different encodings in separate sections.
Generating PDF files from SVG graphical data.
As an example, the following code reads the binary form of a JPEG image file, searches for the 'Start of Frame/DCT' segment, and unpacks the relevant binary sections to integers of height and width:
<xsl:variable name="binary" select="file:read-binary(@href)" as="xs:base64Binary"/> <xsl:variable name="location" select="bin:find($binary,0,bin:hex('FFC0'))"/> <size width="{bin:unpack-unsigned-integer($binary,$location+5,2,'most-significant-first')}" height="{bin:unpack-unsigned-integer($binary,$location+7,2,'most-significant-first')}"/> → <size width="377" height="327"/>
(The 'most-significant-first'
argument ensures the numeric
conversion is 'big-endian', which is the format in JPEG.)
The functions in this example have been moved into a differing namespace
prefix (asn:
) to avoid suggesting that they are part of the
supported function set.
[ASN.1] defines several formats for identifying and encoding arbitrary-sized telecommunications data as streams of octets. Many of these forms specify the length of data as part of their encoding. For example, in the Basic Encoding Rules, an integer is represented as the following series of octets:
Type – 1 octet – in this case the value 0x02
Length – ≥ 1 octet – the number of octets in the integer value. The length field itself can be variable in length – to accommodate VERY large integers (requiring more than 127 octets to represent, e.g. 2048-bit crypto keys.)
Payload – ≥ 0 octets – the octets of the integer value in most-significant-first order.
To generate such a representation for an integer from XSLT/XPath, the following code might be used:
<xsl:function name="asn:int-octets" as="xs:integer*"> <xsl:param name="value" as="xs:integer"/> <xsl:sequence select="if($value ne 0) then (bin:int-octets($value idiv 256),$value mod 256) else ()"/> </xsl:function> <xsl:function name="asn:encode-ASN-integer" as="xs:base64Binary"> <xsl:param name="int" as="xs:integer"/> <xsl:variable name="octets" select="bin:int-octets($int)"/> <xsl:variable name="length-octets" select="let $l := count($octets) return (if($l le 127) then $l else (let $lo := bin:int-octets($l) return (128+count($lo),$lo)))"/> <xsl:sequence select="bin:from-octets((2,$length-octets,$octets))"/> </xsl:function>
The function asn:int-octets
returns a sequence of all the
'significant' octets of the integer (i.e. eliminating leading 'zeroes') in
most-significant order. Examples of the encoding are:
asn:encode-ASN-integer(0) → "AgA=" asn:encode-ASN-integer(1234) → "AgIE0g==" asn:encode-ASN-integer(123456789123456789123456789123456789) → "Ag8XxuPAMviQRa10ZoQEXxU=" asn:encode-ASN-integer(123456789.. 900 digits... 123456789) → "AoIBdgaTo....EBF8V"
The first example requires no octets to encode zero, hence its octets are
2,0
. Both the second and third examples can be represented in
less than 128 octets (2 and 15 respectively), so length is encoded as a
single octet. The first three octets of the result for the last example,
which encodes a 900-digit integer, are: 2,130,1
indicating that
the data is represented by (130-128) * 256 + 1 = 513 octets and the length
required two octets to encode.
Decoding is a matter of compound use of the integer decoding function:
<xsl:function name="asn:decode-ASN-integer" as="xs:integer"> <xsl:param name="in" as="xs:base64Binary"/> <xsl:sequence select="let $lo := bin:unpack-unsigned-integer($in,1,1,'BE') return ( if($lo le 127) then bin:unpack-unsigned-integer($in,2,$lo,'BE') else (let $lo2 := $lo - 128, $lo3 := bin:unpack-unsigned-integer($in,2,$lo2,'BE') return bin:unpack-unsigned-integer($in,2+$lo2,$lo3,'BE')))" /> </xsl:function>
(all numbers in ASN are 'big-endian') and the examples from above reverse:
asn:decode-ASN-integer(xs:base64Binary("AgA=")) → 0 asn:decode-ASN-integer(xs:base64Binary("AgIE0g==")) → 1234 asn:encode-ASN-integer(xs:base64Binary("Ag8XxuPAMviQRa10ZoQEXxU=")) → 123456789123456789123456789123456789 asn:encode-ASN-integer(xs:base64Binary("AoIBdgaTo....EBF8V")) → 123456789.. 900 digits... 123456789
The [XQuery and XPath Functions and Operators 4.0] function
fn:binary-resource
has been added to the list of useful
functions.
This module defines no specific functions for reading and writing binary data from resources, but other specifications provide some suitable mechanisms.
[XQuery and XPath Functions and Operators 4.0] provides a function to retrieve binary resources:
fn:fn:binary-resource ( |
||
$filesource |
as
|
|
) as
|
The EXPath File Module [EXPath File 4.0] provides three functions suitable for use in file-based situations:
fn:file:read-binary ( |
||
$file |
as , |
|
$offset |
as
|
:= 0 , |
$size |
as
|
:= () |
) as
|
which reads binary data from an existing file, with an optional offset and size.
fn:file:write-binary ( |
||
$file |
as , |
|
$value |
as
|
|
) as
|
which writes binary data into a new or existing file.
fn:file:append-binary ( |
||
$file |
as , |
|
$value |
as
|
|
) as
|
which appends binary data onto the end of an extant file.
Users of the package may need to define binary 'constants' within their code or examine the basic octets. The following functions support these:
Returns the binary form of the set of octets written as a sequence of (ASCII) hex digits ([0-9A-Fa-f]).
bin:hex ( |
||
$in |
as
|
|
) as
|
This function is deterministicFO40, context-independentFO40, and focus-independentFO40.
$in
will be effectively zero-padded from the left to generate an
integral number of octets, i.e. an even number of hexadecimal digits.
Byte order in the result follows (per-octet) character order in the string.
If $in
is an empty string, then the result will be a
xs:base64Binary
with no embedded data.
If $in
is the empty sequence, the function returns an empty
sequence.
[bin:non-numeric-character] is raised if $in
cannot be parsed as a hexadecimal number.
When the input string has an even number of characters, this function behaves
similarly to the double cast
xs:base64Binary(xs:hexBinary($string))
.
bin:hex('11223F4E') → "ESI/Tg==" |
|
bin:hex('1223F4E') → "ASI/Tg==" |
Returns the binary form of the set of octets written as a sequence of (8-wise) (ASCII) binary digits ([01]).
bin:bin ( |
||
$in |
as
|
|
) as
|
This function is deterministicFO40, context-independentFO40, and focus-independentFO40.
$in
will be effectively zero-padded from the left to generate an
integral number of octets (i.e. the number of characters in $in
will be
a multiple of 8).
Byte order in the result follows (per-octet) character order in the string.
If $in
is an empty string, then the result will be a
xs:base64Binary
with no embedded data.
If $in
is the empty sequence, the function returns an empty
sequence.
[bin:non-numeric-character] is raised if $in
cannot be parsed as a binary number.
bin:bin('1101000111010101') → "0dU=" |
|
bin:bin('1000111010101') → "EdU=" |
Returns the binary form of the set of octets written as a sequence of (ASCII) octal digits ([0-7]).
bin:octal ( |
||
$in |
as
|
|
) as
|
This function is deterministicFO40, context-independentFO40, and focus-independentFO40.
$in
will be effectively zero-padded from the left to generate an
integral number of octets.
Byte order in the result follows (per-octet) character order in the string.
If $in
is an empty string, then the result will be a
xs:base64Binary
with no embedded data.
If $in
is the empty sequence, the function returns an empty
sequence.
[bin:non-numeric-character] is raised if $in
cannot be parsed as an octal number.
bin:octal('11223047') → "JSYn" |
Returns binary data as a sequence of integer octets.
bin:to-octets ( |
||
$in |
as
|
|
) as
|
This function is deterministicFO40, context-independentFO40, and focus-independentFO40.
If $in
is a zero length binary data then the empty sequence is
returned.
Octets are returned as integers from 0 to 255.
Converts a sequence of octets into binary data.
bin:from-octets ( |
||
$in |
as
|
|
) as
|
This function is deterministicFO40, context-independentFO40, and focus-independentFO40.
Octets are integers from 0 to 255.
If $in
is the empty sequence, the function returns zero-sized binary
data.
[bin:octet-out-of-range] is raised if one of the octets lies outside the range 0 – 255.
Changes in 4.0 ⬆
The function find-all
in the example for bin:find
has been moved into a differing namespace prefix (f:
) to avoid
suggesting that it is part of the supported function set.
Returns the size of binary data, measured in octets.
bin:length ( |
||
$in |
as
|
|
) as
|
This function is deterministicFO40, context-independentFO40, and focus-independentFO40.
Returns the size of binary data in octets.
The effect of the function is equivalent to the result of the following XPath expression.
count(bin:to-octets($in))
Returns a specified part of binary data.
bin:part ( |
||
$in |
as , |
|
$offset |
as , |
|
$size |
as
|
:= () |
) as
|
This function is deterministicFO40, context-independentFO40, and focus-independentFO40.
Returns a section of binary data starting at the $offset
octet. If
$size
is defined, the size of the returned binary data is
$size
octets. If $size
is absent, all remaining data from
$offset
is returned.
The $offset
is zero based.
The values of $offset
and $size
must be non-negative integers.
It is a dynamic error if $offset
+ $size
is larger than the
size of the binary data in $in
.
If the value of $in
is the empty sequence, the function returns an empty
sequence.
[bin:index-out-of-range] is raised if $offset
is
negative or $offset + $size
is larger than the size of the binary data
of $in
.
[bin:negative-size] is raised if $size
is
negative.
Note that fn:subsequence()
and fn:substring()
both use xs:double
for offset and size –
this is a legacy from XPath 1.0.
Testing whether |
|
bin:part($data, 0, 4) eq bin:hex("25504446") |
|
|
Returns the binary data created by concatenating the binary data items in a sequence.
bin:join ( |
||
$in |
as
|
|
) as
|
This function is deterministicFO40, context-independentFO40, and focus-independentFO40.
The function returns an xs:base64Binary
created by concatenating the
items in the sequence $in
, in order.
If the value of $in
is the empty sequence, the function returns a binary
item containing no data bytes.
The effect of the function is equivalent to the result of the following XPath expression.
bin:from-octets($in ! bin:to-octets(.))
Inserts additional binary data at a given point in other binary data.
bin:insert-before ( |
||
$in |
as , |
|
$offset |
as , |
|
$extra |
as
|
|
) as
|
This function is deterministicFO40, context-independentFO40, and focus-independentFO40.
Returns binary data consisting sequentially of the data from $in
upto
and including the $offset - 1
octet, followed by all the data from
$extra
, and then the remaining data from $in
.
The $offset
is zero based.
The value of $offset
must be a non-negative integer.
If the value of $in
is the empty sequence, the function returns an empty
sequence.
If the value of $extra
is the empty sequence, the function returns
$in
.
If $offset eq 0
the result is the binary concatenation of
$extra
and $in
, i.e. equivalent to
bin:join(($extra,$in))
.
[bin:index-out-of-range] is raised if $offset
is
negative or $offset
is larger than the size of the binary data of
$in
.
Note that when $offset gt 0 and $offset lt bin:size($in)
the function is
equivalent to:
bin:join((bin:part($in,0,$offset - 1),$extra,bin:part($in,$offset)))
Returns the binary data created by padding $in
with $size
octets from the left. The padding octet values are $octet
or zero if
omitted.
bin:pad-left ( |
||
$in |
as , |
|
$size |
as , |
|
$octet |
as
|
:= 0 |
) as
|
This function is deterministicFO40, context-independentFO40, and focus-independentFO40.
The function returns an xs:base64Binary
created by padding the input
with $size
octets in front of the input. If
$octet
is specified, the padding octets each have that value, otherwise
they are initialized to 0.
$size
must be a non-negative integer.
If the value of $in
is the empty sequence, the function returns an empty
sequence.
The effect of the function is equivalent to the result of the following XPath expression, except in error cases.
bin:join((bin:from-octets((1 to $size) ! $octet), $in))
[bin:negative-size] is raised if $size
is
negative.
[bin:octet-out-of-range] is raised if $octet
lies outside the range 0 – 255.
Returns the binary data created by padding $in
with $size
blank octets from the right. The padding octet values are $octet
or
zero if omitted.
bin:pad-right ( |
||
$in |
as , |
|
$size |
as , |
|
$octet |
as
|
:= 0 |
) as
|
This function is deterministicFO40, context-independentFO40, and focus-independentFO40.
The function returns an xs:base64Binary
created by padding the input
with $size
blank octets after the input. If
$octet
is specified, the padding octets each have that value, otherwise
they are initialized to 0.
$size
must be a non-negative integer.
If the value of $in
is the empty sequence, the function returns an empty
sequence.
The effect of the function is equivalent to the result of the following XPath expression, except in error cases.
bin:join((bin:from-octets($in,(1 to $size) ! $octet)))
[bin:negative-size] is raised if $size
is
negative.
[bin:octet-out-of-range] is raised if $octet
lies outside the range 0 – 255.
Returns the first location in $in
of $search
, starting at
the $offset
octet.
bin:find ( |
||
$in |
as , |
|
$offset |
as , |
|
$search |
as
|
|
) as
|
This function is deterministicFO40, context-independentFO40, and focus-independentFO40.
The function returns the first location of the binary search sequence in the input, or if not found, the empty sequence.
If $search
is empty $offset
is returned.
The value of $offset
must be a non-negative integer.
The $offset
is zero based.
The returned location is zero based.
If the value of $in
is the empty sequence, the function returns an empty
sequence.
[bin:index-out-of-range] is raised if $offset
is
negative or $offset
is larger than the size of the binary data of
$in
.
Finding all the matches can be accomplished with simple recursive application:
<xsl:function name="f:find-all" as="xs:integer*"> <xsl:param name="data" as="xs:base64Binary?"/> <xsl:param name="offset" as="xs:integer"/> <xsl:param name="pattern" as="xs:base64Binary"/> <xsl:sequence select="if(bin:length($pattern) = 0) then () else let $found := bin:find($data,$offset,$pattern) return if($found) then ($found, if($found + 1 lt bin:length($data)) then f:find-all($data,$found + 1,$pattern) else ()) else ()"/> </xsl:function>
Decodes binary data as a string in a given encoding.
bin:decode-string ( |
||
$in |
as , |
|
$encoding |
as
|
:= 'utf-8' , |
$offset |
as
|
:= 0 , |
$size |
as
|
:= () |
) as
|
This function is deterministicFO40, context-independentFO40, and focus-independentFO40.
If $offset
and $size
are provided, the $size
octets from $offset
are decoded. If $offset
alone is
provided, octets from $offset
to the end are decoded, otherwise the
entire octet sequence is used.
The $encoding
argument is the name of an encoding. The values for this
attribute follow the same rules as for the encoding
attribute in an XML
declaration. The only values which every implementation is
required to recognize are utf-8
and
utf-16
.
If $encoding
is omitted, utf-8
encoding is assumed.
The values of $offset
and $size
must be non-negative integers.
If the value of $in
is the empty sequence, the function returns an empty
sequence.
$offset
is zero based.
[bin:index-out-of-range]is raised if $offset
is
negative or $offset + $size
is larger than the size of the binary data
of $in
.
[bin:negative-size] is raised if $size
is
negative.
[bin:unknown-encoding] is raised if $encoding
is
invalid or not supported by the implementation.
[bin:conversion-error] is raised if there is an error or malformed input during decoding the string. Additional information about the error may be passed through suitable error reporting mechanisms – this is implementation-dependant.
Testing whether |
|
bin:decode-string($data, 'UTF-8', 0, 4) eq '%PDF' |
|
The first four characters of a PDF file are |
Encodes a string into binary data using a given encoding.
bin:encode-string ( |
||
$in |
as , |
|
$encoding |
as
|
:= 'utf-8' |
) as
|
This function is deterministicFO40, context-independentFO40, and focus-independentFO40.
The $encoding
argument is the name of an encoding. The values for this
attribute follow the same rules as for the encoding
attribute in an XML
declaration. The only values which every implementation is
required to recognize are utf-8
and
utf-16
.
If $encoding
is omitted, utf-8
encoding is assumed.
If the value of $in
is the empty sequence, the function returns an empty
sequence.
[bin:unknown-encoding] is raised if $encoding
is
invalid or not supported by the implementation.
[bin:conversion-error]is raised if there is an error or malformed input during encoding the string. Additional information about the error may be passed through suitable error reporting mechanisms – this is implementation-dependant.
Packing and unpacking numeric values within binary data can be performed in
'most-significant-first' ('big-endian') or 'least-significant-first'
('little-endian') octet order. The default is
'most-significant-first'. The relevant functions have an
optional parameter $octet-order
whose string value controls the
order. Least-significant-first order is indicated by any of the values
least-significant-first
, little-endian
or
LE
. Most-significant-first order is indicated by any of the
values most-significant-first
, big-endian
or
BE
.
Integers within binary data are represented, or assumed to be represented, as
an integral number of octets. Integers where $length
is greater
than 8 octets (and thus not representable as a long
) might be
expected in some situations, e.g. encryption. Whether the range of integers
is limited to ±2^63
is implementation-dependentFO40.
Care should be taken with the packing and unpacking of floating point numbers
(xs:float
and xs:double
). The binary
representations are expected to correspond with those of the IEEE
single/double-precision 32/64-bit floating point types [IEEE 754-1985]. Consequently they will occupy 4 or 8 octets when packed.
Positive and negative infinities are supported. INF
maps to
0x7f80
0000
(float), 0x7ff0 0000 0000 0000
(double). -INF
maps to 0xff80 0000
(float),
0xfff0 0000 0000 0000
(double).
Negative zero (0x8000 0000 0000 0000
double,
0x8000 0000
float) encountered during unpacking will yield
negative zero forms (e.g. -xs:double(0.0)
) and negative zeros
will be written as a result of packing.
[XML Schema Part 2: Datatypes Second Edition] provides only one form of NaN
which
corresponds to a 'quiet' NaN
with zero payload of [IEEE 754-1985] with forms 0x7fc0 0000
(float),
0x7ff8 0000 0000 0000
(double). These are the bit forms that
will be packed.
'Signalling' NaN
values (0x7f80 0001
→
0x7fbf ffff
or 0xff80 0001
→
0xffbf ffff
, 0x7ff0 0000 0000
0001
→ 0x7ff7 ffff ffff ffff
or
0xfff0 0000 0000
0001
→ 0xfff7 ffff ffff ffff
) encountered
during unpacking will be replaced by 'quiet' NaN
. Any low-order
payload in a unpacked 'quiet' NaN
is also zeroed.
Returns the 8-octet binary representation of a double value.
bin:pack-double ( |
||
$in |
as , |
|
$octet-order |
as
|
:= 'most-significant-first' |
) as
|
This function is deterministicFO40, context-independentFO40, and focus-independentFO40.
Most-significant-octet-first number representation is assumed unless the
$octet-order
parameter is specified. Acceptable values for
$octet-order
are described in 7.1.1 Number 'endianness'.
The binary representation will correspond with that of the IEEE double-precision 64-bit floating point type [IEEE 754-1985]. For more details see 7.1.3 Representation of floating point numbers.
[bin:unknown-significance-order] is raised if the value
$octet-order
is unrecognized.
Returns the 4-octet binary representation of a float value.
bin:pack-float ( |
||
$in |
as , |
|
$octet-order |
as
|
:= 'most-significant-first' |
) as
|
This function is deterministicFO40, context-independentFO40, and focus-independentFO40.
Most-significant-octet-first number representation is assumed unless the
$octet-order
parameter is specified. Acceptable values for
$octet-order
are described in 7.1.1 Number 'endianness'.
The binary representation will correspond with that of the IEEE single-precision 32-bit floating point type [IEEE 754-1985]. For more details see 7.1.3 Representation of floating point numbers.
[bin:unknown-significance-order] is raised if the value
$octet-order
is unrecognized.
Returns the twos-complement binary representation of an integer value
treated as $size
octets long. Any 'excess' high-order bits are
discarded.
bin:pack-integer ( |
||
$in |
as , |
|
$size |
as , |
|
$octet-order |
as
|
:= 'most-significant-first' |
) as
|
This function is deterministicFO40, context-independentFO40, and focus-independentFO40.
Most-significant-octet-first number representation is assumed unless the
$octet-order
parameter is specified. Acceptable values for
$octet-order
are described in 7.1.1 Number 'endianness'.
Specifying a $size
of zero yields an empty binary data.
[bin:unknown-significance-order] is raised if the value
$octet-order
is unrecognized.
[bin:negative-size] is raised if $size
is
negative.
If the integer being packed has a maximum precision of $size
octets,
then signed/unsigned versions are not necessary. If the data is considered unsigned,
then the most significant bit of the bottom $size
octets has a normal
positive (2^(8 *$size - 1)
) meaning. If it is considered to be a signed
value, then the MSB and all the higher order, discarded bits will be '1' for a
negative value and '0' for a positive or zero. If this function were to check the
'sizing' of the supplied integer against the packing size, then any values of MSB
and the discarded higher order bits other than 'all 1' or 'all 0' would constitute
an error. This function does not perform such checking.
Extract double value stored at the particular offset in binary data.
bin:unpack-double ( |
||
$in |
as , |
|
$offset |
as , |
|
$octet-order |
as
|
:= 'most-significant-first' |
) as
|
This function is deterministicFO40, context-independentFO40, and focus-independentFO40.
Extract the double value
stored in the 8 successive octets from the $offset
octet of the binary
data of $in
.
Most-significant-octet-first number representation is assumed unless the
$octet-order
parameter is specified. Acceptable values for
$octet-order
are described in 7.1.1 Number 'endianness'.
The value of $offset
must be a non-negative integer.
The $offset
is zero based.
The binary representation is expected to correspond with that of the IEEE double-precision 64-bit floating point type [IEEE 754-1985]. For more details see 7.1.3 Representation of floating point numbers.
[bin:index-out-of-range] is raised if $offset
is
negative or $offset + 8
(octet-length of xs:double
) is
larger than the size of the binary data of $in
.
[bin:unknown-significance-order] is raised if the value
$octet-order
is unrecognized.
Extract float value stored at the particular offset in binary data.
bin:unpack-float ( |
||
$in |
as , |
|
$offset |
as , |
|
$octet-order |
as
|
:= 'most-significant-first' |
) as
|
This function is deterministicFO40, context-independentFO40, and focus-independentFO40.
Extract the float value
stored in the 4 successive octets from the $offset
octet of the binary
data of $in
.
Most-significant-octet-first number representation is assumed unless the
$octet-order
parameter is specified. Acceptable values for
$octet-order
are described in 7.1.1 Number 'endianness'.
The value of $offset
must be a non-negative integer.
The $offset
is zero based.
The binary representation is expected to correspond with that of the IEEE single-precision 32-bit floating point type [IEEE 754-1985]. For more details see 7.1.3 Representation of floating point numbers.
[bin:index-out-of-range] is raised if $offset
is
negative or $offset + 4
(octet-length of xs:float
) is
larger than the size of the binary data of $in
.
[bin:unknown-significance-order] is raised if the value
$octet-order
is unrecognized.
Returns a signed integer value represented by the $size
octets starting
from $offset
in the input binary representation. Necessary sign
extension is performed (i.e. the result is negative if the high order bit is
'1').
bin:unpack-integer ( |
||
$in |
as , |
|
$offset |
as , |
|
$size |
as , |
|
$octet-order |
as
|
:= 'most-significant-first' |
) as
|
This function is deterministicFO40, context-independentFO40, and focus-independentFO40.
Most-significant-octet-first number representation is assumed unless the
$octet-order
parameter is specified. Acceptable values for
$octet-order
are described in 7.1.1 Number 'endianness'.
The values of $offset
and $size
must be non-negative integers.
$offset
is zero based.
Specifying a $size
of zero yields the integer 0
.
[bin:index-out-of-range] is raised if $offset
is
negative or $offset + $size
is larger than the size of the binary data
of $in
.
[bin:negative-size] is raised if $size
is
negative.
[bin:unknown-significance-order] is raised if the value
$octet-order
is unrecognized.
For discussion on integer range see 7.1.2 Integer representation.
Returns an unsigned integer value represented by the $size
octets
starting from $offset
in the input binary representation.
bin:unpack-unsigned-integer ( |
||
$in |
as , |
|
$offset |
as , |
|
$size |
as , |
|
$octet-order |
as
|
:= 'most-significant-first' |
) as
|
This function is deterministicFO40, context-independentFO40, and focus-independentFO40.
Most-significant-octet-first number representation is assumed unless the
$octet-order
parameter is specified. Acceptable values for
$octet-order
are described in 7.1.1 Number 'endianness'.
The values of $offset
and $size
must be non-negative integers.
The $offset
is zero based.
Specifying a $size
of zero yields the integer 0
.
[bin:index-out-of-range] is raised if $offset
is
negative or $offset + $size
is larger than the size of the binary data
of $in
.
[bin:negative-size] is raised if $size
is
negative.
[bin:unknown-significance-order] is raised if the value
$octet-order
is unrecognized.
For discussion on integer range see 7.1.2 Integer representation.
Returns the "bitwise or" of two binary arguments.
bin:or ( |
||
$a |
as , |
|
$b |
as
|
|
) as
|
This function is deterministicFO40, context-independentFO40, and focus-independentFO40.
Returns "bitwise or" applied between $a
and $b
.
If either argument is the empty sequence, an empty sequence is returned.
[bin:differing-length-arguments] is raised if the input arguments are of differing length.
Returns the "bitwise xor" of two binary arguments.
bin:xor ( |
||
$a |
as , |
|
$b |
as
|
|
) as
|
This function is deterministicFO40, context-independentFO40, and focus-independentFO40.
Returns "bitwise exclusive or" applied between $a
and
$b
.
If either argument is the empty sequence, an empty sequence is returned.
[bin:differing-length-arguments] is raised if the input arguments are of differing length.
Returns the "bitwise and" of two binary arguments.
bin:and ( |
||
$a |
as , |
|
$b |
as
|
|
) as
|
This function is deterministicFO40, context-independentFO40, and focus-independentFO40.
Returns "bitwise and" applied between $a
and $b
.
If either argument is the empty sequence, an empty sequence is returned.
[bin:differing-length-arguments] is raised if the input arguments are of differing length.
Returns the "bitwise not" of a binary argument.
bin:not ( |
||
$in |
as
|
|
) as
|
This function is deterministicFO40, context-independentFO40, and focus-independentFO40.
Returns "bitwise not" applied to $in
.
If the argument is the empty sequence, an empty sequence is returned.
Shift bits in binary data.
bin:shift ( |
||
$in |
as , |
|
$by |
as
|
|
) as
|
This function is deterministicFO40, context-independentFO40, and focus-independentFO40.
If $by
is positive then bits are shifted $by
times to the
left.
If $by
is negative then bits are shifted -$by
times to the
right.
If $by
is zero, the result is identical to $in
.
If |$by|
is greater than the bit-length of $in
then an
all-zeros result, of the same length as $in
, is returned.
|$by|
can be greater than 8, implying multi-byte shifts.
The result always has the same size as $in
.
The shifting is logical: zeros are placed into discarded bits.
If the value of $in
is the empty sequence, the function returns an empty
sequence.
Bit shifting across byte boundaries implies 'big-endian' treatment, i.e. the leftmost (high-order) bit when shifted left becomes the low-order bit of the preceding byte.
Expression: |
bin:shift(bin:hex("000001"), 17) |
---|---|
Result: |
bin:hex("020000") |
The error text provided with these errors is non-normative.
Error in converting to/from a string.
The two arguments to a bitwise operation are of differing lengths.
Attempting to retrieve data outside the meaningful range of a binary data type.
Size of binary portion, required numeric size or padding is negative.
Wrong character in binary 'numeric constructor' string.
Attempting to pack binary value with octet outside range 0-255.
The specified encoding is not supported.
Unknown octet-order value.
This Appendix describes some sources of functions or operators that fall outside the scope of the function library defined in this specification. It includes both function specifications and function implementations. Inclusion of a function in this appendix does not constitute any kind of recommendation or endorsement; neither is omission from this appendix to be construed negatively. This Appendix does not attempt to give any information about licensing arrangements for these function specifications or implementations.
A number of W3C Recommendations make use of XPath, and in some cases such Recommmendations define additional functions to be made available when XPath is used in a specific host language.
Of particular interest to this specification, [XQuery and XPath Functions and Operators 4.0] defines
Comparison operators
on xs:hexBinary
and xs:base64Binary
values, defining the semantics of the eq
,
ne
, lt
and ge
operators
applied to binary data. Each returns a boolean
value.
A function to retrieve the value of a binary resource
Function name | Availability | Notes |
---|---|---|
Section 11.1.1 op:binary-equalFO40 | XPath4.0+ | Returns true if both binary values contain the same
octet sequence. |
Section 11.1.2 op:binary-less-thanFO40 | XPath4.0+ | Returns true if the first argument is less than the
second. |
[TITLE OF FO40 SPEC, TITLE OF func-binary-resource SECTION]FO40 | XPath4.0+ | Returns a resource as xs:base64Binary . |
Of particular interest to this specification, [EXPath File 4.0]
defines the following functions for input and output of
xs:base64Binary
values:
Function name | Availability | Notes |
---|---|---|
Section 4.5 file:read-binaryFILE40 | XPath4.0+ | Returns the content of a file in its Base64 representation. |
Section 4.9 file:write-binaryFILE40 | XPath4.0+ | Writes a Base64 item as binary data to a file. |
Section 4.2 file:append-binaryFILE40 | XPath4.0+ | Appends a Base64 item as binary data to a file. |
Use the arrows to browse significant changes since the 1.0 version of this specification.
See 1 Introduction
Sections with significant changes are marked Δ in the table of contents. New functions introduced in this version are marked ➕ in the table of contents.
See 1 Introduction
Binary 'data' arguments to the functions are now declared to be either
xs:hexBinary
or xs:base64Binary
, but binary
function results remain of type xs:base64Binary
. This should
not cause any backward incompatibilities as casting back and forth between
the two representations has been possible since at least version
2.0
See 1.2 Binary type
The functions in this example have been moved into a differing namespace
prefix (asn:
) to avoid suggesting that they are part of the
supported function set.
See 2.2 Example – reading and writing variable length ASN.1 integers
The [XQuery and XPath Functions and Operators 4.0] function
fn:binary-resource
has been added to the list of useful
functions.
The function find-all
in the example for bin:find
has been moved into a differing namespace prefix (f:
) to avoid
suggesting that it is part of the supported function set.
The function signatures of all the specified signatures now use the 'optional argument' syntax of XPath 4.0 where appropriate, rather than giving several signatures of differing arity. Other than that, no intended change to the semantics of the functions are assumed.
These changes are not highlighted in the change-marked version of the specification.
The example functions in 2.2 Example – reading and writing variable length ASN.1 integers have been moved
into a differing namespace prefix (asn:
) to avoid
suggesting that they are part of the supported function set. This is in
accordance with the principle that the namespace
http://expath.org/ns/binary
is reserved solely for use in
QNames for functions specified in this module.
This section summarizes the extent to which this specification is compatible with previous versions.
Version 4.0 of this function library is fully backwards compatible with version 1.0, except as noted below:
The use of optional arguments in the function signatures means that minor alterations to possible function calls, which would be invalid in 1.0, are now supported. For example:
bin:decode-string($string,'utf-8',0,())
would be invalid in 1.0, as the fourth argument $size
is defined
to be of type xs:integer
. It is valid for 4.0 as the empty
sequence denotes default behaviour, that is decoding all octets after
$offset
The functions bin:decode-string
,bin:encode-string
,
bin:pack-double
, bin:pack-float
,
bin:pack-integer
, bin:pad-left
,
bin:pad-right
, bin:part
,
bin:unpack-double
, bin:unpack-float
,
bin:unpack-integer
and bin:unpack-unsigned-integer
all have similar incompatibilities.