/binary-parser

a XSLT package to parse binary data in XSLT

Primary LanguageXSLTMIT LicenseMIT

binary-parser

a XSLT package to parse binary data in XSLT

Description

This XSLT package provides functions to easily parse binary data in XSLT. To use this package, you'll need a processor that :

  • complies with XSLT 3.0
  • support EXPATH file module
  • support EXPATH binary module

As of today, this has only been tested with Saxon PE 11.4.

Installation

Copy all files from the 'src' folder somewhere into your project. Configure Saxon to be able to find the package by editing the Saxon configuration file. Add a reference to the 'binary-parser' package in the <xsltPackages> element and set the sourceLocation according to your project. You Saxon configuration file may look like the following :

<?xml version="1.0" encoding="UTF-8"?>
<configuration xmlns="http://saxon.sf.net/ns/configuration"
    label="My super project using Binary Parser" edition="PE">
    
    <xsltPackages>
        <package name="binary-parser" version="1.0" sourceLocation="binary-parser/binaryParser.xsl"/>
    </xsltPackages>  
    
</configuration>

How to use

binary-parser package provides several functions in the "urn://binary-parser" namespace.

Add this namespace to your stylesheet and define as an extension prefix.

Remember that your stylesheet version must be 3.0 as XSLT packages are required.

Finally, add the <xsl:use-package> element to use the binary-parser package.

<?xml version="1.0" encoding="UTF-8"?>

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:bp="urn://binary-parser"
    extension-element-prefixes="bp"
    version="3.0">
    
    <xsl:use-package name="binary-parser" package-version="1.0"/>
</xsl:stylesheet>

Initializing the parsing context

A parsing context can be created with :

  • bp:from-file( $file as xs:string)
  • bp:from-data( $data as xs:base64Binary)
<xsl:variable name="png-file" select="bp:from-file('simple.png')"/>

Parsing data

binary-parser only provide the bp:read($context, $count as xs:integer) function. This funtion moves count bytes from the data to the results pile.

<!-- parse the PNG header (8 bytes) -->
<xsl:variable name="png-chunks" select="bp:read($png-file, 8)"/>
<!-- The header is now available in the results -->
<xsl:variable name="png-header" select="bp:result($png-chunks)"/>
<!-- further calls to bp:read on $png-chunks will retrieve chunks data -->

Working with results

The parsing context holds a results pile that carry the already parsed data. The bp:results($context) will return the full pile whereas bp:result($context,$pos as xs:integer) will just return the result at position pos from the end.

bp:pop($context) will remove the last result; bp:push($context, $value) will add a new value to the result pile.

<!-- parse the PNG header (14 bytes) -->
<xsl:variable name="header" select="bp:from-data(bin:hex('89504e470d0a1a0d'))"/>
<xsl:variable name="parsed" select="$header=>bp:read(1)=>bp:read(3)=>bp:read(2)=>bp:read(1)=>bp:read(1)"/>
<!-- bp:results($parsed) is (bin:hex('89'),bin:hex('504e47'),bin:hex('0d0a'),bin:hex('1a'),bin:hex('0d')) but we just want to keep the 3 bytes field converted as a string -->
<xsl:variable name="cleaned" select="$parsed=>bp:pop()=>bp:pop()=>bp:pop()=>bp:pop()=>bp:pop()=>bp:push(bp:result($parsed,4)=>bin:decode-string())"/>
<!-- bp:results($cleaned) is now ('PNG') -->

An alternative to the previous is

<!-- parse the PNG header (14 bytes) -->
<xsl:variable name="header" select="bp:from-data(bin:hex('89504e470d0a1a0d'))"/>
<xsl:variable name="parsed" select="$header=>bp:read(1)=>bp:pop()=>bp:read(3)=>bp:read(2)=>bp:pop()=>bp:read(1)=>bp:pop()=>bp:read(1)=>bp:pop()"/>
<!-- bp:results($parsed) is now (bin:hex('504e47')) but we just need to convert as a string -->
<xsl:variable name="cleaned" select="$parsed=>bp:pop()=>bp:push(bp:result($parsed,1)=>bin:decode-string())"/>
<!-- bp:results($cleaned) is now ('PNG') -->