Functional Clojure wrapper for Michael Kay's Saxon XSLT and XQuery Processor (from Saxonica Limited).
A jar is available on Clojars. Use the following formula
in your project.clj
[clojure-saxon "0.9.4"]
Clojure version
The library has been tested against Clojure 1.3, 1.4, & 1.6.
The top-level function is query
; it takes an XQuery or XPath expression, an
optional namespace map, and a node. Here's returning a sequence of the names of
all elements in a document:
user=> (require '[saxon :as xml])
user=> (xml/query "distinct-values(//element()/local-name())" xmldoc)
("html" "head" "title" "style" "body" "p" "a")
Here's the same, limiting elements to those in the XHTML namespace:
user=> (xml/query "distinct-values(//xhtml:*/local-name())" {:xhtml ""} xmldoc)
("html" "head" "title" "style" "body" "p" "a")
Compiled XQuery expressions are cached:
user=> (def complex-exp "//element()[@type='text/css']//local-name(parent::element()) = 'head'")
user=> (time (xml/query complex-exp xmldoc))
"Elapsed time: 1.747 msecs"
user=> (time (xml/query complex-exp xmldoc))
"Elapsed time: 0.226 msecs"
but query
accepts a compiled function (the result of compile-xquery
or compile-xpath
below) as its first argument as well. (The results of the query are not cached.)
Compile-xquery, Compile-xpath
and compile-xpath
are the lower-level functions behind query
They take expressions and an optional namespace map, and return a function that applies
the compiled expression to a node.
user=> (xml/compile-xquery "distinct-values(//element()/local-name())")
#<saxon$compile_xquery__302$fn__314 saxon$compile_xquery__302$fn__314@48c5186e>
user=> ((xml/compile-xquery "distinct-values(//element()/local-name())") xmldoc)
("html" "head" "title" "style" "body" "p" "a")
and compile-xpath
cache their query arguments as well.
Compiling XML
Use compile-xml
to produce the Saxon in-memory representation, an "XdmNode."
user=> (def xmldoc (xml/compile-xml ( "")))
takes a File, URL, InputStream, Reader, raw String, or an XdmNode.
user=> (xml/compile-xml "<root/>")
#<XdmNode <root/>>
takes the same arguments as compile-xml
and returns a function
that applies the compiled stylesheet to a node, with an optional map of parameters.
When the result of a query is a single item, the query functions return a singleton instead of a sequence of one item, e.g.
user=> (xml/query "count(//element())" xmldoc)
I find this inconsistency convenient, but it might be a bad design choice. User feedback appreciated.
Traversal of nodes is somewhat lazy, though not strictly so. The Clojure code
realizes the first two items of the return sequence, and the Saxon Java processor
seems to keep a few items ahead as well. E.g. in xmldoc
, with 7 element nodes:
user=> (def returned (xml/query "for $e in //element() return trace(($e/local-name()), \"hit\")" xmldoc))
hit [1]: xs:string: html
hit [1]: xs:string: head
hit [1]: xs:string: title
user=> (nth returned 0)
user=> (nth returned 1)
user=> (nth returned 2)
hit [1]: xs:string: style
user=> (nth returned 3)
hit [1]: xs:string: body
user=> (nth returned 4)
hit [1]: xs:string: p
user=> (nth returned 5)
hit [1]: xs:string: a
user=> (nth returned 6)
As you can see, three items are realized when the function is first executed, and from when the third item is touched onward, realizing an item also realizes the next item in the background.
Helper Functions
returns an absolute XPath to a node:
user=> (map xml/node-path (xml/query "//element()" xmldoc))
("/html" "/html/head[1]" "/html/head[1]/title[1]" "/html/head[1]/style[1]" "/html/body[1]" "/html/body[1]/p[1]" "/html/body[1]/a[1]")
adds a default namespace to an XQuery expression:
user=> (xml/query (xml/with-default-ns "" "//*/local-name()") xmldoc)
("html" "head" "title" "style" "body" "p" "a")