'if' in 'let' causes error
lueck opened this issue · 8 comments
Hi,
I have written the following schematron rule, which works in oXygen:
<?xml version="1.0" encoding="UTF-8"?>
<sch:schema xmlns:sch="http://purl.oclc.org/dsdl/schematron" queryBinding="xslt2"
xmlns:sqf="http://www.schematron-quickfix.com/validator/process">
<sch:pattern>
<sch:rule context="*[matches(local-name(), '^h\d+$')]">
<sch:let
name="level"
value="number(replace(local-name(.), 'h', ''))"/>
<sch:let
name="preceding-head"
value="./preceding-sibling::*[matches(local-name(.), '^h\d+$')][1]"></sch:let>
<sch:let
name="preceding-level"
value="if (exists($preceding-head)) then number(replace(local-name($preceding-head), 'h', '')) else 0"/>
<sch:report
test="($level - $preceding-level > 1)"
>Missing headline level: Level <sch:value-of select="$level"/> follows on level <sch:value-of select="$preceding-level"/></sch:report>
</sch:rule>
</sch:pattern>
</sch:schema>
I use it to find missing headline levels in a flat structure like in the following xml, where h3
follows on h1
:
<?xml version="1.0" encoding="UTF-8"?>
<doc>
<h1>Antibeispiel</h1>
<p>Ich bin ein Antibeispiel eines XML-Dokuments, das Konventionen der Hierarchisierung von Überschriften einfach missachtet. Denn ich gönne mir die Freiheit hier einfach eine</p>
<h3>Überschrift auf dritter Ebene</h3>
<p>einzufügen, obwohl der Konvention nach eigentlich nur eine</p>
<h2>Überschrift auf zweiter Ebene</h2>
<p>stehen sollte.</p>
<h3>Aufgabe</h3>
<p>ist es nun als <a href="https://www.data2type.de/xml-xslt-xslfo/schematron/">Einführung in Schematron</a> zu entwickeln, welches genau diesen Knoventionsbruch aufspürt.</p>
<p>Am Ende kann man eine<a href="https://www.oxygenxml.com/demo/Schematron_Validation.html">Validierung in Schematron im oXygen durchführen.</a></p>
<h1>Ende</h1>
</doc>
Using the ph-schematron-maven-plugin
I get the following error:
[ERROR] /home/clueck/src/scdh/brownbag-coding/xslt-basics/sch/assert-nonsloppy.sch [0:0]: Failed to compile XPath expression in
<report>: '(number(replace(local-name(.), 'h', '')) - if (exists(./preceding-sibling::*[matches(local-name(.), '^h\d+$')][1]))
then number(replace(local-name(./preceding-sibling::*[matches(local-name(.), '^h\d+$')][1]), 'h', '')) else 0 > 1)' with the f
ollowing variables: {$preceding-level=if (exists(./preceding-sibling::*[matches(local-name(.), '^h\d+$')][1])) then number(repl
ace(local-name(./preceding-sibling::*[matches(local-name(.), '^h\d+$')][1]), 'h', '')) else 0, $preceding-head=./preceding-sibl
ing::*[matches(local-name(.), '^h\d+$')][1], $level=number(replace(local-name(.), 'h', ''))} - net.sf.saxon.trans.XPathExceptio
n: Unexpected token "if(" at start of expression
net.sf.saxon.trans.XPathException: Unexpected token "if(" at start of expression
...
[ERROR] /home/clueck/src/scdh/brownbag-coding/xslt-basics/sch/assert-nonsloppy.sch [0:0]: Error creating bound schema - com.helger.schematron.pure.binding.SchematronBindException: Failed to precompile the supplied schema.
com.helger.schematron.pure.binding.SchematronBindException: Failed to precompile the supplied schema.
...
To me, it seem that after expanding the expression from the second let-binding (i.e. $preceding-level
) in the test-expression, the syntax tree is not parsed correctly.
If I put parentheses around the if
from the let-binding, value="(if ... else 0)"
, the error disappears. But then the rule is apparently not run, since there is no message that the validation has failed. This seems to be #88, then.
Here is my pom.xml
:
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>de.wwu.scdh.bbc</groupId>
<artifactId>xml-transformation</artifactId>
<version>1.0-SNAPSHOT-0</version>
<name>BBC XSLT</name>
<url>https://zivgitlab.uni-muenster.de/SCDH/brownbag-coding/xslt-basics</url>
<properties>
<ph.schematron.version>5.6.3</ph.schematron.version>
</properties>
<dependencies>
<dependency>
<groupId>com.helger</groupId>
<artifactId>ph-schematron</artifactId>
<version>${ph.schematron.version}</version>
</dependency>
<dependency>
<groupId>com.helger.maven</groupId>
<artifactId>ph-schematron-maven-plugin</artifactId>
<version>${ph.schematron.version}</version>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>com.helger.maven</groupId>
<artifactId>ph-schematron-maven-plugin</artifactId>
<version>${ph.schematron.version}</version>
<configuration>
<schematronProcessingEngine>pure</schematronProcessingEngine>
<schematronFile>sch/assert-nonsloppy.sch</schematronFile>
<xmlDirectory>xml</xmlDirectory>
<xmlIncludes>xml/sloppy-doc.xml</xmlIncludes>
<svrlDirectory>target/schematron-reports</svrlDirectory>
</configuration>
<executions>
<execution>
<goals>
<goal>validate</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>
</project>
I have tried different versions from the latest back to 4.0.8. Neither pure
, nor schematron
nor xslt
works as I would expect.
Regards,
Chris
Basically I have no clue, but lets see where we get:
- "if" is a valid XPath2 expression: https://www.w3.org/TR/2010/REC-xpath20-20101214/#id-conditionals
- Also valid in XPath3: https://www.w3.org/TR/2014/REC-xpath-30-20140408/#id-conditionals
- The XPath version is chosen by the underlying Saxon version (which I try to keep up-to-date)
- The question is: why does Saxon not like the "if". Does the "if" fail only in the expression
(number (...)) - if ...
or also standalone? - Try not using "pure" version, as the "let" evaluation is broken. Try
schematron
orxslt
instead.
Btw. you can remove this from the POM:
<dependency>
<groupId>com.helger.maven</groupId>
<artifactId>ph-schematron-maven-plugin</artifactId>
<version>${ph.schematron.version}</version>
</dependency>
Having it in the is sufficient
Saxon is fine with the 'if' without parentheses.
I cloned the Schematron/schematron repo and used trunk/schematron/code/iso_svrl_for_xslt2.xsl
to compile a stylesheet from my schematron file using Saxon-HE. Then I applied the resulting stylesheet to my xml sample using Saxon-HE again and the result is as I would expect:
...
<svrl:successful-report test="($level - $preceding-level > 1)" location="/doc[1]/h3[1]">
<svrl:text>Missing headline level: Level 3 follows on level 1</svrl:text>
</svrl:successful-report>
...
I tried Saxon-HE version 10.2 and 9.9.1-7, both successfully.
With the maven plugin I have no success, even when setting schematronProcessingEngine
to schematron
or xslt
.
schXslt also works as expected.
In the test file I created, I also receive this output:
<?xml version="1.0" encoding="UTF-8"?>
<schematron-output xmlns="http://purl.oclc.org/dsdl/svrl" title="" schemaVersion="">
<active-pattern document="C:\dev\git\ph-schematron\ph-schematron\src\test\resources\issues\github108\test.xml" />
<fired-rule context="*[matches(local-name(), '^h\d+$')]" />
<fired-rule context="*[matches(local-name(), '^h\d+$')]" />
<successful-report location="/doc[1]/h3[1]" test="($level - $preceding-level > 1)">
<text>Missing headline level: Level 3 follows on level 1</text>
</successful-report>
<fired-rule context="*[matches(local-name(), '^h\d+$')]" />
<fired-rule context="*[matches(local-name(), '^h\d+$')]" />
<fired-rule context="*[matches(local-name(), '^h\d+$')]" />
</schematron-output>
When using the Maven plugin, please use
<schematronProcessingEngine>schematron</schematronProcessingEngine>
and check the created SVRL
Hm, using schematron
as engine, there are no reports at all. So I conclude, that no test is run.
The fact that the exit code of the maven command is 0, suggests that, too.
Well, after experimenting with the directory layout I can see clearer:
- After deleting
xmlIncludes
there is a report, and the report is as I would expect it to be! Yes! - The path to my xml file is
xml/sloppy-doc.xml
. xmlDirectory
was put toxml
.- Setting
xmlIncludes
toxml/sloppy-doc.xml
would include a file inxml/xml/sloppy-doc.xml
only (tested), but notxml/sloppy-doc.xml
- While running the validation, maven logs errors about non-creatable folders. But in the target, the folders are present and the reports are in there, anyway. E.g.:
[ERROR] Failed to create parent directory of '/home/clueck/src/scdh/brownbag-coding/xslt-basics/target/schematron-reports/sloppy-doc.xml.svrl'!
- The validation of the xml with the sloppy headline levels gives an expected report with. But nevertheless, maven ends with "BUILD SUCCESS" and its exit code is 0. IMO it should exit with a failure, so that the plugin can be used as a validator in an automatic CI/CD pipeline. How can I make it fail?
So this turned out to be a configuration problem! Sorry taking your time! But thanks for your help!
Thanks for the clarification - that somehow makes sense.
Regarding the Maven plugin: I checked and saw, that it only checks for failed asserts but not for successful reports - that was fixed.
Also the stupid error message was fixed. I wonder nobody complained about it so far ;-) Building v5.6.4 now
Nice! I've build 5.6.4 locally and it works!
Thanks!
Chris