phax/ph-schematron

'if' in 'let' causes error

lueck opened this issue · 8 comments

lueck commented

Hi,

I have written the following schematron rule, which works in oXygen:

<?xml version="1.0" encoding="UTF-8"?>
<sch:schema xmlns:sch="http://purl.oclc.org/dsdl/schematron" queryBinding="xslt2"
    xmlns:sqf="http://www.schematron-quickfix.com/validator/process">

    <sch:pattern>
        <sch:rule context="*[matches(local-name(), '^h\d+$')]">
            <sch:let
                name="level"
                value="number(replace(local-name(.), 'h', ''))"/>
            <sch:let 
                name="preceding-head"
                value="./preceding-sibling::*[matches(local-name(.), '^h\d+$')][1]"></sch:let>
            <sch:let
                name="preceding-level"
                value="if (exists($preceding-head)) then number(replace(local-name($preceding-head), 'h', '')) else 0"/>
            <sch:report
                test="($level - $preceding-level > 1)"
                >Missing headline level: Level <sch:value-of select="$level"/> follows on level <sch:value-of select="$preceding-level"/></sch:report>
        </sch:rule>
    </sch:pattern>
    
</sch:schema>

I use it to find missing headline levels in a flat structure like in the following xml, where h3 follows on h1:

<?xml version="1.0" encoding="UTF-8"?>
<doc>
    <h1>Antibeispiel</h1>
    <p>Ich bin ein Antibeispiel eines XML-Dokuments, das Konventionen der Hierarchisierung von Überschriften einfach missachtet. Denn ich gönne mir die Freiheit hier einfach eine</p>
    <h3>Überschrift auf dritter Ebene</h3>
    <p>einzufügen, obwohl der Konvention nach eigentlich nur eine</p>
    <h2>Überschrift auf zweiter Ebene</h2>
    <p>stehen sollte.</p>
    <h3>Aufgabe</h3>
    <p>ist es nun als <a href="https://www.data2type.de/xml-xslt-xslfo/schematron/">Einführung in Schematron</a> zu entwickeln, welches genau diesen Knoventionsbruch aufspürt.</p>
    <p>Am Ende kann man eine<a href="https://www.oxygenxml.com/demo/Schematron_Validation.html">Validierung in Schematron im oXygen durchführen.</a></p>
    <h1>Ende</h1>
</doc>

Using the ph-schematron-maven-plugin I get the following error:

[ERROR] /home/clueck/src/scdh/brownbag-coding/xslt-basics/sch/assert-nonsloppy.sch [0:0]: Failed to compile XPath expression in
 <report>: '(number(replace(local-name(.), 'h', '')) - if (exists(./preceding-sibling::*[matches(local-name(.), '^h\d+$')][1]))
 then number(replace(local-name(./preceding-sibling::*[matches(local-name(.), '^h\d+$')][1]), 'h', '')) else 0 > 1)' with the f
ollowing variables: {$preceding-level=if (exists(./preceding-sibling::*[matches(local-name(.), '^h\d+$')][1])) then number(repl
ace(local-name(./preceding-sibling::*[matches(local-name(.), '^h\d+$')][1]), 'h', '')) else 0, $preceding-head=./preceding-sibl
ing::*[matches(local-name(.), '^h\d+$')][1], $level=number(replace(local-name(.), 'h', ''))} - net.sf.saxon.trans.XPathExceptio
n: Unexpected token "if(" at start of expression
net.sf.saxon.trans.XPathException: Unexpected token "if(" at start of expression
...
[ERROR] /home/clueck/src/scdh/brownbag-coding/xslt-basics/sch/assert-nonsloppy.sch [0:0]: Error creating bound schema - com.helger.schematron.pure.binding.SchematronBindException: Failed to precompile the supplied schema.
com.helger.schematron.pure.binding.SchematronBindException: Failed to precompile the supplied schema.
...

To me, it seem that after expanding the expression from the second let-binding (i.e. $preceding-level) in the test-expression, the syntax tree is not parsed correctly.

If I put parentheses around the if from the let-binding, value="(if ... else 0)", the error disappears. But then the rule is apparently not run, since there is no message that the validation has failed. This seems to be #88, then.

Here is my pom.xml:

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>de.wwu.scdh.bbc</groupId>
    <artifactId>xml-transformation</artifactId>
    <version>1.0-SNAPSHOT-0</version>
    
    <name>BBC XSLT</name>
    <url>https://zivgitlab.uni-muenster.de/SCDH/brownbag-coding/xslt-basics</url>

    <properties>
	<ph.schematron.version>5.6.3</ph.schematron.version>
    </properties>
    
    <dependencies>
	<dependency>
	    <groupId>com.helger</groupId>
	    <artifactId>ph-schematron</artifactId>
	    <version>${ph.schematron.version}</version>
	</dependency>
	<dependency>
	    <groupId>com.helger.maven</groupId>
	    <artifactId>ph-schematron-maven-plugin</artifactId>
	    <version>${ph.schematron.version}</version>
	</dependency>
    </dependencies>

    <build>
	<plugins>
	    <plugin>
		<groupId>com.helger.maven</groupId>
		<artifactId>ph-schematron-maven-plugin</artifactId>
		<version>${ph.schematron.version}</version>
		<configuration>
		    <schematronProcessingEngine>pure</schematronProcessingEngine>
		    <schematronFile>sch/assert-nonsloppy.sch</schematronFile>
		    <xmlDirectory>xml</xmlDirectory>
		    <xmlIncludes>xml/sloppy-doc.xml</xmlIncludes>
		    <svrlDirectory>target/schematron-reports</svrlDirectory>
		</configuration>
		<executions>
		    <execution>
			<goals>
			    <goal>validate</goal>
			</goals>
		    </execution>
		</executions>
	    </plugin>
	</plugins>
    </build>
	
</project>

I have tried different versions from the latest back to 4.0.8. Neither pure, nor schematron nor xslt works as I would expect.

Regards,
Chris

phax commented

Basically I have no clue, but lets see where we get:

Btw. you can remove this from the POM:

	<dependency>
	    <groupId>com.helger.maven</groupId>
	    <artifactId>ph-schematron-maven-plugin</artifactId>
	    <version>${ph.schematron.version}</version>
	</dependency>

Having it in the is sufficient

lueck commented

Saxon is fine with the 'if' without parentheses.
I cloned the Schematron/schematron repo and used trunk/schematron/code/iso_svrl_for_xslt2.xsl to compile a stylesheet from my schematron file using Saxon-HE. Then I applied the resulting stylesheet to my xml sample using Saxon-HE again and the result is as I would expect:

...
<svrl:successful-report test="($level - $preceding-level &gt; 1)" location="/doc[1]/h3[1]">
      <svrl:text>Missing headline level: Level 3 follows on level 1</svrl:text>
</svrl:successful-report>
...

I tried Saxon-HE version 10.2 and 9.9.1-7, both successfully.

With the maven plugin I have no success, even when setting schematronProcessingEngine to schematron or xslt.
schXslt also works as expected.

phax commented

In the test file I created, I also receive this output:

<?xml version="1.0" encoding="UTF-8"?>
<schematron-output xmlns="http://purl.oclc.org/dsdl/svrl" title="" schemaVersion="">
  <active-pattern document="C:\dev\git\ph-schematron\ph-schematron\src\test\resources\issues\github108\test.xml" />
  <fired-rule context="*[matches(local-name(), '^h\d+$')]" />
  <fired-rule context="*[matches(local-name(), '^h\d+$')]" />
  <successful-report location="/doc[1]/h3[1]" test="($level - $preceding-level > 1)">
    <text>Missing headline level: Level 3 follows on level 1</text>
  </successful-report>
  <fired-rule context="*[matches(local-name(), '^h\d+$')]" />
  <fired-rule context="*[matches(local-name(), '^h\d+$')]" />
  <fired-rule context="*[matches(local-name(), '^h\d+$')]" />
</schematron-output>
phax commented

When using the Maven plugin, please use

<schematronProcessingEngine>schematron</schematronProcessingEngine>

and check the created SVRL

lueck commented

Hm, using schematron as engine, there are no reports at all. So I conclude, that no test is run.
The fact that the exit code of the maven command is 0, suggests that, too.

lueck commented

Well, after experimenting with the directory layout I can see clearer:

  • After deleting xmlIncludes there is a report, and the report is as I would expect it to be! Yes!
  • The path to my xml file is xml/sloppy-doc.xml.
  • xmlDirectory was put to xml.
  • Setting xmlIncludes to xml/sloppy-doc.xml would include a file in xml/xml/sloppy-doc.xml only (tested), but not xml/sloppy-doc.xml
  • While running the validation, maven logs errors about non-creatable folders. But in the target, the folders are present and the reports are in there, anyway. E.g.:
[ERROR] Failed to create parent directory of '/home/clueck/src/scdh/brownbag-coding/xslt-basics/target/schematron-reports/sloppy-doc.xml.svrl'!
  • The validation of the xml with the sloppy headline levels gives an expected report with. But nevertheless, maven ends with "BUILD SUCCESS" and its exit code is 0. IMO it should exit with a failure, so that the plugin can be used as a validator in an automatic CI/CD pipeline. How can I make it fail?

So this turned out to be a configuration problem! Sorry taking your time! But thanks for your help!

phax commented

Thanks for the clarification - that somehow makes sense.
Regarding the Maven plugin: I checked and saw, that it only checks for failed asserts but not for successful reports - that was fixed.
Also the stupid error message was fixed. I wonder nobody complained about it so far ;-) Building v5.6.4 now

lueck commented

Nice! I've build 5.6.4 locally and it works!

Thanks!
Chris