What can I do to improve performance?
Closed this issue · 10 comments
The validation of a file ttakes at least 20 seconds on my local system...
What can I do to improve that?
What are you using? Directly the Schematron files or the precompiled XSLTs? Using the XSLTs is much quicker. You may have a look at https://github.com/phax/ph-bdve where I already created a nice Java wrapper around different Schematron rule sets.
Hi Philip,
I'm using the Schematron files directly now.
Gone take a look at ph-bdve.
Thank you for your prompt response!
Changed to using an XSLT in stead of the Schematron files. That improves performance!
Can't get the pure-mode working. When I try that with the Schematron file my XML is failing a lot of rules suddenly.
Yes, that is very likely, because the pure mode can handle XPath only. I assume we also use XSLT features, so you need to use the XSLT.
When I use the XSLT with pure-mode, the XSL is rejected as malformed.
I just would like to pass the "Compilation"-fase as this takes most of the time and cpu.
The "compilation" phase is the conversion from "SCH" to "XSLT" - this is what takes forever. That's why we also delivery the "pre-build" XSLTs as e.g. for UBL in https://github.com/CenPC434/validation/tree/master/ubl/xslt
The "compilation" of the "XSLT" into a "Template" cannot be done by us. But maybe the object is serializable and you can just reuse this binary representation then - hope you know what I mean :D
Hi Philip, I think I understand, a bit ;)
I already use and XSLT with the -xsl-option and then performance is within limits.
For this I use the XSL provided by Simpler Invoicing. For now we accept the speed of about 5 seconds per validation.
https://github.com/SimplerInvoicing/validation/tree/master/xsl
It would be nice if the validator would support multiple input files (*.xml of xml.lst).
Then the compilation only needs to be done once per session.
I know what you mean, but this validator is really just meant as an example on how to do it. The issue is, that the "caching" of precompiled XLSTs can only happen in memory. So everytime you are invoking the validator the XSLT is compiled again - this is definitively a pain in the ass.
What I did to resolve this is to build a "standalone validation server" in https://github.com/phax/phoss-validator/ which comes with an integrated application server, so it runs "in the background" and is therefore able to cache things. I must admit I need to update the documentation a bit, but in general it is meant as a downloadable "all in one" validator.
I created an up-to-date version for you: https://github.com/phax/phoss-validator/releases/tag/2018-10-25 grab the ZIP, unpack it and call "start".
Than you can start validating locally with much better performance
Wow great!!!!
I will do some extra tests next Monday. But this is perfect!