validation fails due to missing file
Closed this issue · 8 comments
both @teodorgeorgiev and @gsautter report that validation is failing do to missing file
from @teodorgeorgiev: failed to load external entity "../nlm/JATS-mathmlsetup1.ent" on line 226
from @gsautter in plazi/ggxml2taxpub#43 (comment)
... I encountered one error: JATS-mathmlsetup1.ent doesn't seem to exist in the repo folder you point me to, and none of its subfolders, either ... is this an oversight during prior-version cleanup, or a missing repo file?
Using oxygen, I am not getting this error when validating against a local copy of https://github.com/plazi/TaxPub/tree/v1.0-gamma using either the default engine (Xerces?) or xmllint (which does send warnings regarding duplicate models and the non-determistic nomenclature with its clumsy use of x
In case some of the base JATS files in the repo might have been lost, one could simply download the official JATS 1.1 files at: https://ftp.ncbi.nih.gov/pub/jats/publishing/1.1/JATS-Publishing-1-1-MathML3-DTD.zip
and then simply place the files:
tax-treatment-NS0-v1.dtd
taxpubcustom-classes-NS0-v1.ent
taxpubcustom-elements-NS0-v1.ent
taxpubcustom-mixes-NS0-v1.ent
taxpubcustom-models-NS0-v1.ent
taxpubcustom-modules-NS0-v1.ent
from https://github.com/plazi/TaxPub/tree/v1.0-gamma
and then validate against tax-treatment-NS0-v1.dtd in that context.
This is probably the preferred method anyway, as it insures that one is using the correct set of base JATS files being extended by TaxPub which is entirely done by the files listed above.
Doing this, again, I am not able to replicate the missing file error. Perhaps in other validation scenarios and environments it does not work. @teodorgeorgiev and @gsautter, how are you performing validation?
@tcatapano we are using the standard PHP DOMDocument::validate. It takes the DTD from the XML, which in our case we store locally:
<!DOCTYPE article PUBLIC "-//TaxonX//DTD Taxonomic Treatment Publishing DTD v0 20100105//EN" "../../nlm/tax-treatment-NS0.dtd">
@tcatapano also having a local copy of the DTD files from Pensoft available to the validator does fix the problem, and the server currently uses it this way ...
Mainly wanted to make sure I don't validate against any older and stricter versions of now-relaxed definitions (as with tp:material-citation
) and thus tried to validate against https://github.com/plazi/TaxPub/tree/v1.0-gamma alone, which led to the JATS-mathmlsetup1.ent
error ...
Could we make this repo self-contained, just to avoid similar scenarios with thrid-party Taxub users?
OK, now I see ... so far I was trying to validate it against tax-treatment-NS0.dtd and the result was:
failed to load external entity "../nlm/JATS-mathmlsetup1.ent" on line 226
I did as you suggested above (downloaded the official JATS 1.1 and added all "-NS0-v1" files).
I validate the XML against tax-treatment-NS0-v1.dtd and voilà ... I did not get this one anymore :)
However, now although I think my XML is valid I get the following error:
validity error : Content model of nomenclature is not determinist: (sec-meta? , label? , tp:taxon-name , x? , tp:taxon-authority? , x? , tp:taxon-status? , x? , tp:taxon-identifier* , xref* , x? , tp:nomenclature-citation-list* , x? , (tp:type-genus | tp:type-species)? , x? , tp:taxon-type-location? , x?)
Here is my test file
test_taxpub.zip
@teodorgeorgiev: Yes. It's a known issue. See: #52. In the meantime, if at all possible try using the Xerces parser (https://xerces.apache.org/index.html) which I do not think will report this error. I'll prioritize a patch for this. Hope to get it out this weekend.
Closing