plazi/ggxml2taxpub

validation of taxpub provided by GG service

Opened this issue · 3 comments

Nota bene that the results of the transformations provided by the service at https://tb.plazi.org/GgServer/taxPubL1/ may be invalid. In fact it is likely than many will be invalid due to other issues logged in this repository. This can be handled in two ways (not exclusive):

  1. consumers perform validation against a copy of the TaxPub DTD provide at: https://github.com/plazi/TaxPub/releases/tag/v1.0.0-rc2
  2. as mentioned in #20 (comment)_, Plazi will add a step to the service at https://tb.plazi.org/GgServer/taxPubL1/ which performs DTD validation, passing through valid instances and producing an error for invalid instances

@tcatapano thought we agreed that the TaxPub created on demand in the website shouldn't be validated, and that validation should happen only on a push export from the back-end (once SiB provide us with a place to push to) and, should we introduce that, also on export to Zenodo and to TaxPub formatted collection dumps.

@gsuatter: I think its fine to perform the validation on the export and not through the on-demand service. That still leaves option 1 above, for consumers to handle the validation (if even necessary) on their end of the on-demand pulls.

Ive posted a list of URLs to the known valid TaxPub files (i.e., those under level1/ in this repo) here:

https://github.com/plazi/ggxml2taxpub-treatments/blob/main/valid_level1.txt

I hope this will be helpful for development purposes.