plazi/ggxml2taxpub

remove line breaks in taxpub output

Opened this issue · 5 comments

as discussed with @gsautter whitespace cleanup will happen post xslt conversion by the export service to be set up on the SRS server.

@tcatapano @gsautter can we please make this change and provide a new set of taxpub files for @jgobeill - he is waiting for it (see last tech meeting https://docs.google.com/document/d/1mEACrbcjfGBaaHEB5qeZ9tESFBdsol98RHkUKxIiT-Y/edit#heading=h.61ni7e1dljbu)

@myrmoteras @jgobeill: I've applied XML "pretty print" to the level1 files. As mentioned above, eventually we will implement a similar pretty printing post-process to the files provided by the GG to TaxPub service.

Note that the whitespace in XML is generally not significant and should not be a factor for xml aware downstream processing, but tools do exist to perform such "pretty printing." In oXygen, one can use "Format and Indent" to pretty print on individual or multiple XML files. I believe also that XML libraries have similar features pretty printing. E.g. lxml in Python. Hope this helps.

@myrmoteras @tcatapano built a pretty printing output writer now to go between the XSL Transformer output and the client-bound output ... currently integrating it in the code of the web front-end servlets.

Pretty printing is deployed now, see https://tb.plazi.org/GgServer/taxPubL1/0384433845052F6CFE68FD53FDA7FD8D (and any other treatment).