wendellpiez/JATSKit

Update to Niso Jats 1.1

Closed this issue · 19 comments

It seems that a new version of JATS is available.
Should we update all the schemas to it or also leave the schemas for 1.0 and add the 1.1 schemas side by side + catalog entries?

Radu,

Indeed, plus as you also saw we now have a BITS 2.0 release. Which is much to be celebrated.

I have been postponing asking your question until this happened. Now it has, it becomes urgent. For a number of reasons I'd be inclined to provide only JATS 1.1 and BITS 2.0 in the framework, without other schema files, and only with MathML 3. (So, in JATS terms, support for Blue and Orange tag sets, in JATS 1.1, open for adjustment in view of local variances from this.) However, I also think it should be easy to tweak all this, and the catalog etc. can show a user how to do this at the local level.

This is probably what I will proceed to do but if you (or anyone reading, thanks!) have other opinions or experiences to offer, they would be gratefully received.

I don't really have much experience with JATS, so I agree 👍 We'll probably need also some new file templates, possibly also some CSS selectors to handle the new elements.

Any idea how many users there are? I'm concerned that people might have started using 1.0 and would be impacted by dropping it.

I confess I don't know how version selection works in this framework. I've always intended to play with this, but haven't found the time yet.

Chris,

Thanks for weighing in, you read my mind. It is not only present users but also new users we have to think about. Likewise, of the new users, it is safe to imagine that there will be both those who are already on, or who can move easily to, the latest versions, and those who already have masses of older JATS or JATSish data (even nominal 'nlm' formats of various types), with which the framework might work perfectly well with only modest extension (but whose various DTDs we can't include). So 'the users' might make for quite a grab bag.

I am reluctant to say that JATSKit 'supports' anything but the latest mainly because I'm unable to test against everything, and limiting my data set to documents valid against the latest seems like a reasonable place to draw a clear boundary. However, as you know there is a paradox here, since the framework as delivered will indeed provide 90% of the capability that even the constituency of JATS 1.0 users (or older variants) needs; indeed given the need for customization around the edges in any case, I don't think this constituency really gets much less from the framework, on balance, than a user who has 100% valid current JATS. So I don't actually wish to discourage them. Most of the CSS, XSLT etc. etc. should work just fine for them.

Then too the installed base and numbers of users (if oXygen or anyone might know that) may also have differential abilities or interests to migrate from 1.0 to 1.1 ... which as you know ain't really all that difficult a migration, at least relatively ... so once again it isn't really an either/or question based on a stable set of factors.

To address this balance, I propose that we try to be "generous in what we accept and strict in what we emit" - so the framework will not attempt to support all of JATS/NLM through earlier versions (because we are not prepared to emit it), but yet, we will "do our best" with all the XSLTs and processing, document how to wire up other versions of DTDs etc, and provide means and documentation to help local developers or user-developers in extending the framework to cover them. Perhaps there can even be Schematron logic to detect this situation and report on it.

All this is a lot, so what it might come down to will be documentation in the catalog file showing how to wire up the older DTDs, and pointing to them on line - but not including them in the framework.

Chris are you writing as a user of the present framework and if so, what have you done to deal with the issue of data valid to a no-longer-extant DTD version? (Or if this has never happened, what would you as a user expect and prefer if it did?)

Thanks again for pondering; more perspectives are welcome.

No idea how many users are for the framework. If for the 1.1 JATS there are other DTD public and system ID they can be solved via the XML catalog to the 1.1 DTDs. And we could have new file templates both for 1.0 and 1.1 but being a minor release which probably only adds new elements maybe we can just update the DTDs.

I will address these issues via comments in the catalog.xml file and/or docs, and see what you guys think then.

Meanwhile leaving this issue open until the update pass on the schemas is done.

One simple reason to avoid packaging multiple versions of DTDs is simply the size of the installed framework. I would naturally like it to remain as small as possible.

As of now, the DTDs are updated along with a couple of other features, such as DTD documentation related to tooltips.

For now, the toolkit will work seamlessly with any Blue, Orange (1.1) or BITS (2.0) document with MathML3, with or without OASIS tables.

Additionally, if you have data valid to any other variant of an NLM DTD (include JATS 1.0), you can add DTDs to validate them, either by adding a new catalog to the framework or by extending the framework (a clean solution).

Not all XSLTs and CSSs will support everything permitted by any DTD of course, but that's what extensibility is for.

There is a wiki page on which this policy and a how-to for framework extension is given, on DTD Versions

I have a single albeit major misgiving about this arrangement - it means that the JATSKit will turn on for NLM documents which it cannot parse (because their DTDs are not found, nor FPIs recognized), thereby creating errors before success for any new user who happens to have older DTDs. This is a problem even if the setup for workaround(s) is not difficult.

However, short of bundling any / all DTDs for NLM or NLM-like formats that anyone might ever try oXygen with, I don't see a remedy for this. Any suggestions or recommendations are welcome.

On further experimentation, I'm tightening things a bit on the framework's association rules.

Only documents explicitly marked with a known (current) FPI will be associated with the framework. Hence the framework will not turn on for document types for which it does not have the DTDs.

This does mean you won't get any functionality for free if you aren't in sync with JATS 1.1/BITS 2.0 - the framework simply won't be there. However, it is still easy to extend the framework to cover any other document sets (whatever their DTDs) as long as you can provide a catalog - and when you do, XSLT CSS etc. will still "mostly work".

Since the DTDs are updated I am closing this Issue. I will however open a separate issue on remaining questions regarding bindings to document types.

Thanks, I will update, test a little bit on my side and if I have feedback I'll write it here.

We have an automated tests which detects if DTDs referenced via XML Catalogs are present, it reported some problems for the new JATS catalogs:

        The file:D:\projects\eXml\frameworks\jats\lib\DTD\JATS-DTDs\JATS-journalpublishing1.dtd does not exist from file:/D:/projects/eXml/frameworks/jats/lib/DTD/JATS-DTDs/catalog-jats-v1-1-no-base.xml
        The file:D:\projects\eXml\frameworks\jats\lib\DTD\JATS-DTDs\JATS-journalpublishing-oasis-article1.dtd does not exist from file:/D:/projects/eXml/frameworks/jats/lib/DTD/JATS-DTDs/catalog-jats-v1-1-no-base.xml
        The file:D:\projects\eXml\frameworks\jats\lib\DTD\JATS-DTDs\JATS-archivecustom-mixes1.ent does not exist from file:/D:/projects/eXml/frameworks/jats/lib/DTD/JATS-DTDs/catalog-jats-v1-1-no-base.xml
        The file:D:\projects\eXml\frameworks\jats\lib\DTD\JATS-DTDs\JATS-archivecustom-models1.ent does not exist from file:/D:/projects/eXml/frameworks/jats/lib/DTD/JATS-DTDs/catalog-jats-v1-1-no-base.xml
        The file:D:\projects\eXml\frameworks\jats\lib\DTD\JATS-DTDs\JATS-archivecustom-classes1.ent does not exist from file:/D:/projects/eXml/frameworks/jats/lib/DTD/JATS-DTDs/catalog-jats-v1-1-no-base.xml
        The file:D:\projects\eXml\frameworks\jats\lib\DTD\JATS-DTDs\JATS-archivecustom-modules1.ent does not exist from file:/D:/projects/eXml/frameworks/jats/lib/DTD/JATS-DTDs/catalog-jats-v1-1-no-base.xml
        The file:D:\projects\eXml\frameworks\jats\lib\DTD\JATS-DTDs\JATS-archivearticle1.dtd does not exist from file:/D:/projects/eXml/frameworks/jats/lib/DTD/JATS-DTDs/catalog-jats-v1-1-no-base.xml
        The file:D:\projects\eXml\frameworks\jats\lib\DTD\JATS-DTDs\JATS-archivearticle1-mathml3.dtd does not exist from file:/D:/projects/eXml/frameworks/jats/lib/DTD/JATS-DTDs/catalog-jats-v1-1-no-base.xml
        The file:D:\projects\eXml\frameworks\jats\lib\DTD\JATS-DTDs\JATS-archive-oasis-custom-classes1.ent does not exist from file:/D:/projects/eXml/frameworks/jats/lib/DTD/JATS-DTDs/catalog-jats-v1-1-no-base.xml
        The file:D:\projects\eXml\frameworks\jats\lib\DTD\JATS-DTDs\JATS-archive-oasis-custom-modules1.ent does not exist from file:/D:/projects/eXml/frameworks/jats/lib/DTD/JATS-DTDs/catalog-jats-v1-1-no-base.xml
        The file:D:\projects\eXml\frameworks\jats\lib\DTD\JATS-DTDs\JATS-archive-oasis-article1.dtd does not exist from file:/D:/projects/eXml/frameworks/jats/lib/DTD/JATS-DTDs/catalog-jats-v1-1-no-base.xml
        The file:D:\projects\eXml\frameworks\jats\lib\DTD\JATS-DTDs\JATS-archive-oasis-article1-mathml3.dtd does not exist from file:/D:/projects/eXml/frameworks/jats/lib/DTD/JATS-DTDs/catalog-jats-v1-1-no-base.xml
        The file:D:\projects\eXml\frameworks\jats\lib\DTD\JATS-DTDs\JATS-articleauthoring1.dtd does not exist from file:/D:/projects/eXml/frameworks/jats/lib/DTD/JATS-DTDs/catalog-jats-v1-1-no-base.xml
        The file:D:\projects\eXml\frameworks\jats\lib\DTD\JATS-DTDs\JATS-mathmlsetup1.ent does not exist from file:/D:/projects/eXml/frameworks/jats/lib/DTD/JATS-DTDs/catalog-jats-v1-1-no-base.xml
        The file:D:\projects\eXml\frameworks\jats\lib\DTD\JATS-DTDs\BITS-add-attributes2.ent does not exist from file:/D:/projects/eXml/frameworks/jats/lib/DTD/JATS-DTDs/catalog-jats-v1-1-no-base.xml 

Yes, those dangling links are there by design -- that's the way JATS distributions work.

Is it actually an error if the catalog references files that are not there? If so, JATS will be distributing an erroneous catalog with each distribution, since all of them use the same catalog (while they configure different modules from it). None of these files, FWIW, will be referenced by DTD files actually packaged.

If it is an error, we can comment those lines out.

I am interested in others' take on this. Won't the catalogs as delivered by NCBI with the Blue and Green DTDs also show these issues?

It's just something one of our automated tests noticed after the framework was updated. I can "fix" the test to ignore the problems in the jats catalog.

So you can close this one if you want.

It's a reasonable test -- or not, at least, unreasonable.

There may be other reasons to comment out the dangling references, however, for example to prevent interference when users extend to include other document types (which include the modules referenced but not included).

So I'm not going to close the issue just yet, while I think about it.

Yes, I would suggest that the catalogs should be fixed. Resolvers look for the first match, and then stop. So it wouldn't matter if another catalog in the chain had the same ids that resolved successfully, the resolver would never get there.

Indeed. And while one might suppose that the simplest way to include Green (for example) in JATSKit would be, after adding FPI bindings, simply to add the Green modules into the JATS-DTDs directory with all the others ... it is also likely that people will simply want to add their own paths to where they have the Green DTD (and a copy of the catalog). Making interference. (They will do this especially if that's what we tell them to do, which is what I've told them so far hoping to stay out of the weeds.)

I wonder if it would be a bad idea for oXygen to keep looking in the next catalog when it gets errors for file not found from a catalog lookup? Or would such a resolver be non-conformant with relevant standards (OASIS XML Catalogs)?

There is now a new catalog, a level up and named jatskit-catalog.xml so it can be found.

This is better. Thanks.

👍