ubleipzig/intermediateschema

added missing fields that are utilised in span

zazi opened this issue · 2 comments

zazi commented

currently, these are:

  • x.indicator
  • x.packages
  • x.labels
  • x.oa
  • x.license

see here

miku commented

I re-add the comment from #11:

The x was something of an ad-hoc namespace, that I added at the start of the project, meaning something like: not yet fully specified, specify in some later version. The intermediate schema is still 0.9 and we had several discussions about removing it altogether - as to simplify things (get rid of blob server, get rid of extra conversion steps, and so on).

The intermediate schema was not complete from the beginning, so a few field (open access, packages, ...) were added over time. Now, I believe, it is debatable, whether we need another schema altogether when we have a relatively stable format, that is the target schema.xml.

If we manage to drop the intermediate schema, we might be able to drop the blobserver (freeing infrastructure), get rid of the extra conversion steps to intermediate format and to SOLR (saving IO) - and in general have some easier way forward with regard to collaboration. We should discuss this together, and I believe there is some opportunity for this soon.

zazi commented

Thanks a lot for your comprehensive and interesting insights. Nevertheless, a search index schema (as it is the case of the finc Solr schema) differs in design and amount of fields a bit from a simple schema to hold relevant information for describing bibliographic resources. Hence, I would still argue for an independent (as simple as possible) schema for describing bibliographic resources, e.g., as a JSON schema (instead of a very technology dependent search index schema). Even the existing finc Solr schema can be simplified, if we go for client-specific search index schemata (with less properties overall ;) ).