gbv/cocoda-sdk

Add MyCoRE classification provider

Closed this issue · 4 comments

MyCoRE is used by VZG and other organization for digital collections (around 70 instances). The software includes support of classification (see documentation in German) with

  • import, export and API access of classification in MyCoRE classification XML format
  • classification editor webinterface
  • classification mapping (in particular to those listed at http://mycore.de/classifications/)

The MyCoRE API includes three method with response format XML or JSON:

  1. list all classifications
  2. get a classification
  3. get a category (aka concept) - seems to be not available yet

Example API calls at https://bibliographie.schleswig-holstein.de:

I've added MyCoRE Classification API to BARTOC although for a particular vocabulary it's just a plain URL such as https://bibliographie.schleswig-holstein.de/api/v2/classifications/shbib_sachgruppen.json

As most "standard" vocabularies seem to be imported in many MyCoRE instances, BARTOC should only list vocabularies original to a MyCoRE instance.

The JSON format based on MyCoRE classification data model is:

  • .labels
    • labels (when language code not starting with x-) and notation (if label starts with value of ID and a space)
    • uri (x-uri) or identifier if repeated
    • mappings (x-mapping). No support of mapping types
    • ??? x-topic
  • categories : list of top (root) or narrower concepts
  • description : scopeNote
  • url : url
  • ID : local identifier, used to construct full URIs
  • counter : optional (seems to be occurrence count?) - ignore by now
  • service : optional (?) - ignore by now

The full concept URIs must be build from vocabulary URI and concept ID (e.g. http://www.mycore.org/classifications/shbib_sachgruppen + 011000 should become http://www.mycore.org/classifications/shbib_sachgruppen/011000

All queries except for the list of vocabularies (not required for current use cases as we use BARTOC to point to particular vocabularies) require to load the full vocabulary in one HTTP request and cache it (open question: how long? Most vocabularies don't change quickly so at least one day cache should be ok).. This may also be used in additional providers for small vocabularies:

  • JSKOS File provider (load JSKOS file with a vocabulary)
  • SKOS File provider (load SKOS file and convert to JSKOS)

Alternative: convert MyCoRE classification format to JSKOS and import into JSKOS server

require to load the full vocabulary in one HTTP request and cache it

As I see it, https://bibliographie.schleswig-holstein.de/api/v2/classifications/shbib_sachgruppen.json already contains all required data, right?

I think as long as vocabularies don't get too big (since we need to keep it all in memory), it should be fine. I could start to implement it tomorrow. In theory, it shouldn't be a big thing as long as the API results are consistent.

Stupid thing, but for completeness: It's "MyCoRe", not "MyCoRE". The provider name is correctly named, but the documentation might not be fully consistent with it.

Currently only works with one specified vocabulary per registry (not an issue since it's used via BARTOC most of the time anyway). Listing of vocabularies is currently not supported. Will be release as experimental in v3.3.0.