Add MyCoRE classification provider
Closed this issue · 4 comments
MyCoRE is used by VZG and other organization for digital collections (around 70 instances). The software includes support of classification (see documentation in German) with
- import, export and API access of classification in MyCoRE classification XML format
- classification editor webinterface
- classification mapping (in particular to those listed at http://mycore.de/classifications/)
The MyCoRE API includes three method with response format XML or JSON:
- list all classifications
- get a classification
- get a category (aka concept) - seems to be not available yet
Example API calls at https://bibliographie.schleswig-holstein.de:
- https://bibliographie.schleswig-holstein.de/api/v2/classifications.json
- Sachgruppen der Schleswig-Holsteinischen Bibliographie
https://bibliographie.schleswig-holstein.de/api/v2/classifications/shbib_sachgruppen.json - https://bibliographie.schleswig-holstein.de/api/v2/shbib_sachgruppen/011000.json (does not work as expected!)
I've added MyCoRE Classification API to BARTOC although for a particular vocabulary it's just a plain URL such as https://bibliographie.schleswig-holstein.de/api/v2/classifications/shbib_sachgruppen.json
As most "standard" vocabularies seem to be imported in many MyCoRE instances, BARTOC should only list vocabularies original to a MyCoRE instance.
The JSON format based on MyCoRE classification data model is:
.labels
- labels (when language code not starting with
x-
) and notation (if label starts with value ofID
and a space) - uri (
x-uri
) oridentifier
if repeated - mappings (
x-mapping
). No support of mapping types - ???
x-topic
- labels (when language code not starting with
categories
: list of top (root) or narrower conceptsdescription
: scopeNoteurl
: urlID
: local identifier, used to construct full URIscounter
: optional (seems to be occurrence count?) - ignore by nowservice
: optional (?) - ignore by now
The full concept URIs must be build from vocabulary URI and concept ID (e.g. http://www.mycore.org/classifications/shbib_sachgruppen
+ 011000
should become http://www.mycore.org/classifications/shbib_sachgruppen/011000
All queries except for the list of vocabularies (not required for current use cases as we use BARTOC to point to particular vocabularies) require to load the full vocabulary in one HTTP request and cache it (open question: how long? Most vocabularies don't change quickly so at least one day cache should be ok).. This may also be used in additional providers for small vocabularies:
- JSKOS File provider (load JSKOS file with a vocabulary)
- SKOS File provider (load SKOS file and convert to JSKOS)
Alternative: convert MyCoRE classification format to JSKOS and import into JSKOS server
require to load the full vocabulary in one HTTP request and cache it
As I see it, https://bibliographie.schleswig-holstein.de/api/v2/classifications/shbib_sachgruppen.json already contains all required data, right?
I think as long as vocabularies don't get too big (since we need to keep it all in memory), it should be fine. I could start to implement it tomorrow. In theory, it shouldn't be a big thing as long as the API results are consistent.
Stupid thing, but for completeness: It's "MyCoRe", not "MyCoRE". The provider name is correctly named, but the documentation might not be fully consistent with it.
Currently only works with one specified vocabulary per registry (not an issue since it's used via BARTOC most of the time anyway). Listing of vocabularies is currently not supported. Will be release as experimental in v3.3.0.