mapping-commons/sssom

Align with JSKOS data format

nichtich opened this issue · 4 comments

What a surprise two independent intitiatives for terminology mapping exist since years without knowing about each other:

Ontologies are a special case of general knowledge organization systems or terminologies (authority files, taxonomies, classifications, thesauri, gazetteers...) so coli-conc is more general but the main use case is systems below ontologies and semantic networks (see KOS typology).

A first comparison between SSSOM and JSKOS elements shows large overlap. Independent development of similar solutions to similar problems is a good indicator for a good solution.

SSSOM JSKOS
Mapping Mapping
MappingSet Concordance
MappingRegistry Registry
mapping server JSKOS Server
Compound concepts (not supported) Concept Bundles

JSKOS defined as JSON format with a JSON-LD context so it can be mapped to RDF as well. In contrast to SSSOM, most properties are reused from other ontologies (e.g. Dublin Core and FOAF) instead of minting new URIs. SSSOM should include mappings to JSKOS classes and properties.

Most relevant fields of a mapping record:

JSKOS mapping can only be transformed to SSSOM if from.memberSet.length == 1 && to.memberSet.length <= 1.

SSSOM Mapping JSKOS Mapping
record_id (proposed) uri
predicate_id type
subject_id from.memberSet[0].uri
subject_label from.memberSet[0].prefLabel repeatable
object_id to.memberSet[0].uri or sssom:noTermFound
object_label to.memberSet[0].prefLabel[] repeatable
confidence mappingRelevance
creator_id creator[].uri repeatable
creator_label creator[].prefLabel[] repeatable
mapping_date created
publication_date issued
predicate_modifier missing in JSKOS!

Open questions:

  • How are mappings beyond 1-to-1 expressed in SSSOM?
  • How are null-mappings expressed in SSSOM?

How are mappings beyond 1-to-1 expressed in SSSOM?

There are many kinds of 1-to-1 mappings and I generally recommend the people I work with to stop using that term.

If you mean:

  1. 3 specific terms in one space map to a single more general term in another space, there is nothing much to model. They are all broad matches, and if you like, you can use @gouttegd much beloved (I share his enthusiasm) mapping_cardinality field.
  2. 3 specific terms in one space have to be combined into a single, post-coordinated expression to map to another term (e.g. severe + Alzheimer --> Alzheimer, we call this at the moment "complex mapping", which is currently outside of SSSOM. Internally, we have developed a system for representing such expressions using JSON-URL, and the specification can be found here: https://github.com/monarch-initiative/uri-expression-language. But this has no bearing on SSSOM, because you can simply use http://my.org/schema/0001/(disease1:'MONDO:0005148',disease2:'MONDO:0000960',excluded:'HP:0100758') as a subject_id in a SSSOM file. As I said, not SSSOM, and half the SSSOM team is against its existence.
  3. Data structure mappings: a person with an birthdate and first and a surname in one data model maps to a person with a birthyear and a single name string in another. Totally out of scope for SSSOM - an unlikely to be covered by the core model, see https://github.com/linkml/linkml-map how we do this internally.

How are null-mappings expressed in SSSOM?

Another tough one to nail down unfortunately:

I personally have been using:

:A skos:exactMatch sssom:noTermFound, but this has not been approved into the standard, despite a not so greatly attended vote in this issue #28. Feel free to comment on the issue if you want us to move it faster into the standard.

Thanks a ton for doing this @nichtich! It would be great if we could include a JSKOS parser into sssom-py when your mapping is complete!

Thanks for clarification:

"complex mapping", which is currently outside of SSSOM.

Ok, so these won't be convertable from JSKOS to SSSOM (but they make up a minority of mappings anyway).

sssom:NoTermFound

Looks good to me. Again, these mapings are not frequent.

Negative Mappings are not supported in JSKOS (yet).