INCATools/boomer

accept SSSOM format

wdduncan opened this issue · 14 comments

in addition to the ptable, also accept SSSOM mappings

cc @cmungall

Maybe let’s wait until we have actually agreed on anything..

I think we agree on the majority, which could still be implemented, though there is no rush

Would be good to think about boomer reqs, both for input and output

I commented on the SSSOM about whether complex mapping axioms will be supported.

Many of the cases in SSOS (such as "Exact Match") are already covered by SKOS. It seems like we are reinventing the wheel in such cases. Why not use SKOS terms when what we are looking for is present in SKOS?

@balhoff I would like to allow complex mappings, but this requires to allow subject and objects actually to be tuples that map to class expressions. While this is technically doable, it will make the discussion of this proposal with the community much harder.. I would like to avoid now to define a standard that looks so complicated that no one will adopt it... Technically speaking all we need is:

  1. Allow subject and Object to be a tuple like PATO:001|UBERON:003 and
  2. a new metadata element subject_pattern, object_pattern that allows the inclusion of parseable manchester syntax expressions or purls to DOSDP patterns.

We can build this in right at the onset - but we can also just agree we will do it later once the simpler format is agreed on (simple but already quite extensive.. more than I hoped).

@wdduncan most of the SSSOM match type classes that look like skos are actually only grouping classes for the concrete match types (like "match on label"). skos annotations are allowed as a predicates, and the match type really qualifies the method with which the match was obtained. But its not yet perfectly straightened out I have to admit.

note: sssom-py can be used to translate sssom to ptables

@cmungall @matentzn is there a place in SSSOM to hold the probabilities? It doesn't seem like it fits, given that the 4 probabilities represent 4 different predicates. How should that be handled?

I use confidence for probability, however, this is over-interptering this column which doesn't have precise semantics in the spec

Also there would be a separate row for each interpretation of a pair of terms

@cmungall I agree if we are very formal, confidence and probability are different. However, how would you in your own words characterise the difference? Maybe if its significant, we could add a field for probability. For me confidence is a subjective measure that states "I am 70% sure this mapping holds" - I don't quite understand, although I can sense the difference, how this differs practically from "This relationship holds with 70% probability". Maybe the latter is a statement about the domain? in 7 out of 10 cases the relation holds, and in 3 out of 10 it does not? Maybe that's it..

Or do you mean something like confidence interval?

Didn't even think of that...

Let's just have a convention:

  • confidence -> probability in boomer
  • 4 predicates:
    • owl:equivalentClass
    • rdfs:subClassOf -- treated as proper, by convention
    • sssom:superClassOf -- see mapping-commons/sssom#38
    • skos:relatedMatch

This allows sssom as an alternative to ptables