provide URIs for synonym types within OMO
cmungall opened this issue · 8 comments
Before commenting, read the details on how many ontologies like GO, Mondo, Uberon, HPO, CL manage synonyms. We are lacking good central OMO docs on this, but for now you can read https://github.com/obophenotype/uberon/wiki/Using-uberon-for-text-mining
Briefly:
- there are 4 different predicates from the oio vocabulary used (
oio:has{Exact,Broad,Narrow,Related}Synonym
). These are sometimes called synonym scopes for historic reasons, but formally they can be better thought of as the primary predicate connecting entity and literal - synonym triples (in owl: annotation assertions) are constructed between an entity and a literal using one of these primary predicates
- each assertion can be optionally annotated (using standard owl annotation vocab) with any additional metadata. This could include info such as which PMID the syn came from or which curator provided the synonym. For our purposes here we are concerned with the
oio:hasSynonymType
predicate, which links the synonym assertion to a URI that lumps synonyms into broad categories, e.g- abbreviations
- layperson synonyms
- ...
Effectively this means that synonym assertions have a biaxial system of classification that is largely orthogonal. The primary predicate (scope) is used by most NLP/TM tools and search portals (see this doc) whereas the synonymType can provide an orthogonal classification that used for other purposes
The synonym types are very important for some ontologies (e.g. layperson synonyms and HPO - see https://www.nature.com/articles/s41588-018-0096-x for background)
The purpose of this issue is not to discuss this overall data model, but purely to discuss the value of the synonymType axiom annotations.
Currently these have horrible hash URIs. This is a legacy of conventional usage of these in OBO format https://owlcollab.github.io/oboformat/doc/obo-syntax.html#5.0.2
Some examples:
- obo:chebi#BRAND_NAME
- obo:chebi#SMILES
- obo:hp.obo#abbreviation
- obo:hp.obo#layperson
- obo:po#Japanese
- obo:tax#common_name
- obo:tax#genbank_common_name
- obo:uberon/core#HUMAN_PREFERRED
- obo:uberon/core#INCONSISTENT
- obo:uberon/core#LATIN
- obo:uberon/core#MISSPELLING
Full list is attached
Some of these reflect bad practice - e.g. inappropriate use to designate language synonyms or chemical formulae. But some reflect common use cases, and it would be good to use a common non-hash based URI across multiple ontologies, and OMO seems a good home
there are arguments for making some of these primary predicates, e.g abbreviation. however, this can lead to predicate lattices, e.g. exactLaypersonSynonym is-a {LaypersonSynonym, ExactSynonym}
.
For synonym types that are truly local to an ontology, we should come up with a scheme for providing identifiers for these that do not result in hashes. One possibility is to use the primary ID space from the ontology, but these have historically been used to denote entities with the ontologies' domain rather than metadata elements.
One possibility is to use the primary ID space from the ontology, but these have historically been used to denote entities with the ontologies' domain rather than metadata elements.
I think its fine, we also do this sometimes. I don't really believe that this should be necessary - how many synonym types make only sense in a local context?
I would even say such synonyms should not be annotated with the normal synonyms.
Can someone say which class each "synonym type" should go under as a subclass?
not a class (a property), but to preserve current parsers etc should go under oboInOwl:hasSynonymType
.
I don’t understand - shouldn’t a synonym type be the object of this property
Haha sorry forgot how confusing this is. This is a design pattern, and anti-pattern, that snug into our stack 10 years ago. Synonym types and subset declarations, both of which would be best conceptually represented as owl:Individuals, have been decided to be represented as owl:AnnotationProperty. A part of @cmungall now regrets this decision, but that is how it was done. So synonym types are represented as OWL annotation properties - unless we change that entirely in the new OBO 1.6 spec, which could mean quite a bit of churn - but possible.
alright thanks for the suggestion. See #124 for the first few additions I made and let me know if I'm on the right track
There are still some synonym types from Chris's list left, but I think the core idea from this issue has been addressed. It might be good to add something like a unstandardized synonym checker to the OQUAT dashboard (tracked in biopragmatics/oquat#10)