information-artifact-ontology/ontology-metadata

Request for has_acronym

csbjohnson opened this issue · 23 comments

IRI

No response

Label

Acronym (has_acronym)

Definition of the property

I'd like to request the creation of has_acronym, to be able to specify when an acronym is being used.

Would truly appreciate your time and assistance. Thank you!

Best,
Claudia Sánchez-Beato Johnson

Parent property

No response

What is the range of the property in question?

xsd:string

Examples of use

DOID:0050214
Name: Lambert-Eaton myasthenic syndrome

Synonyms label: Eaton-Lambert syndrome [EXACT], Lambert-Eaton syndrome [EXACT], LEMS [EXACT]

Eaton-Lambert syndrome [EXACT], Lambert-Eaton syndrome [EXACT] are synonyms, different words, yet LEMS is not providing a different word yet representing the same name provided in the DOID with abbreviation. Therefore representing an acronym as a synonym wouldn't be accurate and it would be truly beneficial to be able to specify it's classification with it's own label.

Motivation to add

In order to represent acronyms by their own label rather than represent them as synonyms as they aren't a different word but an abbreviation of that same term.

ORCID, ROR or Wikidata identifier of the contributor

/

OMO Checklist

  • I believe the property is generally useful beyond my specific ontology needs.
  • There is no other property in OMO that covers the same use case.

This would be a great addition, IMO. But what is the intended domain of the relation. Seems to me that it should be related as an annotation to the fully spelled out label. There's a question in my mind as to whether there should be a direct relation as an alternative label of the term. But to the extent that an acronym could be of a label that is any of a variety of types of synonyms, I don't think we want to replicate the various synonym properties specialized to acronyms.

Thank you for the request!

Generally, we consider acronyms, abbreviations, etc "synonym types".

There are several such synonym types in OMO already, and as discussed here, we should definitely add acronym to the list, and I am happy to do it.

However, you would not proceed to say:

DOID:0050214 "has acronym" LEMS

If you use this pattern, you would say (pseudocode, given a new synonym type OMO:123 "acronym"):

DOID:0050214 "has exact synonym" LEMS [oio:SynomymType=OMO:123]

Find some examples here:

https://api.triplydb.com/s/FIf2uoYo9

Another closely related issue is #122

Would you be ok with this using this pattern as well?

Thank you for the pattern provided, encoding it that way would still not allow a separation of true synonyms from acronyms . I'd like to keep my request for "has_acronym synonym" as I believe it is a direct way for increased accessibility to separate true synonyms from acronyms. Without it, it wouldn't be user friendly by lacking visibility and making it more complicated to be queried.

This addition would serve great value to the community.

Thank you for your time.

Best,
Claudia Marie

would still not allow a separation of true synonyms from acronyms

How so? They are clearly separated, albeit in a bit of a cumbersome manner..

it wouldn't be user friendly

This is true.. But I don't know then how exactly I should play my role here as a shepherd for OMO.

This is the dilemma:

  1. There is a pattern in OMO for synonym types
  2. A user (you) does not like the pattern (for a good reason!) and suggests to implement another parallel pattern (it is, after all, a pattern, as it would open the floodgates for "has abbreviation", "has layperson synonym" etc).

So either I violate my community-entrusted role of evolving a coherent OMO with a single way of doing things, or I violate my dedication to you, the user, which are both of equally important to me!

So unless you can convince me that an acronym is actually not a synonym at all, I am afraid you will have to find someone in OBO to muscle your request past me (which is definitely possible!). Or else tell me how to solve my conundrum. :( Sorry, sorry.

@csbjohnson - did you have any further thoughts on what I was saying? Disagreement with something specific? Ideas on how to move forward? I really don't like when community members such as yourself take the time to reach out to an ontology and making a request (which is really great), just to be rebuffed by technical people like myself for formal reasons.. I just need a good reason not to stick to the past design decision!

Hi @matentzn, thank you for your time. I believe that this implementation is valuable.

Please see following definitions of acronym and synonym:

-Acronym: An acronym is a word or name formed as an abbreviation from the initial components in a phrase or a word, usually individual letters (as in NATO or laser) and sometimes syllables (as in Benelux).

-Synonym
A synonym is a word or phrase that means exactly or nearly the same as another word or phrase in the same language. Some lexicographers claim that no synonyms have exactly the same meaning (in all contexts or social levels of language) because etymology, orthography, phonic qualities, ambiguous meanings, usage, and so on make them unique. Different words that are similar in meaning usually differ for a reason: feline is more formal than cat; long and extended are only synonyms in one usage and not in others (for example, a long arm is not the same as an extended arm).

-Examples in which requested has_acronym addition holds place to prevent inaccuracies:

  • The gene symbol AHR (aryl hydrocarbon receptor) and the disease/phenotype AHR (airway hyperresponsiveness)

  • MEN1: Multiple endocrine neoplasia type 1 (a genetic disorder characterized by the development of tumors in multiple endocrine glands). MEN1: Menin 1 (gene associated with multiple endocrine neoplasia type 1)

  • NOD2: Nucleotide-binding oligomerization domain-containing protein 2 (gene associated with Crohn's disease and Blau syndrome). NOD2: NOD-like receptor family, pyrin domain containing 2 (a protein involved in innate immunity)

  • PARK2: Parkinson protein 2, E3 ubiquitin-protein ligase (gene associated with autosomal recessive early-onset Parkinson's disease). PARK2: Parkinson's disease 2 (a type of Parkinson's disease caused by mutations in the PARK2 gene)

  • APP: Amyloid precursor protein (gene associated with Alzheimer's disease). APP: Atrial natriuretic peptide precursor (a hormone involved in the regulation of blood pressure and fluid balance)

  • RET: Rearranged during transfection (gene associated with multiple endocrine neoplasia type 2 and familial medullary thyroid cancer). RET: Retinal (related to the retina)

  • TP53: Tumor protein p53 (gene associated with various cancers).TP53: T-cell prolymphocytic leukemia (a type of leukemia)

We should really have this in the FAQ. For better or worse, OMO/oboInOwl takes a liberal interpretation of "nearly the same as". Our "broad synonym" properties and so on don't make any sense with a stricter reading of "synonym".

However, these properties have been standard for some 25 years or so in ontologies like GO, DO, Uberon, etc

@csbjohnson thanks for the details.

@cmungall (and others), what is your opinion? Is an acronym a synonym (thereby falling under the purview of the axiom annotating pattern) or is an acronym sufficiently different conceptually from a synonym to justify a separate property?

@bpeters42 I thought I made this clear above: because there are then two different ways to say that something is an acronym, see for example this comment: #135 (comment)

@csbjohnson thanks for the details.

@cmungall (and others), what is your opinion? Is an acronym a synonym (thereby falling under the purview of the axiom annotating pattern) or is an acronym sufficiently different conceptually from a synonym to justify a separate property?

One anecdotal response: An acronym is not a synonym (and vice-versa). They follow different grammatical, syntactical, and conceptual rules. When someone says "I need a synonym for X", I think very few people will answer with an acronym. (Of course, things change, maybe younger people would!)

Both acronyms and synonyms may be suitable labels for something, but so might codes or icons—it doesn't make them synonyms.

Alright, thank you all for the discussion. While I do not exactly agree with the line drawn by you all between synonym and acronym (IMO they are both literals used to refer to a conceptual entity, regardless of whether the word "acronym" is perceived as a "synonym" to the term "synonym" or not :-)), I do see the practical value in separating synonyms from acronyms, for example during QC time (as @csbjohnson points out in various examples, they can overlap significantly and we have numerous examples to support that assumption - and we want to check that we do not assign the same exact synonym to multiple terms).

I have made a PR: #138

Please provide your feedback, and any orcids you want me to add as "contributors".

[writing this fairly quickly these same arguments have been rehashed again and again, apologies for typos/repetition]

As always with any ontology concept, we all like to focus on the string used to describe the concept, rather than the concept itself, and this applies to a metadata ontology as much as a domain ontology.

I fully accept that "synonym" was a bad primary label to choose for the concept under discussion here. It leads people to overly focus on how that string is used in their community rather than the concept itself. I suggest for purposes here we focus on what concepts OMO needs, how they should be organized, and how they should be used by applications.

the oboInOwl synonym predicates are for relating a domain concept to a string that is used by humans as a name, where the relationships is either exact, similar but narrower in some contexts, similar but broader in some contexts, or otherwise related. Ontology tools SHOULD use all synonyms when implementing search (and of course they MAY use other predicates), and they SHOULD also use the synonym predicate in ranking search results and in providing information to the user on why something matched. Ontology tools SHOULD consider all synonyms in applications like NER, and MAY use the predicate to rank results. You can find further guidelines on places like the uberon wiki.

Metadata about how the string is constructed - is it an acronym, a portmanteau of an acronym and a spelled out term, is an orthogonal concern.

Like any system, there are edge cases here. It could be argued that HGNC symbols are more like identifiers than synonyms, and indeed sometimes we see novel constructions like HGNC:BRCA.

The existing system has worked for decades. If we want to be ontologically fussy and come up with a complicated alternative then people proposing this need to do more than object on minor terminological grounds and propose an alternative system together with a description of how this will be rolled out and implemented in major software systems in a way that doesn't cause users to get incomplete results.

A valid way to do this is by having an open ended set of APs that inherit from a common parent, as @bpeters42 suggests. Note this kind of system is already used in some ontologies that don't follow oboInOwl, with APs inheriting from "alternative term". But if we open this gate, do we also introduce other APs that function as oboInOwl synonyms? E.g. "has symbol"? "has gene symbol"? "has code"? Or is acronym a special one-off?

  • how to distinguish acronyms that function as exact synonyms (e.g VACTERL for a disease concept) from those that acronyms of broad/narrow/related synonyms
  • how to deal with portmanteaus (e.g. VATER syndrome, ER visit, ER transport, MPS7)? Do we introduce new APs?
  • how to deal with strings that are both primary labels and acronyms. Do we have an AP with two parents? Two assertions?
  • How do we treat things like gene and protein symbols? These are often acronyms, acronym-like contractions, portmanteaus.
  • if we have APs for "has acronym" and "has symbol", do we have a lattice with bottom concepts like "has acronym and symbol" inheriting from "has acronym" and "has symbol"? Or do we have two assertions?

If the system involves an open ended lattice of APs connected by subAnnotationProperty of then the guidelines should explain how tool implementers should obtain this and use it to dynamically drive behavior in a robust way. I assume the intent is that the applications should do an initial lookup of OMO, obtain all the transitive subproperties of the root synonym/alternativeLabel AP, and use this to drive behavior of the tool (search, NER, etc).

This is not unreasonable, but if this is the plan, someone needs to document it, coordinate with developers and ensure it is rolled out consistently.

As a data point, note that OBI and other IAO-based ontologies have employed this system for some time. In OBI you can see an AP hierarchy:

'alternative term'

  • 'FGED alternative term'
  • 'IEDB alternative term'
  • 'ISA alternative term'
  • 'NIAID GSCID-BRC alternative term'

But this system was never documented and there was no coordination with tool developers. As a result, if someone uses one of these subproperties for an alternative term, it is ignored by software for purposes of search, including the 3 main portal providers. This is despite this system being in use for well over a decade.

(funnily enough, many of these "alternative terms" are also acronyms)

In contrast, the existing system used by GO, Uberon, and many other ontologies cleanly separates concerns, has a simple implementation that is largely adhered to by most software. As far as I can tell there are no practical issues and it's purely a nomenclature choice, but just read "synonym" as "alternative term" and everything is fine.

See also:

I agree with pretty much all that Chris wrote. But I wanted to add one bit of background for why the sub-properties like 'IEDB alternative term' were created. That came about from having discussions about labels between different projects that contribute to OBI, which are much more varied than the typical GO derived communities. To reduce the need for those discussions, we allowed everyone to use their own 'alternative label'. This meant every project can replace the OBI rdfs:label that they didn't like with whatever their community prefers. In contrast, the OBI rdfs:label has to be precise, unique and distinguishable across all of OBI, which often means it is long and clunky. We wanted to make these alternative terms available for anyone wanting to do text mining, but we did not want to argue about if something is broad, narrow, syonym, acronym or whatever. This OBI practical solution has worked very well over a decade for its intended purpose. It does not separate the concerns that Chris mentions, but it does separate the concern I mentioned in myoriginal comment that different people will have very different idea on what labels are good / narrow / whatever, and that those discussions are often not productive.

cthoyt commented

I'm with nico's original post - I think we should continue considering acronyms as a type of synonym (since in practice we want to use these the same way) and use the recent work in OMO to add new synonym standard synonym types to mediate this. Having tons of properties to look for this stuff instead of in a single place with a well-defined data model will make it less easy and enjoyable to use ontologies.

I agree with @cthoyt. I don't support adding an acronym property; I do support standardizing synonym type identifiers since these are currently a mess of ontology-local hash IRIs.

Strongly agree with Chris' points with one exception - symbols.

For cell types, we have very strong (I would even say critical) use-cases for official symbols. Long labels are needed for disambiguation purposes, but no-one in the community uses them (similar issue: FlyBase has both gene symbols and official full names). Where consensus emerges on particular symbols as standard, we need to reflect this so that the tools we build have a single, official, compact way to refer to cell types that reflects dominant community usage. This is particularly important for atlases, where, for real-estate reasons, overlay annotation needs short symbols that users can understand.

We are following this approach in FBbt, PCL, and are starting to follow it in CL (currently using an IAO ID). In order to support default assumptions about indexing OBO ontologies for search, we follow an SOP that all symbols should also be exact synonyms - this also allows us to add supporting references.

In discussion at OFOC meeting 2023-10-31 including @csbjohnson and @lschriml, there was consensus that we should create a new 'acronym' synonym type (not a new property).

cthoyt commented

@balhoff was there discussion of if we can include this as a subproperty of "abbreviation"?

cthoyt commented

Either way, I can take care of making this new property.

@balhoff was there discussion of if we can include this as a subproperty of "abbreviation"?

No this wasn't discussed. It's kind of a weird situation, since these things aren't really properties (i.e. relationships) but just put under OWLAnnotationProperty out of convenience (visible in Protege, but no logical commitment).