mapping-commons/disease-mappings

Review general mapping rules for diseases and phenotypes

Opened this issue · 6 comments

The idea is to figure out a clear recipe with which we can determine a match between two phenotypes and two diseases.

@sabrinatoro Can you help me with that? I would like to capture all the possible mapping rules that can lead to a mapping. This does not include your fine-grained work on distinguishing when to do "exact" vs "narrow" that you captured in your ICD10 work - just the general "thought processes" that can be applied to determine whether a mapping (exact or otherwise) holds.

Mapping diseases

When matching diseases, potentially across species, the following matching disease rules (MDR) can be applied:

  • MDR1: two diseases (across species) share phenotypic presentation
  • MDR2: two diseases (across species) share known genetic underpinnings
  • MDR3: two diseases share phenotypic presentation and genetic underpinnings
  • MDR4: two diseases share same same label
  • MDR5: two diseases share very similar textual descriptions that, from a curators perspective, appear to be describing analogous concepts
  • MDR6: two diseases appear to be the same concept based on domain knowledge of the curator

Mapping phenotypes

  • MPR1: two phenotypes are associated with the exact same set of diseases
  • MPR2: two phenotypes inhere in homologous structures and exhibit the same quality (e.g. increased thickness)
  • MPR3: two phenotypes share very similar descriptions that, from a curators perspective, appear to be describing analogous concepts
  • MPR4: two phenotypes are caused by the same set of (orthologous) genes

@ImkeTammen I am interested in your thoughts here, in particular if your own mapping rules reflected in any of these.. Does this cover the choices you make to decide a mappings?

For diseases:
2 diseases are the same when :

  • They have the same “key” phenotypes/features* AND same etiology
    • Note that there is always variability in the phenotypes between patients: not only all patients do not share all the phenotypes, and/or the severity of these phenotypes.
    • If diseases share the “key” phenotypes/features but have different etiologies (e.g. variations in different genes are responsible), these diseases are probably of the same “general” group (e.g. Phenotypic series in OMIM; e.g ‘Usher Syndrome’ has multiple types based on the gene involved)
  • The definitions of the diseases are the same or similar enough that they are describing the same concept. (this requires manual review).
    • Note that the definition might not have enough details to determine whether the diseases are actually the same, and this often requires discussion between curators and sometimes experts

The following conditions are not sufficient to say that 2 diseases are the same, but they give a clue that they could be (ie someone needs to manually review, and additional information is needed)

  • 2 diseases share the same label
  • 2 diseases share the same phenotype/feature
  • 2 diseases share the same etiology (e.g. variation in the same gene can lead to different diseases)

Diseases between species:
note that one cannot say that (for example) a mouse model “has Usher Syndrome”, because “Usher Syndrome” is a human condition. One should say that the mouse “models” Usher Syndrome, or “has the same features as patients with Usher Syndrome”... This is very nitpicky, but it is an important distinction (and also, our jobs :-) ).
I would group diseases of different species under the same “species agnostic disease term” when:

  • They have similar “key” phenotypes/features (see notes about phenotypes/features above), AND same or orthologue etiology
    • ie genes affected should be orthologous
    • Phenotypes affect homologous structures.

(@matentzn , is it the information you are looking for?)

This is excellent. Thank you! This is what I wanted. Let's see what @ImkeTammen has to add!

Mapping phenotypes
2 phenotypes are the same when (I am assuming that it is between species?)

  • 2 phenotypes inhere in homologous structures and exhibit the same quality (e.g. increased thickness)
  • 2 phenotypes involve identical dysfunction of the same/orthologous process (Molecular Function in GO)
    • For example, dysfunction in the “Wnt signaling pathway”. This assumes that genes involved in the “Wnt signaling pathway” were determined for each species and were shown as orthologous between species
    • “Identical dysfunction” means that the disruption should be the same, or similar. For instance: a block in Wnt Signaling pathway (ie no output) is different than upregulation
    • Note that this should be in homologous structures when known
  • Orthologous genes are affected in the same way (e.g. both are down regulated)
    • Note that this should be in homologous structures when known

A slightly tangential but illuminating read on clinical mappings https://www.ohdsi.org/web/wiki/doku.php?id=documentation:vocabulary:mapping