IHCC-cohorts/GECKO

Review placement of generic classes

Opened this issue · 13 comments

Based on the feedback from OBOFoundry/OBOFoundry.github.io#1253, @pbuttigieg has made some suggestions for moving some terms from GECKO to other ontologies to keep GECKO orthogonal:

  • GECKO:0000003 timeline - move to COB
  • GECKO:0000005 population data item (& children) - move to PCO
  • GECKO:0000055 unique identifier - move to IAO
  • GECKO:0000061 ethnicity - replace with 'ancestry category' (HANCESTRO)
  • GECKO:0000060 gender - move to PATO ?
  • GECKO:0000106 sample size (& children) - move to STATO

I'm not sure if 'ethnicity' and 'gender' really fall within PATO, since these are human-specific terms. @matentzn maybe you could comment on placement of these terms?

If these look OK, I will open term requests in the appropriate repos.

Additionally, these terms could be candidates for OBI requests:

  • GECKO:0000042 eQTL analysis
  • GECKO:0000040 microbiome sequencing assay

I'm not sure if 'ethnicity' and 'gender' really fall within PATO, since these are human-specific terms. @matentzn maybe you could comment on placement of these terms?

These are so general that it would good to have them in a higher-level ontology. If I was looking for these for reuse (e.g. for the SDGIO / UN Semantics) I would hesitate on reusing content from GECKO which appears to be for a very specific use (noting there's nothing wrong with the terms).

I didn't realise that PATO wasn't handling human-related terms. Is that the case?

Wondering if instead of ethnicity we should consider 'ancestry category'?

cc @zhengj2007 as I know Chris S had good advice on the topic, and @daniwelter

@mcourtot 'ancestry category' could be a good match here, looking at the aims of the project. It is however worth noting that this would require some rearrangements in the hierarchy as 'ancestry category' in the extended HANCESTRO hierarchy is a subclass of population, which is a material entity.

Screenshot 2020-08-12 at 11 32 30

GECKO:0000060 gender - move to PATO ?

May consider to replace it by OMRSE 'gender role'
http://purl.obolibrary.org/obo/OMRSE_00000007

I agree with Becky about gender and ethnicity being social entities, and I think education level is as well. OMRSE has 'gender identity information content entity', 'ethnic identity information content entity', and 'highest level of education data item'. The parent of the first two terms, 'social identity information content entity', explains "identifying" in three ways that could correspond to three different qualities, if that helps.

Ontology for Biobanking (OBIB) used EFO:ethnic group (http://www.ebi.ac.uk/efo/EFO_0001799) and some of its subClasses. The 'ethnic group' is a subclass of 'population'. Current the subclasses of EFO:ethnic group are deprecated in EFO. The OBIB developers are discussing what terms should be used but have not decided the replaced terms yet.
(biobanking/biobanking#71)

The 'ancestry category' looks good match to me. I think we may also think to use it and its subClasses in the OBIB.

@zhengj2007 The subclasses of EFO:ethnic group were deprecated when EFO moved to importing HANCESTRO classes instead. EFO:ethnic group itself was retained for legacy reasons but all deprecated subclasses were replaced with subclasses of HANCESTRO:ancestry category.

@mcourtot Could we get your feedback on this, please?

I think there are 2 things:

  • the genetic make up that would justify putting ancestry category under material entity (I'm not sure that can be that clearly delineated but I'm no expert)
  • the social construct which is when participants are asked "are you caucasian? are you asian?"

I'm not sure the first one is that relevant for us, because they way this is handled in practice (and I think the same is true for GWAS for which HANCESTRO was built) is that participants are being asked. So the category recorded will be the one you self identify as (or you assess your grand parents to be)

I'd prefer to use something like the OMRSE term but I'd also really like to get the EFO subclasses... I'll start a thread with the EFO developers.

@mcourtot
I wasn't involved in the GWAS/HANCESTRO work, but I see that the granular ancestry terms are defined as "...individuals that either self-report or have been described as...", so apparently both cases are covered. I'd see the genetic make-up aspect you list above as a material property, and the social construct as a self-reported trait (we have that in EFO, also a material property). Possibly, 'ancestry category' should not be a subclass of 'population', but an aspect of it... and a descendant of 'material property'. Then the HANCESTRO terms wouldn't be descendants of 'material entity' and you could accommodate the dual aspect of genetic and self-reported traits. (I would not advocate the creation of two distinct branches for self-reported vs. 'assessed' ethnicity, as the distinction may not always be unambiguous in a study.)
As I said I'm not an expert on the subject though.
Good luck!
Paola

data collection move to OBCS ?
Suggested replacement:
Term label: “data collection”
Term ID: “OBCS_0000002”
Term definition: “a planned process that gathers and measures information on variables of interest, in an established systematic fashion that enables one to answer stated research questions, test hypotheses, and evaluate outcomes. Data collection results in a collection of data”.