Use of "Factor" as generic term
ahwagner opened this issue · 2 comments
I would like us to collectively consider and decide on the term we want to use for the concept we currently describe as "Factor". I think there are better terms we could use that align with the community and prior vocabulary. Here is an alignment piece where instead of "Factor" I propose we use the term "Genome Feature". However, there may be other more descriptive and/or better-aligned terms we can use in place of the current "Factor" label.
On 10/17 call, @arpaddanos and @ahwagner led a discussion on design choices for Factor
and Genome Feature
.
@arpaddanos raised the following reasons Factor
would be preferable:
- It is a succinct (one-word) description of a Feature category
- It covers a broader domain than Genome Feature, including potential for other features that are neither genome features nor sequence features
- Moving away from
Factor
would result in breaking changes
@ahwagner raised the following reasons Genome Feature
would be preferable:
- It is the directly analogous concept to
Sequence Feature
, a root concept used in the Sequence Ontology - It fits cleanly as a drop-in replacement for
Factor
, and covers all current concepts curated underFactor
in CIViC - This concept has been referred to as a "genomic feature" or "genome feature" in casual conversation, including use by community members from other institutions in recent GA4GH Cat-VRS calls
During discussion, the following additional points were discussed:
- @acoffman raised that this would be a breaking change for data clients reading the API or TSVs for Factor data; @kkrysiak asked if Factors were already seeing frequent querying via API; @acoffman said this is something that could be learned from query logs
- @ahwagner suggested that "Genome" could be used on the UI to describe the
Feature.type
attribute, as it is implied by the context that this is aFeature
; this would also fit naturally into concept description (e.g. "genome variant" vs. "gene variant" vs. "fusion variant" vs. "region variant") - @susannasiebert raised that as a lay person,
Genome
as a feature type may be confusing with the notion of related genome concepts, e.g. Genome build; @obigriffith rebutted this may not be so confusing in context alongside other Feature types - @obigriffith raised concerns that we make this breaking change now, and the community settles on another concept. @ahwagner agreed to champion the use of "Genome Feature" in relevant community discussions if adopted, and did not foresee much pushback on the use of this term
General agreement was reached on the following points:
- there is no concern about the CIViC data model of feature types not reflecting the Sequence Ontology relation hierarchy
- there is minimal concern about the use of
Fusion
as both a feature type and variant type
I had to leave the conversation at this point, so am uncertain if a decision was reached or what actions were to follow prior to making a decision.
I think there was general support for the idea to keep Factor for those non-genome features (e.g., microbiome signature, viral expression, etc) but add "Genome" as a new feature type. Most of what is currently under Factor could be moved seamlessly to Genome and they could be configured in the back and front end with almost identical structures. However, we first want to pilot curating some of the these non-genome factors. Cam also mentioned having some real world examples from POG cases. If we can convince ourselves that there is a real need for them and curate some variants/MPs/evidence it will be easier to motivate creating the new feature type and reorganizing existing factor entries.