pcingola/SnpEff

HGVS Nomenclature issues

RaquelRomeroF opened this issue · 0 comments

Describe the issue

I have identified several issues related to the calculation of HGVS nomenclature, which include:

  1. Synonymous variants

Recommendations: amino acids that have been tested and found not changed (silent) are described as p.Cys123=

chr1 69270 . A G ANN=G|synonymous_variant|LOW|OR4F5|ENSG00000186092|transcript|ENST00000641515.2|protein_coding|3/3|c.243A>G|p.Ser81Ser|303/2618|243/981|81/326||

SNPEff HGVS annotation: p.Ser81Ser
Expected HVGS annotation: p.Ser81=

  1. Frameshifts + stop_gained:

According to HGVS recommendations on frameshifts:

NOTE: the shortest frameshift variant possible contains fsTer2; variants which introduce an immediate translation termination (stop) codon are described as nonsense variant, e.g., p.Tyr4Ter (or p.Tyr4*) not p.Tyr4TerfsTer1 (see Substitution).

chr1 225346173 . C CCTAGA ANN=CCTAGA|frameshift_variant&stop_gained|HIGH|DNAH14|ENSG00000185842|transcript|ENST00000445597.6|protein_coding|48/61|c.8205_8209dupAGACT|p.Trp2737fs|8210/10524|8210/10524|2737/3507||INFO_REALIGN_3_PRIME,

SNPEff HGVS annotation: p.Trp2737fs
Expected HVGS annotation: p.Trp2737Ter

  1. Stop lost:

New stop should be indicated if it is known:

chr19 32879164 . A G ANN=G|stop_lost|HIGH|CEP89|ENSG00000121289|transcript|ENST00000305768.10|protein_coding|19/19|c.2350T>C|p.Ter784Glnext*?|2434/5673|2350/2352|784/783||

SNPEff HGVS annotation: p.Ter784Glnext*?
Expected HVGS annotation: p.(Ter748GlnextTer8)

  1. INDELS

HGVS recommendations: Syntax sequence_identifier ":" coordinate_type "." position_or_range "dup"
NOTE: the recommendation is not to describe the variant as c.20_23dupTAGA, i.e. describe the duplicated nucleotide sequence. This description is longer, it contains redundant information, and chances to make an error increases (e.g., c.20_23dupTGGA).

chr19 1775124 . T TCGCC ANN=TCGCC|intron_variant|MODIFIER|ONECUT3|ENSG00000205922|transcript|ENST00000382349.5|protein_coding|1/1|c.1193-16_1193-13dupCGCC||||||INFO_REALIGN_3_PRIME

SNPEff HGVS annotation: c.1193-16_1193-13dupCGCC
Expected HVGS annotation:c.1193-16_1193-13dup

To Reproduce

  1. SnpEff version: SNPEff 5.2
  2. Genome version: Hg38

Hope all this has a solution, I find SNPEff an amzing tool.

Thanks a lot!

Raquel