HGVS Nomenclature issues
RaquelRomeroF opened this issue · 0 comments
Describe the issue
I have identified several issues related to the calculation of HGVS nomenclature, which include:
- Synonymous variants
Recommendations: amino acids that have been tested and found not changed (silent) are described as p.Cys123=
chr1 69270 . A G ANN=G|synonymous_variant|LOW|OR4F5|ENSG00000186092|transcript|ENST00000641515.2|protein_coding|3/3|c.243A>G|p.Ser81Ser|303/2618|243/981|81/326||
SNPEff HGVS annotation: p.Ser81Ser
Expected HVGS annotation: p.Ser81=
- Frameshifts + stop_gained:
According to HGVS recommendations on frameshifts:
NOTE: the shortest frameshift variant possible contains fsTer2; variants which introduce an immediate translation termination (stop) codon are described as nonsense variant, e.g., p.Tyr4Ter (or p.Tyr4*) not p.Tyr4TerfsTer1 (see Substitution).
chr1 225346173 . C CCTAGA ANN=CCTAGA|frameshift_variant&stop_gained|HIGH|DNAH14|ENSG00000185842|transcript|ENST00000445597.6|protein_coding|48/61|c.8205_8209dupAGACT|p.Trp2737fs|8210/10524|8210/10524|2737/3507||INFO_REALIGN_3_PRIME,
SNPEff HGVS annotation: p.Trp2737fs
Expected HVGS annotation: p.Trp2737Ter
- Stop lost:
New stop should be indicated if it is known:
chr19 32879164 . A G ANN=G|stop_lost|HIGH|CEP89|ENSG00000121289|transcript|ENST00000305768.10|protein_coding|19/19|c.2350T>C|p.Ter784Glnext*?|2434/5673|2350/2352|784/783||
SNPEff HGVS annotation: p.Ter784Glnext*?
Expected HVGS annotation: p.(Ter748GlnextTer8)
- INDELS
HGVS recommendations: Syntax sequence_identifier ":" coordinate_type "." position_or_range "dup"
NOTE: the recommendation is not to describe the variant as c.20_23dupTAGA, i.e. describe the duplicated nucleotide sequence. This description is longer, it contains redundant information, and chances to make an error increases (e.g., c.20_23dupTGGA).
chr19 1775124 . T TCGCC ANN=TCGCC|intron_variant|MODIFIER|ONECUT3|ENSG00000205922|transcript|ENST00000382349.5|protein_coding|1/1|c.1193-16_1193-13dupCGCC||||||INFO_REALIGN_3_PRIME
SNPEff HGVS annotation: c.1193-16_1193-13dupCGCC
Expected HVGS annotation:c.1193-16_1193-13dup
To Reproduce
- SnpEff version: SNPEff 5.2
- Genome version: Hg38
Hope all this has a solution, I find SNPEff an amzing tool.
Thanks a lot!
Raquel