protein change annotation difference between hgvs and vep
Closed this issue · 8 comments
Not sure what alignment you are using here? When I look this up in my genome browser, the alignment looks differently. There is an (indel) disagreement between the transcript and the reference genome somewhere upstream of this position, which I suspect might be causing this confusion. Does your alignment ^^^ incorporate that?
Thanks for answering my question. I'm using IGV for visualization. And you are right, there is a disagreement between the transcript and the reference genome.
When I look through the upstream, I found this
while in NM_000348.4 (note: reverse strand), it is three Gs.
This is where problem comes from.
And when encounters with this kind difference, vep and igv ignore the problem codon, while hgvs uses genome sequence to deduce animo acid sequence, I guess.
FWIW, failure to account for indels in coding regions is the # 1 reason that hgvs differs from other tools. Our 2018 update paper provides lots of examples for issues like this. https://pubmed.ncbi.nlm.nih.gov/30129167/
Because Ensembl transcripts are defined on the genome, it mostly doesn't have a notion of transcript sequences that differ from genomic sequence. In contrast, RefSeq transcript sequences are distinct from genomic sequences and are related through alignment, which may contain indels. Therefore, using Ensembl to annotate RefSeqs is a bit of a sticky wicket.
This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 7 days.
This issue was closed because it has been stalled for 7 days with no activity.
This issue was closed by stalebot. It has been reopened to give more time for community review. See biocommons coding guidelines for stale issue and pull request policies. This resurrection is expected to be a one-time event.
Hi - HGVS correctly handles alignment gaps here. The bug is with Ensembl VEP - I have already raised this issue with them