biocommons/hgvs

protein change annotation difference between hgvs and vep

Closed this issue · 8 comments

When converting cdna change NM_000348.4:c.653dup to protein change, hgvs gives NP_000339.2:Ser220LeufsTer14, while vep gives p.Ser219LeufsTer14 which seems to be right.
图片

Not sure what alignment you are using here? When I look this up in my genome browser, the alignment looks differently. There is an (indel) disagreement between the transcript and the reference genome somewhere upstream of this position, which I suspect might be causing this confusion. Does your alignment ^^^ incorporate that?

Screen Shot 2021-05-10 at 9 52 04 AM

Thanks for answering my question. I'm using IGV for visualization. And you are right, there is a disagreement between the transcript and the reference genome.

When I look through the upstream, I found this
图片
while in NM_000348.4 (note: reverse strand), it is three Gs.
图片
This is where problem comes from.

And when encounters with this kind difference, vep and igv ignore the problem codon, while hgvs uses genome sequence to deduce animo acid sequence, I guess.

reece commented

FWIW, failure to account for indels in coding regions is the # 1 reason that hgvs differs from other tools. Our 2018 update paper provides lots of examples for issues like this. https://pubmed.ncbi.nlm.nih.gov/30129167/

Because Ensembl transcripts are defined on the genome, it mostly doesn't have a notion of transcript sequences that differ from genomic sequence. In contrast, RefSeq transcript sequences are distinct from genomic sequences and are related through alignment, which may contain indels. Therefore, using Ensembl to annotate RefSeqs is a bit of a sticky wicket.

@reece Thanks for the link and explanation, helps a lot.

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 7 days.

This issue was closed because it has been stalled for 7 days with no activity.

This issue was closed by stalebot. It has been reopened to give more time for community review. See biocommons coding guidelines for stale issue and pull request policies. This resurrection is expected to be a one-time event.

Hi - HGVS correctly handles alignment gaps here. The bug is with Ensembl VEP - I have already raised this issue with them