ANHIG/IMGTHLA

DRB3*03:57Q-related issue in the DRB_prot.txt file for releases 3.49.0 to 3.51.0

sjmack opened this issue · 1 comments

The nine nucleotide insertion in DRB3*03:57Q results confusing position numbering in the DRB_prot.txt file. The three residue insertion in this protein results in an indel position in the alignment being identified as position 67, when the position to the immediate right is the actual position 67. This makes it difficult to identify the proper position coordinates for that section of the alignment. I notice a similar issue for the HLA-A and HLA-B protein alignments. The ideal solution would be to start the numbering of positions after any initial indel (".") positions.

DRB_position_67_confusion

HLA-A_position_155_confusion

HLA-B_position_141_confusion

Thanks for the feedback on this, with regards to numbering this is based on only valid bases in the reference sequence, and as such the '.' is not counted as a valid base and the numbering would resume start once the first valid base is encountered.