KolmogorovLab/hapdup

read length inference bug

Closed this issue · 1 comments

Hi Mikhail, I found a bug in the read length inference step since it doesn't work when M is specified as X or = in cigar string.
The original code in (https://github.com/KolmogorovLab/hapdup/blob/main/hapdup/filter_misplaced_alignments.py)line 30 is:
for token in re.findall("[\d]{0,}[A-Z]{1}", cigar):
but "=" is not captured by this pattern. When I change the code as follows, the program works
for token in re.findall("[\d]{0,}[A-Z=]{1}", cigar)

Thanks for reporting, I will incorporate the fix into the next release!