lh3/ksw2

Optimal alignment out of bandwidth

Dmitry-Antipov opened this issue · 8 comments

In case when optimal alignment is out of bandwidth, cigar string can be nonsense (insertions followed by deletions, 1I1D15M1I1D15M1I1D15M).

Seems that such cases can be detected by nonzero flag zdropped in ksw_extz_t , but see no confirmation in comments/readme - am I right?

lh3 commented

What is the scoring in use?

I've tried different scoring schemes. Last one was 1, -5, 5, 2 (match, mismatch, gap_open, gap_extend).

But after increasing bandwidth cigar became normal, so I do not think scoring matters here.

lh3 commented

mismatch should be a positive number

Sorry, 1,5 ,5,2 -- I've used the wrapper same to your example from readme file

lh3 commented

Z-drop may detect out-of-band, but doesn't always work. It may also drop alignments you want to keep. The better way is to check if CIGAR is close to the band boundary.

By the way, whether you can have insertions followed by deletions is determined by the scoring. This is not "nonsense".

I still miss something. I was talking not about z-drop optimization (it was switched off), but about zdropped flag
It seems that it is used not only for zdrop. In particular it's set to 1 here, and this place really looks like band's boundary, isn't it?

ez->zdropped = 1;

lh3 commented

That happens when the end position is not in the band, which means you can't find an alignment. It doesn't protect against general out-of-band cases.

I see. Thank you!