How to interpret the alignment score in ksw2_extz and ksw2_extz_sse
giuliaguidi opened this issue · 1 comments
Dear Heng Li,
I'm using your ksw2 alignment algorithm. However, I cannot get the meaning of the alignment score it returns. For example, comparing ksw_extz (with zdrop) with SeqAn seed extension (which apply xdrop) function (same scoring matrix where a match, mismatch, and gap extension is 1/-1 and gap opening is 0; and same zdrop = zdrop = 3), I got the following results:
[ksw2] /* Left extension from a seed */
ez.score -10688
ez.max_t 183
ez.max_q 165
ez.mqe 21
ez.mqe_t 201
ez.mte -10688
ez.mte_q 186
time: 0.042834s
[ksw2] /* Right extension from a seed */
ez.score -1073741824
ez.max_t -1
ez.max_q -1
ez.mqe -1
ez.mqe_t 0
ez.mte -1073741824
ez.mte_q -1
time: 1e-06s
[seqan] /* Both extensions from a seed */
score: 18442
time: 1.09029s
Both ksw2 and SeqAn have received as input two long reads from a PacBio dataset (possibly already reversed and complemented if needed). SeqAn has received as input the starting positions of a shared seed between the two reads, while ksw_extz is called twice: (1) on the two sequences at the left of the seed (reversed, not complemented), and (2) on the two sequences at the right of the seed.
In ksw2.h, I added the bold line as I would like to the get the alignment score before the pair is dropped:
if (zdrop >= 0 && ez->max - H > zdrop + l * e) {
ez->zdropped = 1;
ez->score = ez->max;
return 1;
}
I cannot get the significant differences between ksw2 and SeqAn scores. I know that the should be different as zdrop is not the same as xdrop, but I guess I'm missing something about your algorithm inner workings.
Should I interpret the scores for the right extension as "there's no extension"?
Thanks,
Giulia
-
ksw2 might not work when gap open is set to 0. I have not confirmed this, though.
-
For seed extension, you need ez->max, not ez->score. Unless you specify KSW_EZ_EXTZ_ONLY, ez->score is global alignment score.
-
zdrop=3 is far too small for pacbio reads.