lh3/ksw2

50k test segmentation fault

hannespetur opened this issue · 2 comments

Hey,

Thank you for all your hard work.

I am having some issues running the 50k test, while the other tests work fine:

$ ./ksw2-test -t extz2_sse test/t1.fa test/q1.fa -A 2 -B 2 -O 4 -E 1             
t1	q1	-1	4	1	1	2M1D
t2	q2	-2	6	14	10	2D7M2D4M4D
t3	q3	42	54	35	33	5M2D27M6D7M2D4M6D5M2D6M
t4	q4	-7	0	-1	-1	11D4M
t5	q5	10	10	4	4	34M
$ ./ksw2-test -t extz2_sse test/MT-human.fa test/MT-orang.fa -A 2 -B 2 -O 4 -E 1
MT_human	MT_orang	21616	22094	16568	16024	1M155D4M63D5M103D3M62D8M192D37M1D85M1D232M1I559M1D6M1I550M1D2M1D337M1I696M1D52M1I7M1I61M1I2061M3I4M3D230M1D59M1D156M1D31M1I98M1I26M14I1617M1D753M7I8M9I10M3I3897M1D49M1D157M3D5M3I3551M5I57M1I54M4D19M1I78M1I15M1D32M2I20M1D13M1D5M1I50M1D15M1I5M2D17M1D190M1I37M474I1M
$ ./ksw2-test -t extz2_sse test/t2.fa.gz test/q2.fa.gz -A 2 -B 2 -O 4 -E 1
[1]    9840 segmentation fault (core dumped)

The issue seems to be related to the tracebacking algorithm as the test finishes successfully when I pass the -s (score only) option.

./ksw2-test -t extz2_sse test/t2.fa.gz test/q2.fa.gz -A 2 -B 2 -O 4 -E 1 -s
phage50k_ref	phage50k_mut90	77292	77333	49962	49999

KSW2 was compiled with GCC 6.3.0 on Ubuntu 16.04.

All the best,
Hannes

lh3 commented

For a pair of 50kb sequences, the backtrack matrix takes 50k*50k=2.5GB RAM. This leads to an integer overflow and thus the segfault. We may fix this by doubling the memory of two large arrays. However, aligning two 50kb sequences without banding is not an intended use case of ksw2. It is not worth increasing the peak memory for this.

That said, ksw2 should give an error or an assertion failure instead of segfault. I will consider this. Thanks.

Thank you for your explanation.

Best, Hannes