wfa2 crashes on ends-free alignment with the following options and the three sequences aligned one after another
iAvicenna opened this issue · 3 comments
Hello,
I have been using wfa2 as a part of a script that analyses ngs data. I noticed that when I tweaked the parameters in a certain (but not unreasonable way) it crashes once in a while. I isolated it to three sequences, which when analysed one after the other causes the program to crash. I stripped down the whole thing into a bare minimum program which still crashes and is given below:
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <stdbool.h>
#include "../ext/WFA2/wavefront/wfa.h"
int main(){
int i;
char *ref = "GCTCAGGATGGGAAAAGCTATGCTTGCAAAAGGGGATCTGTTAACAGTTTCTTTAGTAGATTGAATTGGTTGCACAAATTAGAATACAAATATCCAGCGCTGAACGTGACTATGCCAAACAATGGCAAATTTGACAAATTGTACATTTGGGGGGTTCACCACCCGAGCACGGACAGTGACCAAACCAGCATATATGTTCGAGCATCAGGGAGAGTC";
char *seqs[3] =
{
"CTACGTAGAGCTCAGGATGGCGCTCAGGAAAAGCTATGCTTGCCTTAGGGGATCTGTTTTGAGTTTCTTTAGTAGATTGAATTGAATTGGTTGCACAAATTAGAATACAAATATCCAGCGCTGAACGTGACTATGCCAAACAATGGCAAATTTGACAAATTGTACATTTGGGGGGTTCACCACCCGAGCACGGACAGTGACCAAACCAGCATATATGTTCGAGCATCAGGAGAGTCGCTACGTAGAGATCGGAAGAG",
"TCTTCCGACTACAGGCTCAGGATGGGAATGGGAAAAGCTATGCTTGCAAAAGGGGATCTGTTAGTAGTTTCTTTAGTAGATTGAATTGGTTGCACAAATTAGAATACAAATATCCAGCGCTGAACGTGACTATGCCAAACAATGGCAAATTTGACAAATTGTACATTTGGGGGGTTCACCACCCGAGCACGGACAGTGACCAAACCAGCATATATGTTCGAGCATCAGGGAGAGTCCTGTAGTCGAGATCGGAAGA",
"ACGAGAGATACGCTCAGGATGGGATCAGGATGGGAAAAGCTATGCTTGCAAAAGGGGATCTGTTAACAGTTTCTTTAGTAGATTGAATTGGTTGCACAAATTAGAATACAAATATCCAGCGCTGAACGTGACTATGCCAAACAATGGCAAATTTGACAAATTGTACATTTGGGGGGTTCACCACCCGAGCACGGACAGTGACCAAACCAGCATATATGTTCGAGCATCAGGGAGATCGCTACGTAGAGATCGGAAG"
};
wavefront_aligner_t* wf_aligner;
wavefront_aligner_attr_t attributes = wavefront_aligner_attr_default;
attributes.distance_metric = gap_affine_2p;
attributes.affine2p_penalties.mismatch = 1;
attributes.affine2p_penalties.match = -1;
attributes.affine2p_penalties.gap_opening1 = 4;
attributes.affine2p_penalties.gap_extension1 = 4;
attributes.affine2p_penalties.gap_opening2 = 4;
attributes.affine2p_penalties.gap_extension2 = 4;
attributes.alignment_form.span = alignment_endsfree;
wf_aligner = wavefront_aligner_new(&attributes);
wavefront_aligner_set_alignment_free_ends(wf_aligner, 12, 12, 6, 6);
for (i=0; i<3; i++){
wavefront_align(wf_aligner, seqs[i], strlen(seqs[i]), ref, strlen(ref));
}
wavefront_aligner_delete(wf_aligner);
}
I ran this with valgrind and the part of the log that pertains to the error is (there is a bunch of stuff that appears because wf_aligner was not freed due to a crash):
==31428== Memcheck, a memory error detector
==31428== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==31428== Using Valgrind-3.21.0.GIT and LibVEX; rerun with -h for copyright info
==31428== Command: /home/avicenna/Dropbox/local_packages/merlign/src//ign_debug
==31428==
==31428== Invalid read of size 8
==31428== at 0x127214: wavefront_extend_matches_packed_kernel (wavefront_extend.c:182)
==31428== by 0x127214: wavefront_extend_matches_packed_endsfree (wavefront_extend.c:253)
==31428== by 0x1277AA: wavefront_extend_endsfree (wavefront_extend.c:435)
==31428== by 0x12BE70: wavefront_unialign (wavefront_unialign.c:424)
==31428== by 0x11C6A8: wavefront_align_unidirectional (wavefront_align.c:123)
==31428== by 0x11C6A8: wavefront_align (wavefront_align.c:170)
==31428== by 0x11C27C: main (ign_debug.c:36)
==31428== Address 0xffffffffc56f635b is not stack'd, malloc'd or (recently) free'd
==31428==
==31428==
==31428== Process terminating with default action of signal 11 (SIGSEGV)
==31428== Access not within mapped region at address 0xFFFFFFFFC56F635B
==31428== at 0x127214: wavefront_extend_matches_packed_kernel (wavefront_extend.c:182)
==31428== by 0x127214: wavefront_extend_matches_packed_endsfree (wavefront_extend.c:253)
==31428== by 0x1277AA: wavefront_extend_endsfree (wavefront_extend.c:435)
==31428== by 0x12BE70: wavefront_unialign (wavefront_unialign.c:424)
==31428== by 0x11C6A8: wavefront_align_unidirectional (wavefront_align.c:123)
==31428== by 0x11C6A8: wavefront_align (wavefront_align.c:170)
==31428== by 0x11C27C: main (ign_debug.c:36)
==31428== If you believe this happened as a result of a stack
==31428== overflow in your program's main thread (unlikely but
==31428== possible), you can try to increase the size of the
==31428== main thread stack using the --main-stacksize= flag.
==31428== The main thread stack size used in this run was 8388608.
==31428==
==31428== HEAP SUMMARY:
==31428== in use at exit: 8,645,464 bytes in 20 blocks
==31428== total heap usage: 28 allocs, 8 frees, 8,671,936 bytes allocated
I also ran this with gdb to see exactly where it crashes and it gives the fault here
Program received signal SIGSEGV, Segmentation fault.
0x0000555555573214 in wavefront_extend_matches_packed_kernel (offset=-1073741823, k=0, wf_aligner=0x7ffff761d020) at wavefront_extend.c:253
253 offset = wavefront_extend_matches_packed_kernel(wf_aligner,k,offset);
printing the offset value produces this:
$1 = -1073741823
- Changing the gap penalty parameters etc sometimes prevents the crash however valgrind still produces the error above.
- Using a subset of the sequences above or only one of them does not result in a crash.
- freeing wfa_aligner at the end of everyloop and reinitializing also remedies the problem so it would seem that it is a problem of changing internal state with the wfa_aligner.
- Perhaps another worthy note is that this problem and some other cases where I had problems with disappear when I set the mismatch penalty=2. It also does not appear when mismatch=1 but match=0 (instead of -1). So it seems to happen when there isn't a sufficient mismatch penalty compared to match bonus (I seem to recall one ran into similar problems with bowtie if you tried problematic parameters like mismatch penalty=0).
Note that I have tested valgrind with other input (for instance only the first two sequences above) and it produces a clean log with such examples (note that for valgrind I had to compile wfa2 with MARCH_FLAG="" BUILD_WFA_PARALLEL="0", otherwise it produces some warnings specific to these).
Further info:
- my system is Ubuntu 20.04 focal
- my gcc/g++ version is 9.4.0
- compiled the code above with the following:
gcc -c wfa2_bug.c -o wfa2_bug.o -Wall -Wextra -Wno-unused-function -O3 -I../ext/WFA2 -DVERBOSE
g++ wfa2_bug.o -o wfa2_bug -fopenmp -O3 -I../ext/WFA2 -lwfa -lm -lrt -lz -L../ext/WFA2/lib\
- compiled wfa2 with make clean all
Let me know if there is any other diagnostics I can produce, this is as far as I could go with gbp and valgrind since I don't know the internal details of wfa2. I wonder if there is a function to reset the internal state of a wavefront_aligner_t object to its default state without deleting which would perhaps temporarily remedy the problem until a patch?
Thanks
Hi,
I am busy these days, but I will fix this bug as soon as possible.
Cheers,
Hi,
I apologize for the terrible delay in answering this issue. I am sorry, but, using the latest version (dev), I couldn't reproduce the memory problem. I try the same compiler and OS.
What is the specific CPU ISA you are using (x86, ARM, ...)?
Let us try to reproduce this problem and fix it.