aquaskyline/16GT

Segmentaiton fault

Closed this issue · 5 comments

Hello, I have tried to follow the tutorial in the README file, but program bam2snapshot have dropped a a Segmentation fault. Here is gdb output :

...
Processing ref_is350_to_b3v06_masked_div30.bam...

Program received signal SIGSEGV, Segmentation fault.
0x0000000000418670 in getAmbPos (chr_id=0, offset=1, ambiguityMap=ambiguityMap@entry=0x6582c0, 
    translate=translate@entry=0x7ffff7f4c010, dnaLength=982460096) at indexFunction.cpp:22
22          while (translate[approxValue].startPos > ambPos) {

The bam file is rather big, but privately I can share it if needed.

--- edit ---
Some technical details. It's a sorted bamfile, reads were mapped using bwa-mem and PCR duplicates were marked using samblaster. I am running it on CentOS 7, compiled using gcc version 6.1.1.

The error message suggests that you are using a reference different from the one used for generating the bam file.

I am quite sure that the reference is the same. The sequence names in fasta and bam files are matching for all the 31,712 scaffolds.
-- edit --
What might cause some troubles, there are some unknown nucleotides in the reference and the reference is softmasked.
-- edit 2 --
Unmasking the genome did not help.

Working directly with your bam and reference will make my debug faster, but before that, could you please help me to check on one more thing? Could you please check if there exists any alignment with the "Starting Position + the sum of the numbers in front of tag M, D, =, and X in Cigar String" exceeding the length of the corresponding scaffold.

Oh, I am sorry @aquaskyline. I totally forgot to respond till I got notification about closing the issue. I am afraid that I won't that the file anymore (we update the reference and redid all the mapping), but I suppose if no one else had a problem with this till now, it's was "my files specific" problem anyway.

Thanks, @KamilSJaron for the information.