Illumina/manta

Failed to complete command task: 'generateCandidateSV_0000', error code: 1

Boer223 opened this issue · 7 comments

Hi,

When I run manta to call SV of 125 samples bam files, it occurs error code 1, the error log file is here:
workflow.error.log.txt

So, how to solve this error? Thank you in advance!

jykr commented

I had similar error in workflow.error.log.txt and my task error log file (workspace/pyflow.data/logs/pyflow_tasks_stderr_log.txt says

[2020-10-02T18:40:36.040260Z] [compute-e-16-221.o2.rc.hms.harvard.edu] [1281_1] [generateCandidateSV_0000] FATAL_ERROR: 2020-Oct-02 14:40:36 /builder/src/c++/lib/manta/SVCandidateSetData.cpp(125): Throw in function void SVCandidateSetSequenceFragmentSampleGroup::add(const bam_header_info&, const bam_record&, bool, bool, bool)
[2020-10-02T18:40:36.042183Z] [compute-e-16-221.o2.rc.hms.harvard.edu] [1281_1] [generateCandidateSV_0000] Dynamic exception type: boost::exception_detail::clone_impl<illumina::common::GeneralException>
[2020-10-02T18:40:36.042925Z] [compute-e-16-221.o2.rc.hms.harvard.edu] [1281_1] [generateCandidateSV_0000] std::exception::what: Unexpected alignment name collision. Source: 'datapath/my.bam'
[2020-10-02T18:40:36.043714Z] [compute-e-16-221.o2.rc.hms.harvard.edu] [1281_1] [generateCandidateSV_0000]      Existing read: MG01HX02:343:HTCVYCCXX:6:2121:22577:33182/2 chrom:pos:strand MT:1:+ cigar: 31S120M templateSize: 370 SA: 1,77436909,-,97S49M5S,39,3; mateChrom:pos:strand MT:222:-
[2020-10-02T18:40:36.044374Z] [compute-e-16-221.o2.rc.hms.harvard.edu] [1281_1] [generateCandidateSV_0000]      New read: MG01HX02:343:HTCVYCCXX:6:2121:22577:33182/2 chrom:pos:strand MT:1:+ cigar: 31S120M templateSize: 370 SA: 1,77436909,-,97S49M5S,39,3; mateChrom:pos:strand MT:222:-

Which is saying there's exact duplicate alignment entry in bam file. When I checked my bam file, it actually had two exactly same rows.

MG01HX02:343:HTCVYCCXX:6:2121:22577:33182       163     MT      1       60      31S120M =       222     370     ACACGTTCCCCTTAAATAAGACATCACGANGGATCACAGGTCTATCACCCTATTAACCACTCACGGGAGCTCTCCATGCATTTGGTATTTTCGTCTGGGGGGCATGCACGCGATAGCATTGCGAGACGCTGGAGCCGGAGCACCCTATGTC       =8=:6>>=>>=>????>???>=?>>==9>#>?>:>?=?????@@??@>@@@@?=@@>@@>@>@>:79>@?@?@?@@=??@?@@?@?@?@@@?:=?>?<@@95;@<<?@>:?;??@@?A@A@@;>A7;5@?@A@A@=-)<>>><7;@?;2:9       SA:Z:1,77436909,-,97S49M5S,39,3;        BD:Z:IIHHMNMLKIILJJKCJHKJJLHKLLHJLIIIJLLLHHNMLMJLHLLHMHLLHJJKJMKHLJLHJJIJKONJJJJKKKNNKJCJKLLHJCCJKKMJNKIIIIMNKKNNHJKKLLHMONKJJOLMLKMKLOOLKLPOLLLMQQKPKOOHKHM  MD:Z:71T48      BI:Z:KKJIKLLJJHHLJLJDKIJKKJIKKLHJLJJJKLKLHIMLLKJLIKLHKHLLIJLJJKKHLFLHJJHKKMNFJFJKKKLMKJEKKLKIJEFILLLKNLIIIINNLLMNIKLJMMJMNNLKLMJMLLKLMPOMMMONLMNNPPLNJMMJLJL    NM:i:1  AS:i:115     XS:i:84  RG:Z:LMG01HX02_343_HTCVYCCXX_6  PG:Z:MarkDuplicates
MG01HX02:343:HTCVYCCXX:6:2121:22577:33182       163     MT      1       60      31S120M =       222     370     ACACGTTCCCCTTAAATAAGACATCACGANGGATCACAGGTCTATCACCCTATTAACCACTCACGGGAGCTCTCCATGCATTTGGTATTTTCGTCTGGGGGGCATGCACGCGATAGCATTGCGAGACGCTGGAGCCGGAGCACCCTATGTC       =8=:6>>=>>=>????>???>=?>>==9>#>?>:>?=?????@@??@>@@@@?=@@>@@>@>@>:79>@?@?@?@@=??@?@@?@?@?@@@?:=?>?<@@95;@<<?@>:?;??@@?A@A@@;>A7;5@?@A@A@=-)<>>><7;@?;2:9       SA:Z:1,77436909,-,97S49M5S,39,3;        BD:Z:IIHHMNMLKIILJJKCJHKJJLHKLLHJLIIIJLLLHHNMLMJLHLLHMHLLHJJKJMKHLJLHJJIJKONJJJJKKKNNKJCJKLLHJCCJKKMJNKIIIIMNKKNNHJKKLLHMONKJJOLMLKMKLOOLKLPOLLLMQQKPKOOHKHM  MD:Z:71T48      BI:Z:KKJIKLLJJHHLJLJDKIJKKJIKKLHJLJJJKLKLHIMLLKJLIKLHKHLLIJLJJKKHLFLHJJHKKMNFJFJKKKLMKJEKKLKIJEFILLLKNLIIIINNLLMNIKLJMMJMNNLKLMJMLLKLMPOMMMONLMNNPPLNJMMJLJL    NM:i:1  AS:i:115     XS:i:84  RG:Z:LMG01HX02_343_HTCVYCCXX_6  PG:Z:MarkDuplicates

I saw that MANTA doesn't allow a read to be aligned to distinct position without secondary alignment flag.. but is it problematic to have the duplicate entry in bam file? Would there be a simple way to go around this without changing bam file?

jykr commented

It turned out that duplicate reads in bam file (produced by parallel alignment) caused the error. I ended up temporarily editing src/c++/lib/manta/SVCandidateSetData.cpp line 151~ so that MANTA won't throw error but print out the duplicated read entries, and check the duplicated reads after MANTA finishes running.

Hi @jykr could you please share your edited code here?

Thanks

jykr commented

@gbdias Can you see this page? The forked repository of altered manta here. This code is clumsy but worked for me. You can build manta from the source by following instructions from manta build procedure.

This will write clashed reads' error message into the output stream, so you can manually check how many and which reads are duplicated by reading $run_dir/workspace/pyflow.data/logs/pyflow_tasks_stdout_log.txt.

I get the same error on my BAM (WES):

[181942_1] [TaskManager] [ERROR] Failed to complete command task: 'generateCandidateSV_0000' launched from master workflow, error code: 1

I have even tried to break my BAM into individual chromosomes but the same error occurs and everything stops.

Would someone provide some help, please?

I solved this qustion!

You can modify one paratmeter in this file.

~/manta-1.6.0.release_src/src/c++/lib/blt_util/qscore_cache.hpp

then, you can find this code.

enum { MAX_QSCORE =70, MAX_MAP = 90 };

so, let MAX_QSCORE value equal 200 or more.

Finally, you install again by https://github.com/Illumina/manta/blob/master/docs/userGuide/installation.md.

Same error on my side. I use pre-compiled binaries, with python 2.7.

Manta works after removing the duplicate reads in bam file:

samtools rmdup -S $bamfile $newbamfile
samtools index $newbamfile

Does it make sense to you to do this?