Failed to complete command task: 'rm_individual_seg_files' launched from master workflow,
Opened this issue · 11 comments
Dear author,
I built the singularity environment from the Docker image recently. I got this errror as mentioned in the title when I run it. The detailed error information is as follows:
[2020-11-17T17:07:15.734560] [node127.cm.cluster] [68840_1] [WorkflowRunner] [ERROR] Worklow terminated due to the following task errors:
[2020-11-17T17:07:15.735768] [node127.cm.cluster] [68840_1] [WorkflowRunner] [ERROR] Failed to complete command task: 'rm_individual_seg_files' launched from master workflow, error code: 1, command: 'rm'
[2020-11-17T17:07:15.736237] [node127.cm.cluster] [68840_1] [WorkflowRunner] [ERROR] [rm_individual_seg_files] Error Message:
[2020-11-17T17:07:15.736848] [node127.cm.cluster] [68840_1] [WorkflowRunner] [ERROR] [rm_individual_seg_files] Last 2 stderr lines from task (of 2 total lines):
[2020-11-17T17:07:15.736848] [node127.cm.cluster] [68840_1] [WorkflowRunner] [ERROR] [2020-11-17T15:50:23.975587] [node127.cm.cluster] [68840_1] [rm_individual_seg_files] rm: missing operand
[2020-11-17T17:07:15.736848] [node127.cm.cluster] [68840_1] [WorkflowRunner] [ERROR] [2020-11-17T15:50:23.975877] [node127.cm.cluster] [68840_1] [rm_individual_seg_files] Try 'rm --help' for more information.
Do you have any clue why this error happened? Can you please help me to solve it?
Additional information: I was implementing the program on a pair of WGS of canine tumour and normal tissue. The required reference files were properly made, except the snp_sites.gz file. But, I removed the option --callRegions of the SNP calling step using the Strelka from the main.py file. So the program can still work without the snp_sites file.
Regards,
Yun
Thanks for your response!
I attach the pyflow log files.
I have a SNP-sites file for dogs, but I decided to disable the --callRegions because I got a strange error from that file which was that the chr33 in snp_sites.gz was not found in the reference genome. Maybe disabling that option was a bad idea. But do you have any clue why that particular chr33 can not be found in the reference genome? I checked the reference genome file, the chr33 was there I think.
Thanks for your help in advance!
Regards,
Yun
Did you check if chr33 is in your genome index files (i.e. genome.dict and etc.)?
We saw this in the pyflow_tasks_stdout_log.txt, which suggests all reads in your bam fail to pass our filters:
Reading in genome coverage from "/home/WUR/yu052/DogWUR108_rh.dedup_st.reA.bam" ...
Reading and smoothing of coverage from "/home/WUR/yu052/DogWUR108_rh.dedup_st.reA.bam" is Done. 0 unique chromosomes, 902274498 reads.
Genome wide mean coverage is NaN
Reading in genome coverage from "/home/WUR/yu052/DogWUR115_rh.dedup_st.reA.bam" ...
Reading and smoothing of coverage from "/home/WUR/yu052/DogWUR115_rh.dedup_st.reA.bam" is Done. 0 unique chromosomes, 174036637 reads.
Genome wide mean coverage is NaN
Our rust program contains these filters. Can you check your bam to see why all reads fail to pass these filters?
if record.mapq()<30 {
continue
}
if record.is_paired() && ( record.insert_size()<0 || record.insert_size()>self.max_fragment_len as i64 ||
!record.is_proper_pair() || record.is_mate_unmapped() || !record.is_first_in_template() ||
record.is_secondary() || record.is_duplicate() || record.is_supplementary() ) {
continue;
}
I checked that chromosome 33 is indeed in the genome.fa, genome.dict, and genome.fa.fai.
It is weird that all the reads in the bam failed to pass the filters, isn't it? I confirmed that most reads have mapq 60.
Do you have any other clue why I got these errors?
KInd regards
Thanks for your response!
Actually, I am pretty sure I have good bam files. I checked them in Jbrowse and IGV. Of course, there are mismapped reads, but not much. It makes no sense that all the reads just failed in the filters, right?
The Strelka didn't work in your accucopy pipeline at the first step. But I succeed in running the Strelka independently. This is also confusing me.
You can copy the independent Strelka into the docker and overwrite the docker version and ran it inside the docker to see if anything strange.
You bam files identify chromsomes as "chr1", not "1", right? Our program assumes "chr1", not "1".
Is it possible to make your program compatible for both format?
Sorry that I didn't make it clear. I mean the format of the name of the chromosome, chr1 and 1.