Error in rule apply_quality_filter
johnne opened this issue · 1 comments
johnne commented
- I checked and didn't found a related issue,e.g. while typing the title
- I got an error in the following rule(s):
apply_quality_filter
- I checked the log files indicated indicated in the error message (and the cluster logs if submitted to a cluster)
Here is the relevant log output:
java -ea -Xmx51G -Xms51G -cp /crex/proj/snic2020-5-486/nobackup/SMS-23-6668-micegut/resources/conda_envs/7e5a7fea79fbf3816713ab3b8fc02eb3_/opt/bbmap-39.01-0/current
/ jgi.BBDuk in1=m.c-2/sequence_quality_control/m.c-2_deduplicated_R1.fastq.gz in2=m.c-2/sequence_quality_control/m.c-2_deduplicated_R2.fastq.gz ref=/proj/snic2020-5
-486/nobackup/SMS-23-6668-micegut/resources/adapters.fa interleaved=f out=m.c-2/sequence_quality_control/m.c-2_filtered_R1.fastq.gz out2=m.c-2/sequence_quality_cont
rol/m.c-2_filtered_R2.fastq.gz outs=m.c-2/logs/m.c-2_quality_filtering_stats.txt stats=m.c-2/logs/m.c-2_quality_filtering_stats.txt overwrite=true qout=33 trd=t hdi
st=1 k=27 ktrim=r mink=8 trimq=10 qtrim=rl threads=8 minlength=51 maxns=-1 minbasefrequency=0.05 ecco=t prealloc=t pigz=t unpigz=t -Xmx51G
Executing jgi.BBDuk [in1=m.c-2/sequence_quality_control/m.c-2_deduplicated_R1.fastq.gz, in2=m.c-2/sequence_quality_control/m.c-2_deduplicated_R2.fastq.gz, ref=/proj
/snic2020-5-486/nobackup/SMS-23-6668-micegut/resources/adapters.fa, interleaved=f, out=m.c-2/sequence_quality_control/m.c-2_filtered_R1.fastq.gz, out2=m.c-2/sequenc
e_quality_control/m.c-2_filtered_R2.fastq.gz, outs=m.c-2/logs/m.c-2_quality_filtering_stats.txt, stats=m.c-2/logs/m.c-2_quality_filtering_stats.txt, overwrite=true,
qout=33, trd=t, hdist=1, k=27, ktrim=r, mink=8, trimq=10, qtrim=rl, threads=8, minlength=51, maxns=-1, minbasefrequency=0.05, ecco=t, prealloc=t, pigz=t, unpigz=t,
-Xmx51G]
Version 39.01
Set INTERLEAVED to false
Set threads to 8
Initial size set to 506877058
maskMiddle was disabled because useShortKmers=true
Exception in thread "main" java.lang.AssertionError: Duplicate file m.c-2/logs/m.c-2_quality_filtering_stats.txt was specified for multiple output streams.
at shared.Tools.testOutputFiles(Tools.java:133)
at jgi.BBDuk.<init>(BBDuk.java:919)
at jgi.BBDuk.main(BBDuk.java:78)
Atlas version
2.15.0
Additional context
It seems that the stats file (specified as stats="{sample}/logs/{sample}_quality_filtering_stats.txt",
in the qc.smk
rulesfile) is used both for outs=
and stats=
. My guess is that this comes from
outputs=(
lambda wc, output: "out={0} out2={1} outs={2}".format(*output)
if PAIRED_END
else "out={0}".format(*output)
),
starting on line 372 in workflow/rules/qc.smk
.
johnne commented
Ok I found the error. I think it was due to me having the Reads_QC_R1
and Reads_QC_R2
columns in my sample file, but with empty values for my samples. Removing those columns seems to have fixed the issue.