Extensive testing of subworkflows
Closed this issue · 13 comments
Please create sub-issues for each subworkflow.
dup_to_ins parameter does not exist
sample parameter into truth-id
It is written nextflow run nf-core/benchmark -profile test_liftover,<docker/singularity> --outdir <OUTDIR>
in test_liftover.config file.
It should be nextflow run nf-core/variantbenchmarking ....
add info about truth samples, link GIAB truth ftp directories.
I added normalization and deduplication preprocess into test_somatic.config. When I run nextflow run . -profile test_somatic,docker --outdir test_results
, I got this error:
Command error:
[W::vcf_parse_info] INFO 'STRANDS' is not defined in the header, assuming Type=String
REF_MISMATCH chr1 669501 N A
REF_MISMATCH chr1 964661 N G
[E::vcf_format] Invalid BCF, the INFO tag id=13 is too large at chr1:669501
[flush_buffer] Error: cannot write to 13059_2022_2816_MOESM4_ESM.rh.dedup.vcf.gz
Similar issue: samtools/bcftools#1720
I added filter_contig
into the test_stub.config. And then run nextflow run . -profile test_stub,docker --outdir test_results -stub
Error message:
Execution cancelled -- Finishing pending tasks before exit
-[nf-core/variantbenchmarking] Pipeline completed with errors-
WARN: Killing running tasks (9)
ERROR ~ Error executing process > 'NFCORE_VARIANTBENCHMARKING:VARIANTBENCHMARKING:SMALL_GERMLINE_BENCHMARK:HAPPY_HAPPY (freebayes)'
Caused by:
Process `NFCORE_VARIANTBENCHMARKING:VARIANTBENCHMARKING:SMALL_GERMLINE_BENCHMARK:HAPPY_HAPPY (freebayes)` terminated with an error exit status (1)
Command executed:
echo "" | gzip > freebayes.HG002.small.freebayes.roc.all.csv.gz
echo "" | gzip > freebayes.HG002.small.freebayes.roc.Locations.INDEL.csv.gz
echo "" | gzip > freebayes.HG002.small.freebayes.roc.Locations.INDEL.PASS.csv.gz
echo "" | gzip > freebayes.HG002.small.freebayes.roc.Locations.SNP.csv.gz
echo "" | gzip > freebayes.HG002.small.freebayes.roc.Locations.SNP.PASS.csv.gz
echo "" | gzip > freebayes.HG002.small.freebayes.metrics.json.gz
echo "" | gzip > freebayes.HG002.small.freebayes.vcf.gz
touch freebayes.HG002.small.freebayes.vcf.gz.tbi
touch freebayes.HG002.small.freebayes.summary.csv
touch freebayes.HG002.small.freebayes.extended.csv
touch freebayes.HG002.small.freebayes.runinfo.json
cat <<-END_VERSIONS > versions.yml
"NFCORE_VARIANTBENCHMARKING:VARIANTBENCHMARKING:SMALL_GERMLINE_BENCHMARK:HAPPY_HAPPY":
hap.py: 0.3.14
END_VERSIONS
Command exit status:
1
Command output:
(empty)
Command error:
.command.sh: line 14: freebayes.HG002.small.freebayes.vcf.gz: cannot overwrite existing file
Work dir:
/workspace/variantbenchmarking/work/41/4240508390b8f5c64ca437a44466ea
Container:
quay.io/biocontainers/hap.py:0.3.14--py27h5c5a3ab_0
Tip: when you have fixed the problem you can continue the execution adding the option `-resume` to the run command line
-- Check '.nextflow.log' file for details
ERROR ~ Pipeline failed. Please refer to troubleshooting docs: https://nf-co.re/docs/usage/troubleshooting
-- Check '.nextflow.log' file for details
Join mismatch for the following entries: key=[id:tiddit2, caller:tiddit, vartype:sv, subsample:HCC1395_HCC1395T, normshift:null, normdist:null, normsizediff:null, maxdist:null, pctsize:null, pctseq:null, pctovl:null, refdist:null, chunksize:null, dup_to_ins:null, typeignore:null, bpDistance:null, percentThreshold:null, absoluteThreshold:null, maxMatches:null, evaluationmode:null] values=
Join mismatch for the following entries: key=[id:freebayes, caller:freebayes, vartype:small, subsample:null, normshift:null, normdist:null, normsizediff:null, maxdist:null, pctsize:null, pctseq:null, pctovl:null, refdist:null, chunksize:null, dup_to_ins:null, typeignore:null, bpDistance:null, percentThreshold:null, absoluteThreshold:null, maxMatches:null, evaluationmode:null] values=
ERROR ~ Pipeline failed. Please refer to troubleshooting docs: https://nf-co.re/docs/usage/troubleshooting
-- Check '.nextflow.log' file for details
dup_to_ins parameter does not exist
It exists as a truvari parameter being processed form ext.args, should be defined in samplesheet.
sample parameter into truth-id
that is resolved in #100
It is written
nextflow run nf-core/benchmark -profile test_liftover,<docker/singularity> --outdir <OUTDIR>
in test_liftover.config file.It should be
nextflow run nf-core/variantbenchmarking ....
this is fixed
add info about truth samples, link GIAB truth ftp directories.
a spesfic .md file is created for truth files
I added normalization and deduplication preprocess into test_somatic.config. When I run
nextflow run . -profile test_somatic,docker --outdir test_results
, I got this error:Command error: [W::vcf_parse_info] INFO 'STRANDS' is not defined in the header, assuming Type=String REF_MISMATCH chr1 669501 N A REF_MISMATCH chr1 964661 N G [E::vcf_format] Invalid BCF, the INFO tag id=13 is too large at chr1:669501 [flush_buffer] Error: cannot write to 13059_2022_2816_MOESM4_ESM.rh.dedup.vcf.gz
Similar issue: samtools/bcftools#1720
I will skip this error since it is originated through the truth vcf misformatting
I added
filter_contig
into the test_stub.config. And then runnextflow run . -profile test_stub,docker --outdir test_results -stub
Error message:
Execution cancelled -- Finishing pending tasks before exit -[nf-core/variantbenchmarking] Pipeline completed with errors- WARN: Killing running tasks (9) ERROR ~ Error executing process > 'NFCORE_VARIANTBENCHMARKING:VARIANTBENCHMARKING:SMALL_GERMLINE_BENCHMARK:HAPPY_HAPPY (freebayes)' Caused by: Process `NFCORE_VARIANTBENCHMARKING:VARIANTBENCHMARKING:SMALL_GERMLINE_BENCHMARK:HAPPY_HAPPY (freebayes)` terminated with an error exit status (1) Command executed: echo "" | gzip > freebayes.HG002.small.freebayes.roc.all.csv.gz echo "" | gzip > freebayes.HG002.small.freebayes.roc.Locations.INDEL.csv.gz echo "" | gzip > freebayes.HG002.small.freebayes.roc.Locations.INDEL.PASS.csv.gz echo "" | gzip > freebayes.HG002.small.freebayes.roc.Locations.SNP.csv.gz echo "" | gzip > freebayes.HG002.small.freebayes.roc.Locations.SNP.PASS.csv.gz echo "" | gzip > freebayes.HG002.small.freebayes.metrics.json.gz echo "" | gzip > freebayes.HG002.small.freebayes.vcf.gz touch freebayes.HG002.small.freebayes.vcf.gz.tbi touch freebayes.HG002.small.freebayes.summary.csv touch freebayes.HG002.small.freebayes.extended.csv touch freebayes.HG002.small.freebayes.runinfo.json cat <<-END_VERSIONS > versions.yml "NFCORE_VARIANTBENCHMARKING:VARIANTBENCHMARKING:SMALL_GERMLINE_BENCHMARK:HAPPY_HAPPY": hap.py: 0.3.14 END_VERSIONS Command exit status: 1 Command output: (empty) Command error: .command.sh: line 14: freebayes.HG002.small.freebayes.vcf.gz: cannot overwrite existing file Work dir: /workspace/variantbenchmarking/work/41/4240508390b8f5c64ca437a44466ea Container: quay.io/biocontainers/hap.py:0.3.14--py27h5c5a3ab_0 Tip: when you have fixed the problem you can continue the execution adding the option `-resume` to the run command line -- Check '.nextflow.log' file for details ERROR ~ Pipeline failed. Please refer to troubleshooting docs: https://nf-co.re/docs/usage/troubleshooting -- Check '.nextflow.log' file for details Join mismatch for the following entries: key=[id:tiddit2, caller:tiddit, vartype:sv, subsample:HCC1395_HCC1395T, normshift:null, normdist:null, normsizediff:null, maxdist:null, pctsize:null, pctseq:null, pctovl:null, refdist:null, chunksize:null, dup_to_ins:null, typeignore:null, bpDistance:null, percentThreshold:null, absoluteThreshold:null, maxMatches:null, evaluationmode:null] values= Join mismatch for the following entries: key=[id:freebayes, caller:freebayes, vartype:small, subsample:null, normshift:null, normdist:null, normsizediff:null, maxdist:null, pctsize:null, pctseq:null, pctovl:null, refdist:null, chunksize:null, dup_to_ins:null, typeignore:null, bpDistance:null, percentThreshold:null, absoluteThreshold:null, maxMatches:null, evaluationmode:null] values= ERROR ~ Pipeline failed. Please refer to troubleshooting docs: https://nf-co.re/docs/usage/troubleshooting -- Check '.nextflow.log' file for details
this was resolved in #98