Workflow to run extract UMIs from fastq and generate consensus Bams as well as run it thru mutect2 task and combinevariants task
java -jar cromwell.jar run umiConsensus.wdl --inputs inputs.json
Parameter | Value | Description |
---|---|---|
outputFileNamePrefix |
String | Prefix to use for output file |
intervalFile |
String | interval file to subset variant calls |
reference |
String | the reference build of the genome |
Parameter | Value | Default | Description |
---|---|---|---|
fastqGroups |
Array[fastqGroup]? | None | Array of fastq files to concatenate if a top-up |
sortedBam |
File? | None | Bam file from bwamem |
sortedBai |
File? | None | Bai file from bwamem |
Parameter | Value | Default | Description |
---|---|---|---|
align.consensusCruncherPy |
String | "$CONSENSUS_CRUNCHER_ROOT/bin/ConsensusCruncher.py" | Path to consensusCruncher binary |
align.bwa |
String | "$BWA_ROOT/bin/bwa" | Path to bwa binary |
align.samtools |
String | "$SAMTOOLS_ROOT/bin/samtools" | Path to samtools binary |
align.threads |
Int | 4 | Number of threads to request |
align.jobMemory |
Int | 16 | Memory allocated for this job |
align.timeout |
Int | 72 | Hours before task timeout |
mergeBams.additionalParams |
String? | None | Additional parameters to pass to GATK MergeSamFiles. |
mergeBams.jobMemory |
Int | 48 | Memory allocated to job (in GB). |
mergeBams.overhead |
Int | 6 | Java overhead memory (in GB). jobMemory - overhead == java Xmx/heap memory. |
mergeBams.cores |
Int | 1 | The number of cores to allocate to the job. |
mergeBams.timeout |
Int | 8 | Maximum amount of time (in hours) the task can run for. |
mergeBams.modules |
String | "gatk/4.1.6.0" | Environment module name and version to load (space separated) before command execution. |
consensus.consensusCruncherPy |
String | "$CONSENSUS_CRUNCHER_ROOT/bin/ConsensusCruncher.py" | Path to consensusCruncher binary |
consensus.samtools |
String | "$SAMTOOLS_ROOT/bin/samtools" | Path to samtools binary |
consensus.ccDir |
String | basePrefix + ".consensuscruncher" | Placeholder |
consensus.cutoff |
Float | 0.7 | Cutoff to use to call a consenus of reads |
consensus.threads |
Int | 8 | Number of threads to request |
consensus.jobMemory |
Int | 32 | Memory allocated for this job |
consensus.timeout |
Int | 72 | Hours before task timeout |
hsMetricsRunDCSSC.collectHSmetrics_timeout |
Int | 5 | Maximum amount of time (in hours) the task can run for. |
hsMetricsRunDCSSC.collectHSmetrics_maxRecordsInRam |
Int | 250000 | Specifies the N of records stored in RAM before spilling to disk. Increasing this number increases the amount of RAM needed. |
hsMetricsRunDCSSC.collectHSmetrics_coverageCap |
Int | 500 | Parameter to set a max coverage limit for Theoretical Sensitivity calculations |
hsMetricsRunDCSSC.collectHSmetrics_jobMemory |
Int | 18 | Memory allocated to job |
hsMetricsRunDCSSC.collectHSmetrics_filter |
String | "LENIENT" | Settings for picard filter |
hsMetricsRunDCSSC.collectHSmetrics_metricTag |
String | "HS" | Extension for metrics file |
hsMetricsRunDCSSC.bedToBaitIntervals_timeout |
Int | 1 | Maximum amount of time (in hours) the task can run for. |
hsMetricsRunDCSSC.bedToBaitIntervals_jobMemory |
Int | 16 | Memory allocated to job |
hsMetricsRunDCSSC.bedToTargetIntervals_timeout |
Int | 1 | Maximum amount of time (in hours) the task can run for. |
hsMetricsRunDCSSC.bedToTargetIntervals_jobMemory |
Int | 16 | Memory allocated to job |
hsMetricsRunSSCSSC.collectHSmetrics_timeout |
Int | 5 | Maximum amount of time (in hours) the task can run for. |
hsMetricsRunSSCSSC.collectHSmetrics_maxRecordsInRam |
Int | 250000 | Specifies the N of records stored in RAM before spilling to disk. Increasing this number increases the amount of RAM needed. |
hsMetricsRunSSCSSC.collectHSmetrics_coverageCap |
Int | 500 | Parameter to set a max coverage limit for Theoretical Sensitivity calculations |
hsMetricsRunSSCSSC.collectHSmetrics_jobMemory |
Int | 18 | Memory allocated to job |
hsMetricsRunSSCSSC.collectHSmetrics_filter |
String | "LENIENT" | Settings for picard filter |
hsMetricsRunSSCSSC.collectHSmetrics_metricTag |
String | "HS" | Extension for metrics file |
hsMetricsRunSSCSSC.bedToBaitIntervals_timeout |
Int | 1 | Maximum amount of time (in hours) the task can run for. |
hsMetricsRunSSCSSC.bedToBaitIntervals_jobMemory |
Int | 16 | Memory allocated to job |
hsMetricsRunSSCSSC.bedToTargetIntervals_timeout |
Int | 1 | Maximum amount of time (in hours) the task can run for. |
hsMetricsRunSSCSSC.bedToTargetIntervals_jobMemory |
Int | 16 | Memory allocated to job |
hsMetricsRunAllUnique.collectHSmetrics_timeout |
Int | 5 | Maximum amount of time (in hours) the task can run for. |
hsMetricsRunAllUnique.collectHSmetrics_maxRecordsInRam |
Int | 250000 | Specifies the N of records stored in RAM before spilling to disk. Increasing this number increases the amount of RAM needed. |
hsMetricsRunAllUnique.collectHSmetrics_coverageCap |
Int | 500 | Parameter to set a max coverage limit for Theoretical Sensitivity calculations |
hsMetricsRunAllUnique.collectHSmetrics_jobMemory |
Int | 18 | Memory allocated to job |
hsMetricsRunAllUnique.collectHSmetrics_filter |
String | "LENIENT" | Settings for picard filter |
hsMetricsRunAllUnique.collectHSmetrics_metricTag |
String | "HS" | Extension for metrics file |
hsMetricsRunAllUnique.bedToBaitIntervals_timeout |
Int | 1 | Maximum amount of time (in hours) the task can run for. |
hsMetricsRunAllUnique.bedToBaitIntervals_jobMemory |
Int | 16 | Memory allocated to job |
hsMetricsRunAllUnique.bedToTargetIntervals_timeout |
Int | 1 | Maximum amount of time (in hours) the task can run for. |
hsMetricsRunAllUnique.bedToTargetIntervals_jobMemory |
Int | 16 | Memory allocated to job |
Output | Type | Description | Labels |
---|---|---|---|
rawBam |
File? | aligned bam file | vidarr_label: rawBam |
rawBamIndex |
File? | aligned bam index | vidarr_label: rawBamIndex |
dcsScBam |
File | DCS generated from SSCS + SC | vidarr_label: dcsScBam |
dcsScBamIndex |
File | Index for DCS SC Bam | vidarr_label: dcsScBamIndex |
allUniqueBam |
File | DCS (from SSCS + SC) + SSCS_SC_Singletons + remaining singletons | vidarr_label: allUniqueBam |
allUniqueBamIndex |
File | Index for All Unique Bam | vidarr_label: allUniqueBamIndex |
sscsScBam |
File | SSCS combined with corrected singletons (from both rescue strategies) | vidarr_label: sscsScBam |
sscsScBamIndex |
File | Index for SSCS SC Bam | vidarr_label: sscsScBamIndex |
outputCCStats |
File | Consensus sequence formation metrics | vidarr_label: outputCCStats |
outputCCReadFamilies |
File | Family size and frequency from consensusCruncher | vidarr_label: outputCCReadFamilies |
ccFolder |
File | output folder containing files not needed for downstream analysis; info on family size, QC metrics | vidarr_label: ccFolder |
dcsScHsMetrics |
File | Hs Metrics for duplex consensus sequences (DCS) | vidarr_label: dcsScHsMetrics |
sscsScHsMetrics |
File | HS Metrics for single-strand consensus sequences (SSCS) | vidarr_label: sscsScHsMetrics |
allUniqueHsMetrics |
File | HS Metrics for AllUnique | vidarr_label: allUniqueHsMetrics |
This section lists command(s) run by umiConsensus workflow
- Running umiConsensus
=== Description here ===. Commands for running concat.
set -euo pipefail
zcat ~{sep=" " read1s} | gzip > ~{outputFileNamePrefix}_R1_001.fastq.gz
zcat ~{sep=" " read2s} | gzip > ~{outputFileNamePrefix}_R2_001.fastq.gz
Commands for running align
set -euo pipefail
~{consensusCruncherPy} fastq2bam \
--fastq1 ~{fastqR1} \
--fastq2 ~{fastqR2}\
--output . \
--bwa ~{bwa} \
--ref ~{bwaref} \
--samtools ~{samtools} \
--skipcheck \
--blist ~{blist}
# Necessary for if bam files to be named according to merged library name
# Additionally if ".sorted" isn't omitted here, file names from align include ".sorted" twice
mv bamfiles/*.bam bamfiles/"~{outputFileNamePrefix}.bam"
mv bamfiles/*.bai bamfiles/"~{outputFileNamePrefix}.bam.bai"
Commands for running consensus:
set -euo pipefail
~{consensusCruncherPy} consensus \
--input ~{inputBam} \
--output . \
--samtools ~{samtools} \
--cutoff ~{cutoff} \
--genome ~{genome} \
--bedfile ~{cytoband} \
--bdelim '|'
tar cf - ~{basePrefix} | gzip --no-name > ~{ccDir}.tar.gz
For support, please file an issue on the Github project or send an email to gsi@oicr.on.ca .
Generated with generate-markdown-readme (https://github.com/oicr-gsi/gsi-wdl-tools/)