/hrDetect

Workflow for the sigtools and HRdetect R packages

Primary LanguageR

hrDetect

Homologous Recombination Deficiency (HRD) Prediction Workflow using sig.tools

Overview

Dependencies

Usage

Cromwell

java -jar cromwell.jar run hrDetect.wdl --inputs inputs.json

Inputs

Required workflow parameters:

Parameter Value Description
outputFileNamePrefix String Name of sample matching the tumor sample in .vcf
structuralVcfFile File Input VCF file of structural variants (eg. from delly)
smallsVcfFile File Input VCF file of SNV and indels (small mutations) (eg. from mutect2)
smallsVcfIndex File Index file for smallsVcfFile
segFile File File for segmentations, used to estimate number of segments in Loss of heterozygosity (LOH) (eg. from sequenza)
reference String Reference genome version

Optional workflow parameters:

Parameter Value Default Description

Optional task parameters:

Parameter Value Default Description
filterStructural.modules String "bcftools/1.9" Required environment modules
filterStructural.structuralQUALfilter String "PASS" filter for filter calls to keep, eg. PASS
filterStructural.jobMemory Int 5 Memory allocated for this job (GB)
filterStructural.threads Int 1 Requested CPU threads
filterStructural.timeout Int 1 Hours before task timeout
filterINDELs.VAF Float 0.01 minimum variant allele frequency to retain variant
filterINDELs.QUALfilter String "FILTER~'haplotype' FILTER~'clustered_events'
filterINDELs.jobMemory Int 10 Memory allocated for this job (GB)
filterINDELs.threads Int 1 Requested CPU threads
filterINDELs.timeout Int 2 Hours before task timeout
filterSNVs.VAF Float 0.01 minimum variant allele frequency to retain variant
filterSNVs.QUALfilter String "FILTER~'haplotype' FILTER~'clustered_events'
filterSNVs.jobMemory Int 10 Memory allocated for this job (GB)
filterSNVs.threads Int 1 Requested CPU threads
filterSNVs.timeout Int 2 Hours before task timeout
hrdResults.modules String "sigtools/2.4.1 sigtools-data/1.0 sigtools-rscript/1.0" Required environment modules
hrdResults.sigtoolrScript String "$SIGTOOLS_RSCRIPT_ROOT/scripts/sigTools_runthrough.R" .R script containing sigtools
hrdResults.SVrefSigs String "$SIGTOOLS_DATA_ROOT/RefSigv0_Rearr.tsv" reference signatures for SVs
hrdResults.SNVrefSigs String "$SIGTOOLS_DATA_ROOT/COSMIC_v1_SBS_GRCh38.txt" reference signatures for SNVs
hrdResults.sigtoolsBootstrap Int 200 Number of bootstraps for sigtools
hrdResults.indelCutoff Int 10 minimum number of indels to run analysis
hrdResults.jobMemory Int 50 Memory allocated for this job (GB)
hrdResults.threads Int 1 Requested CPU threads
hrdResults.timeout Int 15 Hours before task timeout

Outputs

Output Type Description
hrd_signatures File JSON file of hrDetect signatures
SBS_exposures File JSON of single basepair substitution signatures
SV_exposures File JSON of structural variant signatures
ID_catalog File JSON cataloguing indels

This section lists command(s) run by hrDetect workflow

  • Running hrDetect
		set -euo pipefail


		$BCFTOOLS_ROOT/bin/bcftools view -f '~{structuralQUALfilter}' ~{structuralVcfFile} >> ~{outputFileNamePrefix}.structural.PASS.vcf

		awk '$1 !~ "#" {print}' ~{structuralVcfFile} | wc -l >~{outputFileNamePrefix}.structural.filteringReport.txt
		awk '$1 !~ "#" {print}' ~{outputFileNamePrefix}.structural.PASS.vcf | wc -l >>~{outputFileNamePrefix}.structural.filteringReport.txt

		set -euo pipefail

		$BCFTOOLS_ROOT/bin/bcftools norm --multiallelics - --fasta-ref ~{genome} ~{difficultRegions} ~{smallsVcfFile} | \
		$BCFTOOLS_ROOT/bin/bcftools filter -i "TYPE='~{smallType}'" | \
		$BCFTOOLS_ROOT/bin/bcftools filter -e "~{QUALfilter}" | \
		$BCFTOOLS_ROOT/bin/bcftools filter -i "(FORMAT/AD[0:1])/(FORMAT/AD[0:0]+FORMAT/AD[0:1]) >= ~{VAF}" >~{outputFileNamePrefix}.~{smallType}.VAF.vcf

		bgzip ~{outputFileNamePrefix}.~{smallType}.VAF.vcf
		tabix -p vcf ~{outputFileNamePrefix}.~{smallType}.VAF.vcf.gz

		zcat ~{smallsVcfFile} | awk '$1 !~ "#" {print}'  | wc -l >~{outputFileNamePrefix}.~{smallType}.filteringReport.txt
		zcat ~{outputFileNamePrefix}.~{smallType}.VAF.vcf.gz | awk '$1 !~ "#" {print}'  | wc -l >>~{outputFileNamePrefix}.~{smallType}.filteringReport.txt

		set -euo pipefail

		Rscript ~{sigtoolrScript} \
			--sampleName ~{outputFileNamePrefix} \
			--snvFile ~{snvVcfFiltered} \
			--indelFile  ~{indelVcfFiltered} \
			--SVFile ~{SV_vcf_location} \
			--LOHFile ~{lohSegFile} \
			--bootstraps ~{sigtoolsBootstrap} \
			--genomeVersion ~{genomeVersion} \
			--indelCutoff ~{indelCutoff} \
			--SVrefSigs ~{SVrefSigs} \
			--SNVrefSigs ~{SNVrefSigs}

## Support

For support, please file an issue on the [Github project](https://github.com/oicr-gsi) or send an email to gsi@oicr.on.ca .

_Generated with generate-markdown-readme (https://github.com/oicr-gsi/gsi-wdl-tools/)_