/purple

Primary LanguageWDL

purple

performs purity and ploidy estimation

Overview

Dependencies

Usage

Cromwell

java -jar cromwell.jar run purple.wdl --inputs inputs.json

Inputs

Required workflow parameters:

Parameter Value Description
tumour_bam File Input tumor file (bam)
tumour_bai File Input tumor file index (bai)
normal_bam File Input normal file (bam)
normal_bai File Input normal file index (bai)
normal_name String Name of normal sample
tumour_name String Name of tumour sample

Optional workflow parameters:

Parameter Value Default Description
genomeVersion String "38" Genome Version, only 38 supported
doSV Boolean true include somatic structural variant calls, true/false
doSMALL Boolean true include somatic small (SNV+indel) calls, true/false

Optional task parameters:

Parameter Value Default Description
amber.amberScript String "java -Xmx32G -cp $HMFTOOLS_ROOT/amber.jar com.hartwig.hmftools.amber.AmberApplication" location of AMBER script
amber.threads Int 8 Requested CPU threads
amber.memory Int 32 Memory allocated for this job (GB)
amber.timeout Int 100 Hours before task timeout
cobalt.colbaltScript String "java -Xmx8G -cp $HMFTOOLS_ROOT/cobalt.jar com.hartwig.hmftools.cobalt.CobaltApplication" location of COBALT script
cobalt.gamma String 100 gamma (penalty) value for segmenting
cobalt.threads Int 8 Requested CPU threads
cobalt.memory Int 32 Memory allocated for this job (GB)
cobalt.timeout Int 100 Hours before task timeout
filterSV.vcf File? None VCF file for filtering
filterSV.gripssScript String "java -Xmx80G -jar $HMFTOOLS_ROOT/gripss.jar" location and java call for gripss jar
filterSV.hard_min_tumor_qual Int 500 Any variant with QUAL less than x is filtered
filterSV.filter_sgls String "-filter_sgls" include filtering of single breakends
filterSV.memory Int 80 Memory allocated for this job (GB)
filterSV.threads Int 1 Requested CPU threads
filterSV.timeout Int 100 Hours before task timeout
filterSMALL.vcf File? None VCF file for filtering
filterSMALL.vcf_index File? None index of VCF file for filtering
filterSMALL.bcftoolsScript String "$BCFTOOLS_ROOT/bin/bcftools" location for bcftools
filterSMALL.genome String "$HG38_ROOT/hg38_random.fa" reference fasta
filterSMALL.regions String "chr1,chr10,chr11,chr12,chr13,chr14,chr15,chr16,chr17,chr18,chr19,chr2,chr20,chr21,chr22,chr3,chr4,chr5,chr6,chr7,chr8,chr9,chrX,chrY" regions/chromosomes to include
filterSMALL.difficultRegions String "--targets-file $HG38_DAC_EXCLUSION_ROOT/hg38-dac-exclusion.v2.bed" regions to exclude because they are difficult
filterSMALL.tumorVAF String "0.001" minimum variant allele frequency for tumour calls to pass filter
filterSMALL.modules String "bcftools/1.9 hg38/p12 hg38-dac-exclusion/1.0" Required environment modules
filterSMALL.threads Int 8 Requested CPU threads
filterSMALL.memory Int 32 Memory allocated for this job (GB)
filterSMALL.timeout Int 100 Hours before task timeout
runPURPLE.purpleScript String "java -Xmx8G -jar $HMFTOOLS_ROOT/purple.jar" location of PURPLE script
runPURPLE.threads Int 8 Requested CPU threads
runPURPLE.memory Int 32 Memory allocated for this job (GB)
runPURPLE.timeout Int 100 Hours before task timeout
LINX.linxScript String "java -Xmx32G -cp $HMFTOOLS_ROOT/linx.jar com.hartwig.hmftools.linx.LinxApplication" location of LINX script
LINX.threads Int 8 Requested CPU threads
LINX.memory Int 32 Memory allocated for this job (GB)
LINX.timeout Int 100 Hours before task timeout

Outputs

Output Type Description
purple_qc File QC results from PURPLE
purple_purity File tab seperated Purity estimate from PURPLE
purple_purity_range File tab seperated range of Purity estimate from PURPLE
purple_segments File tab seperated segments estimated by PURPLE
purple_cnv File tab seperated somatic copy number variants from PURPLE
purple_cnv_gene File tab seperated somatic gene-level copy number variants from PURPLE
purple_SV_index File? Structural Variant .vcf index edited by PURPLE
purple_SV File? Structural Variant .vcf edited by PURPLE
purple_SMALL_index File? SNV+IN/DEL .vcf index edited by PURPLE
purple_SMALL File? SNV+IN/DEL .vcf edited by PURPLE

Commands

This section lists commands run by the PURPLE workflow

Run AMBER

 ~{amberScript} \
   -reference ~{normal_name} -reference_bam ~{normal_bam} \
   -tumor ~{tumour_name} -tumor_bam ~{tumour_bam} \
   -output_dir ~{tumour_name}.amber/ \
   -loci ~{PON} \
   -ref_genome_version ~{genomeVersion}

Run COBALT

 ~{colbaltScript} \
   -reference ~{normal_name} -reference_bam ~{normal_bam} \
   -tumor ~{tumour_name} -tumor_bam ~{tumour_bam} \
   -output_dir ~{tumour_name}.cobalt/ \
   -gc_profile ~{gcProfile} \
   -pcf_gamma ~{gamma}

Filter structural variant VCFs using GRIPSS, this is optional but recommended

 ~{gripssScript} \
     -vcf ~{vcf}  \
     -sample ~{tumour_name} -reference ~{normal_name} \
     -ref_genome_version ~{genomeVersion} \
     -ref_genome ~{refFasta} \
     -pon_sgl_file ~{pon_sgl_file} \
     -pon_sv_file ~{pon_sv_file} \
     -known_hotspot_file ~{known_hotspot_file} \
     -repeat_mask_file ~{repeat_mask_file} \
     -output_dir gripss/ \
     -hard_min_tumor_qual ~{hard_min_tumor_qual} \
     ~{filter_sgls}

Filter small variant (SNV + in/del) VCFs using bcftools, this is optional but recommended

  ~{bcftoolsScript} view -f "PASS" -S samples.txt -r ~{regions} ~{difficultRegions} ~{vcf} |\
  ~{bcftoolsScript} norm --multiallelics - --fasta-ref ~{genome} |\
  ~{bcftoolsScript} filter -i "(FORMAT/AD[0:1])/(FORMAT/AD[0:0]+FORMAT/AD[0:1]) >= ~{tumorVAF}"  > ~{tumour_name}.PASS.vcf

Run PURPLE

 ~{purpleScript} \
   -ref_genome_version ~{genomeVersion} \
   -ref_genome ~{refFasta}  \
   -gc_profile ~{gcProfile} \
   -ensembl_data_dir ~{ensemblDir}  \
   -reference ~{normal_name} -tumor ~{tumour_name}  \
   -amber ~{tumour_name}.amber -cobalt ~{tumour_name}.cobalt \
   ~{"-somatic_sv_vcf " + SV_vcf} \
   ~{"-somatic_vcf " + smalls_vcf} \
   -output_dir ~{tumour_name}.purple \
   -no_charts

Run LINX, if structural variant calls from GRIDSS are included

 ~{linxScript} \
   -sample ~{tumour_name} \
   -ref_genome_version ~{genomeVersion} \
   -ensembl_data_dir ~{ensemblDir}  \
   -check_fusions \
   -known_fusion_file ~{fusions_file} \
   -purple_dir ~{tumour_name}.purple \
   -output_dir ~{tumour_name}.linx 

Support

For support, please file an issue on the Github project or send an email to gsi@oicr.on.ca .

Generated with generate-markdown-readme (https://github.com/oicr-gsi/gsi-wdl-tools/)