A pipeline to perform joint calling on WGS NGS data on the Dragen server.
- Calls SNPs/Indels, SVs, CNVs and Repeat Expansions
- Transfers data from dragen to another long term storage location
Configurations are available for use with either Human Reference Genome Version GRCh37 or GRCh38.
dragen version 4.2.7
The script should be run on a per sample basis in a directory structure such as this:
├── sample1/
│ ├── sample1_S1_L001_R1_001.fastq.gz
│ ├── sample1_S1_L001_R2_001.fastq.gz
│ ├── sample1_S2_L002_R1_001.fastq.gz
│ ├── sample1_S2_L002_R2_001.fastq.gz
│ └── sample1.variables
This can be found within the staging area fastq directory on the Dragen e.g. /staging/data/fastq/191010_D00501_0366_BH5JWHBCX3/Data/NexteraDNAFlex
Once within this folder:
bash DragenWGS.sh
Once the gvcf creation is complete for each sample the joint genotyping will be called and produce the final joint vcf.
Produces results in:
Will produce:
Sample Level:
- BAM file
- QC Metrics
- Repeat Expansion VCF
Run Level:
- Joint VCF
- Joint VCF hard filtered
- Variant Calling Metrics
- Joint SV VCF
- Join CNV VCF
Chris Medway and Joseph Halstead