This is an implementation for GATK Variant Quality Score Recalibration (VQSR) using snakemake pipeline written by Sherine Awad.
You will need your cohort vcf file, you can change its name and path in the config file.
snakemake -jn
where n is the number of cores for example for 10 cores use:
snakemake -j10
For less froodiness, use conda:
snakemake -jn --use-conda
For example, for 10 cores use:
snakemake -j10 --use-conda
This will pull automatically the same versiosn of tools we used. Conda has to be installed in the system, in addition to snakemake.
For a dry run use:
snakemake -j1 -n
and to print command in dry run use:
snakemake -j1 -n -p
You can have a specific config file for each cohort, and pass them accordingly as follows:
snakemake -j1 --configfile config-WES.yaml
or:
snakemake -j1 configfile config-WGS.yaml
-
Brouard, Jean-Simon, Flavio Schenkel, Andrew Marete, and Nathalie Bissonnette. "The GATK joint genotyping workflow is appropriate for calling variants in RNA-seq experiments." Journal of animal science and biotechnology 10, no. 1 (2019): 1-6.
-
Van der Auwera, Geraldine A., Mauricio O. Carneiro, Christopher Hartl, Ryan Poplin, Guillermo Del Angel, Ami Levy‐Moonshine, Tadeusz Jordan et al. "From FastQ data to high‐confidence variant calls: the genome analysis toolkit best practices pipeline." Current protocols in bioinformatics 43, no. 1 (2013): 11-10.
-
Poplin, R., Ruano-Rubio, V., DePristo, M. A., Fennell, T. J., Carneiro, M. O., Van der Auwera, G. A., ... & Banks, E. (2018). Scaling accurate genetic variant discovery to tens of thousands of samples. BioRxiv, 201178.
-
https://gatk.broadinstitute.org/hc/en-us/articles/360035531612