/grenepipe

A flexible, scalable, and reproducible pipeline to automate variant calling from sequence reads.

Primary LanguagePythonGNU General Public License v3.0GPL-3.0

grenepipe logo

Snakemake pipeline for variant calling from raw sample sequences, with lots of bells and whistles.

Advantages:

  • One command to run the whole pipeline!
  • Many tools to choose from for each step
  • Simple configuration via a single file
  • Automatic download of tool dependencies
  • Resuming from failing jobs

Getting Started

See --> the Wiki pages <-- for setup and documentation.

For bug reports and feature requests, please open an issue on our GitHub page.

Pipeline Overview

Minimal input:

  • Reference genome fasta file
  • Per-sample fastq files
  • Optionally, a vcf file of known variants to restrict the variant calling process

Process and available tools:

Typical output:

  • Variant calls vcf, raw and filtered, and potentially with annotations
  • MultiQC report (includes summaries of most other tools, and of the final vcf)
  • Snakemake report (optional)

Intermediate output files such as bam files are also kept by default, and mpileup files can optionally be created if needed. In addition to the above tools, there are some tools used as glue between the steps. If you are interested in the details, have a look at the snakemake rules for each step.

Citation

When using grenepipe, please cite:

grenepipe: A flexible, scalable, and reproducible pipeline
to automate variant calling from sequence reads.

Lucas Czech and Moises Exposito-Alonso. Bioinformatics. 2022.
doi:10.1093/bioinformatics/btac600 [pdf]

Furthermore, please do not forget to cite all tools that you selected to be run for your analysis. See our Wiki for their references.