LosicLab/ExoSeq is a bioinformatics analysis pipeline that performs best-practice analysis pipeline for Exome Sequencing data. It is forked from nfcore/ExoSeq.
The pipeline is built based on GATK best practices using Nextflow, a bioinformatics workflow tool. The main steps done by pipeline are the following (more information about the processes can be found here).
- Alignment - bwa
- Marking Duplicates - picard
- Recalibration - gatk 4
- Realignment - gatk 4
- Variant Calling (Somatic or SNP) - gatk 4
- Variant Filtration - gatk 4
The LosicLab pipeline comes with the documentation forked from the original nf-core repository, found in the docs/
directory:
- Pipeline installation and configuration instructions
- Pipeline configuration
- Running the pipeline
- Output and how to interpret the results
- Troubleshooting
The pipeline now also has support for the MSSM Minerva HPC. Example run scripts can be found in the scripts/run_scripts
folder.
The original nf-core/exoseq pipeline was initally developed by Senthilkumar Panneerselvam (@senthil10) with a little help from Phil Ewels (@ewels) at the National Genomics Infrastructure, part of SciLifeLab in Stockholm and has been extended by Alex Peltzer (@apeltzer), Marie Gauder (@mgauder) from QBIC Tuebingen/Germany as well as Marc Hoeppner (@marchoeppner) from IKMB Kiel/Germany.
Many thanks also to others who have helped out along the way too, including @pditommaso, @colindaven.