A snakemake workflow for variant calling and lineage barcoding of the Mycobacterium tuberculosis samples
The usage of this workflow is described in the Snakemake Workflow Catalog, alternatively it can be installed as described below.
Use the Conda package manager and BioConda channel to install TBvar.
If you do not have conda installed do the following:
# Download Conda installer
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
# Set permissions
chmod -X Miniconda3-latest-Linux-x86_64.sh
# Install
bash Miniconda3-latest-Linux-x86_64.sh
Set up channels:
conda config --add channels defaults
conda config --add channels bioconda
conda config --add channels conda-forge
conda config --set channel_priority strict
Get TBvar pipeline:
git clone https://github.com/dbespiatykh/TBvar.git
Install required dependencies:
conda install -c conda-forge mamba
mamba env create --file environment.yml
Activate TBvar environment:
conda activate TBvar
cd TBvar
👉 In config
folder edit config.yml
and add your samples.tsv
table location, it should be formatted like this:
Run_accession | R1 | R2 |
---|---|---|
SRR2024996 | /path/to/SRR2024996_1.fastq.gz | /path/to/SRR2024996_2.fastq.gz |
SRR2024925 | /path/to/SRR2024925_1.fastq.gz | /path/to/SRR2024925_2.fastq.gz |
SRR12882189 | /path/to/SRR12882189.fastq.gz |
Run_accession - Run accession number or sample name;
R1 - Path to the first read pair;
R2 - Path to the second read pair.
Run pipeline:
snakemake --conda-frontend mamba --use-conda -j 48 -c 48 --max-threads 48 -k --rerun-incomplete
It is recommended to use dry run if you are running pipeline for the first time, to see if everything is in working order, for this you can use -n
flag:
snakemake --conda-frontend mamba -np