This workflow performs mapping of single-end and paired-end reads in fastq format against a reference genome to produce a deduplicated and recalibrated BAM file.
DiMA is part of the Snakemake-based pipelines collection solida-core developed and manteined at CRS4.
- Matteo Massidda (@massiddaMT)
- Rossano Atzeni (@ratzeni)
The usage of this workflow is described in the Snakemake Workflow Catalog.
If you use this workflow in a paper, don't forget to give credits to the authors by citing the URL of this (original) repository and its DOI (see above).
Create a virtual environment with the command:
mamba create -c bioconda -c conda-forge --name snakemake snakemake=6.15 snakedeploy
and activate it:
conda activate snakemake
We get some public data to test the pipeline. You can directly clone in this folder from github, just type:
git clone https://github.com/solida-core/test-data-DNA.git
You can then perform the pipeline deploy defining a directory my_dest_dir
for analysis output and a pipeline tag for a specific version:
snakedeploy deploy-workflow https://github.com/solida-core/dima
my_desd_dir
To run the pipeline, go inside the deployed pipeline folder and use the command:
snakemake --use-conda -p --cores all
You can generate analysis report with the command:
snakemake --report report.zip --cores all