The project new URL is now https://gitbio.ens-lyon.fr/LBMC/Delattre/ChrSexebelari
SNP calling pipeline to find homozygote SNPs present in JU2817 strain but not in JU2859
These instructions will get you a working version of the pipeline the SNP calling pipeline.
To run nextflow on you computer you need to have java (>= 1.8) installed.
java --version
To be able to easily test tools already implemented for nextflow on your computer (src/nf_modules/
to see their list). You need to have docker installed.
docker run hello-world
To install nextflow on you computer simply run the following command:
src/install_nextflow.sh
Then to initialize the necessary Docker tools with the following command:
src/docker_modules/cutadapt/1.14/docker_init.sh
src/docker_modules/UrQt/d62c1f8/docker_init.sh
src/docker_modules/bioawk/1.0/docker_init.sh
src/docker_modules/Bowtie2/2.3.4.1/docker_init.sh
src/docker_modules/sambamba/0.6.7/docker_init.sh
src/docker_modules/sambamba/0.6.7/docker_init.sh
src/docker_modules/GATK/4.0.8.1/docker_init.sh
src/docker_modules/SAMtools/1.7/docker_init.sh
src/docker_modules/bcftools/1.7/docker_init.sh
Necessary R packages
R -e 'install.packages(c("tidyverse", "seqinr"), repos = "https://cloud.r-project.org")'
To launch the analysis, you can execute the content of the script src/1_JU28_59vs17_SNP_calling.sh
.
There, is a first section to run the pipeline locally with Docker on a training set (after generating the training set), a second to run it with Docker on the full data set, and a last seciton to run it on the PSMN.
tool | nf module | docker module | psmn module |
---|---|---|---|
BEDtools | ok | ok | ok |
BFCtools | no | ok | ok |
bioawk | no | ok | ok |
Bowtie | ok | ok | no |
Bowtie2 | ok | ok | ok |
BWA | ok | ok | ok |
canu | ok | ok | ok |
cutadapt | ok | ok | ok |
deepTools | no | ok | ok |
FastQC | ok | ok | ok |
file_handle | no | ok | ok |
GATK | no | ok | ok |
HISAT2 | no | ok | no |
HTSeq | ok | ok | ok |
Kallisto | ok | ok | ok |
MACS2 | no | ok | ok |
MultiQC | ok | ok | ok |
MUSIC | ok | ok | ok |
picard | no | ok | ok |
pigz | no | ok | ok |
RSEM | ok | ok | ok |
sambamba | ok | ok | ok |
samblaster | ok | ok | ok |
SAMtools | ok | ok | ok |
SRAtoolkit | ok | ok | ok |
Salmon | no | ok | ok |
TopHat | no | ok | ok |
Trimmomatic | no | ok | ok |
UrQt | ok | ok | ok |
After running the src/SNP_calling.nf
pipeline, the src/intersect_SNP.R
R scripts will format the .vcf
files into .csv
table.
The final output is filtered to keep only SNP matching a list of enzymes and SNP that are homozygote in one strain and not present in the other.
Please read CONTRIBUTING.md for details on our code of conduct, and the process for submitting pull requests to us.
We use SemVer for versioning. For the versions available, see the tags on this repository.
- Laurent Modolo - Initial work
See also the list of contributors who participated in this project.
This project is licensed under the CeCiLL License- see the LICENSE file for details