/Sniffles

Structural variation caller using third generation sequencing

Primary LanguageC++MIT LicenseMIT

Sniffles

Sniffles is a structural variation caller using third generation sequencing (PacBio or Oxford Nanopore). It detects all types of SVs (10bp+) using evidence from split-read alignments, high-mismatch regions, and coverage analysis. Please note the current version of Sniffles requires sorted output from BWA-MEM (use -M and -x parameter), Minimap2 (sam file with Cigar & MD string) or NGMLR. If you experience problems or have suggestions please contact: fritz.sedlazeck@gmail.com

Please see our github wiki for more information (https://github.com/fritzsedlazeck/Sniffles/wiki)

How to build Sniffles

wget https://github.com/fritzsedlazeck/Sniffles/archive/master.tar.gz -O Sniffles.tar.gz
tar xzvf Sniffles.tar.gz
cd Sniffles-master/
mkdir -p build/
cd build/
cmake ..
make

cd ../bin/sniffles*
./sniffles

Note Mac users often have to provide parameters to the cmake command:

cmake -D CMAKE_C_COMPILER=/opt/local/bin/gcc-mp-4.7 -D CMAKE_CXX_COMPILER=/opt/local/bin/g++-mp-4.7 .. 

NGMLR

Sniffles performs best with the mappings of NGMLR our novel long read mapping method. Please see: https://github.com/philres/ngmlr


Citation:

Please see and cite our paper: https://www.nature.com/articles/s41592-018-0001-7


Poster & Talks:

Accurate and fast detection of complex and nested structural variations using long read technologies Biological Data Science, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 26 - 29.10.2016

NGMLR: Highly accurate read mapping of third generation sequencing reads for improved structural variation analysis Genome Informatics 2016, Wellcome Genome Campus Conference Centre, Hinxton, Cambridge, UK, 19.09.-2.09.2016


Datasets used in the mansucript:

We provide the NGMLR aligned reads and the Sniffles calls for the data sets used:

Arabidopsis trio:

Genome in the Bottle trio:

NA12878:

SKBR3: