/SVsearcher

A method is used to detect SVs from long read alignments

Primary LanguageHTML

SVsearcher

Introduction

Structural variations (SVs) represent genomic rearrangements such as deletions, insertions, inversions, duplications, and translocations whose sizes are larger than 50bp. A number of long read SV callers have been proposed to call SVs and they perform well. However, the long reads generated by Oxford Nanopore (ONT) have high error rate, which affect the correctness of the long read alignment. Existing long read SV callers do not perform well. We propose a novel method, SVsearcher, to resolve these issues. Compared with existing methods, SVsearcher has highest recall, precision and F1-score.


Installation

git clone https://github.com/kensung-lab/SVsearcher.git	

Dependence

1. python3
2. pysam
3. cigar
4. numpy
5. pyfaidx
6. copy
7. time
8. argparse

Running

The sorted bam files from NGMLR, Minimap and Minimap2 are all be used as input sorted bam. The input reference.fa and reference.fa of bam file must be the same one.

cd dist
SVsearcher <input sorted bam> <input reference.fa>	

Output format

The output format is as follows. CHROM is chromosome name. POS is the SV start position. ID is the SV name. REF is the reference sequence and ALT is the alternate sequence. QUAL is the quality of SV and FILTER means filter status. INFO is the basic information of SV.

#CHROM	POS	ID	REF	ALT	QUAL	FILTER	INFO
chr1	10780	SVsearcher.INS.1	G	GAACACATGCTAGCGCGTCCGGGGGTGGAGGCGATAGCGCAGGCGCAGAGAGCGCCGCGCC	.	PASS	SVTYPE=INS;SVLEN=61;END=10780;RNAMES=NULL
chr1	30893	SVsearcher.DEL.1	catttctctctatctcatttctctctctctcgctatct	c	.	PASS	SVTYPE=DEL;SVLEN=-37;END=30930;RNAMES=NULL

Contact

For advising, bug reporting and requiring help, please contact yan.zheng@nwpu-bioinformatics.com.