/DeepSV

Calling deletions using deep convolutional neural

Primary LanguagePython

DeepSV

DeepSV

Introduction

DeepSV, an approach based on deep learning for calling long deletions from sequence reads.DeepSV is based on a novel method of visualizing sequence reads. The visualization is designed to capture multiple sources of information in the data that are relevant to long deletions. DeepSV also implements techniques for working with noisy training data. DeepSV trains a model from the visualized sequence reads and calls deletions based on this model. We demonstrate that DeepSV outperforms existing methods in terms of accuracy and efficiency of deletion calling on the data from the 1000 Genomes Project. Our work shows that deep learning can potentially lead to effective calling of different types of genetic variations that are complex than SNPs. WorkFlow

Requirements

  • python 3.6, Jupyter Notebook, numpy, scipy, pandas, Matplotlib
  • Cuda 8.0, Cudnn
  • TensorFlow
  • Digits
  • Pysam

Installation

Tools

bash Anaconda3-4.3.1-Linux-x86_64.sh

Jupyter Notebook

  • pip install jupyter
  • Configure
       jupyter notebook --generate-config
    Create a ciphertext password: from notebook.auth import passwd
  • Modify the default configuration file
    c.NotebookApp.ip='*'
    c.NotebookApp.password = u'sha:...'
    c.NotebookApp.open_browser = False
    c.NotebookApp.port =8888
    c.NotebookApp.notebook_dir = u’/home/...’

Cuda & cudnn

Installation tutorial can be downloaded from the official website

TensorFlow

  • pip install tensorflow-gpu

Digits

cd ~
git clone https://github.com/NVIDIA/DIGITS.git digits
cd digits
sudo apt-get install graphviz gunicorn
for req in $(cat requirements.txt); do sudo pip install $req; done 
pip install -r ~/digits/requirements.txt 
./digits-devserver

pysam

  • pip install pysam

Usage

Data

BAM file & VCF file
First provide the bam files and vcf files for program

Generation Candidates

Run Generate_Deletion_Image.py and Generate_Non_Deletion_Image.py in the custom path

  • python Generate_Deletion_Image.py --del_length
  • python Generate_Non_Deletion_Image.py --del_length

Geerationg Images Path

Generate the path of all pictures for training the network

  • python my_file_travel.py

Using Digits training CNN

Send all the generated pictures to the network training

  • Using the CNN architecture in CNN_Source.py

Using a trained network for calling deletion

Generating whole genome pictures

  • python Whole_genome_Image.py

Extracting deletion information from test results

  • python extract_breakpoint.py

Generating VCF File

  • python generate_final_vcf.py