Classifier of the potential pathogenicity of human genomic truncations
# Perl DBI module
If you have not already installed this module a zip file of DBI-1.633 is included in /SOFTWARE module. to install it follow the ensuing instructions:
tar -xvzf SOFTWARE/DBI-1.633.tar.gz -C SOFTWARE/
cd SOFTWARE/DBI-1.633/
perl Makefile.PL
make test
sudo make install
cd ../..
# To check the installation
perl -MDBI -e 'warn $DBI::VERSION'
If you have not installed and configured MySQL in your computer follow the ensuing instructions
sudo apt-get install mysql-server
sudo apt-get install libmysqlclient-dev
sudo service mysql restart
# check if mysql is running
sudo netstat -tap | grep mysql
# Set password
sudo dpkg-reconfigure mysql-server-5.5
# To test the pass
mysql -u root -p
# DBD-mysql module
If you have not already installed this module a zip file of DBI-1.633 is included in /SOFTWARE module. to install it follow the ensuing instructions:
tar -xvzf SOFTWARE/DBD-mysql-4.031.tar.gz -C SOFTWARE/
cd SOFTWARE/DBD-mysql-4.031/
perl Makefile.PL
make test
sudo make install
cd ../..
# To check the installation
perl -MDBD::mysql -e 'warn $DBD::mysql::VERSION'
# ggplot package in R
RStudio> install.packages("ggplot2")
# ROCR package in R
RStudio> install.packages("ROCR")
Download the .zip files with git clone
cd nutvar2-master
Export the following variables or add them to your .profile
export PERL5LIB=$HOME/src/ensembl/modules:$PERL5LIB
export PERL5LIB=$HOME/src/ensembl-variation/modules:$PERL5LIB
export PERL5LIB=$HOME/src/ensembl-compara/modules:$PERL5LIB
export PERL5LIB=$HOME/src/ensembl-funcgen/modules:$PERL5LIB
export PERL5LIB=$HOME/src/ensembl-tools/modules:$PERL5LIB
export PERL5LIB=$HOME/src/bioperl-1.6.1/:$PERL5LIB
During the installation of ENSEMBL VEP for release 75 the user will be asked if he/she wants to install a cache version of the genome or Fasta files. In both cases the answer is NO, as the GRCh37.75 genome is provided within NuTVar2.
There are three different options to run NutVar2: using SnpEff as the only predictor for variant outcome, using VEP as the only predictor for variant outcome, and using both SnpEff and VEP.
**Note: VEP is a very comprehensive and informative predictor of variant outcome, but its time of execution is way larger than that of SnpEff. If VEP is going to be used in large input vcf files, cut files in smaller subfiles prior to running NutVar2.
cd nutvar2-master
nutvar2-master$ ./ ~/path-to-/nutvar2-master user.vcf data/final
Test: nutvar2-master$ ./ ~/Downloads/nutvar2-master example.vcf data/final
nutvar2-master$ ./ ~/path-to-/nutvar2-master user.vcf data/final
Test:nutvar2-master$ ./ ~/Downloads/nutvar2-master example.vcf data/final
nutvar2-maste$ ./ ~/path-to-/nutvar2-master example.vcf data/final
ISSUE!! Create a relative path to setwd in R
Installation ISSUE!! The genomes are huge where are we going to allocate them online for the user to download? Right now I retrieve them from my local disk.
# Search for a function that does not produce error when ordering Chrom X and Chrom Y like if they were numbers (sort?)
# Search the casuse for the messages (my own warnings)
splice_in_last_component and
This might be a quality control step in the initial parsing of the snpEff / VEP output
# Re think the way we calculate NMD
NMD --> stop_gained
NMD ---> frameshifts
NMD ---> new script for splice donor/acceptor variants
# Eliminate from the stand alone tool and THEY ARE NOT USED
# echo the running of each script in the pipeline
# data/final should have the date of creation of the folder