An integrative drug repositioning framework discovered a potential therapeutic agent targeting COVID-19
Published in (Signal Transduction and Targeted Therapy).
Our code requires GPU to faciliate the neural network training. The GPU we used to train the model is Nvidia GeForce GTX 1080 Ti.
Our code is tested on Ubuntu 16.04.6 LTS with the following software dependencies:
- Python 2.7
- Tensorflow (tested on version 1.5.0)
- numpy (tested on version 1.16.4)
- sklearn (tested on version 0.20.3)
- tflearn
Install Python 2.7
sudo apt update
sudo apt upgrade
sudo apt install python2.7 python-pip
Install numpy
pip install numpy==1.16.4
Install sklearn
pip install scikit-learn==0.20.3
Install tensorflow
pip install tensorflow-gpu==1.5.0
Install tflearn
pip install tflearn
To run our experiments:
-
Unzip data.rar or data_original.rar in ./data. The latter one corresponds to the original dataset we used in the manuscript while the former one added new virus protein-human protein interaction data [Gordon, D.E., Jang, G.M., Bouhaddou, M. et al. A SARS-CoV-2 protein interaction map reveals targets for drug repurposing. Nature (2020)] that further improved the prediction performance.
-
cd ./src
, executepython NeoDTI_for_COVID19.py
to run a 10 fold cross validation of NeoDTI as well as the prediction results. Options are:
-d: The embedding dimension d, default: 512.
-n: Global norm to be clipped, default: 1.
-k: The dimension of project matrices, default: 256.
-r: The number of 10 fold cross validation to run, default: 1.
-e: The epoch number for model training, default: 5000.
-l: L2 regularization strength, default: 0.
The results are stored in ./output. -
The typical run time for one repeat of 10 fold cross validation on a Linux machine is 2~3 days.
-
drug_iddict: python dictionary, key: drug InChI; value: index number.
-
Drug_simi_net.npy: Drug structure similarity matrix (tanimoto similarity of Morgan fingerprints).
-
new_all_human_seq_iddict.pkl: python dictionary, key: human protein ID; value: index number.
-
new_all_human_seq_seqdict.pkl: python dictionary, key: human protein ID; value: protein sequence.
-
new_all_human_seq.npy: Human protein sequence similarity matrix (normalized Smith-Waterman alignment scores).
-
PPI_net.npy: Human protein-protein interaction matrix.
-
all_seq_virus_seqdict.pkl: python dictionary, key: virus protein ID; value: index number.
-
all_seq_virus_iddict.pkl: python dictionary, key: virus protein ID; value: index number.
-
all_seq_virus.npy: Virus protein sequence similarity matrix (normalized Smith-Waterman alignment scores).
-
VHI_net.npy: Virus protein-human protein interaction matrix.
-
VDTI_net.npy: Drug-virus protein interaction matrix.
-
HDTI_net.npy: Drug-human protein interaction matrix.
All entities (i.e., drugs, human and virus proteins) are ordered by their indices.
The predictive results and files associated with the paper are provided.
paper/BERE/*
: Results after text mining using BERE. For trained BERE model please refer to the original repostory.paper/CMap/*
: Data used for CMap analysis.paper/results/retrospective_study/*
: Prediction results of CoV-DTI in a retrospective study.paper/results/retrospective_study/All_drugs_rank_for_SARS.txt
: Prediction results of CoV-DTI for SARS.paper/results/retrospective_study/All_drugs_rank_for_MERS.txt
: Prediction results of CoV-DTI for MERS.paper/results/predictions/*
: Prediction results of drug repositioning against COVID-19.paper/results/predictions/Drug_COVID-19_CoV-DTI.xlsx
: Prediction results of CoV-DTI.paper/results/predictions/Drug_COVID-19_BERE.xlsx
: Prediction results of BERE.paper/results/predictions/Drug_COVID-19_CMap_PBMC.txt
: Prediction results of CMap from PBMC samples of COVID-19 patients.paper/results/predictions/Drug_COVID-19_CMap_BALF.txt
: Prediction results of CMap from BALF samples of COVID-19 patients.paper/results/predictions/Drug_SARS_CMap_PBMC.txt
: Prediction results of CMap from PBMC samples of SARS patients.paper/experiments/Source Data.xlsx
: Raw data points associated with the figures in the paper.
If you have any questions or comments, please feel free to email Fangping Wan (wanfangping92[at]gmail[dot]com) and/or Jianyang Zeng (zengjy321[at]tsinghua[dot]edu[dot]cn).