DTI-CNN:
A learning-based method for drug-targetinteraction prediction based on feature representation learning and deep neural network
Quick start
We provide an example script to run experiments on our dataset:
- Run
./python/run_model.py
: predict drug-target interactions, and evaluate the results with ten cross-validation.
All process
-
-Run
compute_similarity.m
-
-Run
run_joint.m
-
-Run
run_DAE.py
-
-Run
run_model.py
Code and data
matlab/
directory
compute_similarity.m
: compute Jaccard similarity based on interaction/association networkjoint.m
: splicing the network of drugs and proteinsdiffusionRWR.m
: network diffusion algorithm (random walk with restart)run_joint.m
: implement the joint and RWR above.
python/
directory
au_class.py
: implement the autoencoderDAE.py
: implement the denoising autoencoderrun_DAE
: use the dataset to run denoising autoencoderrun_model.py
: predict drug-target interactions, and evaluate the results with ten cross-validation
data/
directory
drug.txt
: list of drug namesprotein.txt
: list of protein namesdisease.txt
: list of disease namesse.txt
: list of side effect namesdrug_dict_map
: a complete ID mapping between drug names and DrugBank IDprotein_dict_map
: a complete ID mapping between protein names and UniProt IDmat_drug_se.txt
: Drug-SideEffect association matrixmat_protein_protein.txt
: Protein-Protein interaction matrixmat_protein_drug.txt
: Protein-Drug interaction matrixmat_drug_protein.txt
: Drug_Protein interaction matrix (transpose of the above matrix)mat_drug_protein_remove_homo.txt
: Drug_Protein interaction matrix, in which homologous proteins with identity score >40% were excluded (see the paper).mat_drug_drug.txt
: Drug-Drug interaction matrixmat_protein_disease.txt
: Protein-Disease association matrixmat_drug_disease.txt
: Drug-Disease association matrixSimilarity_Matrix_Drugs.txt
: Drug similarity scores based on chemical structures of drugsSimilarity_Matrix_Proteins.txt
: Protein similarity scores based on primary sequences of proteins
Note: drugs, proteins, diseases and side-effects are organized in the same order across all files, including name lists, ID mappings and interaction/association matrices.