/MINN-DTI

Effective drug-target interaction prediction with mutual interaction neural network

Primary LanguagePython

Effective drug-target interaction prediction with mutual interaction neural network

MINN-DTI

  • Source code for the paper "Effective drug-target interaction prediction with mutual interaction neural network".

  • MINN-DTI is a model for drug-target interaction (DTI) prediction. MINN-DTI combines an interacting-transformer module (called Interformer) with an improved Communicative Message Passing Neural Network (CMPNN) (called Inter-CMPNN) to better capture the two-way impact between drugs and targets, which are represented by molecular graph and distance map respectively.

MINN-DTI

Dataset

All data used in this paper are publicly available and consistent with that used by DrugVQA , which can be accessed here : DrugVQA.

Environment

  • base dependencies:
  - dgl
  - dgllife
  - numpy
  - pandas
  - python>=3.7
  - pytorch>=1.7.1
  - rdkit
  • We also provide an environment file for Anaconda users. You can init your environment by conda env create -f environment.yaml.
  • Need download the chemprop package from CMPNN and put it in model/ directory.

Usage

Training on datasets uesd in this paper

Data and file directory preparation

  • Before training a model on the datasets uesd in this paper, you must prepare data and file directory as follows (take DUD-E as an example):
      1. Select or create your local data directory for the DUD-E dataset, such as data/DUD-E.
      1. Download data/DUDE/contactMap and data/DUDE/dataPre directories including proetin contact maps, SMIELS and labels from DrugVQA repository.
      1. Put the downloaded contactMap and dataPre folders in your data directory(data/DUD-E)

Set arguments

  • All default arguments are provided in the model/data.py for training.
  • You can modify the model/data.py directly to set up your model
  • The following arguments must be set according to your data directory
# Path of training data file
trainFoldPath = '../data/DUDE/dataPre/DUDE-foldTrain1'
# Directory of protein contact maps
contactPath = '../data/DUDE/contactMap'
# Path of the protein contact map dict file
contactDictPath = '../data/DUDE/dataPre/DUDE-contactDict'

Run your training

  • Run any one command below using model/main.py to train a model, model files will be saved in model_pkl/my/
$ python model/main.py
# Specify GPU
$ CUDA_VISIBLE_DEVICES=0 python model/main.py
# Running in the background
$ nohup python model/main.py > train.log 2>&1 &
$ CUDA_VISIBLE_DEVICES=0 nohup python model/main.py > train.log 2>&1 &

Training on independent datasets

  • You can train your model on your own datasets Follow the steps above.
  • The only thing you need to do is organize your data in the format used here, you have to:
    • Prepare a training data file like this file
    • Prepare a protein contact map dict file like this file
    • Prepare protein contact maps like this file
    • Specify the path of the above files in model/data.py as Set arguments in Training on datasets uesd in this paper section bove

Testing on datasets uesd in this paper

Data and file directory preparation

  • Before testing a model, you must prepare data and file directory
    • Besides contactMap and dataPre folders, you need to download decoy_smile and active_smile folders from DrugVQA repository and put them in your data directory.

Set arguments

  • All default arguments are provided in the model/dataTest.py for testing.
  • You can modify the model/dataTest.py directly to set up your testing
  • The following arguments must be set according to your data directory
# Path of test list file
testFoldPath = '../data/DUDE/dataPre/DUDE-foldTest1'
# Directory of protein contact maps
contactPath = '../data/DUDE/contactMap'
# Path of the protein contact map dict file
contactDictPath = '../data/DUDE/dataPre/DUDE-contactDict'
# Directory of SMILES file of active or decoy molecules
DECOY_PATH = '../data/DUDE/decoy_smile'
ACTIVE_PATH = '../data/DUDE/active_smile'

Run your testing

  • Run any one command below using model/mainTest.py to test your models, results including AUC and other indicators will be written in test.log
# Running in the background
# Setting model file: ../model_pkl/my/DUDE-fold-h0501-235.pkl 
$ nohup python model/mainTest.py .py --checkpoint_path ../model_pkl/my/DUDE-fold-h0501-235.pkl 2>&1 > test.log
# Specify GPU
$ CUDA_VISIBLE_DEVICES=0 nohup python model/mainTest.py .py --checkpoint_path ../model_pkl/my/DUDE-fold-h0501-235.pkl 2>&1 > test.log

Testing or predicting on independent datasets

  • To test or predict on independent datasets you need to organize your data in the format used here:
    • Prepare protein contact maps and contact map dict file as above
    • Prepare a test list of target names separated by spaces(named mytest here)
    • Put lists of active SMILES and decoy SMILES named XXX_actives_final.ism and XXX_decoys_final.ism (XXX is target name in test list mytest) of targets in active_smile and decoy_smile folds Separately with one SMILES per line (named active_smile and decoy_smile), put them all in active_smile for prediction task.
  • Modify the following arguments in model/dataTest.py
# Path of test list file
testFoldPath = '../data/DUDE/dataPre/mytest'
# Directory of protein contact maps
contactPath = '../data/DUDE/contactMap'
# Path of the protein contact map dict file
contactDictPath = '../data/DUDE/dataPre/DUDE-contactDict'
# Directory of SMILES file of active or decoy molecules
DECOY_PATH = '../data/DUDE/decoy_smile'
ACTIVE_PATH = '../data/DUDE/active_smile'
  • Run any one command below using model/mainTest.py to test your models, testing and predicting results including predicting results of each sample will be written in mytest.log
# Running in the background
# Setting model file: ../model_pkl/my/DUDE-fold-h0501-235.pkl 
$ nohup python model/mainTest.py .py --checkpoint_path ../model_pkl/my/DUDE-fold-h0501-235.pkl 2>&1 > mytest.log
# Specify GPU
$ CUDA_VISIBLE_DEVICES=0 nohup python model/mainTest.py .py --checkpoint_path ../model_pkl/my/DUDE-fold-h0501-235.pkl 2>&1 > mytest.log