This repository contains the source code of KnowDDI.
- data: the pre-processed dataset of DrugBank and BioSNAP (also known as TWOSIDES).
- pytorch: the pytorch version code of KnowDDI.
- raw_data: the origin dataset of DrugBank and BioSNAP.
We provide the dataset in the data folder.
Data | Source | Description |
---|---|---|
DrugBank | This link | A drug-drug interaction network betweeen 1,709 drugs with 136,351 interactions. |
BioSNAP | This link | A drug-drug interaction network betweeen 645 drugs with 46221 interactions. |
Hetionet | This link | The knowledge graph containing 33,765 nodes out of 11 types (e.g., gene, disease, pathway,molecular function and etc.) with 1,690,693 edges from 23 relation types after preprocessing (To ensure no information leakage, we remove all the overlapping edges between HetioNet and the dataset). |
We provide the mapping file between ids in our pre-processed data and their original name/drugbank id as well as a copy of Hetionet data and their mapping file on this link.
This repository requires only a standard computer with enough RAM to support the in-memory operations. We recommend that your computer contains a GPU.
The package development version is tested on Linux(Ubuntu 18.04) operating systems with CUDA 10.2.
The environment required by the code is as follows.
python==3.7.15
pytorch==1.6.0
torchvision==0.7.0
cudatoolkit==10.2
lmdb==0.98
networkx==2.4
scikit-learn==0.22.1
tqdm==4.43.0
dgl-cu102==0.6.1
Please follow the commands below:
cd KnowDDI-codes
conda create -n KnowDDI_pytorch python=3.7
conda install pytorch==1.6.0 torchvision==0.7.0 cudatoolkit=10.2 -c pytorch
pip install dgl-cu102==0.6.1
pip install -r requirements.txt
cd pytorch
The default parameters are the best on DrugBank dataset. To train and evaluate the model,you can run the following command.
python train.py -e Drugbank
Besides, to train and evaluate the model on BioSNAP dataset,you can run the following command.
python train.py -e BioSNAP --dataset=BioSNAP --eval_every_iter=452 --weight_decay_rate=0.00001 --threshold=0.1 --lamda=0.5 --num_infer_layers=1 --num_dig_layers=3 --gsl_rel_emb_dim=24 --MLP_hidden_dim=24 --MLP_num_layers=3 --MLP_dropout=0.2
We provide examples on two datasets with expected experimental results and running times.