CMPNN

Source code for our IJCAI 2020 paper Communicative Representation Learning on Attributed Molecular Graphs

The code was built based on DMPNN. Thanks a lot for their code sharing!

Overview

Dataset	BBBP	Tox21	Sider	ClinTox	ESOL	FreeSolv
Task	Classification	Classification	Classification	Classification	Regression	Regression
RF	0.788	0.619	0.572	0.544	1.176	2.048
FNN	0.899	0.788	0.652	0.688	2.152	3.043
GCN	0.690	0.829	0.638	0.807	0.970	1.400
Weave	0.671	0.820	0.581	0.832	0.610	1.220
RGAT	0.875	0.821	0.621	0.841	0.731	1.338
N-Gram	0.890	0.842	-	0.870	0.718	1.371
MPNN	0.910±0.032	0.844±0.014	0.641±0.014	0.881±0.037	0.702±0.042	1.242±0.249
DMPNN w/o FP	0.913±0.026	0.845±0.015	0.646±0.016	0.894±0.027	0.665±0.052	1.157±0.105
CMPNN w/o FP (our)	0.963±0.003	0.856±0.007	0.666±0.007	0.933±0.012	~~0.233±0.015~~ *0.547±0.011	0.819±0.147

Prediction results of CMPNN, its variants and baselines on six chemical graph datasets. We used a 5-fold cross validation with random split and replicated experiments on each tasks for five times, and reported the mean and standard deviation of AUC or RMSE values. For methodology and scaffold-split results, please refer to paper for more details.

* The authors note that there was a mistake in previous ESOL dataset. The result has been corrected. Thanks to Shengchao and Sorkun for their friendly reminders.

Dependencies

cuda >= 8.0
cuDNN
RDKit
torch >= 1.2.0

Tips: Using code conda install -c rdkit rdkit can help you install package RDKit quickly.

Training

To run the demo code on dataset BBBP, run:

python train_demo.py

To train a model, run:

python train.py --data_path <path> --dataset_type <type> --num_folds 5 --gpu 0 --epochs 30

where <path> is the path to a CSV file containing a dataset, <type> is either "classification" or "regression" depending on the type of the dataset.

Predicting

python predict.py --data_path <path> --checkpoint_dir <dir>

where <dir> is the directory where the model checkpoint(s) are saved, and <path> is the path of SMILES dataset

Citation:

Please cite the following paper if you use this code in your work.

@inproceedings{ijcai2020-392,
  title     = {Communicative Representation Learning on Attributed Molecular Graphs},
  author    = {Song, Ying and Zheng, Shuangjia and Niu, Zhangming and Fu, Zhang-hua and Lu, Yutong and Yang, Yuedong},
  booktitle = {Proceedings of the Twenty-Ninth International Joint Conference on
               Artificial Intelligence, {IJCAI-20}},
  publisher = {International Joint Conferences on Artificial Intelligence Organization},             
  editor    = {Christian Bessiere}	
  pages     = {2831--2838},
  year      = {2020},
  month     = {7},
  note      = {Main track}
  doi       = {10.24963/ijcai.2020/392},
  url       = {https://doi.org/10.24963/ijcai.2020/392},
}