/icml18-jtnn

Junction Tree Variational Autoencoder for Molecular Graph Generation (ICML 2018)

Primary LanguagePythonMIT LicenseMIT

Junction Tree Variational Autoencoder for Molecular Graph Generation

Official implementation of our Junction Tree Variational Autoencoder https://arxiv.org/abs/1802.04364

Update

We have made architecture improvements to JT-VAE. We recommend you to check our new repository at https://github.com/wengong-jin/hgraph2graph/. This repo contains a molecular language model pre-trained on ChEMBL (1.8 million compounds) and scripts for property-guided molecule generation. All scripts are written in python 3.7 and pytorch.

Accelerated Version

We have accelerated our code! The new code is in fast_jtnn/, and the VAE training script is in fast_molvae/. Please refer to fast_molvae/README.md for details.

Requirements

  • Linux (We only tested on Ubuntu)
  • RDKit (version >= 2017.09)
  • Python (version == 2.7)
  • PyTorch (version >= 0.2)

To install RDKit, please follow the instructions here http://www.rdkit.org/docs/Install.html

We highly recommend you to use conda for package management.

Quick Start

The following directories contains the most up-to-date implementations of our model:

  • fast_jtnn/ contains codes for model implementation.
  • fast_molvae/ contains codes for VAE training. Please refer to fast_molvae/README.md for details.

The following directories provides scripts for the experiments in our original ICML paper:

  • bo/ includes scripts for Bayesian optimization experiments. Please read bo/README.md for details.
  • molvae/ includes scripts for training our VAE model only. Please read molvae/README.md for training our VAE model.
  • molopt/ includes scripts for jointly training our VAE and property predictors. Please read molopt/README.md for details.
  • jtnn/ contains codes for model formulation.

Contact

Wengong Jin (wengong@csail.mit.edu)