This repository contains our (re)implementation of Protein family-specific models using deep neural networks and transfer learning improve virtual screening and highlight the need for more data (DenseFS).
If you found DenseFS useful, please cite our paper:
This implementation utilises libmolgrid for molecular gridding.
This code was tested in Python 3.7 with PyTorch 1.4.
A yaml file containing all requirements is provided. This can be readily setup using conda.
conda env create -f DenseFS-env.yml
conda activate DenseFS-env
We have implemented two CNN architectures, which can be found in models.py
. These are specified when using the scripts by --model / -m.
Ragoza - This refers to the three-layer CNN architecture described in Ragoza et al., 2017.
Imrie - This refers to the DenseNet-based CNN architecture described in Imrie et al., 2018.
python CNN_train.py -m Imrie --train_file ./data/small.types -d ./data/structs/ -i 500 -b 32 -s 42 --display_iter 50 --save_iter 500 --anneal_iter 100 --rotate --translate 2.0
python CNN_train.py -m Imrie --train_file ./data/small.types -d ./data/structs/ -i 250 -b 32 -s 42 --display_iter 50 --save_iter 250 --anneal_iter 100 --weights model.iter-500 --base_lr 0.005 --rotate --translate 2.0
python CNN_test.py -m Imrie --weights model.iter-250 --test_file ./data/small.types -d ./data/structs/ -b 32 -s 42 --display_iter 50 --rotate --num_rotate 4
Please submit a Github issue or contact Fergus Imrie imrie@stats.ox.ac.uk.
@Article{Imrie2018, author={Imrie, Fergus and Bradley, Anthony R. and van der Schaar, Mihaela and Deane, Charlotte M.}, title={Protein Family-Specific Models Using Deep Neural Networks and Transfer Learning Improve Virtual Screening and Highlight the Need for More Data}, journal={Journal of Chemical Information and Modeling}, year={2018}, month={Oct}, day={1}, publisher={American Chemical Society}, issn={1549-9596}, doi={10.1021/acs.jcim.8b00350}, url={https://doi.org/10.1021/acs.jcim.8b00350} }