Interaction-Based Inductive Bias in Graph Neural Networks: Enhancing Protein-Ligand Binding Affinity Predictions from 3D Structures
- Implementation of other baselines can be found on GIGN.
- This repository contains the source code for PLA prediction. For structure-based virtual screening (SBVS), please refer to our dedicated repository at EHIGN_SBVS on GitHub.
All data used in this paper are publicly available at the following locations:
The preprocessed data can be downloaded from Graphs.
dgl==0.9.0
networkx==2.5
numpy==1.19.2
pandas==1.1.5
pymol==0.1.0
rdkit==2022.3.5
scikit_learn==1.1.2
scipy==1.5.2
torch==1.10.2
tqdm==4.63.0
openbabel==3.3.1 (conda install -c conda-forge openbabel)
Alternatively, install the environment using the provided YAML file at ./environment.yaml
.
./data
: Contains information about various datasets. Download and organize preprocessed datasets as described../config
: Parameters used in EHIGN../log
: Logger../model
: Contains model checkpoints and training records.- Scripts and Implementations: Various Python files implementing models, preprocessing, training, and testing.
- Download the preprocessed datasets and organize them in the
./data
folder. - Run
python train.py
.
- Run
python test.py
(modify file paths in the source code if necessary).
- Run a demo using provided examples:
python preprocess_complex.py
python graph_constructor.py
python train_example.py
-
Organize the data like: -data
-external_test
-pdb_id
-pdb_id_ligand.mol2
-pdb_id_protein.pdb -
Execute the following commands:
python preprocess_complex.py
python graph_constructor.py
python test.py
- (Modify file paths in the source code if necessary)
- Use datasets found in the
./cold_start_data
folder. - Execute scripts
train_random.py
,train_scaffold.py
, andtrain_sequence.py
if the original training set has been processed.