deepmodeling/Uni-Mol

Longer than expected running time for Binding Pose Prediction

tsa87 opened this issue · 3 comments

tsa87 commented

In the Uni-Mol paper, the average number of seconds per ligand is 0.2.

Efficiency benchmark We compare Uni-Mol binding pose prediction with popular docking tools in efficiency. The baseline results are taken from EquiBind [99] paper. And Uni-Mol binding pose prediction is run on a single V100 GPU. For each molecule, Uni-Mol is run with 10 different initial conformations, and the total time cost is reported. As shown in Table 20, Uni-Mol is significantly faster than traditional docking tools, about 250x faster.

image

I tried running the docking pose prediction on the provided test.lmdb and the run time took much longer than expected on a single RTX 3090. The average time was about 3 seconds per ligand. Were anything done to improve the run time of this?

This step was relatively fast ~36 seconds:

data_path="./protein_ligand_binding_pose_prediction"  # replace to your data path
results_path="./infer_pose"  # replace to your results path
weight_path="./save_pose/checkpoint.pt"
batch_size=8
dist_threshold=8.0
recycling=3

python ./unimol/infer.py --user-dir ./unimol $data_path --valid-subset test \
       --results-path $results_path \
       --num-workers 8 --ddp-backend=c10d --batch-size $batch_size \
       --task docking_pose --loss docking_pose --arch docking_pose \
       --path $weight_path \
       --fp16 --fp16-init-scale 4 --fp16-scale-window 256 \
       --dist-threshold $dist_threshold --recycling $recycling \
       --log-interval 50 --log-format simple

But this step took around 15 mins for 285 ligand/pocket pairs.

`nthreads=20`  # Num of threads
predict_file="./infer_pose/save_pose_test.out.pkl"  # Your inference file dir
reference_file="./protein_ligand_binding_pose_prediction/test.lmdb"  # Your reference file dir
output_path="./protein_ligand_binding_pose_prediction"  # Docking results path

python ./unimol/utils/docking.py --nthreads $nthreads --predict-file $predict_file --reference-file $reference_file --output-path $output_path

The time in the table refers to the inference time in binding pose prediction task

tsa87 commented

Does the binding pose prediction task include the time for docking.py? I believe calling docking.py is required to compute the coordinates for the atoms based on the predicted intermolecular distance.

This time is just the time for model inference in this task.