kliment-olechnovic/ftdmp

General Questions To Test Runs

Closed this issue · 2 comments

I have some general questions concerning the runtime. Im just looking for a rough estimation.

Docking two proteins (globular) of 400aa, how long will it take with default settings (based on run.bash). How does the size of the proteins influence the inference time?

Is FTDMP docking optimised for GPUs? How much faster will it run? Will it run faster if I provide more memory?

Similar questions for FTDMP scoring.

Thanks :)

Hi,

I will try to answer, but my running time estimates will be very approximate, sorry.

Docking in FTDMP is not done on GPU, but it can be done on multiple CPU (either on one machine, on on a cluster via slurm). Docking two 400 aa proteins on 8 CPUs can take up to 1 or 2 hours. Docking time and memory consumption depends no-linearly (i.e. worse than linearly) on the size of proteins. I do not have exact numbers, sorry.

Scoring is a different matter - it runs several algorithms, all of them can run on CPU, but some (the ones involving applying neural networks) will use GPU if GPU is available. Scoring is also adapted to run on a cluster via slurm. Overall, scoring stage can take about 1 hour on 8 CPUs. Scoring time and memory consumption is much less dependent on the protein complex size. The most costly step in scoring is optional relaxing with OpenMM before final rescoring (GPU is very useful for OpenMM).

So, FTDMP can be quite slow, but it can be run on a multi-CPU cluster, see "--parallel-" and "--sbatch-" parameters in the examples (we usually run FTDMP on a cluster using 64 CPUs in the docking stage and 128 CPUs in the scoring stage).

Scoring can be run separately, without docking, if you already have multimeric models generated with other methods (e.g. deep learning-based methods like AlphaFold or RosettaFold). The scoring time will depend mostly on the number of input structures.

Cheers,
Kliment

Thanks a lot, this already helps me estimating the runtime and resources needed :)