Question: How can I obtain ~same speed as the server ?

Question

Question: How can I obtain ~same speed as the server ?

crisdarbellay opened this issue 8 months ago · 3 comments

Hello,

First of all thank you for sharing your work!

I tried to install and use colabfold locally, and use colabfold_search instead of the server (I have ~7000 proteins to predict). I managed to use it, and mobilised the required ressources (1tb ram, performant CPU etc), but I still can't manage to obtain the .a3m files with a correct speed. It takes ~1h per protein. I used this tutorial https://qiita.com/Ag_smith/items/bfcf94e701f1e6a2aa90 as an example for my code.

Could I see more examples of how do you run colabfold_search, especially for a folder full of fasta files?

OS: Ubuntu 22.04

Answer 1 · 2024-05-07T01:31:06.000Z

Same problem!

Answer 2 · 2024-05-07T05:31:33.000Z

Please check the following computational environment:

Is your database placed on a local SSD, not an HDD? If you are using HDD, the calculation speed will be extremely slow. In other cases, if the database is located on a network drive (such as NFS), it will delay the calculation.
vmtouch is useful to keep the database in the RAM. --db-load-mode 2 is also useful, but it will be effective from the second cycle.

Answer 3 · 2024-05-17T09:07:07.000Z

Hi! I need to fold around ~1k dimers and I was wondering whether it is faster to download MMseqs2 locally (colabfold_search) or use the MSA server (colabfold_batch). If so, how much faster? Also, many of these dimers share the same monomer. Is there a way of speeding things up knowing that?