HeliXonProtein/OmegaFold

A100 VS A40 performance issue

truatpasteurdotfr opened this issue · 2 comments

Hi

I am wondering if you can give some ideas about the performance difference between A100:40G and A40:48GB ?
"toy system" used as benchmark: https://rest.uniprot.org/uniprotkb/F0NHT7.fasta

runnning on the A100 (40GB), and A40 (48GB):
omegafold --subbatch_size 768 --weights_file ~/APPASCRATCH/OmegaFold/model.pt ~/F0NHT7.fasta ~/APPASCRATCH/OmegaFold-48G
omegafold --subbatch_size 768 --weights_file ~/APPASCRATCH/OmegaFold/model.pt ~/F0NHT7.fasta ~/APPASCRATCH/OmegaFold-40G
on the A100(80GB):
omegafold --weights_file ~/APPASCRATCH/OmegaFold/model.pt ~/F0NHT7.fasta ~/APPASCRATCH/OmegaFold-80G

the A100 are A100-SXM4-40GB or A100-SXM4-80GB while the A40 is PCI-e based.

INFO:root:Loading weights from /pool/omegafold/weights_files/20220921/model.pt
INFO:root:Constructing OmegaFold
INFO:root:Reading /pool/omegafold/F0NHT7.fasta
INFO:root:Predicting 1th chain in /pool/omegafold/F0NHT7.fasta
INFO:root:1229 residues in this chain.
INFO:root:Finished prediction in 2200.10 seconds.

while the A100-40G is completing the task in ~990 seconds and the A100-80G finishes in ~840 seconds.

Hi,

Sorry we do not have access to A40 GPUs so we do not think we could answer this.

Is omegafold using double precision or just single precision on the gpu? that might explain the huge penalty on A40 VS A100.