Question about computing resource and batch size

Question

Question about computing resource and batch size

jasperhyp opened this issue 2 years ago · 4 comments

Hi,

Thanks for sharing the code. I noticed in your run_pretrain.sh, the batch size of protein-GO and protein MLM is 8, while the batch size of GO-GO is 64. Meanwhile, the num of negative samples for each positive sample is 128, or 256 for GO-GO.

(1) Does this mean in each GO-GO pass, at most (64*2+64*256) samples of length at most 128 are fed into the GO encoder (in one batch)?

(2) How many V100s did you use for this pretraining?

Also, I noticed that you didn't permutate proteins for protein-GO relations.

(3) Is this due to computing resource limit (i.e. 8*128 is just too large a number for proteins)?

(4) Did you experiment with a lower number of negative samples while considering such protein permutation?

Thanks in advance!

Answer 1 · 2023-01-05T09:45:41.000Z

Hi,

(1) In one batch, 64 positive samples and 64 * 128 * 2 negative samples are input into the GO encoder.

(2) We use 4 * V100s to pretrain the model.

(3) Due to the limitation of computing resources, we didn't permutate proteins for protein-GO relations.

(4) The number of negative samples is an important hyperparameter, but in this work, we didn't search for the optimal value.

Answer 2 · 2023-01-06T02:54:34.000Z

Thank you for providing the information! Just one more question: With that GPU budget and training step (I saw in run_pretrain.sh that the max step is 60000), how long did it take you to train the model? Please don't feel pressured to answer, I totally understand as it was a long while ago. But in case you remember, it would be very helpful if I could get a rough estimate!

Answer 3 · 2023-01-06T03:02:23.000Z

It took about a week to pre-train the model.

Answer 4 · 2023-01-06T15:30:08.000Z

Thank you for your kind help!