Question about computing resource and batch size
jasperhyp opened this issue · 4 comments
Hi,
Thanks for sharing the code. I noticed in your run_pretrain.sh
, the batch size of protein-GO and protein MLM is 8, while the batch size of GO-GO is 64. Meanwhile, the num of negative samples for each positive sample is 128, or 256 for GO-GO.
(1) Does this mean in each GO-GO pass, at most (64*2+64*256) samples of length at most 128 are fed into the GO encoder (in one batch)?
(2) How many V100s did you use for this pretraining?
Also, I noticed that you didn't permutate proteins for protein-GO relations.
(3) Is this due to computing resource limit (i.e. 8*128 is just too large a number for proteins)?
(4) Did you experiment with a lower number of negative samples while considering such protein permutation?
Thanks in advance!
Hi,
(1) In one batch, 64 positive samples and 64 * 128 * 2 negative samples are input into the GO encoder.
(2) We use 4 * V100s to pretrain the model.
(3) Due to the limitation of computing resources, we didn't permutate proteins for protein-GO relations.
(4) The number of negative samples is an important hyperparameter, but in this work, we didn't search for the optimal value.
Thank you for providing the information! Just one more question: With that GPU budget and training step (I saw in run_pretrain.sh
that the max step is 60000), how long did it take you to train the model? Please don't feel pressured to answer, I totally understand as it was a long while ago. But in case you remember, it would be very helpful if I could get a rough estimate!
It took about a week to pre-train the model.
Thank you for your kind help!