batch sizes

Question

batch sizes

dlnp2 opened this issue 5 years ago · 2 comments

Thank you very much for your great repo! This is really helpful.

I have a question: in your parer, especially in Section "A.3 Training Details", you described

speciﬁc batch sizes for each model at each protein length are available in our repository
however I could not find these. Could tell me the corresponding information in this repo?

Thank you.

Answer 1 · 2020-02-04T18:12:34.000Z

The pytorch version of the code removes this training detail. In tensorflow, we used that bucket_by_sequence_length function to select batch sizes for each protein length and model. This is because 1) tensorflow provides this function easily and 2) tensorflow makes it more difficult to implement gradient accumulation.

The pytorch code instead implements gradient accumulation, which allows you to control the batch size in a different way. This is simpler to implement in pytorch and also results in a more easily reproducible setup. If you do want the batch sizes for the tensorflow code, you can look at each individual model, which have a get_optimal_batch_sizes() function. If you're planning on using the pytorch code, simply pick a desired final batch size, distribute across as many GPUs that you plan to use, and then increase gradient accumulation steps until you're able to fit the batch in memory.

Answer 2 · 2020-02-07T16:33:44.000Z

@rmrao Thank you very much for you kind reply.

you can look at each individual model, which have a get_optimal_batch_sizes() function

OK. I will check these. Closing.