mlpc-ucsd/CoaT

About AMP and batch size

Closed this issue · 2 comments

Hi,
I'm very impressed by your excellent work! Thanks for sharing your code.

I have questions about the training protocol.

In your paper,

"We train all models with a global batch size of 2048 with the NVIDIA Automatic Mixed Precision(AMP) enabled."

but the training script denotes the batch size of 256, instead of 2048.

I wonder two points from here.

  1. Can I re-produce the result accuracy in this repo by using this command (batch size=256, instead of 2048)?

  2. Does this repo contains AMP?

Thanks in advance :)

Hi @youngwanLEE, thank you for your interest in our work!

  1. For your batch size concern, actually 256 means the batch size per GPU. The 8-GPU setting is used in the default training command and it has a total batch size 2048. You should be able to re-produce the result accuracy (similar to our reported ones) using the command provided in this repo.

  2. Yes. The AMP is enabled at default.

@xwjabc Thanks for your quick reply :)