s12 model Reproduction experiment
Closed this issue · 1 comments
starsky68 commented
Using the s12 model, only the four card batch size is 240 for a single card, and the acc top1 is 76 in the end. If there are no eight cards, how can the acc reach 80,Other parameter defaults. --Apex amp can greatly affect the accuracy in addition to fast training.
yuweihao commented
Hi @starsky68 ,
Thanks for your attention. Please refer to the train.py file in metaformer repo where I add --grad-accum-steps
so that you can set larger batch size.