Question on Training Time
4ndrelim opened this issue · 2 comments
Hi, i'm really fascinated by what you've done and was hoping to recreate the reported results. But i seem to be having trouble re-creating them (granted, i've only run the command a few times, some with a few params tweak. But the longest i have run the default command for was an entire day). The accuracy seems to be incrementing slowly, and even after a day, it reaches around 20% and seem to oscillate up and down.
May i ask, how long did you train the model for? and under what parameter/specification did you use to achieve the high results mentioned in the paper?
Hi, thank you for this question.
My I know what platform you are using (e.g., GPUs)? We used 4xA6000 GPUs to run the experiment.
Oscillating is not observed in our experiment. Could you share your log with us?