yafuly opened this issue 3 years ago · 0 comments
Hi,
Thanks for providing the training details. I wonder if you tried smaller batch sizes, as 128K is very expensive.