Opened this issue 4 years ago · 1 comments
I trained bert-han on ag-news following instructs, loss became nan after 45600 step.
I met the same problem