nshepperd/gpt-2

avg stays in 2.6-2.9 range

freedmann2 opened this issue · 0 comments

I'm trying to finetune 355M, 744M models, but having issue with avg. it doesn't fall at all!
have tried learning_rate 0.00001, 0.00002 , 5e-5 - result is the same.