AGI-Edgerunners/LLM-Adapters

Training reproduce

ChaoGaoUCR opened this issue · 2 comments

Dear Authors,

Thanks for the great work again.
I have a quick question,
I try to do training with 4 epochs by setting the trainer epoch to 1 and using for to repeat it four times.
I can't get the same result with this,
Any hint for what I did wrong?
image

Thanks

Hi,

Did you resume from previous epoch checkpoints? If so, please ensure every epoch is training from scratch. If not, you can set a random seed to strengthen the result reproducibility. Could you report the results you got? I'd like to know how the results vary. Thanks!

Dear,

Thanks, it's my fault, I set the batch size too big(512),
I fixed it with batch size 16, now it works perfectly.

Best