microsoft/LoRA

Can't reproduce the results for GLUE and hyperparameter misalignment

nbasyl opened this issue · 4 comments

nbasyl commented

Hi,
Thanks for the great work.

I am trying to reproduce the result of Roberta-large on the NLU tasks, however, I got a CoLA score = 0 and MNLI = 31.3 using the provided finetuning scripts, and then I found out that there are misalignments between the hyperparameters in the provided training scripts and those on the paper. For example, in roberta_large_cola.sh the lr is set to 3e-4, but in the paper, it is set to 2e-4. Which settings should I follow to reproduce the reported result?

looking forward to your reply!

Best,
Sean

nbasyl commented

I changed the lr in the CoLA training script to 2e-4 and solved the CoLA constant 0 eval correlation value problem, but still couldn't reproduce the MNLI result :(

nbasyl commented

But I am still only getting 62.82 CoLA score, anyone encountered similar problem when trying to reproduce the result

But I am still only getting 62.82 CoLA score, anyone encountered similar problem when trying to reproduce the result

Hi,Did you solve this problem?

I changed the lr in the CoLA training script to 2e-4 and solved the CoLA constant 0 eval correlation value problem, but still couldn't reproduce the MNLI result :(

My result in CoLA is 63.48 which matches the paper. And the random seeds used are (1 3 13 37 71), but I can not reproduce other task, only CoLA can match the paper.