can't get EM=0.79 with the default parameters setting

Question

can't get EM=0.79 with the default parameters setting

ewrfcas opened this issue 5 years ago · 5 comments

Appreciate your great works!
But I can't get the result of EM=0.79 with the default setting sh train.sh 345 5e-4 1.5e-5 5e-5 0.01.
Limited by the gpu memory, I set the gradient_accumulation_steps=8, and the final result is eval em 0.7414010067114094 eval f1 0.7768875838926179.
Then I use the FP16 training in this code and the gradient_accumulation_steps can be decrease to 4. The new result of FP16 is eval em 0.7577600671140939 eval f1 0.7950440436241618.

Answer 1 · 2019-11-01T08:27:40.000Z

Does the verison of 'pytorch_transformer' influence the result of the roberta backbone? I use the pytorch_transformer==1.2.0, and I will try the 1.1.0 version.

Answer 2 · 2019-11-03T02:47:46.000Z

Now the pytorch_transformer with 1.1.0 can get the EM:78.95 & F1:82.56 scores

Answer 3 · 2019-11-07T00:37:07.000Z

Now the pytorch_transformer with 1.1.0 can get the EM:78.95 & F1:82.56 scores

Hi, which version of pytorch you used?

Answer 4 · 2019-11-07T13:17:59.000Z

I use torch==1.2.0.

Answer 5 · 2019-11-08T06:51:14.000Z

@ewrfcas I used fp16, O2 op_level, and cast handly for masked_fill op, but it can not reach F1 82.56. How did you fix masked_fill ? And did you adjust clip_by_norm ?