llamazing/numnet_plus

can't get EM=0.79 with the default parameters setting

ewrfcas opened this issue · 5 comments

Appreciate your great works!
But I can't get the result of EM=0.79 with the default setting sh train.sh 345 5e-4 1.5e-5 5e-5 0.01.
Limited by the gpu memory, I set the gradient_accumulation_steps=8, and the final result is eval em 0.7414010067114094 eval f1 0.7768875838926179.
Then I use the FP16 training in this code and the gradient_accumulation_steps can be decrease to 4. The new result of FP16 is eval em 0.7577600671140939 eval f1 0.7950440436241618.

Does the verison of 'pytorch_transformer' influence the result of the roberta backbone? I use the pytorch_transformer==1.2.0, and I will try the 1.1.0 version.

Now the pytorch_transformer with 1.1.0 can get the EM:78.95 & F1:82.56 scores

Now the pytorch_transformer with 1.1.0 can get the EM:78.95 & F1:82.56 scores

Hi, which version of pytorch you used?

I use torch==1.2.0.

@ewrfcas I used fp16, O2 op_level, and cast handly for masked_fill op, but it can not reach F1 82.56. How did you fix masked_fill ? And did you adjust clip_by_norm ?