Unable to reproduce Main Result in the paper
Closed this issue · 1 comments
alisea47 commented
Hello, I configured the environment according to your README.md. To evaluate Orion's performance on OpenRule155, run python evaluation.py --task openrule155 --inductor rule --mlm_training True --bart_training True --group_beam True
.
I think that the results for the metric BLEU-1 BLEU-2 BLEU-4 ROUGE-L METEOR self-BLEU-2 will be the same as the results in the last row of Table 1 in your paper. However, the results are all slightly different from those in your paper, particularly METEOR and self-BLEU-2.
- Results of the code:
- METEOR: 0.2903811252268602
- self-BLEU-2: 0.87502350778747
- Results in the paper:
- ROUGE-L: 0.4041
- self-BLEU-2: 0.9094
chenxran commented
There are many possible reasons causing this result. If possible we can communicate via WeChat: KERO652.