cylnlp/dialogsum

Rouge Score Scripts

Hannibal046 opened this issue · 5 comments

Hello,could you please tell which rouge scripts you use?thanks

Hi, following Gliwa et al. (2019), we used py-rouge for evaluation. You may find the script here: https://pypi.org/project/py-rouge/.

ok, thanks for the reply.
Did you insert \n between two sentence to calculate rouge score? Because in the link you post , official examples have, which will give a much higher rouge-L score。

Hi,
"\n"s are inserted as marks between utterances in dialogues, which are not inserted between sentences within an utternace.
For official examples have, do you mean \n between utterances?

in this link ,the hypothesis_1 and reference_1 are all appended \n after period, which is curcial for rouge-L.

Sorry for misunderstanding your question.
And the answer is no.
We directly feed the model-generated outputs and references into the py-rouge.