maszhongming/UniEval

Why are the tau and rho tables in Summeval different from those in the original Summeval paper?

Opened this issue · 1 comments

Tau and rho from the Summeval paper:
image

Tau and rho from the Unieval paper:
image

I believe the issue lies in how you compute the scores. Instead of calculating the Rouge score against the annotated reference, you compute it directly with the source text. Don’t you think this is unfair to scoring functions that have a limited token input, or to those that operate at the set level, like Rouge? Thank you.

I’d like to clarify a few points regarding your questions:

  1. The ROUGE metric correlations are from the BARTScore paper, but I believe all ROUGE scores are calculated against annotated references.
  2. Table 2 you provided shows system-level correlations, whereas Table 3 refers to summary-level correlations. Please refer to this paper for the distinction between the two.