Computing BLEU more than once
Closed this issue ยท 4 comments
Hey, why does computing the BLEU score more than once, affect the key value of the score dict.
e.g. 'bleu_1', 'bleu_1_1', 'bleu_1_1_1'
Overall I find the library quite user-friendly, but unsure about this behavior.
Hey @Axe-- , thank you for sharing thoughts about the library, appreciate the feedback ๐. Can you please share a reproducable example/code snippet, so I can look into the problem in detail ? Also, can you please specify your version of jury
and datasets
.
Sure! Here's the snippet
from jury import Jury
scorer = Jury()
predictions = [
["the cat is on the mat", "There is cat playing on the mat"],
["Look! a wonderful day."]
]
references = [
["the cat is playing on the mat.", "The cat plays on the mat."],
["Today is a wonderful day", "The weather outside is wonderful."]
]
scores = scorer(predictions=predictions, references=references)
scores_ = scorer(predictions=predictions, references=references)
print(scores.keys())
print(scores_.keys())
Output:
dict_keys(['empty_predictions', 'total_items', 'bleu_1', 'bleu_2', 'bleu_3', 'bleu_4', 'meteor', 'rouge'])
dict_keys(['empty_predictions', 'total_items', 'bleu_1_1', 'bleu_2_2', 'bleu_3_3', 'bleu_4_4', 'meteor', 'rouge'])
Version Info:
import jury, datasets
print(jury.__version__, datasets.__version__)
>>> 2.0.0 1.12.1
Hope this helps!
@Axe-- Thank you for the snippet, I reproduced the behavior, and I think it is due to previous bleu implementation lack a spesific naming convention control. However, this problem does not occur in the recent version of jury (2.1.0). As the same code produces
>>> dict_keys(['empty_predictions', 'total_items', 'bleu_1', 'bleu_2', 'bleu_3', 'bleu_4', 'meteor', 'rouge'])
>>> dict_keys(['empty_predictions', 'total_items', 'bleu_1', 'bleu_2', 'bleu_3', 'bleu_4', 'meteor', 'rouge'])
Upgrading to the latest version would solve this issue.
Awesome! And thank you for all the work!