Evaluation of form feed symbol with BLEU results in error
lowlypalace opened this issue · 0 comments
lowlypalace commented
Hi, I'm generating LLM sequences with some of the HF models such as pythia-1.4b. Some of my generations result in a sequence consisting only of form feed token, which is 12th ASCII character.
from evaluate import load
bleu = load("bleu")
prediction = "hello"
reference = chr(12)
bleu_score = bleu.compute(
predictions=[prediction], references=[[reference]]
)["bleu"]
The following code results in an error:
ZeroDivisionError Traceback (most recent call last)
[<ipython-input-1-8625f8bf1df7>](https://localhost:8080/#) in <cell line: 8>()
6 reference = chr(12)
7
----> 8 bleu_score = bleu.compute(
9 predictions=[prediction], references=[[reference]]
10 )["bleu"]
2 frames
[/usr/local/lib/python3.10/dist-packages/evaluate/module.py](https://localhost:8080/#) in compute(self, predictions, references, **kwargs)
465 inputs = {input_name: self.data[input_name] for input_name in self._feature_names()}
466 with temp_seed(self.seed):
--> 467 output = self._compute(**inputs, **compute_kwargs)
468
469 if self.buf_writer is not None:
[~/.cache/huggingface/modules/evaluate_modules/metrics/evaluate-metric--bleu/9e0985c1200e367cce45605ce0ecb5ede079894e0f24f54613fca08eeb8aff76/bleu.py](https://localhost:8080/#) in _compute(self, predictions, references, tokenizer, max_order, smooth)
120 references = [[tokenizer(r) for r in ref] for ref in references]
121 predictions = [tokenizer(p) for p in predictions]
--> 122 score = compute_bleu(
123 reference_corpus=references, translation_corpus=predictions, max_order=max_order, smooth=smooth
124 )
[~/.cache/huggingface/modules/evaluate_modules/metrics/evaluate-metric--bleu/9e0985c1200e367cce45605ce0ecb5ede079894e0f24f54613fca08eeb8aff76/nmt_bleu.py](https://localhost:8080/#) in compute_bleu(reference_corpus, translation_corpus, max_order, smooth)
101 geo_mean = 0
102
--> 103 ratio = float(translation_length) / reference_length
104
105 if ratio > 1.0:
ZeroDivisionError: float division by zero
The expected behaviour would be that the score should still be computed for this character even though this is a non-printable character. I believe this will happen with other non-printable characters. Is this an intended behaviour?