ZeroDivisionError: division by zero in AccuracyForLanguageGeneration._compute_single_pred_single_ref
NISH1001 opened this issue · 3 comments
Describe the bug
I was running RobertaForQuestionAnswering
on HuggingFace's squad-v2 train sets (~86k).
The Accuracy
metric at AccuracyForLanguageGeneration._compute_single_pred_single_ref
threw division by zero error.
To Reproduce
- Use
datasets
squad-v2train
set. - Run the samples through
pipeline("question-answering", ...)
Expected behavior
Run without error.
Exception Traceback (if available)
If applicable, add full traceback to help explain your problem.
ration.py:107, in AccuracyForLanguageGeneration._compute_single_pred_single_ref(self, predictions, references, reduce_fn, **kwargs)
105 if token in ref_counts:
106 score += min(pred_count, ref_counts[token]) # Intersection count
--> 107 scores.append(score / max(len(pred), len(ref)))
108 avg_score = sum(scores) / len(scores)
109 return {"score": avg_score}
ZeroDivisionError: division by zero
Environment Information:
- OS: Mac OS 13.2.1 (22D68)
- jury version: 2.2.3
- evaluate version:
evaluate==0.2.2
- datasets version:
datasets==2.11.0
Thanks. Appreciate jury to exist. I could patch this by cloning and doing in-depth trace analysis. But, I wanted to know if there's a better way to patch this.
Re: I was able to patch it under try/catch block here:
NISH1001@6bdf680
Should I send a PR? I don't know if need to just throw a warning or also show the original <value>
for either of the pred/ref.
Hi @NISH1001, thanks for the heads-up, and also thanks for your comments, it is appreciated. I'll look into the PR asap.