fuzzy match gives the wrong answer in eval
Opened this issue · 1 comments
cheng-tan commented
for task 361:
our agent gave the answer: Order number 170 is Canceled, order number 189 is Pending
the evaluator is using fuzzy match and evaluated our answer as wrong:
"eval_types": [
"string_match"
],
"reference_answers": {
"fuzzy_match": [
"170: cancelled",
"189: pending"
]
},
"reference_url": "",
"program_html": [],
"string_note": "",
"reference_answer_raw_annotation": "170: cancelled, 189: pending"
},
minghchen commented
In my opinion, line 165 of 'StringEvaluator' in evaluation_harness.evaluator_router should be revised to:
assert isinstance(value, list)
score *= self.fuzzy_match(
ref=" ".join(value), pred=pred, intent=intent
)
The original code will compare each individual item in the 'fuzzy_match' list with the prediction, but the prediction should be compared with the whole list.