Assertion error in LLM based fuzzy match
Closed this issue · 1 comments
vardaan123 commented
for config_files/test_reddit/69.json, I get the following error in LLM based fuzzy match metric.
[Unhandled Error] AssertionError('n/a')
Traceback (most recent call last):
File "/home/pahuja.9/visualwebarena/run.py", line 412, in test
score = evaluator(
File "/home/pahuja.9/visualwebarena/evaluation_harness/evaluators.py", line 626, in __call__
cur_score = evaluator(trajectory, config_file, page, client)
File "<@beartype(evaluation_harness.evaluators.HTMLContentExactEvaluator.__call__) at 0x7f992c464790>", line 115, in __call__
File "/home/pahuja.9/visualwebarena/evaluation_harness/evaluators.py", line 472, in __call__
StringEvaluator.fuzzy_match(
File "<@beartype(evaluation_harness.evaluators.StringEvaluator.fuzzy_match) at 0x7f992c453e20>", line 69, in fuzzy_match
File "/home/pahuja.9/visualwebarena/evaluation_harness/evaluators.py", line 197, in fuzzy_match
return llm_fuzzy_match(pred, ref, intent)
File "<@beartype(evaluation_harness.helper_functions.llm_fuzzy_match) at 0x7f992c452cb0>", line 69, in llm_fuzzy_match
File "/home/pahuja.9/visualwebarena/evaluation_harness/helper_functions.py", line 609, in llm_fuzzy_match
assert "correct" in response, response
AssertionError: n/a
I am using the same LLM for fuzzy match as in the original code.
kohjingyu commented
This happens sometimes if gpt-4 returns something other than one of ["correct", "partially correct", "incorrect"]. It can safely ignored since the task would have failed anyway (since the llm didn't return "correct"). You could also replace the assert with another elif
to assign a score of 0 if you prefer.