RuntimeError occurs when test sample >= 10 in math

Question

RuntimeError occurs when test sample >= 10 in math

Opened this issue 4 months ago · 2 comments

Hi, thanks for sharing this benchmark.

One problem I met is when I evaluated the performance in math understanding using
python3 -m axlearn.open_api.evaluator \ --input_file ./output/math_understand/$EVAL_SET \ --output_file ./metrics/math_understand/$EVAL_SET \ --metric_name math \ --grader_model gpt-4o-2024-05-13 \ --client_name openai \

there would be an error

Traceback (most recent call last):

File "/home/dawei/miniconda3/envs/moa/lib/python3.9/runpy.py", line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/dawei/miniconda3/envs/moa/lib/python3.9/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/dawei/miniconda3/envs/moa/lib/python3.9/site-packages/axlearn/open_api/evaluator.py", line 116, in
app.run(main)
File "/home/dawei/miniconda3/envs/moa/lib/python3.9/site-packages/absl/app.py", line 308, in run
_run_main(main, args)
File "/home/dawei/miniconda3/envs/moa/lib/python3.9/site-packages/absl/app.py", line 254, in run_main
sys.exit(main(argv))
File "/home/dawei/miniconda3/envs/moa/lib/python3.9/site-packages/axlearn/open_api/evaluator.py", line 110, in main
evaluate_from_file(FLAGS)
File "/home/dawei/miniconda3/envs/moa/lib/python3.9/site-packages/axlearn/open_api/evaluator.py", line 104, in evaluate_from_file
evaluator.evaluate(input_file=fv.input_file, output_file=fv.output_file, metric_fn=metric_fn)
File "/home/dawei/miniconda3/envs/moa/lib/python3.9/site-packages/axlearn/open_api/common.py", line 602, in evaluate
metrics = metric_fn(
File "/home/dawei/miniconda3/envs/moa/lib/python3.9/site-packages/axlearn/open_api/metrics/math.py", line 96, in metric_fn
judgement_responses = asyncio.run(
File "/home/dawei/miniconda3/envs/moa/lib/python3.9/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/home/dawei/miniconda3/envs/moa/lib/python3.9/asyncio/base_events.py", line 647, in run_until_complete
return future.result()
File "/home/dawei/miniconda3/envs/moa/lib/python3.9/site-packages/axlearn/open_api/common.py", line 253, in async_generate_from_requests
responses.append(await task)
File "/home/dawei/miniconda3/envs/moa/lib/python3.9/asyncio/tasks.py", line 611, in _wait_for_one
return f.result() # May raise f.exception().
File "/home/dawei/miniconda3/envs/moa/lib/python3.9/site-packages/axlearn/open_api/common.py", line 180, in _async_generate_from_request
async with self._semaphore:
File "/home/dawei/miniconda3/envs/moa/lib/python3.9/asyncio/locks.py", line 14, in aenter
await self.acquire()
File "/home/dawei/miniconda3/envs/moa/lib/python3.9/asyncio/locks.py", line 417, in acquire
await fut
RuntimeError: Task <Task pending name='Task-10' coro=<Generator._async_generate_from_request() running at /home/dawei/miniconda3/envs/moa/lib/python3.9/site-packages/axlearn/open_api/common.py:180> cb=[as_completed.._on_completion() at /home/dawei/miniconda3/envs/moa/lib/python3.9/asyncio/tasks.py:598]> got Future attached to a different loop

and when I limited the sample number to be tested < 10, everything would just be fine. Any idea about this?

Thanks.

Answer 1 · 2024-09-04T00:58:54.000Z

Thanks @David-Li0406 for reporting this. The root cause is that's a top-level import that is not included in the core dependencies in math. Working on fixing this issue.

Answer 2 · 2024-09-04T03:17:06.000Z

Thanks for this prompt response.