Add MMLU-Pro
yifanmai opened this issue · 0 comments
yifanmai commented
https://huggingface.co/datasets/TIGER-Lab/MMLU-Pro
Should be similar to original MMLU: see mmlu_scenario.py
for the original MMLU and air_bench_scenario.py
for how to use load_dataset()
with Hugging Face datasets.
Edit: Also look at simple_scenarios.py
and test_simple_scenarios.py
for an example of MCQA.
Edit 2: Also see this doc.
Edit 3: To create the run spec function, take this function in lite_run_specs.py
:
@run_spec_function("mmlu")
def get_mmlu_spec(subject: str, method: str = ADAPT_MULTIPLE_CHOICE_JOINT) -> RunSpec:
and modify it so mmlu becomes mmlu-pro, then you should be able to do helm-run
.