big-bench evaluation

Question

big-bench evaluation

thesofakillers opened this issue 2 years ago · 1 comments

Hi, thank you for your work.

From what I understand a large portion of the evaluation was done on the big-bench benchmark.
How would we run evaluation to reproduce these results? It is unclear from the evaluation README.

Thank you!

Answer 1 · 2022-07-12T08:51:56.000Z

Hi @thesofakillers, thanks for your question

We used a fork from the BIGBENCH repository: https://github.com/lintangsutawika/BIG-bench/tree/t5 (that is now very much behind...hehe)

Now that big bench is also available through the HF datasets library (https://huggingface.co/datasets/bigbench), I suspect you can tweak around the https://github.com/bigscience-workshop/t-zero/blob/master/evaluation/run_eval.py script to get the bigbench numbers.