ZubinGou/math-evaluation-harness
A simple toolkit for benchmarking LLMs on mathematical reasoning tasks. 🧮✨
PythonMIT
Stargazers
- 2003proHong Kong University of Science and Technology
- aiseei
- csyanghanSJTU
- GanjinZeroDAMO Academy
- GuochryRenmin University of China
- Gxy-2001Peking University
- HillZhang1999Bytedance
- hongtangshuisjtu & bytedance
- HuangOwenVSDL Lab, HKUST
- jmSNUCML@Seoul National University
- jxzhangjhuIntuit AI Research
- koalazf99Shanghai Jiao Tong University
- lewtun@huggingface
- lihaolingTsinghua University
- lx865712528@Microsoft Research
- lzh0525
- MasterVitoTsinghua University
- Olivia-fsmEcole Polytech Federal of Lausanne
- percent4Shanghai
- peterjc123Shanghai, China
- pprpData Science and Analytic Thrust, Information Hub, HKUST(GZ)
- qrdaiUniversity of Illinois Urbana-Champaign
- REIGN12Tsinghua University
- seshurajup@dolcera
- SinclairCoderChina
- SivilTaramResearcher @ TikTok
- TechxGenusUSTC
- TianheWuTsinghua University
- ToheartZhangRenmin Univiersity of China
- valeriocardosoHvar Consulting
- wwh0411
- Xinzhe-Ni
- xz259
- yiyihum
- yulonghuiPeking University
- ZubinGouTsinghua University