night-chen/ToolQA
ToolQA, a new dataset to evaluate the capabilities of LLMs in answering challenging questions with external tools. It offers two levels (easy/hard) across eight real-life scenarios.
Jupyter NotebookApache-2.0
Stargazers
- 0x1za@bowery-intelligence
- 4IK1d
- chinollChina
- codinglover0111
- ddkang1
- DmitryRyuminSt.Petersburg, Russia
- Duan-JM杭州
- evdcush
- evolu8
- frouhiNew York City
- functorismEarth
- gaahrdner@zenbusiness
- jahuu
- jstenmarkStockholm
- kinopsis
- kylezhangwei
- LuckyyySTA
- Maxvanhattum@Open-Maze
- mintmatica
- noon-abdulqadirUniversity of Amsterdam
- polya20
- Qi-SunBeijing
- Remember2015Shanghai China
- rfjohnso
- RManLuoMonash University
- rpand002MIT-IBM Watson AI Lab
- Run0nceEx
- sangkilpark-kidmam
- shyamperi
- stoicasergiu
- taisazeroPhD Student - UNC@Charlotte
- Vbansal21None
- wentinghomeUniverisity of Illions at Chicago
- winglianAnnapolis, MD
- yangsp5
- zhjohnchanThe Chinese University of Hong Kong, Shenzhen