night-chen/ToolQA
ToolQA, a new dataset to evaluate the capabilities of LLMs in answering challenging questions with external tools. It offers two levels (easy/hard) across eight real-life scenarios.
Jupyter NotebookApache-2.0
Stargazers
No one’s star this repository yet.