THUNLP-MT/StableToolBench
A new tool learning benchmark aiming at well-balanced stability and reality, based on ToolBench.
PythonApache-2.0
Issues
- 0
Is the server still working properly?
#25 opened by Octobrist - 0
ToolBench Key
#24 opened by hoyeongchoi - 1
requests.exceptions.ConnectionError: HTTPConnectionPool(host='8.218.239.54', port=8080
#8 opened by lileishitou - 0
Looking for a ToolBench key
#23 opened by Dlxxx - 0
inference problem
#15 opened by farawayxxx - 2
ToolBench Key
#21 opened by Dandelionym - 0
ToolBench Key
#22 opened by Hanlin1004 - 1
role error
#16 opened by farawayxxx - 1
Did ToolBench server crashed?
#18 opened by Reason-Wang - 1
- 2
- 0
报错信息:AttributeError: function_call`
#17 opened by wupaopao123 - 1
gpt3.5 > gpt4 on pass rate?
#14 opened by stanpcf - 2
applied for toolbench_key but no one responded
#13 opened by stanpcf - 1
Implementation of DFSDT and ReACT
#11 opened by JuhaoLiang1997 - 1
How is native LLM on this benchmark?
#12 opened by YenFuLin - 1
Correctness of API Simulator
#4 opened by xuanz20 - 2
- 1
Reproduce experimental results.
#7 opened by Taeyoung-Jang - 2
tool_root_dir in the inference script
#6 opened by zhiyuanc2001 - 2
- 2
Plans to add a license?
#2 opened by timisstrong