codefuse-ai/codefuse-devops-eval

Industrial-first evaluation benchmark for LLMs in the DevOps/AIOps domain.

PythonNOASSERTION

Issues

Tool Learning数据集中为什么没有“thought”字段？
#22 opened a year ago by liuweie
1
Any related arxiv?
#17 opened a year ago by zhimin-z
1
AIOps样本总计2840，但是文件中只有devopseval-exam\en\dev\OPERATE\AIOps和devopseval-exam\en\test\OPERATE\AIOps两个路径共计1445个样本
#24 opened a year ago by DanLiu0623
0
tool call模型
#21 opened a year ago by zcgit001
0
function call dataset
#20 opened a year ago by paleblackless
1
有增加其他模型比较的计划吗?
#7 opened a year ago by leiwen83
1
您好，请问fcdata-zh-luban和fcdata-zh-codefuse的区别是啥？
#16 opened a year ago by GeniusYx
1
AttributeError: 'EvaluateArguments' object has no attribute 'eval_language'
#18 opened a year ago by slatter666
0
DevOps Summary Benchmark
#15 opened a year ago by lightislost
0
toollearning数据集是否可以提供？
#13 opened 2 years ago by yangyuxiang1996
1
hf 地址失效了求更新
#8 opened 2 years ago by Mr1994
1
hello～hf链接失效了，页面打开都是404哦～
#6 opened 2 years ago by yangbiaoqiange
1
支持多种开源模型prompt格式
#3 opened 2 years ago by donttal
1
数据集需要进一步清洗
#2 opened 2 years ago by hhk123
1
Integrate with LiteLLM - Evaluate 100+LLMs, 92% faster
#1 opened 2 years ago by ishaan-jaff
0