modelscope/evalscope
A streamlined and customizable framework for efficient large model evaluation and performance benchmarking
PythonApache-2.0
Issues
- 3
请问,评估支持使用昇腾910NPU嘛?
#123 opened by yiyayieryo - 6
选择winogrande测评报错
#148 opened by suchuxin - 2
基础模型评测(qwen2-7b-chat)报错
#147 opened by ljh567 - 2
输出日志显示字符乱码,预期显示中文字符
#162 opened by Devliang24 - 2
测试结果指标数值相同
#141 opened by LHB-kk - 1
执行压测报错 ZeroDivisionError: float division by zero
#151 opened by moyerlee - 2
评估base模型出错
#156 opened by yawzhe - 4
带参数模版的使用方法,这样有问题么?
#160 opened by Devliang24 - 2
请问怎么设置指定输入输出的token长度,例如:支持1k 2k 4k 等,输出也是可以指定
#158 opened by Devliang24 - 2
压测vllm接口报错
#159 opened by ZTurboX - 3
OpenCompass Eval-Backend 不能用自定义的数据集吗?
#79 opened by jackqdldd - 1
- 1
评估代码eval 运行如何不能本地加载数据集?
#154 opened by yawzhe - 22
- 3
使用OpenCompass Math数据集压测,可以使用few shot吗
#152 opened by tianshiyisi - 6
C-MTEB上很多测评指标对不上,尤其是检索指标
#155 opened by xujunrt - 2
请问VLM的自定义评测集怎么做?
#97 opened by stay-leave - 2
- 4
使用自定义模板进行评测报错
#139 opened by mianbaoji - 1
访问api 服务,当并发数大于等于500时,出现请求失败
#142 opened by LHB-kk - 0
- 1
readme中命令行参数中,--template-type的文档引用失效
#133 opened by hrhrng - 3
swift eval 执行自定义数据集:dacite.exceptions.MissingValueError: missing value for field "data"
#132 opened by jackqdldd - 5
swift eval 执行报错: cannot import name 'ftp_head' from 'datasets.utils.file_utils'
#129 opened by jackqdldd - 1
evalscope perf 测试sglang 部署的openai api server 无法输出结果
#128 opened by hetian127 - 4
自定义vlm数据集,build_prompt(self, line) 没有执行
#130 opened by jackqdldd - 3
请教一下,如何使用openai兼容格式的大模型作为评估模型来进行两模型在自定义数据集上的评估任务呢?
#115 opened by EvilCalf - 4
AttributeError: can't set attribute 'split'
#122 opened by Jeremy-J-J - 2
Baseline模型对比模式结果错误
#118 opened by stay-leave - 4
没有结果
#120 opened by lucheng07082221 - 6
调用OpenCompassBackendManager.list_datasets()错误
#105 opened by lyc0930 - 2
能够描述一下 每个指标的含义,有几个指标不太懂什么意思
#103 opened by shell-nlp - 1
HallusionBench数据集的"aAcc","fAcc","qAcc"指标含义
#104 opened by stay-leave - 0
模型推理性能压测 evalscope perf 长时间没有返回
#112 opened by undyingfame - 5
eval_swift_openai是否支持并发测试,怎么配置?
#95 opened by charliedream1 - 1
- 1
未来是否有计划支持对 embedding/reranker 模型 性能/指标 的评估
#107 opened by shell-nlp - 1
perf 只发送一条请求
#101 opened by shell-nlp - 3
OpenCompass,VLMEvalKit 评测模型的时候如何指定请求参数?
#98 opened by jackqdldd - 10
perf 测试不输出结果
#69 opened by hetian127 - 1
配置eval scope环境时报错:FileNotFoundError: [Errno 2] No such file or directory: './MANIFEST.in'
#72 opened by Juvember - 1
OpenCompass 支持设置请求的参数模板吗,类似perf模块的--query-template
#81 opened by jackqdldd - 1
Where is the toolkit name?
#96 opened by zhimin-z - 2
evalscope perf wandb
#94 opened by zll0000 - 7
evalscope perf --url 'our_url/v1/completions' --parallel 128 --model 'Qwen2-72B-Instruct' --log-every-n-query 10 --read-timeout=120 --dataset-path './data/open_qa.jsonl' -n 1 --max-prompt-length 128000 --api openai --stream --stop '<|im_end|>' --dataset openqa --debug
#93 opened by zll0000 - 2
infor_vqa,doc_vqa数据集在计算指标时出现没有answer的情况
#92 opened by stay-leave - 5
是否可以对在线 API 进行模型评估
#89 opened by MatheMatrix - 12
llmuses 0.3.2 执行自带的数据集报错:ImportError: cannot import name '_datasets_server' from 'datasets.utils' (/data/anaconda3/envs/eval-scope/lib/python3.10/site-packages/datasets/utils/__init__.py)
#76 opened by jackqdldd - 3
llmuses 0.3.2/最新代码 本地执行自带的数据集报错:浮点数例外
#77 opened by jackqdldd - 2
文档中的简单示例没法执行:cannot import name 'DEFAULT_CIPHERS' from 'urllib3.util.ssl_' (/data/anaconda3/lib/python3.11/site-packages/urllib3/util/ssl_.py)
#75 opened by jackqdldd