Pinned Repositories
opencompass
OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.
ToolBench
[ICLR'24 spotlight] An open platform for training, serving, and evaluating large language model for tool learning.
CFinBench-Eval
CFinBench: A Comprehensive Chinese Financial Benchmark for Large Language Models
yanbinwei's Repositories
yanbinwei/CFinBench-Eval
CFinBench: A Comprehensive Chinese Financial Benchmark for Large Language Models