open-compass/T-Eval

Evaluate Claude 3

Opened this issue · 0 comments

Hi ppl,
Anthropic released new Claude model series, with two of them added to API for everyone.
They claim these models have «more advanced agentic capabilities», could you please validate them & update the LB?

Thanks in advance!