Add Claude 3.5 Haiku
Closed this issue · 2 comments
EwoutH commented
Anthropic just released Claude 3.5 Haiku, I’m very curious how they score!
They claim 65.0% overall.
Wyyyb commented
Thanks, we have added its evaluation result to our leaderboard and the model output into our git repo.
EwoutH commented
Thanks!
it’s great that you guys do independent verification, because your measured
62.1% is noticeably lower than the 65.0% that Anthropic claimed.