vectara/hallucination-leaderboard

Claude 3.5 Sonnet New

sushantnair opened this issue · 2 comments

Hello maintainers,

Can you please update the leaderboard by considering the New Claude 3.5 Sonnet released a few days ago?

Thanks.

Hey @sushantnair - it's already updated. The results reflect the latest Claude 3.5 Sonnet

Hey @ofermend ,

Thanks a lot for the quick reply!

I'm honestly surprised that even the latest version of that model, named "Claude 3.5 Sonnet (New)", with that extra "New" in parenthesis, still has 4.5% hallucinations!

Yet in my experience, coding and debugging is outstanding. And the best perk of using Anthropic's models is the inherent safety that comes with using it.

Anyways, thanks again!