hkust-nlp/AgentBoard

Any plans to add new models?

Opened this issue · 1 comments

Hi there,

Thank you for the great contributions!

There have been many new models released since the benchmark was published. Do you have any plans to include some of these recent models, such as GPT-4o, Claude-3.5, Llama-3.1 405B, Mistral Large 2, DeepSeek V2, and others? Adding results from these models could provide significant value to the community!

Thanks

Yes, we have tested Llama3-8B/ 70B, Claude-3, and Gemini pro recently and will merge this soon. We will continue to add these important models suggested.