How is native LLM on this benchmark?

Question

YenFuLin opened this issue 7 months ago · 1 comments

Hi,
I'm wondering why this benchmark don't have native LLM's result(such as llama2, llama3).
Do you plan to add these results on this work?

Answer 1 · 2024-06-19T07:13:12.000Z

Hi, thank you for your question.

We have not tested these open-source models yet but it is on the roadmap.