/arena-hard-auto

Arena-Hard-Auto: An automatic LLM benchmark.

Primary LanguageJupyter NotebookApache License 2.0Apache-2.0

No issues in this repository yet.