/TurtleBenchmark

A novel LLM benchmark focus[es] on evaluation of model reasoning & understanding.

Primary LanguagePython

No issues in this repository yet.