/TurtleBench

TurtleBench: Evaluating Top Language Models via Real-World Yes/No Puzzles.

Primary LanguageJupyter NotebookApache License 2.0Apache-2.0

Watchers