/TruthQuest

We introduce TruthQuest, a benchmark designed to evaluate the suppositional reasoning capabilities of large language models through knights and knaves puzzles.

Primary LanguagePythonCreative Commons Attribution Share Alike 4.0 InternationalCC-BY-SA-4.0

Stargazers