/MR-GSM8K

Challenge LLMs to Reason About Reasoning: A Benchmark to Unveil Cognitive Depth in LLMs

Primary LanguagePython

Issues