A Diagnostic Benchmark for Evaluating Logical Robustness of Deductive Reasoners
Primary LanguagePythonMIT LicenseMIT