open-compass/CriticBench
[NeurIPS 2024] A comprehensive benchmark for evaluating critique ability of LLMs
PythonApache-2.0
No issues in this repository yet.
[NeurIPS 2024] A comprehensive benchmark for evaluating critique ability of LLMs
PythonApache-2.0
No issues in this repository yet.