This repository is not active
banksy23/Code-Eval
Evaluate the code generation ability of Code LLMs. Supports mainstream benchmarks such as HumanEval and MBPP. Define chat templates and pre/post-processing code in interface form to facilitate customizing the logic for extracting valid code snippets for testing for one's own model.