humaneval

There are 8 repositories under humaneval topic.

bin123apple/AutoCoder
We introduced a new model designed for the Code generation task. Its test accuracy on the HumanEval base dataset surpasses that of GPT-4 Turbo (April 2024) and GPT-4o.
Language:Python783 14 1265
the-crypt-keeper/can-ai-code
Self-evaluating interview for AI coders
Language:Python511 12 21529
SkyWorkAIGC/SkyCode-AI-CodeX-GPT3
SkyCode是一个多语言开源编程大模型，采用GPT3模型结构，支持Java, JavaScript, C, C++, Python, Go, shell等多种主流编程语言，并能理解中文注释。模型可以对代码进行补全，拥有强大解题能力，使您从编程中解放出来，专心于解决更重要的问题。| SkyCode is an open source programming model, which adopts the GPT3 model structure. It supports Java, JavaScript, C, C++, Python, Go, shell and other languages, and can understand Chinese comments.
393 4 121
abacaj/code-eval
Run evaluation on LLMs using human-eval benchmark
Language:Python373 11 836
zorse-project/COBOLEval
Evaluate LLM-generated COBOL
Language:Python26 1 42
abhigupta2909/LLMPerformanceLab
LLMs' performance analysis on CPU, GPU, Execution Time and Energy Usage
Language:Java0 1 00
talmago/30-seconds-of-code-eval
Code evaluation with *30-seconds-of-code* examples. Inspired by "Evaluating Large Language Models Trained on Code"
Language:Python0 1 00
MousaMohammad/Evaluation-Code-Generator-LLMs
JetBrains Task: Leveraging software evolution data with LLMs