Pinned Repositories
codefuse-evaluation
Industrial-level evaluation benchmarks for Coding LLMs in the full life-cycle of AI native software developing.企业级代码大模型评测体系,持续开放中
codefuse-evaluation
SWE-agent
SWE-agent takes a GitHub issue and tries to automatically fix it, using GPT-4, or your LM of choice. It solves 12.29% of bugs in the SWE-bench evaluation set and takes just 1.5 minutes to run.
codellama
Inference code for CodeLlama models
CodeGeeX2
CodeGeeX2: A More Powerful Multilingual Code Generation Model
HotSummer888's Repositories
HotSummer888/codefuse-evaluation
HotSummer888/SWE-agent
SWE-agent takes a GitHub issue and tries to automatically fix it, using GPT-4, or your LM of choice. It solves 12.29% of bugs in the SWE-bench evaluation set and takes just 1.5 minutes to run.