A survey of Code Agents for improving development productivity. These agents aim to help
- SWE (Software Engineer)
- MLE (Machine Learning Engineer)
- DS (Data Scientist)
- DA (Data Analyst)
- Paper with Code
- Opensource Projects / Company Products
- Foundation Models
| Paper |
Year |
Publisher |
Type |
Institution |
Code |
| SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering |
2024 |
Arxiv |
GitHub issue fixing |
Princeton |
 |
| SWE-bench: Can Language Models Resolve Real-World GitHub Issues? |
2024 |
ICLR |
benchmark github issue fixing |
Princeton |
 |
| DevBench: A Comprehensive Benchmark for Software Development |
2024 |
Arxiv |
benchmark LLM for dev lifecycle |
Shanghai AI Laboratory etc |
 |
| Code Generation with AlphaCodium: From Prompt Engineering to Flow Engineering |
2024 |
Arxiv |
code generation for CodeContests |
CodiumAI |
 |
| ChatDev: Communicative Agents for Software Development |
2023 |
Arxiv |
design, coding, and testing |
Tsinghua University etc |
 |
| MetaGPT: The Multi-Agent Framework |
2023 |
Arxiv |
Multi-Agent Framework. using the software collaboration as an example |
DeepWisdom etc |
 |
| Data Interpreter: An LLM Agent For Data Science |
2024 |
Arxiv |
address data science problem |
DeepWisdom etc |
 |
| Agentless: Demystifying LLM-based Software Engineering Agents |
2024 |
Arxiv |
agentless method for swe-bench |
UIUC |
 |
| DS-1000: A Natural and Reliable Benchmark for Data Science Code Generation |
2023 |
ICML |
code generation benchmark with a thousand data science questions |
The University of Hong Kong etc |
 |
| AutoCodeRover: Autonomous Program Improvement |
2024 |
[Arxiv] |
|
NUS |
 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Opensource Projects / Company Products
| Name |
Type |
Target |
Contributor |
Code / Product |
| OpenDevin: Code Less, Make More |
write code, fix bugs, and ship features. |
SWE |
OpenDevin Community |
 |
| Devon: An open-source pair programmer |
Codebase exploration Config writing Test writing Bug fixing Architecture exploration |
SWE |
entropy-research |
 |
| gpt-engineer |
write and execute software code |
SWE |
gpt-engineer-org |
 |
| Aider is AI pair programming in your terminal |
start a new project or work with an existing git repo. |
SWE |
paul-gauthier |
 |
| Cover-Agent |
automate and enhance the generation of tests (currently mostly unit tests) |
QA Engineer |
CodiumAI |
 |
| PR-Agent |
Automated Pull Request Analysis, Feedback, Suggestions |
SWE |
CodiumAI |
 |
| GPT PILOT |
VS Code extension that aims to provide the first real AI developer companion |
SWE |
Pythagora-io |
 |
| Claude Engineer |
assist with a wide range of software development tasks |
SWE |
Doriandarko |
 |
| cognition AI |
An applied AI lab building end-to-end software agents. |
SWE |
cognition.ai |
Product |
| Tabby |
self-hosted AI coding assistant, Github Copilot alternative |
SWE |
tabbyml |
 |
| Sweep AI |
issue PR, unit test |
SWE |
Sweep AI |
Product |
| Continue AI |
GitHub Copilot alternative, VS Code and JetBrains extension |
SWE |
Continue |
 |
| Hex Magic AI |
text to sql/python data analysis code, copilot to understand and fix code issues |
MLE / DS |
Hex |
Product |
| datagpt |
chatbot to SQL, automatic data analysis |
DA |
datagpt |
Product |
| pandas-ai |
Chat with your database (SQL, CSV, pandas, polars, mongodb, noSQL, etc). |
DA |
pandas AI |
 |
| bito |
generate, explain and review code |
SWE |
bito |
[Product] |
|
|
|
|
|
Foundation Model (Code Specific)
| Name |
Paper |
Year |
Blog |
Institution |
GitHub |
| Code Llama: Open Foundation Models for Code |
arxiv |
2023 |
link |
Meta |
 |
| CodeQwen1.5-7B |
arxiv |
2024 |
link |
Alibaba |
 |
| StarCoder2-Instruct: Fully Transparent and Permissive Self-Alignment for Code Generation |
N.A. |
2024 |
link |
HuggingFace etc. |
 |
| DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence |
arxiv |
2024 |
link |
DeepSeek |
 |
| Codestral |
N.A. |
2024 |
link |
Mistral |
[Hugging Face] |