Large Language Model-Based Agents for Software Engineering: A Survey

The recent advance in Large Language Models (LLMs) has shaped a new paradigm of AI agents, i.e., LLM-based agents. Compared to standalone LLMs, LLM-based agents substantially extend the versatility and expertise of LLMs by enhancing LLMs with the capabilities of perceiving and utilizing external resources and tools. To date, LLM-based agents have been applied and shown remarkable effectiveness in Software Engineering (SE). The synergy between multiple agents and human interaction brings further promise in tackling complex real-world SE problems. In this work, we present a comprehensive and systematic survey on LLM-based agents for SE. We collect 106 papers and categorize them from two perspectives, i.e., the SE and agent perspectives. In addition, we discuss open challenges and future directions in this critical domain.

📍 We systematically summarized the progress of Agent4SE from the perspectives of both Software Engineering tasks and Agent Architecture.

📄 Paper Link: Large Language Model-Based Agents for Software Engineering: A Survey

⭐ Star this repository

This research field is evolving rapidly; star this repository to keep up with the updates!

📰 News

[2024/09/04] 🎉 We released the first version of our survey on arXiv.

🏎️ Coming Soon

Append the repository link to each paper.
Complete the list of all papers from Agent Perspectives.
Provide an interactive table.

📰 News
🏎️ Coming Soon
🖥️ SE Perspectives
🤖Agent Perspectives
📝 Citation
👨🏻‍💻 Maintainers
📬 Contact Us
🌟 Star History

🖥️ SE Perspectives

Requirement Engineering

[2024/05] MARE: Multi-Agents Collaboration Framework for Requirements Engineering. Jin et al. arXiv. [paper]
[2024/04] Elicitron: An LLM Agent-Based Simulation Framework for Design Requirements Elicitation. Ataei et al. arXiv. [paper]
[2024/01] SpecGen: Automated Generation of Formal Program Specifications via Large Language Models. Ma et al. arXiv. [paper] [repo]
[2023/10] Advancing Requirements Engineering through Generative AI: Assessing the Role of LLMs. Arora et al. arXiv. [paper]

Code Generation

[2024/05] Class-Level Code Generation from Natural Language Using Iterative, Tool-Enhanced Reasoning over Repository. Deshpande et al. arXiv. [paper]
[2024/05] MapCoder: Multi-Agent Code Generation for Competitive Problem Solving. Islam et al. ACL. [paper] [repo]
[2024/05] AutoCoder: Enhancing Code Large Language Model with AIEV-INSTRUCT. Lei et al. arXiv. [paper] [repo]
[2024/04] Self-Organized Agents: A LLM Multi-Agent Framework toward Ultra Large-Scale Code Generation and Optimization. Ishibashi et al. arXiv. [paper] [repo]
[2024/03] CoCoST: Automatic Complex Code Generation with Online Searching and Correctness Testing. He et al. arXiv. [paper]
[2024/02] Executable Code Actions Elicit Better LLM Agents. Wang et al. ICML. [paper] [repo]
[2024/02] More Agents Is All You Need. Li et al. arXiv. [paper]
[2024/02] Test-Driven Development for Code Generation. Mathews et al. arXiv. [paper] [repo]
[2024/02] LDB: A Large Language Model Debugger via Verifying Runtime Execution Step by Step. Zhong et al. arXiv. [paper] [repo]
[2024/01] CodeAgent: Enhancing Code Generation with Tool-Integrated Agent Systems for Real-World Repo-level Coding Challenges. Zhang et al. ACL. [paper]
[2024/01] Teaching Code LLMs to Use Autocompletion Tools in Repository-Level Code Generation. Wang et al. arXiv. [paper]
[2024/01] Code Generation with AlphaCodium: From Prompt Engineering to Flow Engineering. Ridnik et al. arXiv. [paper] [repo]
[2023/12] AgentCoder: Multi-Agent-based Code Generation with Iterative Testing and Optimisation. Huang et al. arXiv. [paper]
[2023/12] LLM4TDD: Best Practices for Test Driven Development Using Large Language Models. Piya et al. arXiv. [paper] [repo]
[2023/11] INTERVENOR: Prompting the Coding Ability of Large Language Models with the Interactive Chain of Repair. Wang et al. ACL. [paper] [repo]
[2023/10] Dynamic LLM-Agent Network: An LLM-agent Collaboration Framework with Agent Team Optimization. Liu et al. arXiv. [paper] [repo]
[2023/10] Lemur: Harmonizing Natural Language and Code for Language Agents. Xu et al. ICLR. [paper] [repo]
[2023/10] ClarifyGPT: Empowering LLM-based Code Generation with Intention Clarification. Mu et al. arXiv. [paper] [repo]
[2023/10] CODECHAIN: TOWARDS MODULAR CODE GENERATION THROUGH CHAIN OF SELF-REVISIONS WITH REPRESENTATIVE SUB-MODULES. Le et al. ICLR. [paper] [repo]
[2023/10] Language Agent Tree Search Unifies Reasoning, Acting, and Planning in Language Models. Zhou et al. ICML. [paper] [repo]
[2023/09] MINT: EVALUATING LLMS IN MULTI-TURN INTERACTION WITH TOOLS AND LANGUAGE FEEDBACK. Wang et al. ICLR. [paper] [repo]
[2023/09] Test-Case-Driven Programming Understanding in Large Language Models for Better Code Generation. Tian et al. arXiv. [paper]
[2023/09] CodePlan: Repository-level Coding using LLMs and Planning. Bairi et al. FSE. [paper] [repo]
[2023/09] From Misuse to Mastery: Enhancing Code Generation with Knowledge-Driven AI Chaining. Ren et al. ASE. [paper]
[2023/09] Parsel🐍: Algorithmic Reasoning with Language Models by Composing Decompositions. Zelikman et al. NeurIPS. [paper] [repo]
[2023/08] AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation. Wu et al. arXiv. [paper] [repo]
[2023/08] Gentopia: A Collaborative Platform for Tool-Augmented LLMs. Xu et al. EMNLP. [paper] [repo]
[2023/08] Flows: Building Blocks of Reasoning and Collaborating AI. Josifoski et al. arXiv. [paper] [repo]
[2023/08] CodeCoT: Tackling Code Syntax Errors in CoT Reasoning for Code Generation. Huang et al. arXiv. [paper]
[2023/06] SELFEVOLVE: A Code Evolution Framework via Large Language Models. Jiang et al. arXiv. [paper]
[2023/06] InterCode: Standardizing and Benchmarking Interactive Coding with Execution Feedback. Yang et al. NeurIPS. [paper] [repo]
[2023/06] IS SELF-REPAIR A SILVER BULLET FOR CODE GENERATION?. Olausson et al. ICLR. [paper] [repo]
[2023/05] ToolCoder: Teach Code Generation Models to use API search tools. Zhang et al. arXiv. [paper]
[2023/04] Teaching Large Language Models to Self-Debug. Chen et al. ICLR. [paper]
[2023/04] Fully Autonomous Programming with Large Language Models. Liventsev et al. GECCO. [paper]
[2023/03] CAMEL: Communicative Agents for “Mind” Exploration of Large Language Model Society. Li et al. NeurIPS. [paper] [repo]
[2023/03] Reflexion: Language Agents with Verbal Reinforcement Learning. Shinn et al. NeurIPS. [paper] [repo]
[2023/03] SELF-REFINE: Iterative Refinement with Self-Feedback. Madaan et al. NeurIPS. [paper] [repo]

Static Code Checking

Static Bug Detection

[2024/05] LLM-Assisted Static Analysis for Detecting Security Vulnerabilities. Li et al. arXiv. [paper]
[2024/03] Multi-role Consensus through LLMs Discussions for Vulnerability Detection. Mao et al. arXiv. [paper]
[2024/01] LLM4Vuln: A Unified Evaluation Framework for Decoupling and Enhancing LLMs' Vulnerability Reasoning. Sun et al. arXiv. [paper] [repo]
[2023/12] E&V: Prompting Large Language Models to Perform Static Analysis by Pseudo-code Execution and Verification. Hao et al. arXiv. [paper]
[2023/10] Large Language Model-Powered Smart Contract Vulnerability Detection: New Perspectives. Hu et al. TPS-ISA. [paper] [repo]
[2023/10] Static Code Analysis in the AI Era: An In-depth Exploration of the Concept, Function, and Potential of Intelligent Code Analysis. Fan et al. arXiv. [paper]
[2023/08] Enhancing Static Analysis for Practical Bug Detection: An LLM-Integrated Approach. Li et al. arXiv. [paper] [repo]
[2023/03] ART: Automatic multi-step reasoning and tool-use for large language models. Paranjape et al. arXiv. [paper] [repo]

Code Review

[2024/04] AI-powered Code Review with LLMs: Early Results. Rasheed et al. arXiv. [paper]
[2024/02] CodeAgent: Collaborative Agents for Software Engineering. Tang et al. arXiv. [paper] [repo]
[2023/10] Static Code Analysis in the AI Era: An In-depth Exploration of the Concept, Function, and Potential of Intelligent Code Analysis. Fan et al. arXiv. [paper]
[2023/09] CORE: Resolving Code Quality Issues using LLMs. Wadhwa et al. FSE. [paper] [repo]

Testing

Unit Testing

[2024/04] Enhancing LLM-based Test Generation for Hard-to-Cover Branches via Program Analysis. Yang et al. arXiv. [paper]
[2024/03] COVERUP: Coverage-Guided LLM-Based Test Generation. Pizzorno et al. arXiv. [paper] [repo]
[2023/08] Effective Test Generation Using Pre-trained Large Language Models and Mutation Testing. Dakhel et al. Inf. Softw. Technol. . [paper] [repo]
[2023/05] No More Manual Tests? Evaluating and Improving ChatGPT for Unit Test Generation. Yuan et al. arXiv. [paper] [repo]
[2023/05] ChatUniTest: A Framework for LLM-Based Test Generation. Chen et al. FSE. [paper] [repo]
[2023/02] An Empirical Evaluation of Using Large Language Models for Automated Unit Test Generation. Schäfer et al. IEEE Trans. Software Eng.. [paper] [repo]

System Testing

[2024/04] LLM Agents can Autonomously Exploit One-day Vulnerabilities. Fang et al. arXiv. [paper]
[2024/02] You Can REST Now: Automated Specification Inference and Black-Box Testing of RESTful APIs with Large Language Models. Decrop et al. arXiv. [paper] [repo]
[2024/01] XUAT-Copilot: Multi-Agent Collaborative System for Automated User Acceptance Testing with Large Language Model. Wang et al. arXiv. [paper]
[2024/01] KernelGPT: Enhanced Kernel Fuzzing via Large Language Models. Yang et al. arXiv. [paper]
[2023/11] Autonomous Large Language Model Agents Enabling Intent-Driven Mobile GUI Testing. Yoon et al. arXiv. [paper] [repo]
[2023/10] Make LLM a Testing Expert: Bringing Human-like Interaction to Mobile GUI Testing via Functionality-aware Decisions. Liu et al. ICSE. [paper]
[2023/10] AXNav: Replaying Accessibility Tests from Natural Language. Taeb et al. CHI. [paper]
[2023/10] White-box Compiler FuzzingEmpowered by Large Language Models. Yang et al. arXiv. [paper] [repo]
[2023/08] PENTESTGPT: An LLM-empowered Automatic Penetration Testing Tool. Deng et al. arXiv. [paper] [repo]
[2023/08] Fuzz4All: Universal Fuzzing with Large Language Models. Xia et al. ICSE. [paper] [repo]
[2023/07] Isolating Compiler Bugs by Generating Effective Witness Programs with Large Language Models. Tu et al. IEEE Trans. Software Eng. [paper] [repo]
[2023/06] Prompting Is All You Need: Automated Android Bug Replay with Large Language Models. Feng et al. ICSE. [paper] [repo]

Debugging

Fault Localization

[2024/03] AGENTFL: Scaling LLM-based Fault Localization to Project-Level Context. Qin et al. arXiv. [paper]
[2023/10] RCAgent: Cloud Root Cause Analysis by Autonomous Agents with Tool-Augmented Large Language Models. Wang et al. arXiv. [paper]
[2023/08] A Preliminary Evaluation of LLM-Based Fault Localization. Kang et al. arXiv. [paper]

Program Repair

[2024/04] Flakiness Repair in the Era of Large Language Models. Chen et al. ICSE. [paper]
[2024/03] RepairAgent: An Autonomous, LLM-Based Agent for Program Repair. Bouzenia et al. arXiv. [paper]
[2024/03] ACFIX: Guiding LLMs with Mined Common RBAC Practices for Context-Aware Repair of Access Control Vulnerabilities in Smart Contracts. Zhang et al. arXiv. [paper]
[2024/02] CigaR: Cost-efficient Program Repair with LLMs. Hidvégi et al. arXiv. [paper] [repo]
[2023/04] Explainable Automated Debugging via Large Language Model-driven Scientific Debugging. Kang et al. arXiv. [paper]
[2023/04] Keep the Conversation Going: Fixing 162 out of 337 bugs for $0.42 each using ChatGPT. Xia et al. arXiv. [paper]
[2023/01] Conversational Automated Program Repair. Xia et al. arXiv. [paper]

Unified Debugging

[2024/04] A Unified Debugging Approach via LLM-Based Multi-Agent Synergy. Lee et al. arXiv. [paper] [repo]
[2024/02] LDB: A Large Language Model Debugger via Verifying Runtime Execution Step by Step. Zhong et al. arXiv. [paper] [repo]

End-to-end Software Development

[2024/06] Multi-Agent Software Development through Cross-Team Collaboration. Du et al. arXiv. [paper] [repo]
[2024/06] AgileCoder: Dynamic Collaborative Agents for Software Development based on Agile Methodology. Nguyen et al. arXiv. [paper] [repo]
[2024/05] Iterative Experience Refinement of Software-Developing Agents. Qian et al. arXiv. [paper]
[2024/03] When LLM-based Code Generation Meets the Software Development Process. Lin et al. arXiv. [paper] [repo]
[2024/03] CodeS: Natural Language to Code Repository via Multi-Layer Sketch. Zan et al. arXiv. [paper] [repo]
[2024/02] CodePori: Large Scale Model for Autonomous Software Development by Using Multi-Agents. Rasheed et al. arXiv. [paper]
[2024/01] Experimenting a New Programming Practice with LLMs. Zhang et al. arXiv. [paper] [repo]
[2024/01] LLM4PLC: Harnessing Large Language Models for Verifiable Programming of PLCs in Industrial Control Systems. Fakih et al. ICSE. [paper] [repo]
[2023/12] Experiential Co-Learning of Software-Developing Agents. Qian et al. ACL. [paper] [repo]
[2023/09] AutoAgents: A Framework for Automatic Agent Generation. Chen et al. arXiv. [paper] [repo]
[2023/08] AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors. Chen et al. ICLR. [paper] [repo]
[2023/08] METAGPT: META PROGRAMMING FOR A MULTI-AGENT COLLABORATIVE FRAMEWORK. Hong et al. ICLR. [paper] [repo]
[2023/07] Communicative Agents for Software Development. Qian et al. ACL. [paper] [repo]
[2023/06] MULTI-AGENT COLLABORATION: HARNESSING THE POWER OF INTELLIGENT LLM AGENTS. Talebirad et al. arXiv. [paper]
[2023/06] Prompt Sapper: LLM-Empowered Software Engineering Infrastructure for AI-Native Services. Xing et al. arXiv. [paper]
[2023/04] Self-collaboration Code Generation via ChatGPT. Dong et al. arXiv. [paper] [repo]
[2023/04] Low-code LLM: Visual Programming over LLMs. Cai et al. arXiv. [paper] [repo]

End-to-end Software Maintenance

[2024/07] Agentless: Demystifying LLM-based Software Engineering Agents. Xia et al. arXiv. [paper] [repo]
[2024/06] How to Understand Whole Software Repository?. Ma et al. arXiv. [paper] [repo]
[2024/06] CODER: ISSUE RESOLVING WITH MULTI-AGENT AND TASK GRAPHS. Chen et al. arXiv. [paper] [repo]
[2024/06] MASAI: Modular Architecture for Software-engineering AI Agents. Arora et al. arXiv. [paper]
[2024/05] SWE-AGENT: AGENT-COMPUTER INTERFACES ENABLE AUTOMATED SOFTWARE ENGINEERING. Yang et al. arXiv. [paper] [repo]
[2024/04] AutoCodeRover: Autonomous Program Improvement. Zhang et al. arXiv. [paper] [repo]
[2024/03] MAGIS: LLM-Based Multi-Agent Framework for GitHub Issue ReSolution. Tao et al. arXiv. [paper]

🤖 Agent Perspectives

Agent Framework

Planning

Single-turn Planning

[2024/06] AgileCoder: Dynamic Collaborative Agents for Software Development based on Agile Methodology. Nguyen et al. arXiv. [paper] [repo]
[2024/06] Multi-Agent Software Development through Cross-Team Collaboration. Du et al. arXiv. [paper] [repo]
[2024/05] MapCoder: Multi-Agent Code Generation for Competitive Problem Solving. Islam et al. ACL. [paper] [repo]
[2024/03] MAGIS: LLM-Based Multi-Agent Framework for GitHub Issue ReSolution. Tao et al. arXiv. [paper]
[2024/03] CodeS: Natural Language to Code Repository via Multi-Layer Sketch. Zan et al. arXiv. [paper] [repo]
[2024/03] CoCoST: Automatic Complex Code Generation with Online Searching and Correctness Testing. He et al. arXiv. [paper]
[2024/02] CodePori: Large Scale Model for Autonomous Software Development by Using Multi-Agents. Rasheed et al. arXiv. [paper]
[2024/01] CodeAgent: Enhancing Code Generation with Tool-Integrated Agent Systems for Real-World Repo-level Coding Challenges. Zhang et al. ACL. [paper]
[2024/01] LLM4PLC: Harnessing Large Language Models for Verifiable Programming of PLCs in Industrial Control Systems. Fakih et al. ICSE. [paper] [repo]
[2024/01] Experimenting a New Programming Practice with LLMs. Zhang et al. arXiv. [paper] [repo]
[2023/10] Static Code Analysis in the AI Era: An In-depth Exploration of the Concept, Function, and Potential of Intelligent Code Analysis. Fan et al. arXiv. [paper]
[2023/09] Parsel🐍: Algorithmic Reasoning with Language Models by Composing Decompositions. Zelikman et al. NeurIPS. [paper] [repo]
[2023/08] PENTESTGPT: An LLM-empowered Automatic Penetration Testing Tool. Deng et al. arXiv. [paper] [repo]
[2023/08] Flows: Building Blocks of Reasoning and Collaborating AI. Josifoski et al. arXiv. [paper] [repo]
[2023/08] METAGPT: META PROGRAMMING FOR A MULTI-AGENT COLLABORATIVE FRAMEWORK. Hong et al. ICLR. [paper] [repo]
[2023/07] Communicative Agents for Software Development. Qian et al. ACL. [paper] [repo]
[2023/04] Self-collaboration Code Generation via ChatGPT. Dong et al. arXiv. [paper] [repo]
[2023/04] Low-code LLM: Visual Programming over LLMs. Cai et al. arXiv. [paper] [repo]

Multi-turn Planning

React-like

[2024/06] MASAI: Modular Architecture for Software-engineering AI Agents. Arora et al. arXiv. [paper]
[2024/02] Executable Code Actions Elicit Better LLM Agents. Wang et al. ICML. [paper] [repo]
[2024/01] CodeAgent: Enhancing Code Generation with Tool-Integrated Agent Systems for Real-World Repo-level Coding Challenges. Zhang et al. ACL. [paper]
[2024/01] XUAT-Copilot: Multi-Agent Collaborative System for Automated User Acceptance Testing with Large Language Model. Wang et al. arXiv. [paper]
[2023/11] Autonomous Large Language Model Agents Enabling Intent-Driven Mobile GUI Testing. Yoon et al. arXiv. [paper] [repo]
[2023/10] RCAgent: Cloud Root Cause Analysis by Autonomous Agents with Tool-Augmented Large Language Models. Wang et al. arXiv. [paper]
[2023/10] Language Agent Tree Search Unifies Reasoning, Acting, and Planning in Language Models. Zhou et al. ICML. [paper] [repo]
[2023/10] AXNav: Replaying Accessibility Tests from Natural Language. Taeb et al. CHI. [paper]
[2023/09] CodePlan: Repository-level Coding using LLMs and Planning. Bairi et al. FSE. [paper] [repo]

Layered

[2024/04] Self-Organized Agents: A LLM Multi-Agent Framework toward Ultra Large-Scale Code Generation and Optimization. Ishibashi et al. arXiv. [paper] [repo]

Memory

Long-term Memory

[2024/06] Multi-Agent Software Development through Cross-Team Collaboration. Du et al. arXiv. [paper] [repo]
[2024/05] Iterative Experience Refinement of Software-Developing Agents. Qian et al. arXiv. [paper]
[2023/12] Experiential Co-Learning of Software-Developing Agents. Qian et al. ACL. [paper] [repo]
[2023/11] Autonomous Large Language Model Agents Enabling Intent-Driven Mobile GUI Testing. Yoon et al. arXiv. [paper] [repo]
[2023/09] AutoAgents: A Framework for Automatic Agent Generation. Chen et al. arXiv. [paper] [repo]
[2023/08] METAGPT: META PROGRAMMING FOR A MULTI-AGENT COLLABORATIVE FRAMEWORK. Hong et al. ICLR. [paper] [repo]
[2023/07] Communicative Agents for Software Development. Qian et al. ACL. [paper] [repo]
[2023/03] Reflexion: Language Agents with Verbal Reinforcement Learning. Shinn et al. NeurIPS. [paper] [repo]

Short-term Memory

[2024/06] Multi-Agent Software Development through Cross-Team Collaboration. Du et al. arXiv. [paper] [repo]
[2024/06] AgileCoder: Dynamic Collaborative Agents for Software Development based on Agile Methodology. Nguyen et al. arXiv. [paper] [repo]
[2024/04] Self-Organized Agents: A LLM Multi-Agent Framework toward Ultra Large-Scale Code Generation and Optimization. Ishibashi et al. arXiv. [paper] [repo]
[2024/03] MAGIS: LLM-Based Multi-Agent Framework for GitHub Issue ReSolution. Tao et al. arXiv. [paper]
[2024/01] XUAT-Copilot: Multi-Agent Collaborative System for Automated User Acceptance Testing with Large Language Model. Wang et al. arXiv. [paper]
[2023/12] E&V: Prompting Large Language Models to Perform Static Analysis by Pseudo-code Execution and Verification. Hao et al. arXiv. [paper]
[2023/11] Autonomous Large Language Model Agents Enabling Intent-Driven Mobile GUI Testing. Yoon et al. arXiv. [paper] [repo]
[2023/10] RCAgent: Cloud Root Cause Analysis by Autonomous Agents with Tool-Augmented Large Language Models. Wang et al. arXiv. [paper]
[2023/10] Static Code Analysis in the AI Era: An In-depth Exploration of the Concept, Function, and Potential of Intelligent Code Analysis. Fan et al. arXiv. [paper]
[2023/10] Make LLM a Testing Expert: Bringing Human-like Interaction to Mobile GUI Testing via Functionality-aware Decisions. Liu et al. ICSE. [paper]
[2023/09] CodePlan: Repository-level Coding using LLMs and Planning. Bairi et al. FSE. [paper] [repo]
[2023/09] AutoAgents: A Framework for Automatic Agent Generation. Chen et al. arXiv. [paper] [repo]
[2023/08] METAGPT: META PROGRAMMING FOR A MULTI-AGENT COLLABORATIVE FRAMEWORK. Hong et al. ICLR. [paper] [repo]
[2023/07] Communicative Agents for Software Development. Qian et al. ACL. [paper] [repo]
[2023/03] Reflexion: Language Agents with Verbal Reinforcement Learning. Shinn et al. NeurIPS. [paper] [repo]

Shared Memory: A special kind of Short-term Memory

[2024/06] AgileCoder: Dynamic Collaborative Agents for Software Development based on Agile Methodology. Nguyen et al. arXiv. [paper] [repo]
[2024/05] MARE: Multi-Agents Collaboration Framework for Requirements Engineering. Jin et al. arXiv. [paper]
[2024/03] When LLM-based Code Generation Meets the Software Development Process. Lin et al. arXiv. [paper] [repo]
[2024/03] AGENTFL: Scaling LLM-based Fault Localization to Project-Level Context. Qin et al. arXiv. [paper]
[2023/08] METAGPT: META PROGRAMMING FOR A MULTI-AGENT COLLABORATIVE FRAMEWORK. Hong et al. ICLR. [paper] [repo]
[2023/04] Self-collaboration Code Generation via ChatGPT. Dong et al. arXiv. [paper] [repo]

Perception

Visual Input

[2024/01] XUAT-Copilot: Multi-Agent Collaborative System for Automated User Acceptance Testing with Large Language Model. Wang et al. arXiv. [paper]
[2023/10] AXNav: Replaying Accessibility Tests from Natural Language. Taeb et al. CHI. [paper]
[2023/08] METAGPT: META PROGRAMMING FOR A MULTI-AGENT COLLABORATIVE FRAMEWORK. Hong et al. ICLR. [paper] [repo]

Action

Searching Tools

[2023/03] ART: Automatic multi-step reasoning and tool-use for large language models. Paranjape et al. arXiv. [paper] [repo]
[2023/08] METAGPT: META PROGRAMMING FOR A MULTI-AGENT COLLABORATIVE FRAMEWORK. Hong et al. ICLR. [paper] [repo]
[2024/03] RepairAgent: An Autonomous, LLM-Based Agent for Program Repair. Bouzenia et al. arXiv. [paper]
[2024/03] CoCoST: Automatic Complex Code Generation with Online Searching and Correctness Testing. He et al. arXiv. [paper]
[2023/10] Lemur: Harmonizing Natural Language and Code for Language Agents. Xu et al. ICLR. [paper] [repo]
[2023/12] E&V: Prompting Large Language Models to Perform Static Analysis by Pseudo-code Execution and Verification. Hao et al. arXiv. [paper]
[2023/10] RCAgent: Cloud Root Cause Analysis by Autonomous Agents with Tool-Augmented Large Language Models. Wang et al. arXiv. [paper]
[2023/08] PENTESTGPT: An LLM-empowered Automatic Penetration Testing Tool. Deng et al. arXiv. [paper] [repo]
[2024/04] LLM Agents can Autonomously Exploit One-day Vulnerabilities. Fang et al. arXiv. [paper]
[2023/08] AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors. Chen et al. ICLR. [paper] [repo]
[2023/10] Static Code Analysis in the AI Era: An In-depth Exploration of the Concept, Function, and Potential of Intelligent Code Analysis. Fan et al. arXiv. [paper]
[2023/08] Gentopia: A Collaborative Platform for Tool-Augmented LLMs. Xu et al. EMNLP. [paper] [repo]
[2024/05] Class-Level Code Generation from Natural Language Using Iterative, Tool-Enhanced Reasoning over Repository. Deshpande et al. arXiv. [paper]
[2023/11] Autonomous Large Language Model Agents Enabling Intent-Driven Mobile GUI Testing. Yoon et al. arXiv. [paper] [repo]
[2023/12] Experiential Co-Learning of Software-Developing Agents. Qian et al. ACL. [paper] [repo]
[2024/02] CodePori: Large Scale Model for Autonomous Software Development by Using Multi-Agents. Rasheed et al. arXiv. [paper]
[2023/08] AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation. Wu et al. arXiv. [paper] [repo]
[2023/05] ToolCoder: Teach Code Generation Models to use API search tools. Zhang et al. arXiv. [paper]
[2024/01] LLM4Vuln: A Unified Evaluation Framework for Decoupling and Enhancing LLMs' Vulnerability Reasoning. Sun et al. arXiv. [paper] [repo]
[2024/01] CodeAgent: Enhancing Code Generation with Tool-Integrated Agent Systems for Real-World Repo-level Coding Challenges. Zhang et al. ACL. [paper]

File Operation

[2024/04] LLM Agents can Autonomously Exploit One-day Vulnerabilities. Fang et al. arXiv. [paper]
[2024/05] SWE-AGENT: AGENT-COMPUTER INTERFACES ENABLE AUTOMATED SOFTWARE ENGINEERING. Yang et al. arXiv. [paper] [repo]
[2023/04] Explainable Automated Debugging via Large Language Model-driven Scientific Debugging. Kang et al. arXiv. [paper]
[2024/03] RepairAgent: An Autonomous, LLM-Based Agent for Program Repair. Bouzenia et al. arXiv. [paper]
[2024/05] LLM-Assisted Static Analysis for Detecting Security Vulnerabilities. Li et al. arXiv. [paper]
[2024/06] MASAI: Modular Architecture for Software-engineering AI Agents. Arora et al. arXiv. [paper]

GUI Operation

[2024/01] XUAT-Copilot: Multi-Agent Collaborative System for Automated User Acceptance Testing with Large Language Model. Wang et al. arXiv. [paper]
[2023/10] Make LLM a Testing Expert: Bringing Human-like Interaction to Mobile GUI Testing via Functionality-aware Decisions. Liu et al. ICSE. [paper]
[2023/06] Prompting Is All You Need: Automated Android Bug Replay with Large Language Models. Feng et al. ICSE. [paper] [repo]
[2023/10] AXNav: Replaying Accessibility Tests from Natural Language. Taeb et al. CHI. [paper]

Static Program Analysis

[2023/07] Isolating Compiler Bugs by Generating Effective Witness Programs with Large Language Models. Tu et al. IEEE Trans. Software Eng.. [paper] [repo]
[2024/03] RepairAgent: An Autonomous, LLM-Based Agent for Program Repair. Bouzenia et al. arXiv. [paper]
[2023/09] CodePlan: Repository-level Coding using LLMs and Planning. Bairi et al. FSE. [paper] [repo]
[2023/08] CodeCoT: Tackling Code Syntax Errors in CoT Reasoning for Code Generation. Huang et al. arXiv. [paper]
[2024/01] LLM4PLC: Harnessing Large Language Models for Verifiable Programming of PLCs in Industrial Control Systems. Fakih et al. ICSE. [paper] [repo]
[2024/06] Multi-Agent Software Development through Cross-Team Collaboration. Du et al. arXiv. [paper] [repo]
[2023/12] E&V: Prompting Large Language Models to Perform Static Analysis by Pseudo-code Execution and Verification. Hao et al. arXiv. [paper]
[2024/03] COVERUP: Coverage-Guided LLM-Based Test Generation. Pizzorno et al. arXiv. [paper] [repo]
[2023/06] Prompting Is All You Need: Automated Android Bug Replay with Large Language Models. Feng et al. ICSE. [paper] [repo]
[2024/04] Enhancing LLM-based Test Generation for Hard-to-Cover Branches via Program Analysis. Yang et al. arXiv. [paper]
[2024/05] Class-Level Code Generation from Natural Language Using Iterative, Tool-Enhanced Reasoning over Repository. Deshpande et al. arXiv. [paper]
[2024/02] LDB: A Large Language Model Debugger via Verifying Runtime Execution Step by Step. Zhong et al. arXiv. [paper] [repo]
[2024/06] AgileCoder: Dynamic Collaborative Agents for Software Development based on Agile Methodology. Nguyen et al. arXiv. [paper] [repo]
[2024/05] LLM-Assisted Static Analysis for Detecting Security Vulnerabilities. Li et al. arXiv. [paper]
[2024/04] AutoCodeRover: Autonomous Program Improvement. Zhang et al. arXiv. [paper] [repo]
[2024/01] Teaching Code LLMs to Use Autocompletion Tools in Repository-Level Code Generation. Wang et al. arXiv. [paper]
[2024/03] ACFIX: Guiding LLMs with Mined Common RBAC Practices for Context-Aware Repair of Access Control Vulnerabilities in Smart Contracts. Zhang et al. arXiv. [paper]
[2024/01] CodeAgent: Enhancing Code Generation with Tool-Integrated Agent Systems for Real-World Repo-level Coding Challenges. Zhang et al. ACL. [paper]
[2024/03] AGENTFL: Scaling LLM-based Fault Localization to Project-Level Context. Qin et al. arXiv. [paper]
[2024/06] MASAI: Modular Architecture for Software-engineering AI Agents. Arora et al. arXiv. [paper]

Dynamic Analysis

[2024/04] Enhancing LLM-based Test Generation for Hard-to-Cover Branches via Program Analysis. Yang et al. arXiv. [paper]
[2024/02] LDB: A Large Language Model Debugger via Verifying Runtime Execution Step by Step. Zhong et al. arXiv. [paper] [repo]
[2023/04] Explainable Automated Debugging via Large Language Model-driven Scientific Debugging. Kang et al. arXiv. [paper]
[2023/07] Isolating Compiler Bugs by Generating Effective Witness Programs with Large Language Models. Tu et al. IEEE Trans. Software Eng.. [paper] [repo]
[2024/03] COVERUP: Coverage-Guided LLM-Based Test Generation. Pizzorno et al. arXiv. [paper] [repo]
[2024/03] AGENTFL: Scaling LLM-based Fault Localization to Project-Level Context. Qin et al. arXiv. [paper]

Testing Tools

[2024/03] When LLM-based Code Generation Meets the Software Development Process. Lin et al. arXiv. [paper] [repo]
[2023/03] ART: Automatic multi-step reasoning and tool-use for large language models. Paranjape et al. arXiv. [paper] [repo]
[2023/04] Fully Autonomous Programming with Large Language Models. Liventsev et al. GECCO. [paper]
[2023/08] METAGPT: META PROGRAMMING FOR A MULTI-AGENT COLLABORATIVE FRAMEWORK. Hong et al. ICLR. [paper] [repo]
[2023/12] LLM4TDD: Best Practices for Test Driven Development Using Large Language Models. Piya et al. arXiv. [paper] [repo]
[2023/06] SELFEVOLVE: A Code Evolution Framework via Large Language Models. Jiang et al. arXiv. [paper]
[2023/03] Reflexion: Language Agents with Verbal Reinforcement Learning. Shinn et al. NeurIPS. [paper] [repo]
[2024/04] Self-Organized Agents: A LLM Multi-Agent Framework toward Ultra Large-Scale Code Generation and Optimization. Ishibashi et al. arXiv. [paper] [repo]
[2023/12] AgentCoder: Multi-Agent-based Code Generation with Iterative Testing and Optimisation. Huang et al. arXiv. [paper]
[2023/11] INTERVENOR: Prompting the Coding Ability of Large Language Models with the Interactive Chain of Repair. Wang et al. ACL. [paper] [repo]
[2023/04] Explainable Automated Debugging via Large Language Model-driven Scientific Debugging. Kang et al. arXiv. [paper]
[2024/05] AutoCoder: Enhancing Code Large Language Model with AIEV-INSTRUCT. Lei et al. arXiv. [paper] [repo]
[2024/03] RepairAgent: An Autonomous, LLM-Based Agent for Program Repair. Bouzenia et al. arXiv. [paper]
[2024/03] CoCoST: Automatic Complex Code Generation with Online Searching and Correctness Testing. He et al. arXiv. [paper]
[2023/10] Lemur: Harmonizing Natural Language and Code for Language Agents. Xu et al. ICLR. [paper] [repo]
[2023/08] Flows: Building Blocks of Reasoning and Collaborating AI. Josifoski et al. arXiv. [paper] [repo]
[2024/05] MapCoder: Multi-Agent Code Generation for Competitive Problem Solving. Islam et al. ACL. [paper] [repo]
[2023/04] Teaching Large Language Models to Self-Debug. Chen et al. ICLR. [paper]
[2023/08] CodeCoT: Tackling Code Syntax Errors in CoT Reasoning for Code Generation. Huang et al. arXiv. [paper]
[2024/02] Executable Code Actions Elicit Better LLM Agents. Wang et al. ICML. [paper] [repo]
[2024/02] Test-Driven Development for Code Generation. Mathews et al. arXiv. [paper] [repo]
[2023/05] No More Manual Tests? Evaluating and Improving ChatGPT for Unit Test Generation. Yuan et al. arXiv. [paper] [repo]
[2024/04] A Unified Debugging Approach via LLM-Based Multi-Agent Synergy. Lee et al. arXiv. [paper] [repo]
[2024/04] LLM Agents can Autonomously Exploit One-day Vulnerabilities. Fang et al. arXiv. [paper]
[2023/10] White-box Compiler FuzzingEmpowered by Large Language Models. Yang et al. arXiv. [paper] [repo]
[2024/04] Enhancing LLM-based Test Generation for Hard-to-Cover Branches via Program Analysis. Yang et al. arXiv. [paper]
[2023/08] AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors. Chen et al. ICLR. [paper] [repo]
[2023/06] InterCode: Standardizing and Benchmarking Interactive Coding with Execution Feedback. Yang et al. NeurIPS. [paper] [repo]
[2024/01] Code Generation with AlphaCodium: From Prompt Engineering to Flow Engineering. Ridnik et al. arXiv. [paper] [repo]
[2024/06] AgileCoder: Dynamic Collaborative Agents for Software Development based on Agile Methodology. Nguyen et al. arXiv. [paper] [repo]
[2023/06] IS SELF-REPAIR A SILVER BULLET FOR CODE GENERATION?. Olausson et al. ICLR. [paper] [repo]
[2024/01] Experimenting a New Programming Practice with LLMs. Zhang et al. arXiv. [paper] [repo]
[2024/04] AutoCodeRover: Autonomous Program Improvement. Zhang et al. arXiv. [paper] [repo]
[2023/08] Effective Test Generation Using Pre-trained Large Language Models and Mutation Testing. Dakhel et al. Inf. Softw. Technol. . [paper] [repo]
[2023/09] MINT: EVALUATING LLMS IN MULTI-TURN INTERACTION WITH TOOLS AND LANGUAGE FEEDBACK. Wang et al. ICLR. [paper] [repo]
[2024/04] Flakiness Repair in the Era of Large Language Models. Chen et al. ICSE. [paper]
[2023/08] AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation. Wu et al. arXiv. [paper] [repo]
[2023/02] An Empirical Evaluation of Using Large Language Models for Automated Unit Test Generation. Schäfer et al. IEEE Trans. Software Eng.. [paper] [repo]
[2023/09] Test-Case-Driven Programming Understanding in Large Language Models for Better Code Generation. Tian et al. arXiv. [paper]
[2023/10] ClarifyGPT: Empowering LLM-based Code Generation with Intention Clarification. Mu et al. arXiv. [paper] [repo]
[2023/01] Conversational Automated Program Repair. Xia et al. arXiv. [paper]
[2024/01] CodeAgent: Enhancing Code Generation with Tool-Integrated Agent Systems for Real-World Repo-level Coding Challenges. Zhang et al. ACL. [paper]
[2024/06] MASAI: Modular Architecture for Software-engineering AI Agents. Arora et al. arXiv. [paper]

Fault Localization Tools

[2024/04] AutoCodeRover: Autonomous Program Improvement. Zhang et al. arXiv. [paper] [repo]
[2024/03] RepairAgent: An Autonomous, LLM-Based Agent for Program Repair. Bouzenia et al. arXiv. [paper]

Multi-agent System

Agent Roles

Manager Roles

[2024/06] AgileCoder: Dynamic Collaborative Agents for Software Development based on Agile Methodology. Nguyen et al. arXiv. [paper] [repo]
[2024/05] Iterative Experience Refinement of Software-Developing Agents. Qian et al. arXiv. [paper]
[2024/05] MapCoder: Multi-Agent Code Generation for Competitive Problem Solving. Islam et al. ACL. [paper] [repo]
[2024/04] Self-Organized Agents: A LLM Multi-Agent Framework toward Ultra Large-Scale Code Generation and Optimization. Ishibashi et al. arXiv. [paper] [repo]
[2024/03] MAGIS: LLM-Based Multi-Agent Framework for GitHub Issue ReSolution. Tao et al. arXiv. [paper]
[2024/03] When LLM-based Code Generation Meets the Software Development Process. Lin et al. arXiv. [paper] [repo]
[2024/02] CodeAgent: Collaborative Agents for Software Engineering. Tang et al. arXiv. [paper] [repo]
[2024/02] CodePori: Large Scale Model for Autonomous Software Development by Using Multi-Agents. Rasheed et al. arXiv. [paper]
[2023/12] Experiential Co-Learning of Software-Developing Agents. Qian et al. ACL. [paper] [repo]
[2023/11] Autonomous Large Language Model Agents Enabling Intent-Driven Mobile GUI Testing. Yoon et al. arXiv. [paper] [repo]
[2023/10] AXNav: Replaying Accessibility Tests from Natural Language. Taeb et al. CHI. [paper]
[2023/10] RCAgent: Cloud Root Cause Analysis by Autonomous Agents with Tool-Augmented Large Language Models. Wang et al. arXiv. [paper]
[2023/09] AutoAgents: A Framework for Automatic Agent Generation. Chen et al. arXiv. [paper] [repo]
[2023/08] METAGPT: META PROGRAMMING FOR A MULTI-AGENT COLLABORATIVE FRAMEWORK. Hong et al. ICLR. [paper] [repo]
[2023/04] Low-code LLM: Visual Programming over LLMs. Cai et al. arXiv. [paper] [repo]
[2023/03] CAMEL: Communicative Agents for “Mind” Exploration of Large Language Model Society. Li et al. NeurIPS. [paper] [repo]

Requirement Analyzing Roles

[2024/06] AgileCoder: Dynamic Collaborative Agents for Software Development based on Agile Methodology. Nguyen et al. arXiv. [paper] [repo]
[2024/05] MARE: Multi-Agents Collaboration Framework for Requirements Engineering. Jin et al. arXiv. [paper]
[2024/04] Elicitron: An LLM Agent-Based Simulation Framework for Design Requirements Elicitation. Ataei et al. arXiv. [paper]
[2024/03] When LLM-based Code Generation Meets the Software Development Process. Lin et al. arXiv. [paper] [repo]
[2024/01] Experimenting a New Programming Practice with LLMs. Zhang et al. arXiv. [paper] [repo]
[2023/10] Static Code Analysis in the AI Era: An In-depth Exploration of the Concept, Function, and Potential of Intelligent Code Analysis. Fan et al. arXiv. [paper]
[2023/08] METAGPT: META PROGRAMMING FOR A MULTI-AGENT COLLABORATIVE FRAMEWORK. Hong et al. ICLR. [paper] [repo]
[2023/06] MULTI-AGENT COLLABORATION: HARNESSING THE POWER OF INTELLIGENT LLM AGENTS. Talebirad et al. arXiv. [paper]
[2023/04] Self-collaboration Code Generation via ChatGPT. Dong et al. arXiv. [paper] [repo]
[2023/03] CAMEL: Communicative Agents for “Mind” Exploration of Large Language Model Society. Li et al. NeurIPS. [paper] [repo]

Designer Roles

[2024/03] When LLM-based Code Generation Meets the Software Development Process. Lin et al. arXiv. [paper] [repo]
[2024/01] Experimenting a New Programming Practice with LLMs. Zhang et al. arXiv. [paper] [repo]
[2023/08] METAGPT: META PROGRAMMING FOR A MULTI-AGENT COLLABORATIVE FRAMEWORK. Hong et al. ICLR. [paper] [repo]
[2023/08] AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors. Chen et al. ICLR. [paper] [repo]
[2023/07] Communicative Agents for Software Development. Qian et al. ACL. [paper] [repo]
[2023/06] MULTI-AGENT COLLABORATION: HARNESSING THE POWER OF INTELLIGENT LLM AGENTS. Talebirad et al. arXiv. [paper]

Developer Roles

[2024/06] AgileCoder: Dynamic Collaborative Agents for Software Development based on Agile Methodology. Nguyen et al. arXiv. [paper] [repo]
[2024/05] AutoCoder: Enhancing Code Large Language Model with AIEV-INSTRUCT. Lei et al. arXiv. [paper] [repo]
[2024/05] MapCoder: Multi-Agent Code Generation for Competitive Problem Solving. Islam et al. ACL. [paper] [repo]
[2024/04] Self-Organized Agents: A LLM Multi-Agent Framework toward Ultra Large-Scale Code Generation and Optimization. Ishibashi et al. arXiv. [paper] [repo]
[2024/03] CodeS: Natural Language to Code Repository via Multi-Layer Sketch. Zan et al. arXiv. [paper] [repo]
[2024/03] MAGIS: LLM-Based Multi-Agent Framework for GitHub Issue ReSolution. Tao et al. arXiv. [paper]
[2024/03] When LLM-based Code Generation Meets the Software Development Process. Lin et al. arXiv. [paper] [repo]
[2024/02] Test-Driven Development for Code Generation. Mathews et al. arXiv. [paper] [repo]
[2024/02] CodePori: Large Scale Model for Autonomous Software Development by Using Multi-Agents. Rasheed et al. arXiv. [paper]
[2024/01] Experimenting a New Programming Practice with LLMs. Zhang et al. arXiv. [paper] [repo]
[2023/12] AgentCoder: Multi-Agent-based Code Generation with Iterative Testing and Optimisation. Huang et al. arXiv. [paper]
[2023/11] INTERVENOR: Prompting the Coding Ability of Large Language Models with the Interactive Chain of Repair. Wang et al. ACL. [paper] [repo]
[2023/08] AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation. Wu et al. arXiv. [paper] [repo]
[2023/08] METAGPT: META PROGRAMMING FOR A MULTI-AGENT COLLABORATIVE FRAMEWORK. Hong et al. ICLR. [paper] [repo]
[2023/08] AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors. Chen et al. ICLR. [paper] [repo]
[2023/07] Communicative Agents for Software Development. Qian et al. ACL. [paper] [repo]
[2023/06] IS SELF-REPAIR A SILVER BULLET FOR CODE GENERATION?. Olausson et al. ICLR. [paper] [repo]
[2023/06] MULTI-AGENT COLLABORATION: HARNESSING THE POWER OF INTELLIGENT LLM AGENTS. Talebirad et al. arXiv. [paper]
[2023/04] Self-collaboration Code Generation via ChatGPT. Dong et al. arXiv. [paper] [repo]
[2023/03] CAMEL: Communicative Agents for “Mind” Exploration of Large Language Model Society. Li et al. NeurIPS. [paper] [repo]

Software Quality Assurance Roles

[2024/06] Multi-Agent Software Development through Cross-Team Collaboration. Du et al. arXiv. [paper] [repo]
[2024/06] AgileCoder: Dynamic Collaborative Agents for Software Development based on Agile Methodology. Nguyen et al. arXiv. [paper] [repo]
[2024/06] MASAI: Modular Architecture for Software-engineering AI Agents. Arora et al. arXiv. [paper]
[2024/05] AutoCoder: Enhancing Code Large Language Model with AIEV-INSTRUCT. Lei et al. arXiv. [paper] [repo]
[2024/05] MapCoder: Multi-Agent Code Generation for Competitive Problem Solving. Islam et al. ACL. [paper] [repo]
[2024/04] AI-powered Code Review with LLMs: Early Results. Rasheed et al. arXiv. [paper]
[2024/04] A Unified Debugging Approach via LLM-Based Multi-Agent Synergy. Lee et al. arXiv. [paper] [repo]
[2024/03] MAGIS: LLM-Based Multi-Agent Framework for GitHub Issue ReSolution. Tao et al. arXiv. [paper]
[2024/03] AGENTFL: Scaling LLM-based Fault Localization to Project-Level Context. Qin et al. arXiv. [paper]
[2024/03] When LLM-based Code Generation Meets the Software Development Process. Lin et al. arXiv. [paper] [repo]
[2024/03] ACFIX: Guiding LLMs with Mined Common RBAC Practices for Context-Aware Repair of Access Control Vulnerabilities in Smart Contracts. Zhang et al. arXiv. [paper]
[2024/02] CodeAgent: Collaborative Agents for Software Engineering. Tang et al. arXiv. [paper] [repo]
[2024/02] Test-Driven Development for Code Generation. Mathews et al. arXiv. [paper] [repo]
[2024/02] CodePori: Large Scale Model for Autonomous Software Development by Using Multi-Agents. Rasheed et al. arXiv. [paper]
[2024/01] XUAT-Copilot: Multi-Agent Collaborative System for Automated User Acceptance Testing with Large Language Model. Wang et al. arXiv. [paper]
[2023/12] AgentCoder: Multi-Agent-based Code Generation with Iterative Testing and Optimisation. Huang et al. arXiv. [paper]
[2023/11] Autonomous Large Language Model Agents Enabling Intent-Driven Mobile GUI Testing. Yoon et al. arXiv. [paper] [repo]
[2023/10] Large Language Model-Powered Smart Contract Vulnerability Detection: New Perspectives. Hu et al. TPS-ISA. [paper] [repo]
[2023/10] Static Code Analysis in the AI Era: An In-depth Exploration of the Concept, Function, and Potential of Intelligent Code Analysis. Fan et al. arXiv. [paper]
[2023/10] White-box Compiler FuzzingEmpowered by Large Language Models. Yang et al. arXiv. [paper] [repo]
[2023/10] AXNav: Replaying Accessibility Tests from Natural Language. Taeb et al. CHI. [paper]
[2023/08] AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation. Wu et al. arXiv. [paper] [repo]
[2023/08] METAGPT: META PROGRAMMING FOR A MULTI-AGENT COLLABORATIVE FRAMEWORK. Hong et al. ICLR. [paper] [repo]
[2023/07] Communicative Agents for Software Development. Qian et al. ACL. [paper] [repo]
[2023/06] IS SELF-REPAIR A SILVER BULLET FOR CODE GENERATION?. Olausson et al. ICLR. [paper] [repo]
[2023/06] MULTI-AGENT COLLABORATION: HARNESSING THE POWER OF INTELLIGENT LLM AGENTS. Talebirad et al. arXiv. [paper]
[2023/03] CAMEL: Communicative Agents for “Mind” Exploration of Large Language Model Society. Li et al. NeurIPS. [paper] [repo]

Assistant Roles

[2024/06] MASAI: Modular Architecture for Software-engineering AI Agents. Arora et al. arXiv. [paper]
[2024/05] MapCoder: Multi-Agent Code Generation for Competitive Problem Solving. Islam et al. ACL. [paper] [repo]
[2024/03] MAGIS: LLM-Based Multi-Agent Framework for GitHub Issue ReSolution. Tao et al. arXiv. [paper]
[2024/03] CodeS: Natural Language to Code Repository via Multi-Layer Sketch. Zan et al. arXiv. [paper] [repo]
[2023/10] Static Code Analysis in the AI Era: An In-depth Exploration of the Concept, Function, and Potential of Intelligent Code Analysis. Fan et al. arXiv. [paper]

Collaboration Mechanism

Layered Structure

[2024/06] Multi-Agent Software Development through Cross-Team Collaboration. Du et al. arXiv. [paper] [repo]
[2024/06] AgileCoder: Dynamic Collaborative Agents for Software Development based on Agile Methodology. Nguyen et al. arXiv. [paper] [repo]
[2024/05] MapCoder: Multi-Agent Code Generation for Competitive Problem Solving. Islam et al. ACL. [paper] [repo]
[2024/05] MARE: Multi-Agents Collaboration Framework for Requirements Engineering. Jin et al. arXiv. [paper]
[2024/04] AutoCodeRover: Autonomous Program Improvement. Zhang et al. arXiv. [paper] [repo]
[2024/03] CodeS: Natural Language to Code Repository via Multi-Layer Sketch. Zan et al. arXiv. [paper] [repo]
[2024/03] When LLM-based Code Generation Meets the Software Development Process. Lin et al. arXiv. [paper] [repo]
[2024/03] AGENTFL: Scaling LLM-based Fault Localization to Project-Level Context. Qin et al. arXiv. [paper]
[2024/02] CodeAgent: Collaborative Agents for Software Engineering. Tang et al. arXiv. [paper] [repo]
[2024/02] More Agents Is All You Need. Li et al. arXiv. [paper]
[2024/01] Experimenting a New Programming Practice with LLMs. Zhang et al. arXiv. [paper] [repo]
[2023/10] Static Code Analysis in the AI Era: An In-depth Exploration of the Concept, Function, and Potential of Intelligent Code Analysis. Fan et al. arXiv. [paper]
[2023/10] Large Language Model-Powered Smart Contract Vulnerability Detection: New Perspectives. Hu et al. TPS-ISA. [paper] [repo]
[2023/10] White-box Compiler FuzzingEmpowered by Large Language Models. Yang et al. arXiv. [paper] [repo]
[2023/10] Dynamic LLM-Agent Network: An LLM-agent Collaboration Framework with Agent Team Optimization. Liu et al. arXiv. [paper] [repo]
[2023/08] METAGPT: META PROGRAMMING FOR A MULTI-AGENT COLLABORATIVE FRAMEWORK. Hong et al. ICLR. [paper] [repo]
[2023/08] Flows: Building Blocks of Reasoning and Collaborating AI. Josifoski et al. arXiv. [paper] [repo]
[2023/07] Communicative Agents for Software Development. Qian et al. ACL. [paper] [repo]
[2023/04] Low-code LLM: Visual Programming over LLMs. Cai et al. arXiv. [paper] [repo]

Circular Structure

[2024/05] AutoCoder: Enhancing Code Large Language Model with AIEV-INSTRUCT. Lei et al. arXiv. [paper] [repo]
[2024/04] A Unified Debugging Approach via LLM-Based Multi-Agent Synergy. Lee et al. arXiv. [paper] [repo]
[2024/03] ACFIX: Guiding LLMs with Mined Common RBAC Practices for Context-Aware Repair of Access Control Vulnerabilities in Smart Contracts. Zhang et al. arXiv. [paper]
[2024/03] Multi-role Consensus through LLMs Discussions for Vulnerability Detection. Mao et al. arXiv. [paper]
[2024/02] Test-Driven Development for Code Generation. Mathews et al. arXiv. [paper] [repo]
[2024/02] CodePori: Large Scale Model for Autonomous Software Development by Using Multi-Agents. Rasheed et al. arXiv. [paper]
[2023/12] Experiential Co-Learning of Software-Developing Agents. Qian et al. ACL. [paper] [repo]
[2023/12] AgentCoder: Multi-Agent-based Code Generation with Iterative Testing and Optimisation. Huang et al. arXiv. [paper]
[2023/11] INTERVENOR: Prompting the Coding Ability of Large Language Models with the Interactive Chain of Repair. Wang et al. ACL. [paper] [repo]
[2023/11] Autonomous Large Language Model Agents Enabling Intent-Driven Mobile GUI Testing. Yoon et al. arXiv. [paper] [repo]
[2023/10] AXNav: Replaying Accessibility Tests from Natural Language. Taeb et al. CHI. [paper]
[2023/06] IS SELF-REPAIR A SILVER BULLET FOR CODE GENERATION?. Olausson et al. ICLR. [paper] [repo]
[2023/03] CAMEL: Communicative Agents for “Mind” Exploration of Large Language Model Society. Li et al. NeurIPS. [paper] [repo]
[2023/03] Reflexion: Language Agents with Verbal Reinforcement Learning. Shinn et al. NeurIPS. [paper] [repo]

Tree-like Structure

[2024/06] MASAI: Modular Architecture for Software-engineering AI Agents. Arora et al. arXiv. [paper]
[2024/04] Self-Organized Agents: A LLM Multi-Agent Framework toward Ultra Large-Scale Code Generation and Optimization. Ishibashi et al. arXiv. [paper] [repo]

Star-like Structure

[2024/01] XUAT-Copilot: Multi-Agent Collaborative System for Automated User Acceptance Testing with Large Language Model. Wang et al. arXiv. [paper]
[2023/10] RCAgent: Cloud Root Cause Analysis by Autonomous Agents with Tool-Augmented Large Language Models. Wang et al. arXiv. [paper]
[2023/08] AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation. Wu et al. arXiv. [paper] [repo]

Human-Agent Collaboration

Planning Phase

[2024/01] Experimenting a New Programming Practice with LLMs. Zhang et al. arXiv. [paper] [repo]
[2024/01] LLM4PLC: Harnessing Large Language Models for Verifiable Programming of PLCs in Industrial Control Systems. Fakih et al. ICSE. [paper] [repo]
[2023/10] Static Code Analysis in the AI Era: An In-depth Exploration of the Concept, Function, and Potential of Intelligent Code Analysis. Fan et al. arXiv. [paper]
[2023/04] Low-code LLM: Visual Programming over LLMs. Cai et al. arXiv. [paper] [repo]

Requirements Phase

[2024/05] MARE: Multi-Agents Collaboration Framework for Requirements Engineering. Jin et al. arXiv. [paper]
[2024/02] Executable Code Actions Elicit Better LLM Agents. Wang et al. ICML. [paper] [repo]
[2024/01] Experimenting a New Programming Practice with LLMs. Zhang et al. arXiv. [paper] [repo]
[2023/10] ClarifyGPT: Empowering LLM-based Code Generation with Intention Clarification. Mu et al. arXiv. [paper] [repo]
[2023/06] Prompt Sapper: LLM-Empowered Software Engineering Infrastructure for AI-Native Services. Xing et al. arXiv. [paper]

Development Phase

[2024/03] CodeS: Natural Language to Code Repository via Multi-Layer Sketch. Zan et al. arXiv. [paper] [repo]
[2024/01] LLM4PLC: Harnessing Large Language Models for Verifiable Programming of PLCs in Industrial Control Systems. Fakih et al. ICSE. [paper] [repo]
[2023/09] MINT: EVALUATING LLMS IN MULTI-TURN INTERACTION WITH TOOLS AND LANGUAGE FEEDBACK. Wang et al. ICLR. [paper] [repo]
[2023/08] Flows: Building Blocks of Reasoning and Collaborating AI. Josifoski et al. arXiv. [paper] [repo]
[2023/08] AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation. Wu et al. arXiv. [paper] [repo]

Evaluation Phase

[2024/01] Experimenting a New Programming Practice with LLMs. Zhang et al. arXiv. [paper] [repo]
[2023/08] Gentopia: A Collaborative Platform for Tool-Augmented LLMs. Xu et al. EMNLP. [paper] [repo]
[2023/06] Prompt Sapper: LLM-Empowered Software Engineering Infrastructure for AI-Native Services. Xing et al. arXiv. [paper]
[2023/03] ART: Automatic multi-step reasoning and tool-use for large language models. Paranjape et al. arXiv. [paper] [repo]

📝 Citation

@misc{Agent4SE,
      title={Large Language Model-Based Agents for Software Engineering: A Survey}, 
      author={Junwei Liu and Kaixin Wang and Yixuan Chen and Xin Peng and Zhenpeng Chen and Lingming Zhang and Yiling Lou},
      year={2024},
      eprint={2409.02977},
      archivePrefix={arXiv},
      primaryClass={cs.SE},
      url={https://arxiv.org/abs/2409.02977}, 
}

👨🏻‍💻 Maintainers

Junwei Liu @To-D
Kaixin Wang @wkx228
Yixuan Chen @FloridaSpidee

📬 Contact Us

Feel free to ask any questions or provide us with some suggestions via:

Junwei Liu: jwliu24@m.fudan.edu.cn

rkks/Agent4SE-Paper-List