Papers for Repo-Level Code Generation

This repo maintains the list of papers for repo-level code generation.

Feel free to create pull request to add more.

Papers

06/2024: Iterative Refinement of Project-Level Code Context for Precise Code Generation with Compiler Feedback
- iteratively retrieve the context based on the compiler feedback
06/2024: GraphCoder: Enhancing Repository-Level Code Completion via Code Context Graph-based Retrieval and Language Model
- Slicing the blocks/lines of code as retrieved context
06/2024： R2C2-Coder: Enhancing and Benchmarking Real-world Repository-level Code Completion Abilities of Code Large Language Models
- A repo-level completion benchmark and a context retrieval and prompt assemble powered code completion framework.
06/2024： Enhancing Repository-Level Code Generation with Integrated Contextual Information
- Designed for statically typed programming languages. Integrate relevant code and type context.
05/2024: Dataflow-Guided Retrieval Augmentation for Repository-Level Code Completion
- ACL-2024 accepted: extract entities and relations formalism, to obtain the context graph
03/2024: Repoformer: Selective Retrieval for Repository-Level Code Completion
- A pre-training approach to tackle repo-level code retrieval
02/2024: Enhancing LLM-Based Coding Tools through Native Integration of IDE-Derived Static Context
- Leverage the IDE cross-file information for LLM to perform repo-level code generation.
01/2024: CODEAGENT: Enhancing Code Generation with Tool-Integrated Agent Systems for Real-World Repo-level Coding Challenges (from software engieerning)
- propose a benchmark for evaluation.
- Method: use tool to retrieve, rather than similarity
12/2023: Context-Aware Code Generation Framework for Code Repositories: Local, Global, and Third-Party Library Awareness (from software engieerning)
- Focus on enhancing the retrieval process (based on GPT-3.5-Turbo)
11/2023: Guiding Language Models of Code with Global Context using Monitors
- Maintain a monitor while performing repo-level code generation
10/2023: CrossCodeEval: A Diverse and Multilingual Benchmark for Cross-File Code Completion
- benchmark that required cross-file reasoning
09/2023: CodePlan: Repository-level Coding using LLMs and Planning
- plan first, then execute
06/2023: RepoFusion: Training Code Models to Understand Your Repository
- trained to understand the whole repo
03/2023: RepoCoder: Repository-Level Code Completion Through Iterative Retrieval and Generation (EMNLP 2023)
- Iteratively retrieve code from repo based on similary, until the code is correct
03/2023: InferFix: End-to-End Program Repair with LLMs (ICSE)
- query the database to retrieve
06/2022: Repository-Level Prompt Generation for Large Language Models of Code (ICML 2023)
- generate prompt based on the complete repo, classify from a list of prompt proposal

Survey Papers

Relevant Agent-based Papers

08/2023: MetaGPT: Meta Programming for A Multi-Agent Collaborative Framework
07/2023: Communicative Agents for Software Development

allanj/repo-level-codegen-papers

Papers for Repo-Level Code Generation

Papers

Survey Papers

Relevant Agent-based Papers