/Awesome-Repo-Level-Code-Generation

Must-read papers on Repository-level Code Generation 🔥

MIT LicenseMIT

🤖✨ Awesome Repository-Level Code Generation ✨🤖

🌟 A curated list of awesome repository-level code generation research papers and resources. If you want to contribute to this list (please do), feel free to send me a pull request. 🚀

📚 Contents

💥 Repo-Level Issue Resolution

  • SWE-bench: Can Language Models Resolve Real-World GitHub Issues? [2024-ICLR] [📄 paper] [🔗 repo]
  • How to Understand Whole Software Repository? [2024-arXiv] [📄 paper]

🤖 Repo-Level Code Completion

  • Fully Autonomous Programming with Large Language Models [2023-GECCO] [📄 paper] [🔗 repo]

  • Repository-Level Prompt Generation for Large Language Models of Code [2023-PMLR] [📄 paper] [🔗 repo]

  • RepoFusion: Training Code Models to Understand Your Repository [2023-arXiv] [📄 paper] [🔗 repo]

  • RepoCoder: Repository-Level Code Completion Through Iterative Retrieval and Generation [2023-EMNLP] [📄 paper] [🔗 repo]

  • Monitor-Guided Decoding of Code LMs with Static Analysis of Repository Context [2023-NeurIPS] [📄 paper] [🔗 repo]

  • CodePlan: Repository-Level Coding using LLMs and Planning [2024-FSE] [📄 paper] [🔗 repo]

  • Repoformer: Selective Retrieval for Repository-Level Code Completion [2024-ICML] [📄 paper] [🔗 repo]

  • Iterative Refinement of Project-Level Code Context for Precise Code Generation with Compiler Feedback [2024-arXiv] [📄 paper] [🔗 repo]

  • Natural Language to Class-level Code Generation by Iterative Tool-augmented Reasoning over Repository [2024-ICML] [📄 paper] [🔗 repo]

  • R2C2-Coder: Enhancing and Benchmarking Real-world Repository-level Code Completion Abilities of Code Large Language Models [2024-arXiv] [📄 paper]

  • Enhancing Repository-Level Code Generation with Integrated Contextual Information [2024-arXiv] [📄 paper]

  • STALL+: Boosting LLM-based Repository-level Code Completion with Static Analysis [2024-arXiv] [📄 paper]

  • Hierarchical Context Pruning: Optimizing Real-World Code Completion with Repository-Level Pretrained Code LLMs [2024-arXiv] [📄 paper] [🔗 repo]

  • RepoMinCoder: Improving Repository-Level Code Generation Based on Information Loss Screening [2024-Internetware] [📄 paper]

  • RLCoder: Reinforcement Learning for Repository-Level Code Completion [2025-ICSE] [📄 paper] [🔗 repo]

  • RepoHyper: Search-Expand-Refine on Semantic Graphs for Repository-Level Code Completion [2024-arXiv] [📄 paper] [🔗 repo]

  • GraphCoder: Enhancing Repository-Level Code Completion via Code Context Graph-based Retrieval and Language Model [2024-arXiv] [📄 paper]

  • RepoGenReflex: Enhancing Repository-Level Code Completion with Verbal Reinforcement and Retrieval-Augmented Generation [2024-arXiv] [📄 paper]

  • RAMBO: Enhancing RAG-based Repository-Level Method Body Completion [2024-arXiv] [📄 paper] [🔗 repo]

📊 Datasets and Benchmarks

  • SWE-bench: Can Language Models Resolve Real-World GitHub Issues? [2024-ICLR] [📄 paper] [🔗 repo]