/RepoPilot

Repo-Level Coding Assistant that Can Understand Your Whole Codebase

Primary LanguagePythonApache License 2.0Apache-2.0



license python

Technical Report (to appear), Examples,

RepoPilot: Multi-Agent Coding Assistant that Understand Your Codebase

Table of Contents

Overview

RepoPilot is a one-stop Python library that revolutionizes the way developers interact with and understand their codebases. Utilizing advanced Large Language Models (LLMs), RepoPilot acts as a multi-agent system, offering a next-generation coding assistant for comprehensive codebase exploration and impact analysis. Designed for developers who seek deeper insights into their projects, RepoPilot simplifies complex code analysis tasks, making it an indispensable tool for modern software development.

Key Features

  • Whole Repository Understanding: Unlike other coding assistants, RepoPilot is engineered to grasp the full context of your entire codebase, enabling a more comprehensive analysis and more accurate recommendations.
  • Natural Language Queries: Interact with your codebase using conversational queries. Ask RepoPilot about specific features, code impacts, and more, just like talking to an AI assistant.
  • Codebase Exploration and Analysis: Delve into your codebase with ease. Understand how particular features are implemented and assess the impact of potential changes. Actionable Insights and Recommendations: Get practical suggestions and automated actions based on RepoPilot's deep understanding of your code.
  • Seamless Integration: Integrate RepoPilot into your existing development workflow with its Python API, allowing for flexible and powerful code interactions.

Architecture



RepoPilot is a multi-agent system that consists of three main components: the Planning Agent, the Navigation Agent, and the Analysis Agent.

  • Planning Agent is responsible for understanding the user's query and determining a draft plan of action. The planning agent is based on GPT-4 prompted with a query and general information about the codebase.

  • Navigation Agent is responsible for navigating the codebase, finding relevant code snippets and storing high value information related to the query into the working memory. The navigation agent is implemented with ReAct-like architecture with dynamic backtracking as well as multi-languages language server protocol (mLSP) support to efficiently navigate inside the codebase (go-to-definition, find references, code search, semantic code search, etc).

  • Analysis Agent is responsible for finally giving the user the insights and recommendations based on the query and the information stored in the working memory. The analysis agent is based on GPT-4 prompted with the query and the information stored in the working memory.

Quick Demo:

Demo: Real Github Issue QA on Huggingface/PEFT

demo.repopilot.-.official.mp4

Installation

RepoPilot uses Zoekt for code search. Please install Zoekt before installing RepoPilot. Zoekt requires latest Go installation, please follow the instructions here to install Go.

go get github.com/sourcegraph/zoekt/

# Install Zoekt Index
go install github.com/sourcegraph/zoekt/cmd/zoekt-index
# Install Zoekt Web Server
go install github.com/sourcegraph/zoekt/cmd/zoekt-webserver

We also need to install universal-ctags for semantic code search. Please follow the instructions here. Remember to set the environment variable of CTAGS CTAGS_COMMAND=universal-ctags. Finally, we can install RepoPilot.

pip3 install repopilot

Example

# Importing the RepoPilot library
import repopilot

# Initialize RepoPilot with the path to your code repository
repo_path = "/path/to/your/codebase"
rp = repopilot.RepoPilot(repo_path)

# Example 1: Natural Language Query about a Feature
# User asks about the login feature in a conversational manner
query = "Please explain how the login features work in this codebase."
login_feature_explanation = rp.query_codebase(query)
print("Login Feature Explanation:")
print(login_feature_explanation)

# Example 2: Impact of Changes in Natural Language
# User asks about the impact of a specific change
change_query = "What would be the impact if I refactor the authentication module?"
change_impact = rp.query_codebase(change_query)
print("Impact of Refactoring Authentication Module:")
print(change_impact)

# Example 3: Code Improvement Suggestions in Conversational Style
# User asks for general improvement suggestions
improvement_query = "How can I improve the code quality of the project?"
improvement_suggestions = rp.query_codebase(improvement_query)
print("Code Improvement Suggestions:")
print(improvement_suggestions)

# Example 4: Searching for Code Patterns using Natural Language
# User wants to find certain types of functions or methods
search_query = "Find all asynchronous functions in the codebase."
async_functions = rp.query_codebase(search_query)
print("Asynchronous Functions Found:")
print(async_functions)

# Example 5: Bug reproduction from Bug reports
# User wants to generate a fail-to-pass test case from a bug report
search_query = "Write a JUnit test case code in java that reproduce the failure behavior of the given bug report as following: {bug_report (in this case is Time 23 Defects4J)}."
bug_reproduction = rp.query_codebase(search_query)
print("Bug Reproduction:")

CLI Usage

usage: repopilot [--help] [command: setup, query]

Setup

Usage: repopilot setup [OPTIONS] REPO_PATH [ARGS]...

╭─ Arguments ──────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ *    repo_path      TEXT  The path to the repository to set up. [default: None] [required]                       │                                                                                           
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
╭─ Options ────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ *  --repository-name        TEXT  The name of the repository. [default: None] [required]                         │                                                                                           
│ *  --language               TEXT  The programming language of the repository. [default: None] [required]         │                                                                                           
│    --commit                 TEXT  The commit to set up.                                                          │                                                                                           
│    --local-agent            TEXT  local agent path [default: model/mistral-7B]                                   │                                                                                           
│    --devices                TEXT  devices to use for inference [default: 0]                                      │
│    --clone-dir              TEXT  The directory to clone the repository to. [default: data/repos]                │
│    --gh-token               TEXT  The GitHub token to use for cloning private repositories. [default: ""]        │
│    --help                         Show this message and exit.                                                    │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
                                                                                                                                               

Query

Usage: repopilot query [OPTIONS] REPO_PATH [ARGS]...

╭─ Arguments ─────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ *    repository_name      TEXT  The name of the repository to query. [default: None] [required]                 │                                                                                       
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
╭─ Options ───────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ --planner-type                  TEXT  The type of planner to use. [default: adaptive]                           │                                                                                            
│ --save-trajectories-path        TEXT  The path to save the trajectories to. [default: None]                     │                                                                                            
│ --help                                Show this message and exit.                                               │                                                                                            
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

Acknowledgements

We would like to thank the development of Multiplspy, Supporting Multiple languages chunking and codetext parser for the multi-language support of the navigation agent.