We introduce RepoGraph, an effective plug-in repo-level module that offers the desired context and substantially boosts the LLMs' AI software engineering capability.
We released the first version RepoGraph and its integration with SWE-bench methods!
repograph
contains the code for construct and retrieve related context from the graph.
agentless
and SWE-agent
incorporates the integrated version of RepoGraph with the two methods.
Currently this version may take a little long time to run for a repo. We provide a cached version for all repos in SWE-bench, download it here and put it under repo_structures
.
To generate the repograph for a given repository, simply run:
python ./repograph/construct_graph.py <dir_to_repo>
This will produce two files, tags_{instance_id}.jsonl
stores the line-level information and {instance_id}.pkl
is the graph constructed using networkx.
For a procedural framework, RepoGraph could be integrated into every step of the pipeline. Refer to --repo_graph
hyperparameter for controllability in different stages.
To run RepoGraph with Agentless, use command:
bash run_repograph_agentless.sh
To integrate RepoGraph with agent framework such as SWE-agent, we simply add an extra action in its initial action space. Specifically, you can look up for search_repo()
in corresponding dir. The signature is defined as:
search_repo:
docstring: searches in the current repository with a specific function or class, and returns the def and ref relations for the search term.
signature: search_repo <search_term>
arguments:
- search_term (string) [required]: function or class to look for in the repository.
To run RepoGraph with SWE-agent, use command:
bash run_repograph_sweagent.sh
We are working on prepreints for details in RepoGraph and a more comprehensive/easy integration with exsiting models. Stay tuned!!