GraphRAG is a popular π₯π₯π₯ and powerful πͺπͺπͺ RAG system! ππ‘ Inspired by systems like Microsoft's, graph-based RAG is unlocking endless possibilities in AI.
Our project focuses on modularizing and decoupling these methods π§© to unveil the mystery π΅οΈββοΈπβ¨ behind them and share fun and valuable insights! π€©π« Our projectπ¨ is included in Awesome Graph-based RAG.
-
If you find our work helpful, please kindly cite our paper.
-
Download the datasets GraphRAG-dataset
# Clone the repository from GitHub
git clone https://github.com/JayLZhou/GraphRAG.git
cd GraphRAGYou can run different GraphRAG methods by specifying the corresponding configuration file (.yaml).
python main.py -opt Option/Method/RAPTOR.yaml -dataset_name your_datasetThe following methods are available, and each can be run using the same command format:
python main.py -opt Option/Method/<METHOD>.yaml -dataset_name your_datasetReplace <METHOD> with one of the following:
DalkGRLGraphRAG(Local search in GraphRAG)GGraphRAG(Global search in GraphRAG)HippoRAGKGPLightRAGRAPTORToG
For example, to run GraphRAG:
python main.py -opt Option/Method/GraphRAG.yaml -dataset_name your_datasetEnsure you have the required dependencies installed (The default experiment name is digimon):
conda env create -f experiment.yml -n your_experiment_nameGraphRAG supports both cloud-based and local deployment of LLMs:
- Cloud-based models: OpenAI (e.g.,
gpt-4,gpt-3.5-turbo) - Locally deployed models:
OllamaandLlamaFactory
To use a local model, set api_type to open_llm in the configuration file.
llm:
api_type: "openai/open_llm" # Options: "openai" or "open_llm" (For Ollama and LlamaFactory)
model: "YOUR_LOCAL_MODEL_NAME"
base_url: "YOUR_LOCAL_URL" # Change this for local models
api_key: "YOUR_API_KEY" # Not required for local modelsFor LlamaFactory or Ollama, ensure the model is correctly installed and running in your local environment.
You can refer to the Readme of LlamaFactory
llm:
api_type: "open_llm" # Options: "openai" or "open_llm" (For Ollama and LlamaFactory)
model: "YOUR_LOCAL_MODEL_NAME"
base_url: "YOUR_LOCAL_URL" # Change this for local models
api_key: "ANY_THING_IS_OKAY" # Not required for local modelsWe select the following Graph RAG methods:
Based on the entity and relation, we categorize the graph into the following types:
- Chunk Tree: A tree structure formed by document content and summary.
- Passage Graph: A relational network composed of passages, tables, and other elements within documents.
- KG: knowledge graph (KG) is constructed by extracting entities and relationships from each chunk, which contains only entities and relations, is commonly represented as triples.
- TKG: A textual knowledge graph (TKG) is a specialized KG (following the same construction step as KG), which enriches entities with detailed descriptions and type information.
- RKG: A rich knowledge graph (RKG), which further incorporates keywords associated with relations.
The criteria for the classification of graph types are as follows:
| Graph Attributes | Chunk Tree | Passage Graph | KG | TKG | RKG |
|---|---|---|---|---|---|
| Original Content | β | β | β | β | β |
| Entity Name | β | β | β | β | β |
| Entity Type | β | β | β | β | β |
| Entity Description | β | β | β | β | β |
| Relation Name | β | β | β | β | β |
| Relation keyword | β | β | β | β | β |
| Relation Description | β | β | β | β | β |
| Edge Weight | β | β | β | β | β |
The retrieval stage lies the key role
βΌοΈ in the entire GraphRAG process. β¨ The goal is to identify query-relevant content that supports the generation phase, enabling the LLM to provide more accurate responses.
π‘π‘π‘ After thoroughly reviewing all implementations, we've distilled them into a set of 16 operators π§©π§©. Each method then constructs its retrieval module by combining one or more of these operators π§©.
We classify the operators into five categories, each offering a different way to retrieve and structure relevant information from graph-based data.
Retrieve entities (e.g., people, places, organizations) that are most relevant to the given query.
| Name | Description | Example Methods |
|---|---|---|
| VDB | Select top-k nodes from the vector database | G-retriever, RAPTOR, KGP |
| RelNode | Extract nodes from given relationships | LightRAG |
| PPR | Run PPR on the graph, return top-k nodes with PPR scores | FastGraphRAG |
| Agent | Utilizes LLM to find the useful entities | ToG |
| Onehop | Selects the one-hop neighbor entities of the given entities | LightRAG |
| Link | Return top-1 similar entity for each given entity | HippoRAG |
| TF-IDF | Rank entities based on the TF-IFG matrix | KGP |
Extracting useful relationships for the given query.
| Name | Description | Example Methods |
|---|---|---|
| VDB | Retrieve relationships by vector-database | LightRAGγG-retriever |
| Onehop | Selects relationships linked by one-hop neighbors of the given selected entities | Local Search for MS GraphRAG |
| Aggregator | Compute relationship scores from entity PPR matrix, return top-k | FastGraphRAG |
| Agent | Utilizes LLM to find the useful entities | ToG |
Retrieve the most relevant text segments (chunks) related to the query.
| Name | Description | Example Methods |
|---|---|---|
| Aggregator | Use the relationship scores and the relationship-chunk interactions to select the top-k chunks | HippoRAG |
| FromRel | Return chunks containing given relationships | LightRAG |
| Occurrence | Rank top-k chunks based on occurrence of both entities in relationships | Local Search for MS GraphRAG |
Extract a relevant subgraph for the given query
| Name | Description | Example Methods |
|---|---|---|
| KhopPath | Find k-hop paths with start and endpoints in the given entity set | DALK |
| Steiner | Compute Steiner tree based on given entities and relationships | G-retriever |
| AgentPath | Identify the most relevant π-hop paths to a given question, by using LLM to filter out the irrelevant paths | TOG |
Identify high-level information, which is only used for MS GraphRAG.
| Name | Description | Example Methods |
|---|---|---|
| Entity | Detects communities containing specified entities | Local Search for MS GraphRAG |
| Layer | Returns all communities below a required layer | Global Search for MS GraphRAG |
You can freely πͺ½ combine those operators π§© to create more and more GraphRAG methods.
Below, we present some examples illustrating how existing algorithms leverage these operators.
| Name | Operators |
|---|---|
| HippoRAG | Chunk (Aggregator) |
| LightRAG | Chunk (FromRel) + Entity (RelNode) + Relationship (VDB) |
| FastGraphRAG | Chunk (Aggregator) + Entity (PPR) + Relationship (Aggregator) |
- Detailed readme
- Support RoG, PathRAG, etc.
- Provide a docker image for easy deployment.
- Support more LLMs, such as AZURE.
If you find this work useful, please consider citing our papers:
@article{zhou2025depth,
title={In-depth Analysis of Graph-based RAG in a Unified Framework},
author={Zhou, Yingli and Su, Yaodong and Sun, Youran and Wang, Shu and Wang, Taotao and He, Runyuan and Zhang, Yongwei and Liang, Sicong and Liu, Xilin and Ma, Yuchi and others},
journal={arXiv preprint arXiv:2503.04338},
year={2025}
}
