Knowledge Graph Integration for LangFlow
Opened this issue · 6 comments
Feature Request
Feature Description:
The integration of Neo4j into LangFlow will bring support for powerful knowledge graph database retrieval capabilities, extending LangFlow's versatility in handling diverse data storage solutions. Neo4j, Nebula Graph, and FalcorDB are leading Knowledge Graph Databases, that can serve as both a vector database and a knowledge graph using Cypher queries. This feature will mirror existing integrations like Redis and Pinecone, empowering users to efficiently leverage graph structures for retrieval tasks.
https://python.langchain.com/v0.2/docs/integrations/graphs/neo4j_cypher/
https://python.langchain.com/docs/integrations/providers/neo4j
https://python.langchain.com/docs/integrations/vectorstores/neo4jvector
Motivation
Why This is Essential:
-
Graph Databases as Data Retrievers: The ability to retrieve data from knowledge graph databases opens new avenues for complex, interconnected data queries. KGDB structure allows for highly sophisticated data relationships, offering more flexible and insightful retrieval than traditional databases.
-
Unlocking Knowledge Graph Potential: Knowledge Graph Databases such as Neo4j, FalkorDB, and Nebula Graph are becoming crucial in industries where understanding relationships between data points is key—such as recommendation systems, fraud detection, and semantic search. Integrating these with LangFlow will allow users to query graph data with ease and precision, using graph-based algorithms.
-
Dual Functionality with Vector Databases: Neo4j, Bebula Graph, and FalnorDB can also act as a vector database, adding to LangFlow's capacity to handle vectorized data alongside traditional knowledge graph structures. This flexibility allows users to seamlessly move between vector searches (e.g., for similarity) and more intricate relationship-based queries via Cypher, a powerful query language.
-
Aligning with LangChain: LangChain, a recognized leader in the space, has already integrated Neo4j for both vector store and graph database functionalities. Aligning LangFlow with LangChain ensures that LangFlow remains competitive and compatible with the broader AI and machine learning ecosystem. Users will appreciate the continuity of experience and the shared ecosystem between these tools.
-
Extending LangFlow’s Reach: The integration will expand LangFlow's use cases, making it more appealing to sectors that rely on graph-based data representation and advanced knowledge modeling. Neo4j’s integration would create a path for future inclusion of other graph databases like FalkorDB and Nebula Graph, thereby creating a wider, more flexible infrastructure for LangFlow users.
Business Impact:
By enabling Neo4j, Nebula Graph, FalcorDB and other Knowledge Graph Databases, LangFlow will become more attractive to industries focused on complex data interrelationships, ensuring broader adoption and deeper penetration into sectors like fintech, healthcare, cybersecurity, and recommendation engines.
Your Contribution
No response
Hey @qdrddr! I'm here to assist you with any questions, bugs, or contributions. Let's tackle this together!
I found a similar closed issue regarding the integration of Criteria Evaluation with LangFlow. The suggested solution involved creating a custom component for Criteria Evaluation, which can be adapted for integrating Neo4j as well [1].
To integrate Neo4j into LangFlow for enhancing its capabilities for knowledge graph database retrieval and vector database functionalities, you can follow these steps:
- Create a Neo4j Vector Store Component:
Define a new component class for Neo4j, similar to theHCDVectorStoreComponent
.
from langflow.base.vectorstores.model import LCVectorStoreComponent, check_cached_vector_store
from langflow.inputs import StrInput, SecretStrInput, MultilineInput, IntInput, DropdownInput, FloatInput, DictInput, DataInput
from langflow.schema import Data
class Neo4jVectorStoreComponent(LCVectorStoreComponent):
display_name: str = "Neo4j"
description: str = "Implementation of Vector Store using Neo4j with search capabilities"
documentation: str = "https://neo4j.com/docs/"
name = "Neo4j"
icon: str = "Neo4j"
inputs = [
StrInput(
name="database_name",
display_name="Database Name",
info="The name of the Neo4j database.",
required=True,
),
StrInput(
name="username",
display_name="Neo4j Username",
info="Authentication username for accessing Neo4j.",
required=True,
),
SecretStrInput(
name="password",
display_name="Neo4j Password",
info="Authentication password for accessing Neo4j.",
required=True,
),
StrInput(
name="uri",
display_name="Neo4j URI",
info="URI for the Neo4j service.",
required=True,
),
MultilineInput(
name="search_input",
display_name="Search Input",
),
DataInput(
name="ingest_data",
display_name="Ingest Data",
is_list=True,
),
IntInput(
name="number_of_results",
display_name="Number of Results",
info="Number of results to return.",
advanced=True,
value=4,
),
DropdownInput(
name="search_type",
display_name="Search Type",
info="Search type to use",
options=["Similarity", "Similarity with score threshold", "MMR (Max Marginal Relevance)"],
value="Similarity",
advanced=True,
),
FloatInput(
name="search_score_threshold",
display_name="Search Score Threshold",
info="Minimum similarity score threshold for search results. (when using 'Similarity with score threshold')",
value=0,
advanced=True,
),
DictInput(
name="search_filter",
display_name="Search Metadata Filter",
info="Optional dictionary of filters to apply to the search query.",
advanced=True,
is_list=True,
),
]
@check_cached_vector_store
def build_vector_store(self):
try:
from neo4j import GraphDatabase
except ImportError:
raise ImportError(
"Could not import neo4j package. Please install it with `pip install neo4j`."
)
driver = GraphDatabase.driver(self.uri, auth=(self.username, self.password))
return driver
def _add_documents_to_vector_store(self, driver):
documents = []
for _input in self.ingest_data or []:
if isinstance(_input, Data):
documents.append(_input.to_lc_document())
else:
raise ValueError("Vector Store Inputs must be Data objects.")
if documents:
with driver.session() as session:
for doc in documents:
session.run("CREATE (n:Document {content: $content})", content=doc.content)
else:
logger.debug("No documents to add to the Vector Store.")
def _map_search_type(self):
if self.search_type == "Similarity with score threshold":
return "similarity_score_threshold"
elif self.search_type == "MMR (Max Marginal Relevance)":
return "mmr"
else:
return "similarity"
def _build_search_args(self):
args = {
"k": self.number_of_results,
"score_threshold": self.search_score_threshold,
}
if self.search_filter:
clean_filter = {k: v for k, v in self.search_filter.items() if k and v}
if len(clean_filter) > 0:
args["filter"] = clean_filter
return args
def search_documents(self) -> list[Data]:
driver = self.build_vector_store()
logger.debug(f"Search input: {self.search_input}")
logger.debug(f"Search type: {self.search_type}")
logger.debug(f"Number of results: {self.number_of_results}")
if self.search_input and isinstance(self.search_input, str) and self.search_input.strip():
try:
search_type = self._map_search_type()
search_args = self._build_search_args()
with driver.session() as session:
result = session.run("MATCH (n:Document) WHERE n.content CONTAINS $query RETURN n LIMIT $limit", query=self.search_input, limit=self.number_of_results)
docs = [record["n"] for record in result]
except Exception as e:
raise ValueError(f"Error performing search in Neo4j: {str(e)}") from e
logger.debug(f"Retrieved documents: {len(docs)}")
data = docs_to_data(docs)
logger.debug(f"Converted documents to data: {len(data)}")
self.status = data
return data
else:
logger.debug("No search input provided. Skipping search.")
return []
def get_retriever_kwargs(self):
search_args = self._build_search_args()
return {
"search_type": self._map_search_type(),
"search_kwargs": search_args,
}
-
Install Required Packages:
Ensure you have the necessary packages installed:pip install neo4j
-
Update LangFlow Configuration:
Add the newNeo4jVectorStoreComponent
to the LangFlow configuration to make it available in the UI.
By following these steps, you can integrate Neo4j into LangFlow, allowing it to leverage Neo4j's capabilities for knowledge graph database retrieval and vector database functionalities [2].
Your competitor RAGflow has already added GraphRAG support
Your competitor RAGflow has already added GraphRAG support
No, not across documents. Just within one doc.
Flowise also added Neo4j/GraphRAG already in their 2.2.3 release.
would be great to have it in LangFlow too..
Here someone already created plugins for LangFlow at the https://www.langflow.store/store I think it could be a good start.