FalkorDB uses Euclidean distance instead of Cosine despite default and explicit settings
Opened this issue ยท 9 comments
Checked other resources
- This is a bug, not a usage question.
- I added a clear and descriptive title that summarizes this issue.
- I used the GitHub search to find a similar question and didn't find it.
- I am sure that this is a bug in LangChain rather than my code.
- The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).
- This is not related to the langchain-community package.
- I read what a minimal reproducible example is (https://stackoverflow.com/help/minimal-reproducible-example).
- I posted a self-contained, minimal, reproducible example. A maintainer can copy it and run it AS IS.
Example Code
from langchain_community.vectorstores.falkordb_vector import FalkorDBVector
from langchain_community.embeddings import SentenceTransformerEmbeddings
from langchain.schema import Document
Initialize free embeddings
embeddings = SentenceTransformerEmbeddings(model_name="all-MiniLM-L6-v2")
Example documents (must be Document objects)
docs = [
Document(page_content="Roti Batch", metadata={"id": 1}),
Document(page_content="Roti", metadata={"id": 2}),
Document(page_content="Recipe", metadata={"id": 3})
]
Initialize FalkorDB vector store
vector_store = FalkorDBVector.from_documents(
documents=docs,
embedding=embeddings,
host="localhost",
port=6379
)
Perform similarity search with scores
query = "Roti"
results = vector_store.similarity_search_with_score(query=query, k=3)
for doc, score in results:
print(f"Doc: {doc.page_content}, Score: {score}")
Results
Doc: Roti, Score: 1.0
Doc: Roti Batch, Score: 0.650620520114899
Doc: Recipe, Score: 0.390799462795258
Error Message and Stack Trace (if applicable)
No response
Description
I am experiencing an issue with Falkor where it appears to use Euclidean distance instead of Cosine distance, even though:
- The default distance strategy is explicitly set to Cosine:
DEFAULT_DISTANCE_STRATEGY = DistanceStrategy.COSINE- The distance strategy is explicitly passed as Cosine in parameters.
However, when running vector queries, the distance computation still seems to behave like Euclidean. For example, in the _get_search_index_query function:
DISTANCE_MAPPING = {
DistanceStrategy.EUCLIDEAN_DISTANCE: "euclidean",
DistanceStrategy.COSINE: "cosine",
}
def _get_search_index_query(search_type: SearchType, index_type: IndexType = DEFAULT_INDEX_TYPE) -> str:
if index_type == IndexType.NODE:
if search_type == SearchType.VECTOR:
return (
"CALL db.idx.vector.queryNodes($entity_label, "
"$entity_property, $k, vecf32($embedding)) "
"YIELD node, score "
"WITH node, (2 - score)/2 as score"
)It seems like the distance type mapping (DISTANCE_MAPPING) is not being respected, and the query is returning results consistent with Euclidean distance rather than Cosine similarity.
Expected Behavior:
- When
DistanceStrategy.COSINEis set as default or explicitly passed, the query should compute Cosine similarity.
Actual Behavior:
- The query appears to use Euclidean distance regardless of the strategy setting.
Environment:
- Falkor version: latest
- Python version: 3.13
Additional Context:
- This behavior affects similarity searches and scoring in ways that are inconsistent with the intended Cosine distance strategy.
System Info
(.venv) akankshapalve@Akankshas-MacBook-Air insights-falkordb-service % python -m langchain_core.sys_info
System Information
OS: Darwin
OS Version: Darwin Kernel Version 24.6.0: Mon Jul 14 11:30:40 PDT 2025; root:xnu-11417.140.69~1/RELEASE_ARM64_T8132
Python Version: 3.13.5 (main, Jun 12 2025, 21:50:42) [Clang 16.0.0 (clang-1600.0.26.6)]
Package Information
langchain_core: 0.3.76
langchain: 0.3.27
langchain_community: 0.3.29
langsmith: 0.4.29
langchain_falkordb: 0.1.2
langchain_openai: 0.3.33
langchain_text_splitters: 0.3.11
Optional packages not installed
langserve
Other Dependencies
aiohttp<4.0.0,>=3.8.3: Installed. No version info available.
async-timeout<5.0.0,>=4.0.0;: Installed. No version info available.
dataclasses-json<0.7,>=0.6.7: Installed. No version info available.
falkordb: 1.2.0
httpx-sse<1.0.0,>=0.4.0: Installed. No version info available.
httpx<1,>=0.23.0: Installed. No version info available.
jsonpatch<2.0,>=1.33: Installed. No version info available.
langchain-anthropic;: Installed. No version info available.
langchain-aws;: Installed. No version info available.
langchain-azure-ai;: Installed. No version info available.
langchain-cohere;: Installed. No version info available.
langchain-community;: Installed. No version info available.
langchain-core<1.0.0,>=0.3.72: Installed. No version info available.
langchain-core<1.0.0,>=0.3.76: Installed. No version info available.
langchain-core<2.0.0,>=0.3.75: Installed. No version info available.
langchain-deepseek;: Installed. No version info available.
langchain-fireworks;: Installed. No version info available.
langchain-google-genai;: Installed. No version info available.
langchain-google-vertexai;: Installed. No version info available.
langchain-groq;: Installed. No version info available.
langchain-huggingface;: Installed. No version info available.
langchain-mistralai;: Installed. No version info available.
langchain-ollama;: Installed. No version info available.
langchain-openai;: Installed. No version info available.
langchain-perplexity;: Installed. No version info available.
langchain-text-splitters<1.0.0,>=0.3.9: Installed. No version info available.
langchain-together;: Installed. No version info available.
langchain-xai;: Installed. No version info available.
langchain<2.0.0,>=0.3.27: Installed. No version info available.
langsmith-pyo3>=0.1.0rc2;: Installed. No version info available.
langsmith>=0.1.125: Installed. No version info available.
langsmith>=0.1.17: Installed. No version info available.
langsmith>=0.3.45: Installed. No version info available.
numpy>=1.26.2;: Installed. No version info available.
numpy>=2.1.0;: Installed. No version info available.
openai-agents>=0.0.3;: Installed. No version info available.
openai<2.0.0,>=1.104.2: Installed. No version info available.
opentelemetry-api>=1.30.0;: Installed. No version info available.
opentelemetry-exporter-otlp-proto-http>=1.30.0;: Installed. No version info available.
opentelemetry-sdk>=1.30.0;: Installed. No version info available.
orjson>=3.9.14;: Installed. No version info available.
packaging>=23.2: Installed. No version info available.
pydantic-settings<3.0.0,>=2.10.1: Installed. No version info available.
pydantic<3,>=1: Installed. No version info available.
pydantic<3.0.0,>=2.7.4: Installed. No version info available.
pydantic>=2.7.4: Installed. No version info available.
pytest>=7.0.0;: Installed. No version info available.
PyYAML>=5.3: Installed. No version info available.
requests-toolbelt>=1.0.0: Installed. No version info available.
requests<3,>=2: Installed. No version info available.
requests<3,>=2.32.5: Installed. No version info available.
requests>=2.0.0: Installed. No version info available.
rich>=13.9.4;: Installed. No version info available.
SQLAlchemy<3,>=1.4: Installed. No version info available.
tenacity!=8.4.0,<10,>=8.1.0: Installed. No version info available.
tenacity!=8.4.0,<10.0.0,>=8.1.0: Installed. No version info available.
tiktoken<1,>=0.7: Installed. No version info available.
typing-extensions>=4.7: Installed. No version info available.
vcrpy>=7.0.0;: Installed. No version info available.
zstandard>=0.23.0: Installed. No version info available.
If you comment this particular line 'WITH node, (2 - score) / 2 AS score' from the falkodb_vector.py module
you'll see such results :
Roti, Score: 0.0
Doc: Roti Batch, Score: 0.698758959770203
Doc: Recipe, Score: 1.21840107440948
Indicating the use of euclidean distance .
and If u put the value of k =1 here 'results = vector_store.similarity_search_with_score(query=query, k=1)'
You'll see this result:
Doc: Roti, Score: 0.0
Indicating it knows that roti is top result but score is still zero
Hi,
Thanks for clearing this up - this does give some lead to work on
Hi @Akanksha-turinton, Thank you for bringing it up!
We will look into it.
Hi @galshubeli , i was able to work on a few fixes, should i do a PR ?
@Sathyanarayanan-ops thanks that would be great!
Hi,
Further research prompted me to find a few bugs , that i fixed.
The PR is right here , langchain-ai/langchain-community#351
However , what i found is that , the core function to compute the 2 distances, works fine
from falkordb import FalkorDB
client = FalkorDB(host="localhost", port=6379)
graph = client.select_graph("RcSF")
result = graph.query(
"RETURN vec.euclideanDistance(vecf32([1.0, 0.0]), vecf32([0.0, 1.0])) AS euclid, "
"vec.cosineDistance(vecf32([1.0, 0.0]), vecf32([0.0, 1.0])) AS cos"
)
print(result.result_set)
Itself gives the right answer
The problem begins on how its being retrieved, which i believe is not a problem in the Langchain library.
I am opening an issue in the falkordb repo , but i think we can close this issue for now , please let me know if otherwise
Also please feel free to suggest changes to the PR linked above
Hi , @Sathyanarayanan-ops
Can you elaborate the issue in falkor retrieval side ?
Hi , @Sathyanarayanan-ops Can you elaborate the issue in falkor retrieval side ?
Hi Akanksha,
Sorry, I should have linked the issue that I opened in FalkorDB repo.
Please find it below , also any of the mods can review the PR I have made and then close this issue after it is merged.
Please let me know if you need further elaboration.
Thanks
Hi all,
A reviewer ran some tests today and tried to merge the changes , not sure how I missed, but a few tests failed :(
Pushed changes with all CI checks passed, hopefully will have a merge any time soon.
Will keep you all posted
Thanks