hwchase17/chroma-langchain

Segmentation Fault when Initializing Chroma Vector Store in LangChain

Opened this issue · 0 comments

I am encountering a segmentation fault when trying to initialize a Chroma vector store using langchain_community.vectorstores.Chroma. The issue occurs specifically at the point where I call Chroma.from_texts to create the vector store. Here is a minimal code snippet to demonstrate the issue:

import numpy as np
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores.chroma import Chroma

# Simplified data for testing
texts = ["This is a test document.", "This is another test document."]
metadatas = [{"title": "Test Document 1", "summary": "Summary of test document 1"},
             {"title": "Test Document 2", "summary": "Summary of test document 2"}]

# Initialize embeddings model
embeddings_model = OpenAIEmbeddings()

# Debugging information to check data integrity
print(f"Number of documents: {len(texts)}")
print(f"First document text: {texts[0]}")
print(f"First document metadata: {metadatas[0]}")

# Attempt to initialize Chroma Vector Store
try:
    print("Initializing Chroma Vector Store...")
    docsearch = Chroma.from_texts(texts=texts, embedding=embeddings_model, metadatas=metadatas)
    print("Chroma Vector Store initialized successfully.")
except Exception as e:
    print(f"Error initializing Chroma Vector Store: {e}")

Expected Behavior:

The Chroma vector store should initialize successfully, and the subsequent print statements should execute without errors.

Actual Behavior:

The script encounters a segmentation fault immediately after attempting to initialize the Chroma vector store with Chroma.from_texts.

Interestingly, the print statements prior to the call to Chroma seem to get swallowe by the crash, potentially diverting std.