truera/trulens

Issue with Multiple Retrievers

Closed this issue · 5 comments

Hello,

I am building a RAG application using LangChain and custom LLM. I am using ContextualCompressionRetriever to rerank the context (retrieved documents) using Cohere and passing it into the RetrievalQA chain but when I pass the instance of Retrieval QA chain with reference to ContextualCompressionRetriever, TruLens is throwing the below error:

ValueError: Found more than one 'BaseRetriever' in app: <class 'langchain.retrievers.contextual_compression.ContextualCompressionRetriever'> at <class 'langchain_core.vectorstores.VectorStoreRetriever'> at base_retriever

Could you please help on how to resolve the issue. Thanks.

Here is the code snippet:

compression_retriever = ContextualCompressionRetriever(
    base_compressor=compressor, 
    base_retriever=vector_store.as_retriever(search_type='similarity', search_kwargs={'k': 5})
)

qa = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=compression_retriever,
    return_source_documents=True,
    chain_type_kwargs={"prompt": get_prompt_template()}
)

context = trulens_eval.app.select_context(compression_retriever) # ERROR

Hey @nikhilkomakula - can you also share your feedback definitions?

Hey @nikhilkomakula - can you also share your feedback definitions?

Here you go:

# select context to be used in feedback. the location of context is app specific.
context = App.select_context(compression_retriever)

# initiate the provider
langchain_provider = Langchain(chain=llm, prompt=get_prompt_template())

# Define a groundedness feedback function
groundedness = Groundedness(groundedness_provider=langchain_provider)
f_groundedness = (
    Feedback(groundedness.groundedness_measure_with_cot_reasons, name="Groundedness")
    .on(context.collect()) # collect context chunks into a list
    .on_output()
    .aggregate(groundedness.grounded_statements_aggregator)
)

# Question/answer relevance between overall question and answer.
f_qa_relevance = Feedback(langchain_provider.relevance, name="Answer Relevance").on_input_output()

# Question/statement relevance between question and each context chunk.
f_context_relevance = (
    Feedback(langchain_provider.qs_relevance, name="Context Relevance")
    .on_input()
    .on(context)
    .aggregate(np.mean)
    )

tru_recorder = TruChain(
    base_retriever,
    app_id='Chain1_ChatApplication',
    feedbacks=[f_qa_relevance, f_context_relevance, f_groundedness]
)

with tru_recorder as recording:
    llm_response = qa.invoke(query)

display(llm_response)

Thanks.

Thanks @nikhilkomakula !

The issue here is with the context selection:

context = App.select_context(compression_retriever)

Because your app has two retrievers, you need to intentionally select the one you wish to evaluate the results of.

Please check out our docs here for more details on how to select a particular component of your app for eval.

Thanks for your feedback @joshreini1. In my case, the compression_retriever is used to rerank the context that is retrieved by the base_retriever. I would like to evaluate the results based on the reranked context from compression_retriever. Apologise to say that I looked at the documentation mentioned above, but I could not understand what changes need to be done to the code above to make it work. Any suggestions are greatly appreciated. Thanks in advance.

Hi @nikhilkomakula , I think setting context the following way should solve your issue.

from trulens_eval import Select

context = Select.RecordCalls.first.steps__.context.get_relevant_documents.rets

let me know if this works.