[BUG] TruLens throws ValidationError: Input should be an instance of llama_index.core.base_query_engine.BaseQueryEngine
lucifermorningstar1305 opened this issue ยท 5 comments
Bug Description
I am trying to use TruLens for a custom LLamaIndex code that I am working on, and I am getting the following error:
ValidationError: 2 validation errors for TruLlama
app.`is-instance[llama_index.core.base_query_engine.BaseQueryEngine]`
Input should be an instance of llama_index.core.base_query_engine.BaseQueryEngine [type=is_instance_of, input_value=<llama_index.core.query_e...bject at 0x7b3036edfa70>, input_type=RetrieverQueryEngine]
For further information visit https://errors.pydantic.dev/2.6/v/is_instance_of
app.`is-instance[llama_index.chat_engine.types.BaseChatEngine]`
Input should be an instance of llama_index.chat_engine.types.BaseChatEngine [type=is_instance_of, input_value=<llama_index.core.query_e...bject at 0x7b3036edfa70>, input_type=RetrieverQueryEngine]
For further information visit https://errors.pydantic.dev/2.6/v/is_instance_of
To Reproduce
from typing import Any, Callable
import numpy as np
import pandas as pd
import nest_asyncio
import os
import warnings
# Qdrant
from qdrant_client import QdrantClient
# LLamaIndex Imports
from llama_index.core.settings import Settings
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.core import VectorStoreIndex, Document, StorageContext, SimpleDirectoryReader, ServiceContext, load_index_from_storage
from llama_index.vector_stores.qdrant import QdrantVectorStore
from llama_index.llms.gemini import Gemini
from llama_index.llms.ollama import Ollama
# Langchain Imports
from langchain_community.llms import Ollama as LangOllama
from langchain_google_genai import ChatGoogleGenerativeAI
# TrueLens Imports
from trulens_eval.feedback.provider.langchain import Langchain
from trulens_eval import Feedback, TruLlama, Tru
from trulens_eval.feedback import Groundedness
# General
from pathlib import Path
from collections import defaultdict
documents = SimpleDirectoryReader(input_files=["../data/attn.txt"]).load_data()
document = Document(text="\n\n".join([doc.text for doc in documents]))
client = QdrantClient(host="localhost")
collection_name = "test"
def build_naive_index(document: Document, persist_dir: str="./indices/myIndices", llm: str="ollama") -> VectorStoreIndex:
"""Builds a Naive RAG index for a given document based on the LLMs provided"""
Settings.embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5")
if llm == "ollama":
Settings.llm = Ollama(model="llama2", request_timeout=60.0)
vector_store = QdrantVectorStore(collection_name=collection_name, client=client)
storage_context = StorageContext.from_defaults(vector_store=vector_store)
if not os.path.exists(persist_dir):
index = VectorStoreIndex.from_documents(documents=[document], storage_context=storage_context, show_progress=True)
index.storage_context.persist(persist_dir=persist_dir)
else:
index = load_index_from_storage(StorageContext.from_defaults(vector_store=vector_store, persist_dir=persist_dir))
return index
naive_index = build_naive_index(document=document, persist_dir="../indices/1706.03762", llm="ollama")
query_engine = naive_index.as_query_engine()
def get_prebuilt_trulens_recorder(query_engine: Callable, llm:str="ollama", app_id:str="App1") -> Callable:
"""Creates a trulens recorder based on different metrics for evaluating an LLM"""
provider = None
if llm == "ollama":
ollama = LangOllama(model="llama2")
provider = Langchain(chain=ollama)
# Answer Relevance
ans_relevance = Feedback(provider.relevance_with_cot_reasons, name="Answer Relevance").on_input_output()
# Context Relevance
ctx_selection = TruLlama.select_source_nodes().node.text
ctx_relevance = Feedback(provider.qs_relevance_with_cot_reasons, name="Context Relevance").on_input().on(ctx_selection).aggregate(np.mean)
# Groundness Relevance
grounded = Groundedness(groundedness_provider=provider)
gr_relevance = Feedback(grounded.groundedness_measure_with_cot_reasons, name="Groudedness").on(ctx_selection).on_output().aggregate(grounded.grounded_statements_aggregator)
tru_recorder = TruLlama(query_engine,
app_id=app_id,
feedbacks=[ans_relevance, ctx_relevance, gr_relevance])
return tru_recorder
tru = Tru()
tru.reset_database()
tru_recorder = get_prebuilt_trulens_recorder(query_engine, app_id="Naive Rag Engine")
Expected behavior
There should not be any validation error.
Environment:
- OS: Linux (Ubuntu)
- Python Version: 3.12
- TruLens version: 0.27.0
- Versions of other relevant installed libraries: llama-index-core==0.10.21.post1 ollama==0.1.7
Hey @lucifermorningstar1305! Great to see you back with us. Hope you're doing well!
Based on the error message you provided, it seems that the query_engine
object you're passing to the TruLlama
constructor is not an instance of llama_index.core.base_query_engine.BaseQueryEngine
or llama_index.chat_engine.types.BaseChatEngine
, which is what TruLlama
expects.
In your code, you're creating the query_engine
object with the following line:
query_engine = naive_index.as_query_engine()
The as_query_engine()
method should return an object that is an instance of BaseQueryEngine
or a subclass thereof. If it doesn't, you'll encounter the ValidationError
you're seeing.
To resolve this issue, you should ensure that the as_query_engine()
method of your VectorStoreIndex
class (or whatever class naive_index
is an instance of) returns an object that is an instance of BaseQueryEngine
or correctly inherits from it.
Here's the relevant part of the TruLlama
class definition for reference:
class TruLlama(App):
...
app: Union[BaseQueryEngine, BaseChatEngine]
...
def __init__(
self, app: Union[BaseQueryEngine, BaseChatEngine], **kwargs: dict
):
...
As you can see, the app
parameter (which corresponds to your query_engine
object) is expected to be an instance of BaseQueryEngine
or BaseChatEngine
.
Unfortunately, without the actual implementation details of your VectorStoreIndex
class and its as_query_engine
method, it's hard to provide a more specific solution. However, the error message suggests a type mismatch between what TruLlama
expects and what is being passed to it.
You can find more information about the TruLlama
class in the TruLens codebase.
I hope this helps! If you have any more questions or if something is unclear, please let me know.
Sources
About Dosu
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.
Help me be more useful! Please leave a ๐ if this is helpful and ๐ if it is irrelevant.
If you want to continue the conversation, start your reply with @dosu-bot.
try to update llama-index version
@lucifermorningstar1305 Did you try the update?
I have the same error, and the query_engine
type is :
<class 'llama_index.core.query_engine.retriever_query_engine.RetrieverQueryEngine'>
And version info:
llama-index-core==0.10.30
trulens-eval==0.28.1
When i update to llama-index-core-0.10.31, it worked now...
Thanks @metaimagine - seems this was an issue with the llama-index version that was patched on their end. Closing this issue