truera/trulens

Evaluation metrics inference

FatemaD1577 opened this issue · 3 comments

Hi,

I am using truelens to evaluate the performance of one of our chat applications. I need help in understanding the metrics being used. I am not able to understand how to interpret these metrics. For eg: The answer for my query is actually incorrect, the context relevance and groundedness parameter scores depict this to some extent but I get answer relevance as always 1.

Can some one help me understand how to interpret these metrics or provide reference to content which explain these at a deeper level?

Thanks in advance.

Hi @FatemaD1577 !

We have material in our docs:
Core Concept: The RAG Triad

And you may also enjoy a short course on this topic we put together with Andrew Ng and Llama-Index:
Building and Evaluating Advanced RAG

Thank you!! @joshreini1 for the prompt reply. Will check the course.

You're welcome!