Evaluation metrics inference
FatemaD1577 opened this issue · 3 comments
Hi,
I am using truelens to evaluate the performance of one of our chat applications. I need help in understanding the metrics being used. I am not able to understand how to interpret these metrics. For eg: The answer for my query is actually incorrect, the context relevance and groundedness parameter scores depict this to some extent but I get answer relevance as always 1.
Can some one help me understand how to interpret these metrics or provide reference to content which explain these at a deeper level?
Thanks in advance.
Hi @FatemaD1577 !
We have material in our docs:
Core Concept: The RAG Triad
And you may also enjoy a short course on this topic we put together with Andrew Ng and Llama-Index:
Building and Evaluating Advanced RAG
Thank you!! @joshreini1 for the prompt reply. Will check the course.
You're welcome!