Evaluation metrics inference

Question

Evaluation metrics inference

FatemaD1577 opened this issue 4 months ago · 3 comments

Hi,

I am using truelens to evaluate the performance of one of our chat applications. I need help in understanding the metrics being used. I am not able to understand how to interpret these metrics. For eg: The answer for my query is actually incorrect, the context relevance and groundedness parameter scores depict this to some extent but I get answer relevance as always 1.

Can some one help me understand how to interpret these metrics or provide reference to content which explain these at a deeper level?

Thanks in advance.

Answer 1 · 2024-02-27T12:47:43.000Z

Hi @FatemaD1577 !

We have material in our docs:
Core Concept: The RAG Triad

And you may also enjoy a short course on this topic we put together with Andrew Ng and Llama-Index:
Building and Evaluating Advanced RAG

Answer 2 · 2024-02-27T13:16:27.000Z

Thank you!! @joshreini1 for the prompt reply. Will check the course.

Answer 3 · 2024-02-27T13:54:56.000Z

You're welcome!