"Rich" answers don't hallucinate. Almost.
In Retrieval-Augmented Generation (RAG), document(s) are at first divided into multiple chunks. The size of chunks and the overlap between two chunks, among other parameters, play an important role in retrieving the correct context and answers.
RAG2Rich computes and uses the optimal configurations based on three metrics:
- Context relevance (C )
- Answer relevance (A)
- Groundedness (G)
We compute these metrics using TruLens Eval. Subsequently, we combine these metrics to define "Rich score" as follows:
where
We use the publicly available Technical Report titled Description of IEC 61850 Communication as the data source to demonstrate RAG2Rich. The report is used solely for the purpose of demonstration, and it is not distributed with the code.
RAG2Rich is tuned by considering the following questions:
- What is IEC 61850?
- Tell me about digital subtations.
- What the expansion of GOOSE?
- What are the different fields in a GOOSE packet?
- What are physical and logical devices?
- How so the stNum and sqNum values change?
- Show me a list of different data types supported by the standard.
- How does MMS communication work?
- Are GOOSE messages encrypted?
- What is a data set?
Based on the above-mentioned document, the application generates answers for each of these questions. The output score vector
- chunk size
- chunk overlap
- top-k
For each such set, the average richness,
The currently used RAG parameters are:
- chunk size = 512
- chunk overlap = 75
- top-k = 4
- Cohere re-ranker top-n = 3
Install the dependencies:
pip install -r requirements.txt
To run RAG2Rich, the document Q&A application, execute the following command:
chainlit run app_llamaindex.py
To run RAG fine-tuning experiments, execute the following command in the evaluation_trulens
directory:
python experiments.py
Currently, the measurements are manually copied from the dashboard to CSV files (see examples in the evaluation_trulens
directory). When these data are available, find the optimal settings by running:
python optimal_settings.py
The output shows the optimal value of