Optimize docs search hyper parameters
Closed this issue · 0 comments
creatorrr commented
Let's run an automated evaluation on the RAG dataset (using a local model or something) and then tune the doc search hyperparameters based on this. Parameters are:
- num docs
k_docs
- confidence
docs_confidence
rag dataset: rag-12000
contains three columns: context
, question
, answer
evaluation recipe:
-
create an agent
-
add all the documents from the
context
column as agent docs -
for every row in the dataset (use the train split only)
- create a session with the agent
- ask the question from
question
column (you can set max_tokens to 1 since we dont care about the returned answer) - note the document-ids returned from session.chat
- get all documents using the document ids
- check if
context
(value of that row) is in the fetched documents
cool thing: optuna: https://optuna.org/