julep-ai/julep

Optimize docs search hyper parameters

Closed this issue · 0 comments

Let's run an automated evaluation on the RAG dataset (using a local model or something) and then tune the doc search hyperparameters based on this. Parameters are:

  • num docs k_docs
  • confidence docs_confidence

https://github.com/julep-ai/julep/blob/dev/agents-api/agents_api/models/entry/proc_mem_context.py#L13

rag dataset: rag-12000
contains three columns: context, question, answer

evaluation recipe:

  • create an agent

  • add all the documents from the context column as agent docs

  • for every row in the dataset (use the train split only)

    • create a session with the agent
    • ask the question from question column (you can set max_tokens to 1 since we dont care about the returned answer)
    • note the document-ids returned from session.chat
    • get all documents using the document ids
    • check if context (value of that row) is in the fetched documents

cool thing: optuna: https://optuna.org/