yixuantt/MultiHop-RAG

Make public evaluation code

Closed this issue · 5 comments

Hi, would you please consider making public the code to reproduce the results in your paper? Thanks!

Hi, the evaluation code has already been made public. Please check the evaluate.py.

@yixuantt I believe evaluate.py evaluates only the retrieval results, not the final task itself (question answering accuracy from Table 6).

@hugoabonizio Hi Hugo, you can check the qa_llama.py, which I updated. This is a demo script of my question-answering process.

@hugoabonizio Hi, how did you calculate the Accuracy for Table 6? Could you provide a formula? just estimate whether the gold-answer is in the model_answer ?

closed as the evaluation code has been public.