AkariAsai/self-rag

The meaning of "_w_gs.jsonl" in evaluation data

qiweijian opened this issue · 2 comments

Thanks for your incredible work!

I notice that there are four files for short form QA in the eval data folder.

  • popqa_longtail.jsonl
  • popqa_longtail_w_gs.jsonl
  • triviaqa_test.jsonl
  • triviaqa_test_w_gs.jsonl

I am wondering what 'w_gs' means and more specifically,

  1. how the ctxs field is built.
  2. why the ctxs in popqa_longtail.jsonl and popqa_longtail_w_gs.jsonl differ in numbers and values? (some contexts in popqa_longtail_w_gs.jsonl don't have document id and score)
  3. why triviaqa_test_w_gs_df only has 7313 samples while triviaqa_test.jsonl has 11313? how it is filtered?

Hi! _gs indicates that the retrieved results are further enhanced by Google Programmable Searc in addition to the original contriever top 10 documents. We added this new results in our updated manuscript, Section B.2.
That's the reason the number of the contexts differ. The different number of instance seem to be odd, and I may upload incorrect file... Let me double check!

Hi there!
I'm so sorry, but where can I find the eval data folder or How can I generate it?