AkariAsai/self-rag

About PopQA

AllenShow opened this issue · 1 comments

Hi! Thanks for the great work.
When reproducing the inference for PopQA using Self-RAG, I got the same score for adaptive_retrieval and always_retrieve.
In theory, the adaptive_retrieval result should be better than always_retrieve?I don't know why...

I conducted the same set of experiments and found the same results. I think this is because with the setting of threshold=0.2, the retrieval frequency is 100%. The same can be observed for ARC-C and PubHealth.