google-research/xtreme

How to get LAReQA results across different question languages

shunyuzh opened this issue · 5 comments

Hi @sebastianruder and others,

I have seen in the paper you show the LAReQA results (mean average precision@20) across different question languages.

However, following the released scripts, I can only get the AVG results of all languages. Could you point out how to get them across different question languages.

Feel free to point out my silly problem. Thanks all.

@nconstant-google Hi Noah, could you give some suggestions?

@bothameister , who may remember how we calculated the per-language scores for LAReQA and Mewsli-X.

@bothameister , who may remember how we calculated the per-language scores for LAReQA and Mewsli-X.

Hi all, Mewsli-X results for per-languages are reported in the evaluation logs, while LAReQA not. So could you fix the evaluation log of LAReQA? It seems complicated according to the descriptions in your paper so I can't manage it.

You know in the multilingual domain people would care the per-language scores. Looking forward to your replies.

Thanks all.

Hi Shunyu, to confirm, are you trying to produce something like Table 24 from https://arxiv.org/abs/2104.07412v2 ? It's been a while since I looked at this, but I'd guess you can get per-language results by updating this line something like this?

all_questions = [q for q in question_set.as_list() if q.language == 'ar']

Hi Shunyu, to confirm, are you trying to produce something like Table 24 from https://arxiv.org/abs/2104.07412v2 ? It's been a while since I looked at this, but I'd guess you can get per-language results by updating this line something like this?

all_questions = [q for q in question_set.as_list() if q.language == 'ar']

Thanks, I am going to try.