husseinmozannar/SOQAL

Arabic-SQUAD train/dev/test splits

spookyQubit opened this issue · 0 comments

Hi @husseinmozannar , thanks a lot for sharing the code and the data.

Had a small question regarding Arabic-Squad: In the paper, it is mentioned that Arabic-SQuAD is split 80-10-10% into three parts for training, development and testing: Arabic-SQuad-Test is composed of 2,966 questions on 24 articles; note that articles are distinct between the parts.

Is there an official split which one should use for Table 5 of the paper? The reason I ask is that the Arabic-SQuAD.json comes without any train/dev/test markings (unless I am not looking at the correct file). Will it be possible to please share the splits?

Thanks a lot in advance.