Spanish version of SQuAD 1.1 and 2.0 obtained via machine translation provided by Tilde.
This is one of the two es-SQuAD datasets used in the paper El Departamento de Nosotros: How Machine Translated Corpora Affects Language Models in MRC Tasks. HI4NLP Workshop @ ECAI 2020.
v1.1 | v2.0 | |
---|---|---|
train | 57280 | 56764 |
dev | 7962 | 4530 |
If you find the dataset useful, please cite:
@inproceedings{Khvalchik2020ElDD,
title={El Departamento de Nosotros: How Machine Translated Corpora Affects Language Models in MRC Tasks},
booktitle={Proceedings of HI4NLP Workshop at ECAI 2020},
author={Maria Khvalchik and Mikhail Galkin},
year={2020}
}