/SQuAD-es-mt

Spanish version of SQuAD 1.1 and 2.0 obtained via machine translation

GNU General Public License v3.0GPL-3.0

SQuAD-es-mt

Spanish version of SQuAD 1.1 and 2.0 obtained via machine translation provided by Tilde.

This is one of the two es-SQuAD datasets used in the paper El Departamento de Nosotros: How Machine Translated Corpora Affects Language Models in MRC Tasks. HI4NLP Workshop @ ECAI 2020.

Statistics

v1.1 v2.0
train 57280 56764
dev 7962 4530

Citation

If you find the dataset useful, please cite:

@inproceedings{Khvalchik2020ElDD,
  title={El Departamento de Nosotros: How Machine Translated Corpora Affects Language Models in MRC Tasks},
  booktitle={Proceedings of HI4NLP Workshop at ECAI 2020},
  author={Maria Khvalchik and Mikhail Galkin},
  year={2020}
}