Resources and development for Natural Language Processing in Question Answering datasets from work: "Characterization and automatic prediction of dificulty in Question Answering collections based on Neural Language Models".
Question answering evolution is due to the explosion of challenging datasets requiring world knowledge to answer. Recently, pre-trained neural network language models such as BERT, RoBERTa and T5 have greatly improved on the results of previous approaches. However, error analysis of this models is scarce and and does not allow to know in which aspects can be improved or what type of questions pose the greatest difficulty.
To address this problem, in this work is proposed the automatic linguistic characterization of several datasets used for fine-tunning this models such as SQuAD, NewsQA and RACE, and a study associating the mistakes and successes made by various models on these collections. In addition, a methodology for automatic annotation of the complexity of question collections is proposed based on the difficulties the pose to various systems. Finally, several predictive models based on Machine Learning are evaluated to study the ability to predict the annotation proposed in this work. In this way, it is intended to advance in the studies on how to improve the results in Question Answering by the current systems.
References:
Pranav Rajpurkar, Roin Jia, and Percy Liang. Know What You Don’t Know: Unanswarable Questions for SQuAD. Computer Science Department. Stanford University., 2018. URL https://rajpurkar.github.io/ SQuAD-explorer/.
Danqui Chen, Jason Bolton, and Christopher D. Manning. A Thorough Examination of the CNN/Daily Mail Reading Comprehension Task. Computer Science Stanford University, 2016. URL https://cs.nyu.edu/~kcho/DMQA/.
Guokun Lai, Qizhe Xie, Hanxiao Liu, Yiming Yang, and Eduard Hovy. RACE: Large-scale Reading Comprehension Dataset From Examinations. In Advances in neural information processing systems (pp. 5998-6008)., 2017. URL http: //www.cs.cmu.edu/~glai1/data/race/.