Question Answering over Electronic Devices: A New Benchmark Dataset and a Multi-Task Learning based QA Framework
This repo has the code for the paper "Question Answering over Electronic Devices: A New Benchmark Dataset and a Multi-Task Learning based QA Framework" accepted at EMNLP 2021 Findings. The blog on this paper can be found here, the poster here, and a corresponding presentation here.
Please run pip install -r requirements.txt
(python3
required)
Go to this link. A RoBERTa BASE Model pre-trained on the corpus can be found here, and a BERT BASE UNCASED Model pre-trained on the same here.
- Annotated Data and Amazon User Forum Data Samples are present in
data
(See README) - Data Analysis is done in
data_analysis
(See README) - Corpus extraction code is present in
pre_training_corpus_extraction
(See README) - E-Manual Data Extraction code is present in
EManual_data_extraction
(See README) - Code on pre-training is given in
pre-training
(See README) - Code on unsupervised IR method and fine-tuning variants is given in
fine_tuning_variants_scripts
(See README) - Code on multi-task learning is given in
MTL_scripts
(See README) - Code on funtions for evaluation of MTL and fine-tuning variants is given in
evaluation
(See README)- For ROUGE-L Precision, Recall and F1-Score: https://pypi.org/project/py-rouge/
- For S+WMS: https://github.com/eaclark07/sms
- Dense Passage Retrieval(DPR) - Used HuggingFace implementation (https://huggingface.co/transformers/model_doc/dpr.html)
- Technical Answer Prediction (TAP) - took the help of code in https://github.com/IBM/techqa
- MultiSpan - took the help of code in https://github.com/eladsegal/tag-based-multi-span-extraction
Please cite the work if you would like to use it.
@article{nandy2021question,
title={Question Answering over Electronic Devices: A New Benchmark Dataset and a Multi-Task Learning based QA Framework},
author={Nandy, Abhilash and Sharma, Soumya and Maddhashiya, Shubham and Sachdeva, Kapil and Goyal, Pawan and Ganguly, Niloy},
journal={arXiv preprint arXiv:2109.05897},
year={2021}
}