Comparison on Machine Learning Models and BERT on SQuAD

This repository consists of our implementation of different machine learning models on SQuAD dataset along with a state-of-the-art model BERT. A pre-print is written in detail about the comparative analysis on SQuAD.

Link to pre-print: https://arxiv.org/pdf/2005.11313.pdf

Note:ML_final file consists of all the models used by us. PCA and Regression files consists of individual implementation of PCA and regression.

For running the final file(ML_final.ipynb):

Run the first cell for importing all the libraries.
Run the cells after the "start here" section to avoid doing preprocessing again.(It takes 3-4 hours for preprocessing as the embedding files are quite large in size).
(Optional) To run sentiment analysis code: Get the vader lexicon with a different way rather than nltk.download(vader_lexicon) command. Try wget if possible.

Sikura/BERT-on-SQuAD

Comparison on Machine Learning Models and BERT on SQuAD