/Custom-BERT

This is a BERT based Transformer Machine Learning Model that is fitted on custom dataset to give most accurate predictions in minimum time.

Primary LanguageJupyter Notebook

Custom BERT based Transformer Model

This is a BERT based Transformer Model that is state of the art for learning contextual information from textual data and generate textual responses for prompts posed to it. We took a pre trained BERT-base case model and then fine tuned it on our custom dataset, SQuAD (Stanford Question Answering Dataset). We were able to achieve an accuracy above 96% with a CPU Wall time of around 70 Milliseconds.

Authors

Deployment

To deploy this project run the training and main notebooks, by correctly mounting the drive with dataset.

Statistics

  • Initial Model

InitialModel

  • Trained Model 1 with number of attention heads (n_heads) = 12

Model1

  • Trained Model 2 with number of attention heads (n_heads) = 8

Model2

  • Training Loss

loss

NOTE:

  • More detailed analysis/information of our implemented model and strategy can be looked at in the report.pdf and presentation.pptx