This is the final project to the Learner's Space 23: Neural Networks and LLM. My project involves using a pretrained LLM and finetune it to give us a model that could answer questions based on a contextual block of text.
The aim of the project is to build a question answering model that would be able to search from a block of text and find the answer for a particular question
We have chosen the lightweight "distilbert", a smaller, distilled version of the famous BERT language model.
QUAD dataset (Stanford Question Answering Dataset)
- Storing the model: There were a couple of issues where I could store the model but the same model was not able to be loaded into a new notebook. This was as I found later, because the model was defined using custom loss functions. Instead I had to just pass a dummy loss.