- Sentence Representation in Vector (1024 vector dimension)
- The target is to represent each sentence into a vector of same length
- Not cropping/padding sentences to a fixed length to avoid losing information from the sentence
- Tasks like NLI are sensitive to changes in the original sentence
- Custom Bert model used for extracting word embeddings and representing the sentence vector
- Each sentence is represented in a vector of fixed length of 1024.
- Model Selection: Siamese Network
- Models Tried:
- Simple Dense Model: fits too quickly
- Siamese Dense Model
- Better at tasks which involves finding relationship between two comparable things
- Because of use of the sub networks and sharing of weights means fewer parameters to train for.
- helpful with less data and has less tendency to overfit.
- Siamese hybrid LSTM+CNN model
- Training cost is too high compared to Dense models
- LSTM is great for sequence learning(like in texts)
- Since in this case embedding is not being learnt, the performance of LSTM is comparable to that of Dense Model.
- So Siamese Dense Model used as the main model.
- Models Tried:
- Accuracy Score: 79%
- Precision: 0.8473
- Recall: 0.7246
- F1 Score: 0.7812
- AUC-ROC Score: 0.89
pip install -r requirements.txt
python train_bert_model.py
Next Steps:
- Improve the Sentecne vector representation by fine-tuning the bert model for specifically NLI tasks.