/BERT_Google_Play

Using BERT to do sentiment analysis on Google Play reviews.

Primary LanguageJupyter Notebook

BERT_Google_Play

Summary

Using the pretrained BERT base model to do sentiment analysis on Google Play reviews.

Details

About 14'000 Google Play reviews are collected using the google-play-scraper module. The reviews are then sampled to create 3 balanced categories (negative, neutral and positive sentiment) using their "star" ratings.

The model is trained for 10 epochs on a GPU (~40 mins), val and train accuracy get better but val loss increases as well from epoch 1. The model is probably making more confident "bad" predictions so the Cross Entropy loss increases, but at the same time is getting better at classifying reviews. This might be due to the small validation set size and should be increased. Final test accuracy of ~84%.

Many thanks to Venelin Valkov's great tutorials (https://github.com/curiousily/Getting-Things-Done-with-Pytorch/blob/master/08.sentiment-analysis-with-bert.ipynb)!