/Automatic-review-labelling-using-BERT

Sentiment analysis of amazon reviews dataset using BERT - model development and deployment

Primary LanguageJupyter Notebook

Automatic Review Labelling using BERT

image

   Reviews are essential means of knowing the performance of a product. In this project, I have created a model that predicts the score of a review based on the text. This sentiment analysis model classifies the text into 1 to 5, based on the sentiment behind the review. For example, "Nice product" usually means a score of 5 and “Poor quality” usually means a score of 1.

   The model was trained using the Amazon food reviews dataset, which contains around 5 lakh reviews. Since there was a class imbalance, I did undersampling to balance the classes. I used the BERT model and a linear layer at the end. Therefore, for word embedding, I used the BERT tokenizer. The parameters of the BERT model were frozen during the training process to avoid computational complexity. The test accuracy turned out to be 47.4%, much greater than the random case (20%).

Website preview

image