NLP Amazon review sentimental prediction

Processed >100k public Amazon review text data to predict binary sentiment of 'positive' or 'negative'

Description

  1. proprocess text (tokenize, lower case, lemmatize, remove stopwords)
  2. trained model with NLTK, sklearn TF-IDF, MultinomialNB, accuracy 0.88
  3. saved optimized model to .pkl
    img

Data

Amazon product reviews, Kindel Store 5-score data
This dataset contains product reviews and metadata from Amazon, including 142.8 million reviews spanning May 1996 - July 2014. This dataset includes reviews (ratings, text, helpfulness votes), product metadata (descriptions, category information, price, brand, and image features), and links (also viewed/also bought graphs).

How to run dashboard

cd src
python runserver.py

Open browser and visit http://127.0.0.1:5000/
Type input text and get sentimental prediction

TODO

use Doc2vec + keras.LSTM
interactive flask app to predict new review with saved model