/AmazonFoodReviewsCaseStudy

Case Study for Classification of Amazon Food Reviews

Primary LanguageJupyter Notebook

AmazonFoodReviewsCaseStudy

Case Study for Classification of Amazon Food Reviews

This repository contains a complete case study for Classification of Amazon Food Reviews Dataset. The Dataset can be downloaded from Kaggle: https://www.kaggle.com/snap/amazon-fine-food-reviews .This dataset consists of reviews of fine foods from amazon. The data span a period of more than 10 years, including all ~500,000 reviews up to October 2012. Reviews include product and user information, ratings, and a plain text review. It also includes reviews from all other Amazon categories. Data includes:

  • Reviews from Oct 1999 - Oct 2012
  • 568,454 reviews
  • 256,059 users
  • 74,258 products
  • 260 users with > 50 reviews.

This case Study Contains Tries to compare results of several algorithms mentioned below over this dataset.

  • GBDT
  • Random Forest
  • Logistic Regression
  • Naive Bayes
  • Hierarchical Clustering
  • DBSCAN
  • KMeans
  • KNN
  • Decision Tree

To run the project follow the steps given below:

  • Install following:
    • Anaconda 5.1.*
    • python 3.6.*
    • skitlearn
    • numpy
    • matplotlib
  • Download the data set from above specified URL and save it in same directory.