Case Study for Classification of Amazon Food Reviews
This repository contains a complete case study for Classification of Amazon Food Reviews Dataset. The Dataset can be downloaded from Kaggle: https://www.kaggle.com/snap/amazon-fine-food-reviews .This dataset consists of reviews of fine foods from amazon. The data span a period of more than 10 years, including all ~500,000 reviews up to October 2012. Reviews include product and user information, ratings, and a plain text review. It also includes reviews from all other Amazon categories. Data includes:
- Reviews from Oct 1999 - Oct 2012
- 568,454 reviews
- 256,059 users
- 74,258 products
- 260 users with > 50 reviews.
This case Study Contains Tries to compare results of several algorithms mentioned below over this dataset.
- GBDT
- Random Forest
- Logistic Regression
- Naive Bayes
- Hierarchical Clustering
- DBSCAN
- KMeans
- KNN
- Decision Tree
To run the project follow the steps given below:
- Install following:
-
- Anaconda 5.1.*
-
- python 3.6.*
-
- skitlearn
-
- numpy
-
- matplotlib
- Download the data set from above specified URL and save it in same directory.