A topic modeling (unsupervised learning) mini-project
This project aims to model and visualize topics in a corpus of over 300,000 product reviews from Amazon. Specifically, the reviews come from the foods department between 2002 and 2012, and duplicate reviews have been removed. You can view the original dataset on Kaggle.
The key packages used in this project are scikit-learn and Yellowbrick, along with the usual pandas, NumPy, and Matplotlib.