This project utilizes natural language processing (NLP) to predict whether Yelp customer reviews are positive or negative based on keywords. I created a support vector machine (SVM) classifier model to make these predictions. This type of analysis is used to determine the general tone of customer reviews and to get a better understanding of consumer preferences towards a brand, product, or service.
- Python
- Jupyter Notebook
- yelp_ratings.csv File
All the data for this project was collected from Kaggle. The data set presents 44530 Yelp reviews, accompanied by ratings (1 to 5 stars). However, I only utilized a subset of the data (5000 reviews) to reduce the amount of time required to run the model.
My model correctly classified 95.2% of the reviews.
In conclusion, my model successfully predicted whether most reviews were positive or negative. However, the model accuracy may slightly vary if a larger subset of reviews is extracted from the full dataset.
- Xavier Lim - LinkedIn | Portfolio Website | Tableau Public