Authors: Michael Zhou (mgz27), Minqian Hou (mh2256), Jiayue Pan (jp987), Tong Sha (ts646)
We used the Wine Quality Dataset from the UCI Machine Learning Repository to predict the red and white wine qualities, given their physicochemical attributes. We tried the following methods: linear and logistic regression, random forests, XGBoost, SVMs, Naive Bayes, ridge and lasso regression. Our results show that SVMs outperform just about any other model in terms of test accuracy, while the test accuracies for red wines are consistently greater than white wines; however, the accuracies themselves are not as high due to the overall lack of correlation between predictors.
Final Project for STSCI 4740 - Data Mining & Machine Learning
Project Report: STSCI 4740 Final Project Report.pdf