Yummly

Machine Learning Project

Exploring Recipe Data

###lr.py Word lemmatization version 1 - better for logistic regression

cleaned up words and tfidf vectorized ingredients from the corpus

benchmark 50-50 split:

###nb2.py Word lemmatization version 2 - better for NB simple lowercase lemmatization

benchmark 50-50 split:

###nb.py no lemmatization.

Includes code to do kfold testing. Do not run kfold with gridsearch (slow)

benchmark 50-50 split:

KevinRPan/Yummly