supersteeb/CS_ML_Prework

Thanks!

Opened this issue · 1 comments

The goal of this assignment was to introduce you to 2 main concepts in Machine Learning: Data Pre-processing, and Classification. You learned how to query and clean data using the pandas library in Python, and built a simple Machine Learning Classifier based on the K Nearest Neighbors algorithm.

Things look overall pretty good. There are nicer ways to format the data for the grader (the confusion_matrix output is quite hard to read) but it looks like you just achieved 90%.

In the future, you should try some optional requirements! Does balancing the dataset help improve results?

Assignment 2

The goal of this assignment was to introduce you to three new classification techniques and to understand how to select the best parameters and features for them. You learned how to use python built-in functions (GridSearchCV, SelectKBest, RFE, SelectFromModel) to try out new models (Support Vector Machines, Random Forests, and Logistic Regression) and test different permutations of parameter values and features, and analyze your results to help build better machine learning models.

Goo job!

Here's what you did well:

  • Completed all the required User Stories in a comprehensive and structured manner.
  • Great discussion and analysis about your results.
  • You are understanding the concepts behind the theory and are able to apply them clearly. This will be paramount when developing your final project.

Here's some things to keep an eye out on:

  • You should always be using the best classifier when trying SelectKBest / SelectFromModel etc. (i.e. when trying all the parameter optimization methods). That way you have a more fair comparison between the confusion matrices of your classifiers, otherwise it's always going to be likely that your optimized classifier will perform worse than the original and the best.
  • Logistic Regression: you used the wrong 'best' params (the best multi_class value was output as ovr, but you used multinomial). In addition, your accuracy did not go down, but you stated that it did.
  • I realize this assignment took you a lot of time, but I would still highly encourage you to try at least one of the bonus stories on the next assignment.

Overall, solid work. Keep it up !!!