/GADataScience

General Assembly Data Science

Primary LanguagePython

GADataScience

Repo for 2013 General Assembly Data Science class.

Data Science class covers

Hw1 - Linear Regression and Ridge Regression Added: Stepwise Regression for feature selection

The purpose of ridge regression is to correct for multicollinearity between variables.

Hw2 - K-nearest neighbors (KNN) and N-folds Cross Validation (CV)

KNN is a classification algorithm for identifying which group unseen examples blong to. N-folds CV is a method for validating your model using folds of the data. The model trains on each section without learning from the previous section. This is more robust then just using a straight test/train setup to improve model generalization.

Hw4 - Logistic Regression

Classification Algorithm for linearly separable classes.