This repo is a collection of all assignments completed as part of an applied machine learning course I took at McGill University. The description of each individual assignment is shown below:
- Linear regression implementation by hand (using only Numpy)
- L2 Regularization
- Stochastic Gradient Descent Implementation by hand
- Data imputation, missing features
- Linear regression on communities and crime dataset
- Gaussian Discriminant Analysis by hand (using only Numpy)
- Shared covariance
- Non shared covariance
- K nearest neighbors by hand (using only Numpy)
- Comparison of decision boundary and performance for all classifiers
Preprocessed data into frequency bag of words representation and binary bag of words representation, comparing performance of each on the classifiers listed below.
Classifiers used and cross validated for best hyperparameter performance:
-
SVM (linear)
-
Decision Trees
-
Gaussian Naive Bayes
-
Natural Language Processing on:
- YELP Dataset:
- 10000 reviews
- 5 class problem (ratings 1 to 5)
- IMDB Dataset:
- 50000 reviews
- 2 class problem (1 positive, 0 negative)
- YELP Dataset: