The lab works are performed in python jupyter notebook and they follow:
An Introduction To Statistical Learning with Applications in R
This is a great introductory book for people who are interested in machine learning. This book explores the most commonly used supervised and unsupervised machine learning principles. Prior knowledge of basic statistics and linear algebra is helpful but not necessary. However, as the name of the book suggests all the applications of the given concepts and models are implemented using R statistical software.
With the increasing popularity of the Python language due to its versatility and user-friendly syntax, lots of newcomers will try to learn machine learning using Python. The goal of this project is to replicate all the lab works using the Python programming language. Notebooks will be provided with all the steps which are performed in the book. I hope these guides will be helpful. The book consists of 10 chapters and the lab exercises start from chapter three. This project follows the ITSL in the same order.
* Notes: Some of the R functions are not directly transferable. If you see something is omitted that is because or it was previously entailed somewhere or it's just meaningless. I periodically try to implement functions where there is a repetition of steps. If you have any suggestions or improvements please share!
-
- Simple linear regression
- Multiple linear regression
- Interaction Term
- Non-linear Transformations of the Predictors
-
- Logistic Regression
- Linear Discriminant Analysis
- Quadratic Discriminant Analysis
- KNN
- An Application to Caravan Insurance Data
-
Chapter 5 - Resampling Methods
- Leave-One-Out Cross-Validation
- k-Fold Cross-Validation
- The Bootstrap
-
Chapter 6 - Linear Model Selection and Regularization
- Lab 1: Subset Selection Methods
- Best Subset Selection
- Forward and Backward Stepwise Selection
- Choosing Among Models
- Lab 2: Ridge Regression and the Lasso
- Ridge Regression
- The Lasso
- Lab 3: PCR and PLS Regression
- Principal Components Regression
- Partial Least Squares
- Lab 1: Subset Selection Methods
-
Chapter 7 - Moving Beyond Linearity
- Non-linear Modeling
- Splines
- GAMs
-
Chapter 8 - Tree Based Methods
- Fitting Classification Trees
- Fitting Regression Trees
- Bagging and Random Forests
- Boosting
more to come...
Chapter 3 | Boston.csv | Carseats.csv |
Chapter 4 | Smarket.csv | Caravan.csv |
Chapter 5 | Auto.csv | Portfolio.csv |
Chapter 6 | Hitters.csv |
Chapter 7 | Wage.csv |
Chapter 8 |