For this lab, we will build a model on customer churn binary classification problem. You will be using files_for_lab/Customer-Churn.csv
file.
-
Apply SMOTE for upsampling the data
- Use logistic regression to fit the model and compute the accuracy of the model.
- Use decision tree classifier to fit the model and compute the accuracy of the model.
- Compare the accuracies of the two models.
-
Apply TomekLinks for downsampling
- It is important to remember that it does not make the two classes equal but only removes the points from the majority class that are close to other points in minority class.
- Use logistic regression to fit the model and compute the accuracy of the model.
- Use decision tree classifier to fit the model and compute the accuracy of the model.
- Compare the accuracies of the two models.
- You can also apply this algorithm one more time and check the how the imbalance in the two classes changed from the last time.