/KaggleSantander

Only for my interest

Apache License 2.0Apache-2.0

KaggleSantander

Only for my interest

This kaggle competition is Santander Customer Satisfaction. Santander Bank is asking Kagglers to help them identify dissatisfied customers early in their relationship.

##Problems

  1. The main issue of this competition is the distribution of TARGET is very imbalanced. The dissatisfied customers is only 5 percent. If we train a model just using accuracy as the evalutation, a very bad model can even achieve 95% of accuracy.
  2. A very small amount of missing values. Maybe this is related to dissatisfied customer.
  3. The training dataset has 374 variables. We should do some work of feature engineering.