
Its a data_science Churn Prediction project with a highly imbalance target variable

Primary LanguageJupyter Notebook


Its a data_science Churn Prediction project with a highly imbalance target variable.

You will share a modelling data set over which you are to train your model. A testing data set, over which you are to show your performance output as shown below. Basically a precision-recall curve.

Unlike other modelling challenges, we are looking for a skill-set beyond just pure outputs. Share your work in the best way you deem it to be fit: Jupyter Notebook, Github repository etc.

You will be evaluated over your ability to explain, model, creativity and write clear and clean code.

Assignment Link - https://docs.google.com/document/d/1_uBj1svuciHwQ_pe9lleR6nKhXCph5osv0pqEzClc1Y/edit


  1. https://datascience.stackexchange.com/questions/32818/train-test-split-of-unbalanced-dataset-classification
  2. https://www.kaggle.com/rafjaa/resampling-strategies-for-imbalanced-datasets
  3. https://medium.com/swlh/building-an-artificial-neural-network-in-less-than-10-minutes-cbe59dbb903c
  4. https://www.kaggle.com/janiobachmann/credit-fraud-dealing-with-imbalanced-datasets
  5. https://www.kaggle.com/joshwilkins2013/churn-baby-churn-user-logs