Its a data_science Churn Prediction project with a highly imbalance target variable.
You will share a modelling data set over which you are to train your model. A testing data set, over which you are to show your performance output as shown below. Basically a precision-recall curve.
Unlike other modelling challenges, we are looking for a skill-set beyond just pure outputs. Share your work in the best way you deem it to be fit: Jupyter Notebook, Github repository etc.
You will be evaluated over your ability to explain, model, creativity and write clear and clean code.
Assignment Link - https://docs.google.com/document/d/1_uBj1svuciHwQ_pe9lleR6nKhXCph5osv0pqEzClc1Y/edit
References:
- https://datascience.stackexchange.com/questions/32818/train-test-split-of-unbalanced-dataset-classification
- https://www.kaggle.com/rafjaa/resampling-strategies-for-imbalanced-datasets
- https://medium.com/swlh/building-an-artificial-neural-network-in-less-than-10-minutes-cbe59dbb903c
- https://www.kaggle.com/janiobachmann/credit-fraud-dealing-with-imbalanced-datasets
- https://www.kaggle.com/joshwilkins2013/churn-baby-churn-user-logs