Analytics-Vidhya-India-ML-Hiring-Challange: A Jupyter Notebook repository from amresh1495

Approach Document -

After undesrtanding the problem statement and going through the dataset, I understood that this is a binary classification problem.
Did some basic data preprocessing like calculating for na values, missing values, checking types of variables, varibale stats, etc.
After doing some basic analysis, I found that the target variable was hugely imbalanced.
Converted the categorical values to numeric by encoding them.
Selected the best features by using Recursive Feature Engineering.
Tried different techniques like under-fitting, over-fitting, EasyEnsemble and SMOTE for balancing the target variable.
Tried different classification algorithms with best balancing technique and best selected features.
Combination of Gradient Boosting Classifier with its best params (through GridSearchCV) with SMOTE gave me the best F1-score.

I had a rank of 485 out of 3740 registered participants (Top 12 percent).