Customer-Churn-Prediction

In addressing the challenge of imbalanced data, 🚀 a comprehensive approach was undertaken to enhance the predictive performance of the model. The first step involved mitigating class imbalance through the application of Synthetic Minority Over-sampling Technique (SMOTE), a method designed to generate synthetic instances of the minority class, thereby balancing the dataset. Subsequently, various feature selection techniques were employed to identify the most relevant attributes for model training. Techniques such as Random Forest feature importance, Lasso feature importance, permutation importance, and Gradient Boosting feature importance were applied. The selection of the optimal features was determined by calculating the average importance across these methods, ensuring a robust and representative subset for model training. 🎯

The second phase of the project focused on model selection and optimization. A diverse set of machine learning models were trained on the pre-processed data, encompassing algorithms from different families such as decision trees, support vector machines, and ensemble methods. The models were evaluated based on their F1 scores, a metric that balances precision and recall, providing a comprehensive assessment of classification performance. The top-performing model, determined by the highest F1 score, was selected for further refinement. 📊

In the final phase, the chosen model underwent fine-tuning to optimize its hyperparameters and enhance overall performance. This iterative process involved adjusting parameters to achieve the best possible balance between precision and recall. The resulting model not only addressed the initial challenge of imbalanced data but also demonstrated superior predictive capabilities. The project's findings and methodology were consolidated into a detailed project report, outlining the steps taken from data pre-processing to model selection and fine-tuning. This report serves as a comprehensive documentation of the strategies employed and their impact on achieving a robust and effective predictive model. 📈🔧