Project Title: Bank Marketing Effectiveness Prediction

image

Description:

This project aims to build machine learning models using various classification algorithms to predict whether a client will subscribe to a term deposit or not, based on the direct marketing campaigns (phone calls) of a Portuguese banking institution. The dataset was imported and cleaned to handle missing values and outliers, followed by extensive exploratory data analysis (EDA) to gain insights and formulate hypotheses. Feature engineering was performed to manipulate and transform the data, and important features were selected for model building. Five machine learning classification models were applied and their performance was compared using evaluation metrics such as accuracy, precision, recall, F1 Score, and ROC-AUC curve. The best-performing model was used to predict the dependent variable using a completely new dataset.

Skills Utilized:

  • Data Cleaning
  • Exploratory Data Analysis (EDA)
  • Hypothesis Testing
  • Feature Engineering
  • Data Preprocessing
  • Machine Learning (Logistic Regression, Decision Tree, Random Forest, KNN, Naive Bayes)
  • Evaluation Metrics (Confusion Matrix, Accuracy, Precision, Recall, F1 Score, ROC-AUC Curve)

Results:

The best-performing model was found to be Random Forest with 87% accuracy, 39% precision, and 33% recall. However, for our purpose of predicting the most true positives, Logistic Regression was the best model with 77% accuracy, 80% precision, and 73% recall.

Business Insights:

The project provides several insights for the banking institution, such as the importance of job and education in predicting term deposit subscription, the effectiveness of contacting clients during weekdays, and the need to focus on increasing campaign success rate by improving the call duration and frequency.

Future Scope:

The project can be extended by incorporating more data sources, applying more advanced feature engineering techniques, and exploring different machine learning algorithms to improve model performance.