bank_deposit_prediction
The data is related with direct marketing campaigns of a Portuguese banking institution. The task is to predict whether the product (bank term deposit) would be ('yes') or not ('no') subscribed. Namely, it's a binary classification task.
The dataset is from the in-class kaggle, given as train and test
Data mining algorithms applied:
Linear Discriminants, Logistic Regression, Gaussian Naive Bayes, Decision Tree, Random Forest Classifier, KNN, Gradient Boosting Classifier
Models are evaluated by MCC scores as the dataset is imbalanced.
Analysis
Got best MCC score for Gradient Boosting Classifier. Hence it was used for final submission
Required libraries :
Python Version = 3.6
Pandas = 0.21.0
sklearn
Files:
data_analysis.py --> Analysis of Data and different Classification Algorithms
submission.py --> Classification file to create 'submission.csv'
To run the Gradient boost classifier for the data set:
python code/submission.py
Future Work
Find right parameters for Decision tree, Random Forest Classifier and Gradient Boost Classifier and use Majority Voting Classifier over them.