Project Overview
For this project, you will engage in the full data science process from start to finish, solving a classification problem using the provided bank marketing dataset. You will be working in teams of 2 to finish this project. Your goal should be to explore the data, implement a classification algorithm and interpret the results. The interpretation will ideally be non-technical and should be delivered as a presentation on Friday, 20th Aug at 11:00 am ET.
The Data
The provided dataset is related with direct marketing campaigns of a Portuguese banking institution. The marketing campaigns were based on phone calls. Often, more than one contact to the same client was required, in order to access if the product (bank term deposit) would be ('yes') or not ('no') subscribed.
The goal of this project is to run classification algorithms to identify whether a customer will subscribe to a term deposit. The bank-names.txt
file has a description of all the independent variables as well as the dependent variable
The Deliverables of one team
- A jupyter notebook
- A non-technical business presentation
Key Points
- Feel free to try a bunch of different models: logistic regression, decision trees, or anything else you think would be appropriate.
- You must choose appropriate classification metrics and use them to evaluate your models. Choosing the right classification metrics is a key data science skill, and should be informed by data exploration and the business problem itself. You must then use this metric to evaluate your model performance using both training and testing data.