Binary-Logistic-Regression-On-Insurance-Company-Data

The Insurance Company data used for this exercise can be found on Kaggle. First, the data set was uploaded to our Github repository. The original data set contains over 14,106 observations and 15 attributes. The response variable will be TARGET which represents yes (1) a new product was purchased or no (0) a new product was not purchased.

Exploratory data analysis will be performed on the variables to familiarize us with the insurance company attributes, identify trends and missing data, and gather preliminary predictive inferences. We will apply feature selection and/or dimensionality reduction techniques to identify the explanatory variables to be included in our three different binomial logistic regression models.

In this analysis, we will find which model is the best at predicting if a person will purchase a new insurance product.