/Customer_Segmentation_Arvato

Identifying potential customers using Machine Learning

Primary LanguageHTML

Customer_Segmentation_Arvato

Identifying potential customers using Machine Learning

Table of Contents

  1. Project Motivation
  2. Methods Used
  3. File Descriptions
  4. Results
  5. Installation
  6. Licensing, Authors, and Acknowledgements

Project Motivation

In this project, I will use the data provided by Bertelsmann Arvato Analytics, I will analyze demographics data for customers of a mail-order sales company in Germany, comparing it against demographics information for the general population. I will be using unsupervised learning techniques to perform customer segmentation, identifying the parts of the population that best describe the core customer base of the company. Then, I will apply what I've learned on a third dataset with demographics information for targets of a marketing campaign for the company, and use a model to predict which individuals are most likely to convert into becoming customers for the company.

Methods Used

  • Exploratory data analysis to understand the dataset
  • Feature selection and elimination using Correlation
  • Use PCA for dimensionality reduction and Kmeans Unsuperised ML Technique for Clustering
  • Using varoius Machine Learning model for prediction

File Descriptions

  1. DSND-Arvato Project Workbook Final.ipynb : Notebook containing the whole project combined including EDA & Machine learning model
  2. terms_and_conditions : contains information about data privacy
  3. requirements.txt : text file containing the required libraries & packages to execute the code

Results

  • I identified clusters of relevance of future customers and identified positive responders to the mail-order campaign successfuly,
  • More information about the project and the main findings of the code can be found at the post available here
  • Results of prediction can also be found on Kaggle

Installation

Licensing, Authors, Acknowledgements

Must give credit to Arvato/Bertelsmann and Udacity for the data. You can find the Licensing for the data and other descriptive information at the Kaggle link available here. Otherwise, feel free to use the code here as you would like!