This Jupyter notebook has been tested with python 3.8 and requires the libraries stated under the requirements file. The main notebook is: ProjectWorkbook.ipynb
The company Arvato Financial Services, a Bertelsmann subsidiary, connected with Udacity to propose a capstone project, as one of the options for the Data Science and Machine Learning Nanodegrees. They provided different datasets that will be discussed in the following sections, that consist of a mail-order sales company in Gernmany that is interested in identifying segments of the general population to target with their marketing in order to grow. This kind of practices is pursued by different companies in order to help them improve their ROI, by addressing to customers who are likely to buy certain product, churn or default on a loan. (Sebastiaan Höppner, 2017) For this project I will be addressing the mail-order sales data, among other datasets, in order to apply machine learning techniques to provide an estimator that may infer which individuals are most likely to respond to their marketing campaign and become customers of the mail-order company.
There are two main goals that the Arvato Financial Services wants to address using the datasets provided. First, to analyze attributes of established customers and the general population in order to create customer segments. Second, to use previous analysis to build a model using machine learning techniques in order to infer whether an individual will respond to one of their marketing campaigns, or not.
In order to reproduce my solution and results, all development Python code is available in this Github repository. It is important to remark that in addition to Udacity’s Terms of Use and other policies, the datasets for this project are governed by terms and conditions of AZ Direct GmbH and is prohibited from publishing or keeping after 2 weeks of downloading it from an official source and accepting their terms and conditions.