/KPMG_internship

Primary LanguageJupyter Notebook

KPMG_internship

3 datasets: Customer Demographic Customer Addresses Transactions data in the past 3 months

Module 1

Aim: Look through the client datasets and idetify data quality issues with it and propose strategies to mitigate these issues. The data quality dimesions used to evaluate the dataset are-

  1. Correct Values
  2. Values free from contradiction
  3. Values up to date
  4. Data items with value meta-data
  5. Duplicated records
  6. Null data

Summarized the analysis in a word document attached.

Module 2

Aim: Using the existing 3 datasets as a labelled dataset,recommend which of the 1000 new customers should be targeted to drive the most value for the organisation. Detailed approach-

  1. Understanding the data distributions
  2. Feature engineering
  3. Data transformations
  4. Modelling
  5. Results
  6. Interpretation and reporting.

The PowerPoint presentation includes a detailed approach for strategy behind each of the 3 phases- Data Exploration, Model Development and Interpretation

Module 3

Aim: Support the result of analysis plan with the use of visualisations. Used tableau to display the visulisations that summarizes- 1)What are the trends in the underlying data? 2)Which customer segment has the highest customer value? 3)What do you propose should be Sprocket Central Pty Ltd ’s marketing and growth strategy? 4)What additional external datasets may be useful to obtain greater insights into customer preferences and propensity to purchase the products?