/forage-anz

Solution to Data at ANZ virtual internship on Forage

Primary LanguageJupyter Notebook

Data@ANZ Virtual Experience Program

This repository contains my solution to the Data@ANZ Virtual Experience Program on Forage.

Introduction

ANZ is one of the big four banks in Australia. This virtual experience program aims to explore the transactional behaviour of ANZ customers by examining the transactions made by 100 ANZ customers over a 3-month period. These transactions include purchases, recurring transactions and salary transactions.

Task 1: Exploratory Data Analysis

Segment the dataset and draw unique insights, including visusalisation of the transaction volume and assessing the effect of any outliers.

  • Start by doing some basic checks - are there any data issues? Does the data need to be cleaned?
  • Gather some interesting overall insights about the data. For example, what is the average transaction amount? How many transactions do customers make each month on average?
  • Segment the dataset by transaction date and time. Visualise transaction volume and spending over the course of an average day or week. Consider the effect of any outliers that may distort your analysis.
  • For a challenge: what insights can you draw from the location information provided in the dataset?

Task 2: Predictive Analytics

Explore correlations between customer attributes, build a regression and a decision-tree prediction model based on your findings.

  • Using the same transaction dataset, identify the annual salary for each customer.
  • Explore correlations between annual salary and various customer attributes. These attributes could be those that are readily available in the data or those that you construct or derive yourself. Visualise any interesting correlations using a scatter plot.
  • Build a simple regression model to predict the annual salary for each customer using the attributes you identified above.
  • How accurate is your model? Should ANZ use it to segment customers into income brackets for reporting purposes?
  • For a challenge: build a decision-tree based model to predict salary. Does it perform better? How would you accurately test the performance of this model?

YouTube