I have used a dataset from Kaggle (public domain) related to the Retail Sales Data collected from 10 different shopping malls in Istanbul, Turkey.
Kaggle - Customer Shopping Dataset
invoice_no
: Invoice numbercustomer_id
gender
: Either male or femaleage
category
: Such as clothing, shoes, cosmetics, et ceteraquantity
price
: This is the price per unit itempayment method
: Either cash, debit card, or credit cardshopping_mall
: Name of the mall
- Data Loading and Cleaning
- Feature Engineering
- Descriptive Statistics
- Data Visualization
- Data Analysis
- Recommendations
This step involved:
- Loading the dataset into a Pandas dataframe
- Handling missing values
- Removing duplicates
- Correcting data types
- Formatting the dates
This step involved creating columns based on:
- Categorization of malls (Luxury mall, mega mall, mixed-use mall, and outlet mall)
- Location of mall (whether located on the European side or the Asian side of Istanbul)
- Total price of items
- Categorization of ages into different groups (baby, young adult, middle-aged adult, and old adult)
- Total price of items purchased by each gender
- Total price of items purchased by each age group
- Total price of items purchased in each category
- Total price of items purchased through each payment method
- Total price of items purchased in each shopping mall
- Total price of items purchased in each mall location
- Total price of items purchased in each type of mall
- Unit price of items purchased by each gender
- Total quantity of items purchased by each gender
- Total quantity of items purchased from each category
- Total quantity of items purchased through each payment method
- Total quantity of items purchased in each shopping mall
- Total quantity of items purchased in each type of mall
- Total quantity of items purchased in each mall location
- Total quantity of items purchased by each age group
This step involved getting KPIs from the crucial columns using measures of central tendency. Here are a few important insights derived from this step:
- The average age of a person shopping in the malls is 43 years.
- On average, a customer buys about 3 items of any type from the shopping mall.
- On average, a person spends 689.26 Turkish Liras per item in a shopping mall.
- The number of purchases made by females is higher than those of males.
- The payment method for most invoices is cash.
- The majority of the invoices come from malls located on the European side of Istanbul.
In this step, I utilized various types of graphs such as bar charts and pie charts using matplotlib and seaborn libraries to explore and understand the data.
A few pivotal graphs created at this stage were:
- Comparison of Payment Methods
- Total Amount Spent
- Purchase Trends in Each Month
From the graphs and descriptive statistics, I made the following inferences:
- Customers aged 37 years have the most invoices in shopping malls.
- On average, a customer buys about 3 items of any type from the shopping mall.
- On average, a person spends 689.26 Turkish Liras per item in a shopping mall.
- The maximum number of invoices are from the Mall of Istanbul.
- More invoices are issued to females, who spend more and purchase a higher quantity of items than males.
- Both male and female customers frequently visit luxury and mega malls, as well as malls located in the European part of Istanbul.
- Young adults (aged 3 to 39 years) have more invoices compared to middle-aged and old adults, hence the total price and quantity of items purchased is higher.
- Clothing emerges as the most popular category among all the invoices, followed by cosmetics and food/beverages.
- Clothing and shoes, followed by technology items, have the highest total price.
- Price per item is the highest for technology, followed by shoes and clothing.
- Customers spend more in the month of October, with a sharp decrease in November and a slight increase in December. Males shop the most in September, while female customers shop more in July and October.
- Most payments are made in cash, followed by credit card.
- More invoices are issued by Kanyon and Mall of Istanbul than other malls.
- Invoices are dominated by purchases from mega malls.
- Most invoices are from malls located in the European part of Istanbul.
Here are my top recommendations based on the insights gathered:
- Malls should locate themselves in the European part of Istanbul and upgrade their infrastructure to become luxury or mega malls.
- Malls should focus more on their young adult (aged 3 to 39 years) customers.
- Malls should also focus more on their female customers, as the number of invoices, total price, and quantity of items purchased by females is higher than that of males.
- Malls should prepare adequately for the month of October, as it has the highest sales volume.
- Malls should ensure that customers can easily pay using cash.
- Malls should focus more on selling clothing items to customers.