Bluemazon is an e-commerce that focus sell electronic item. Company has run business for some year and want to improve their sales strategy. They want to analyze 2019 sales and generating insights from sales data, trends, and metrics to set targets. Company need data analyst to provides insights about the top performing and underperforming products, the problems in selling and market opportunities, and sales activities that generate revenue.
- Goal
- Generate insight and recommendation based on 2019 sales data.
- Objective
- Process datasets to usable form
- Analyze data and create bundle recommendation.
To run this project you will need Jupyter notebook to run data analysis.
These are some library you need to run the project, i put the pip installation to make it easy for you.
- Pandas
pip install pandas
- Matplotlib
pip install matplotlib
- Seaborn
pip install seaborn
For more detailed dataset information visit kaggle page.
Avalaible data are 12 csv files for each month sales data. Dataset were concatenante and resulting 186,850 orders data. Dataset contain 545 null values and some unmatch feature data types. Extract some feature such as month, day, hour, city, sales.
Customer mostly order 1 item at once, some small group order 2 item at once, highest order are in 9 item at once. Sales for each order are in range 2.99 to 3400. Distribution for sales and Price Each relatively same it is because most quantity order is 1.
Summary sales 2019, total revenue 34,483,365.68 USD, 185,916 orders and 209,038 items sold.
Most of orders are from California (CA), San Francisco and Los Angeles around more than 40,000 and 30,000 orders. Average orders in cities are around 18,000 orders.
High order are happen in December and October that have 25,000 and 20,000 orders. There are increasing pattern order in January to April then decrease to September.
There is peak of sales in around 9:00 to 21:00. This pattern can be a good spot to promote products to increase more sales.
Top product sold are on Battery products, then followed by Charging cable, and Headphones.
-
Product Combination
There are some frequently combination of products in customer orders behavior. Most of combination are in :- Phone product + Charging cable
- Phone product + Headphone
- Charging cable + Headphone
This data can support to make product bundling to increase more sales of specifict product.
-
Rush Hour
There is peak of sales in around 9:00 to 21:00. It is means that in this time range mostly customer tend to place order. This peak can be sweet spot to promote advertising.This data can be support to post more ads on the rush hour time span.
-
Order Probability
Charging cable have relatively same probability. iPhone have higher probability than Google Phone. Wired Headphones have highest order probability on headphones product type.This data can support to have more product stock and marketing on higher product order probability.