For data analysis, firstly, I calculated the AOV(Average Order Value) of the data and checked that the AOV is $3145.13.
On further analysis, I checked the normal distribution of the data, and according to the figure, we can see that the data is left-skewed, i.e., the information is highly inclined to one side. A significant cause for the increase in the mean value.
To further investigate, I decided to plot the order amount of the data per the date of the order and found that many outliers are causing the skewness of the data magenta and the orange points.
The source of these outliers came out to be shop IDs 42 and 78.
- For Shop ID 42, it is found that the shop is taking multiple orders of $704000 and 2000 items.
- For Shop ID 78, it is found that the shop is are selling costly sneakers.
The complete analysis of the data can be found at the Notebook
a) Think about what could be going wrong with our calculation. Think about a better way to evaluate this data.
As per the data analysis, I can say that the data is skewed, causing the AOV to increase than the actual value. Ee can better evaluate this data by calculating how much the information is divided equally.
We can use MOV(Median Order Value)
284
SELECT
COUNT(DISTINCT(OrderID))
FROM
[ Orders ] O
LEFT JOIN [ Shippers ] S ON O.ShipperID = S.ShipperID
WHERE
S.ShipperName = 'Speedy Express';
Result 54
SELECT
Lastname
FROM
[ Employees ]
WHERE
EmployeeID IN (
SELECT
EmployeeID
from
(
SELECT
EmployeeID,
COUNT(OrderID) as Orders
FROM
[ Orders ]
GROUP BY
EmployeeID
)
where
Orders = (
SELECT
MAX(Orders)
from
(
SELECT
EmployeeID,
COUNT(OrderID) as Orders
FROM
[ Orders ]
GROUP BY
EmployeeID
ORDER BY
Orders DESC
)
)
)
Result Peacock
SELECT
ProductName,
MAX(Product_Count) as Product_Count
FROM
(
SELECT
P.ProductName,
COUNT(1) as Product_Count
FROM
[ OrderDetails ] OD
INNER JOIN [ Products ] P on P.ProductID = OD.ProductID
INNER JOIN [ Orders ] O on O.OrderID = OD.OrderID
INNER JOIN [ Customers ] C on O.CustomerID = C.CustomerID
WHERE
C.Country = 'Germany'
GROUP BY
OD.ProductId
ORDER BY
Product_Count DESC
)
Result Gorgonzola Telino with ordered 5 times