- Northwind Database Schema
- KPIs
- Data Warehouse Design
- ETL Process
- Deploying and Scheduling
- Analytical Queries
- Interactive Dashboard Uisng Power BI
The Northwind database is a classic and fictional relational database that simulates the operations of a small trading company called Northwind Traders. The database is structured into several tables representing entities such as products, customers, orders, employees, suppliers, and categories. Each table contains attributes relevant to its respective entity, facilitating the storage and management of data related to the company's transactions, inventory, personnel, and interactions with customers and suppliers.
- Products: Stores information about the products offered by Northwind Traders.
- Customers: Maintains the details of the customers who place orders.
- Orders: Captures the order details, including the customer, order date, and shipping information.
- Order Details: Stores the individual line items for each order, including the product, quantity, and unit price.
- Employees: Holds the data about the company's employees, such as their names, job titles, and reporting structure.
- Suppliers: Keeps track of the suppliers who provide products to Northwind Traders.
- Categories: Organizes the products into different categories.
The following table outlines a set of essential KPIs relevant to our project
KPI | Definition |
---|---|
Total Orders Handled | The aggregate count of orders managed and fulfilled by shippers, indicating operational throughput and workload distribution efficiency. |
Average Shipment Time | The mean duration taken by each shipper to complete order deliveries, reflecting the speed and reliability of shipping services provided. |
Average Shipment Cost | The average expense incurred per shipment by individual shippers, serving as a key metric for evaluating shipping cost effectiveness and operational expenditure management. |
Total Units Sold | The cumulative quantity of units sold across all product offerings, indicating sales volume and market demand for each product. |
Total Revenue | The aggregated revenue generated from the sale of all products, serving as a primary indicator of sales performance and revenue generation capability. |
Revenue per Unit | The average revenue generated per unit sold for each product, providing insights into pricing strategies, product profitability, and customer purchasing behavior. |
Total Number of Orders | The overall count of orders placed by customers, representing sales activity levels and customer engagement with the business. |
Total Sales Revenue | The total monetary value generated from all sales transactions, serving as a key financial metric for assessing business performance and revenue growth. |
Average Order Value | The mean monetary value of individual orders placed by customers, providing insights into customer spending patterns and purchase behavior, and informing marketing and sales strategies. |
The data staging area plays a pivotal role in the construction of the data warehouse. It serves as an intermediary step where data is collected, transformed, and prepared for integration into the warehouse.
- Data Extraction: Relevant data is extracted from source systems, such as the Northwind database.
- Data Transformation: Data undergoes transformations to standardize formats, handle null values, and perform other necessary modifications.
- Data Loading: Transformed data is loaded into staging tables within the data warehouse, ensuring cleanliness and consistency for downstream processing.
Fact Table | Description |
---|---|
F_Shipper | Captures shipping-related metrics such as the number of orders handled, average shipment time, and average shipment cost. |
F_Product | Stores data on product sales, including the number of sold items, total revenue, and supplier information. |
F_Sales | Tracks sales transactions made by employees, including the number of orders served and total revenue generated. |
Dimension Table | Description |
---|---|
D_Date | Contains date-related attributes for time-based analysis. |
D_Location | Stores location information for customers, suppliers, and shipping destinations. |
D_Customer | Holds details about customers, such as name, location, and segment. |
D_Employee | Stores information about employees, including their names and job titles. |
D_Supplier | Contains details about suppliers, such as name and location. |
D_Product | Stores information about products, including category, supplier, and unit price. |
D_Shipper | Contains details about shipping companies, such as name and location. |
In this project, I utilized SQL Server Integration Services (SSIS) to design and implement a robust ETL process for populating a data warehouse, seamlessly integrating diverse data sources. To optimize the ETL, I implemented an incremental load strategy, enabling real-time data integration and analysis. The data warehouse schema supported comprehensive analysis, with fact tables capturing key business metrics complemented by dimension tables providing essential context. By leveraging SSIS and incremental loading, I delivered a solution that empowers stakeholders with timely and accurate insights to drive informed decisions.
This is a screenshot of the ETL Process for Product Fact Table loading
To see other packages download folder SSIS and enjoy :)
In the deployment and scheduling phase, the SSIS catalog facilitated the organization and deployment of SSIS projects, packages, and environments. SQL Server Agent jobs were utilized to automate the execution of SSIS packages, enabling efficient ETL workflows. Each job executed specific SSIS packages responsible for loading dimension and fact tables. By defining job schedules and dependencies, ETL processes ran reliably according to business requirements. The centralized approach streamlined management and integration with other SQL Server components. This framework provided a robust foundation for automating and managing ETL processes, facilitating informed decision-making and driving business outcomes.
- Query 1: Identifies the fastest shipper for each country based on average shipment time during the specified period.
- Query 2: Finds the top supplier in terms of revenue for each country during the specified period.
- Query 3: Calculates the total number of units sold by each supplier during the specified period.
- Query 4: Calculates the average shipment cost for each country.
- Query 5: Calculates the total number of orders handled by each employee during the specified period.
- Query 6: Calculates the total number of orders processed by each shipper during the specified period.
- Query 7: Calculates the total revenue generated by each country during the specified period.
- Category Sales Analysis: This section displays the total sales revenue for different product categories, providing insights into the best-selling and underperforming product lines.
- Customer Order Distribution: Here, the distribution of orders among major customers is visualized through a pie chart, highlighting the top contributors to the company's revenue.
- Global Revenue Overview: This section presents the total revenue generated from different countries, offering insights into the company's global reach and performance.
- Employee Performance: Employee productivity and performance are showcased here, listing the top employees by the number of orders processed.