This project showcases the development of a star-schema-based sales analysis data warehouse for a fictitious national department store using IBM InfoSphere DataStage. The aim is to demonstrate a typical data warehousing flow, including modeling, ETL processes, and the creation of data marts to meet specific business needs.
- Industry Scenario: Retail
- Dimensions: Customer, Store, Product
- Fact Table: Transactions
- Designing the Star-Schema: Defined a star-schema consisting of three dimensions (product, retail, customer) and a fact table (transactions).
- Populating Dimension and Fact Tables: Loaded sample data into the dimension and fact tables.
- Setting up for Recurring Tasks: Prepared the infrastructure for recurring tasks.
- Translated the logical data model to a physical data model.
- Loaded sample data into files for reference.
- Developed ETL solutions to load data into files or databases as per business requirements.
- Implemented transformations on source data to fulfill business needs.
- Transformation: Upper case transformation applied to all customer name columns. Date representation changed to display columns instead of using the dim date table.
- Business Need: Display each transaction for each customer based on customer type (citizen or foreign) along with product and stock information.
- Business Needs Addressed:
- Display count of all transactions for each customer for each store.
- Display max profit made by which customer type.
- Give bonus to customers based on the profit they make.
This project illustrates the end-to-end process of building a star-schema-based sales analysis data warehouse using IBM InfoSphere DataStage. It covers the modeling process, ETL development, and the creation of data marts to address specific business requirements within the retail industry scenario.