image

Data Engineer at ID/X Partners

Data Warehouse and ETL Implementation

Video Presentation

https://youtu.be/YCUO0fDqs6U

Tools

  • Talend
  • SQL Server

Skills

  • Data Warehouse
  • ETL and ELT
  • Store Procedure
  • Restore Database

Case Study

  • Restore Database
  • Create Table Fact and Dimension
  • Create ETL Process / Data Pipeline
  • Create Store Procedure

1. Restore Database Staging

The first step is to restore the staging.bak database. To do so, see the image below.

  • Restore Database
    • image
  • Result Restore Database
    • image

2. Create Table Fact and Dimension

The next step is to create the DWH_Project database. Then in the database create a table DimCustomer, DimProduct, DimStatusOrder and FactSalesOrder.

  • Create Database DWH_Project
    CREATE DATABASE DWH_Project;
  • Create Table DimCustomer
    CREATE TABLE DimCustomer (
      CustomerID int NOT NULL PRIMARY KEY,
      FirstName varchar(50) NOT NULL,
      LastName varchar(50) NOT NULL,
       Age int NOT NULL,
      Gender varchar(50) NOT NULL,
      City varchar(50) NOT NULL,
      NoHP varchar(50) NOT NULL,
    );
  • Create Table DimProduct
    CREATE TABLE DimProduct (
      ProductID int NOT NULL PRIMARY KEY,
      ProductName varchar(255) NOT NULL,
      ProductCategory varchar(255) NOT NULL,
      ProductUnitPrice int,
    );
  • Create Table DimStatusOrder
    CREATE TABLE DimStatusOrder (
      StatusID int NOT NULL PRIMARY KEY,
      StatusOrder varchar(50) NOT NULL,
      StatusOrderDesc varchar(50) NOT NULL,
    );
  • Create Table FactSalesOrder
    CREATE TABLE FactSalesOrder (
      OrderID int NOT NULL PRIMARY KEY,
       CustomerID int NOT NULL,
       ProductID int NOT NULL,
       Quantity int NOT NULL,
       Amount int NOT NULL,
       StatusID int NOT NULL,
       OrderDate date NOT NULL,
       CONSTRAINT FK_CustomerOrder FOREIGN KEY (CustomerID)
      REFERENCES DimCustomer(CustomerID),
       CONSTRAINT FK_ProductOrder FOREIGN KEY (ProductID)
      REFERENCES DimProduct(ProductID),
       CONSTRAINT FK_StatusOrder FOREIGN KEY (StatusID)
       REFERENCES DimStatusOrder(StatusID)
    );
  • Result
    • image

3. Create ETL Process / Data Pipeline

The next step is to create a job in Talend using the tMSSqllinput and tMSSqlOutput components to move data from source (staging) to target (DWH_Project), and using tMap component for UPCASE in the DimCustomer table.

  • Create DB Connections

    • image
  • Step 1

    • image
  • Step 2

    • image
  • Result Connections

    • image
  • Retrieve Schema

    • image
  • Checklist dbo

    • image
  • Create Job

    • image
  • Result Job

    • image
  • Enter the required components and connect them to each other, as well as configure the database on each component.

    • image
  • The source and target database sections are configured as shown in the image below.

    • Database Source

      • image
    • Database Target

      • image
  • In the target database section, make sure the columns between source and target match the column names and data types, if you have already run the job by clicking Run.

    • image
  • Result table data after moving data from the staging Database to the DWH_Project Database.

    • image

4. Create Store Procedure

Store Procedure named summary_order_status contains SELECT and JOIN commands between fact and dimension tables to display several columns, namely OrderID, CustomerName, ProductName, Quantity, StatusOrder. The following is an example of creating a Store Procedure with StatusID as a parameter.

  • Query Store Procedure
    CREATE PROCEDURE summary_order_status
      @StatusID INT
    AS
    BEGIN
      SELECT
          fso.OrderID,
          dc.CustomerName,
          dp.ProductName,
          fso.Quantity,
          dso.StatusOrder
      FROM FactSalesOrder fso
      JOIN DimCustomer dc ON fso.CustomerID = dc.CustomerID
      JOIN DimProduct dp ON fso.ProductID = dp.ProductID
      JOIN DimStatusOrder dso ON fso.StatusID = dso.StatusID
      WHERE dso.StatusID = @StatusID;
    END;
    
    EXEC summary_order_status @StatusID = 1;
  • Result SP
    • image

Link Video Presentation