Building Dimensional Model and using in it SnowflakeDB
The primary objective of this project is to construct a dimensional model utilizing Fact and Dimension tables, employing the Star Schema approach. The raw or flat data, residing in Amazon S3, serves as the foundational source for this model. The data model is being developed within the SnowflakeDB data warehouse environment, facilitating the organization and optimization of data for analytical purposes.
Technologies/Tools used :
- SQL in SnowflakeDB
- Amazon S3 for storing data in a bucket
- AWS IAM access to the SnowflakeDB
Data Source and Data Dictonary : https://www.kaggle.com/c/instacart-market-basket-analysis/data.
Project WorkFlow:
-
Created a bucket in AWS S3 to store all the e-commerce files in the bucket.
-
Providing the IAM user access with the "Administrator Access" and stored the access keys.
-
In the SnowflakeDB, a new schema and database is created for the instacart-market-basket data.
-
Created staging area (preparing Snowflake's environment to access the data from the S3) to maintain flexibility based on user preferences and requirements. In SnowflakeDB, the S3 .csv files have been imported using COPY command. Refer Instacart Fact Dim.sql
-
Identified the data model and constructed the dimension and fact tables.