Pinned Repositories
airflow-ETL-pipeline
An ETL Data Pipelines Project that uses AirFlow DAGs to extract accessories and jewelry data from PostgreSQL Schemas and the shoes data from a CSV file, load them in AWS Data Lake, transform them with Python script, and finally load them into SnowFlake Data warehouse using SCD type 2.
alaahgag
Bash_Scripting_Project
The project’s purpose is to build a Database Management System with its tasks using Bash Scripting.
BigData_Processing_Using_Spark
this project is to perform a comprehensive Data Analysis using a vast dataset of flight information. My primary focus is to gain insights into flight delay patterns, ultimately shedding light on the dynamics of air travel.
Building_DWH_Using_SSIS
Business_Reviews_Pipeline
The Yelp Data Pipeline processes business reviews using Python, Kafka, AWS (DynamoDB, S3, Redshift), PySpark, AWS Lambda, and Power BI. It supports real-time streaming, CDC, daily batch processing, and data visualization for insights into customer sentiment, business performance, and industry trends.
data_mart_building
In this project, I focus on building a sales data mart for a company by extracting new and updated data from the AdventureWorks2019 (OLTP data) database, creating a star schema and load data to dimensions and fact tables after transforming the data, and finally applying full, incremental, and slow-changing dimension (SCD) loading.
ETL_telecom_pro
Real-Time-Sales-Data-Analysis-Application
A real-time sales data analysis Application using Spark Structured Streaming, Kafka as a messaging system, PostgreSQL as a storage for processed data, and Superset for creating a dashboard.
twitter_pipeline
This is End-To-End Data Engineering Project using Airflow and Python. In this project, we will extract data using Twitter API, use python to transform data, deploy the code on Airflow/EC2 and save the final result on Amazon S3
alaahgag's Repositories
alaahgag/Real-Time-Sales-Data-Analysis-Application
A real-time sales data analysis Application using Spark Structured Streaming, Kafka as a messaging system, PostgreSQL as a storage for processed data, and Superset for creating a dashboard.
alaahgag/Business_Reviews_Pipeline
The Yelp Data Pipeline processes business reviews using Python, Kafka, AWS (DynamoDB, S3, Redshift), PySpark, AWS Lambda, and Power BI. It supports real-time streaming, CDC, daily batch processing, and data visualization for insights into customer sentiment, business performance, and industry trends.
alaahgag/airflow-ETL-pipeline
An ETL Data Pipelines Project that uses AirFlow DAGs to extract accessories and jewelry data from PostgreSQL Schemas and the shoes data from a CSV file, load them in AWS Data Lake, transform them with Python script, and finally load them into SnowFlake Data warehouse using SCD type 2.
alaahgag/alaahgag
alaahgag/Bash_Scripting_Project
The project’s purpose is to build a Database Management System with its tasks using Bash Scripting.
alaahgag/BigData_Processing_Using_Spark
this project is to perform a comprehensive Data Analysis using a vast dataset of flight information. My primary focus is to gain insights into flight delay patterns, ultimately shedding light on the dynamics of air travel.
alaahgag/Building_DWH_Using_SSIS
alaahgag/data_mart_building
In this project, I focus on building a sales data mart for a company by extracting new and updated data from the AdventureWorks2019 (OLTP data) database, creating a star schema and load data to dimensions and fact tables after transforming the data, and finally applying full, incremental, and slow-changing dimension (SCD) loading.
alaahgag/ETL_telecom_pro
alaahgag/twitter_pipeline
This is End-To-End Data Engineering Project using Airflow and Python. In this project, we will extract data using Twitter API, use python to transform data, deploy the code on Airflow/EC2 and save the final result on Amazon S3
alaahgag/BigDataEngineeringInDepth_DataModeling_InterviewQuestions
alaahgag/data-zoom-campe
zoomdata-camp homeworks
alaahgag/DE_Project_Templete
alaahgag/PostgreSQL_Data_Modeling
In this project, I applied what I had learned on data modeling with Postgres and build an ETL pipeline using Python. I defined fact and dimension tables for a star schema for a particular analytic focus, and write an ETL pipeline that transfers data from files in two local directories into these tables in Postgres using Python and SQL.
alaahgag/PWC_intern
alaahgag/readme-typing-svg
⚡ Dynamically generated, customizable SVG that gives the appearance of typing and deleting text for use on your profile page, repositories, or website.