longNguyen010203
ππ I am self-learning to become a π» data engineer in the future πͺππ
Thang Long UniversityHa Noi City
Pinned Repositories
100Day-Self-Learning-DE
ππ»β¨ Self-study process for more than 3 months with 3-4h/day to prepare for the journey of applying for an intern or fresher position as a Data Engineer in 2024 οΈπ₯οΈπ
Bank-DataWarehouse
πππ This project develop a data warehouse for a bank using Amazon Redshift, VPC, Glue, S3 and DBT, following a β Star Schema architecture. The goal is to storage, manage, and optimize data to support decision making and reporting π΅οΈ
Batch-Processing-with-Amazon-EMR
ECommerce-ELT-Pipeline
πππ A Data Engineering Project π that implements an ELT data pipeline using Dagster, Docker, Dbt, Polars, Snowflake, PostgreSQL. Data from kaggle website π₯
FDE-Course-2024-W3
FDE-Course-2024-W4-DBT
π»πFundamental Data Engineering Course 2024 Week4 Learn DBT Transform Data with Models, Macro, ELT-Pipeline with Dagster π
InspireAI-Web-2024
π€ππΊ This project involves creating an AI chatbot with OpenAI using ChatGPT, DALL-E, Codex, and Django to develop the web application π
Stream-Processing-with-Amazon-Kinesis
Youtube-ETL-Pipeline
πππ A Data Engineering Project that implements an ETL data pipeline using Dagster, Apache Spark, Streamlit, MinIO, Metabase, Dbt, Polars, Docker. Data from kaggle and youtube-api πΊ
Zillow-Home-Value-Prediction
πππ The Zillow Home Value Prediction project employs linear regression models on Kaggle datasets to forecast house prices. ππ°Using Apache Spark (PySpark) within a Docker setup enables efficient data preprocessing, exploration, analysis, visualization, and model building with distributed computing for parallel computation.
longNguyen010203's Repositories
longNguyen010203/Youtube-ETL-Pipeline
πππ A Data Engineering Project that implements an ETL data pipeline using Dagster, Apache Spark, Streamlit, MinIO, Metabase, Dbt, Polars, Docker. Data from kaggle and youtube-api πΊ
longNguyen010203/FDE-Course-2024-W3
longNguyen010203/FDE-Course-2024-W4-DBT
π»πFundamental Data Engineering Course 2024 Week4 Learn DBT Transform Data with Models, Macro, ELT-Pipeline with Dagster π
longNguyen010203/ECommerce-ELT-Pipeline
πππ A Data Engineering Project π that implements an ELT data pipeline using Dagster, Docker, Dbt, Polars, Snowflake, PostgreSQL. Data from kaggle website π₯
longNguyen010203/InspireAI-Web-2024
π€ππΊ This project involves creating an AI chatbot with OpenAI using ChatGPT, DALL-E, Codex, and Django to develop the web application π
longNguyen010203/Zillow-Home-Value-Prediction
πππ The Zillow Home Value Prediction project employs linear regression models on Kaggle datasets to forecast house prices. ππ°Using Apache Spark (PySpark) within a Docker setup enables efficient data preprocessing, exploration, analysis, visualization, and model building with distributed computing for parallel computation.
longNguyen010203/100Day-Self-Learning-DE
ππ»β¨ Self-study process for more than 3 months with 3-4h/day to prepare for the journey of applying for an intern or fresher position as a Data Engineer in 2024 οΈπ₯οΈπ
longNguyen010203/Bank-DataWarehouse
πππ This project develop a data warehouse for a bank using Amazon Redshift, VPC, Glue, S3 and DBT, following a β Star Schema architecture. The goal is to storage, manage, and optimize data to support decision making and reporting π΅οΈ
longNguyen010203/Data-Structure---Algorithm-TLU
longNguyen010203/Databricks-ETL-Pipeline
longNguyen010203/InsightEats-Hanoi-Capital
longNguyen010203/LONGNGUYEN--AWS-Trainning--2024
βοΈππ₯ Welcome to my AWS Cloud Training repository! This repo contains notes, exercises, and projects from my AWS Cloud training journey, showcasing my progress and understanding of AWS services. π¨
longNguyen010203/longNguyen010203
longNguyen010203/Spark-Kafka-Self-Learning
πππ A third-year student is self-studying Spark and Kafka as part of their π· data engineering journey, with the goal of securing an π¬ internship or fresher job in 2024.
longNguyen010203/Spark-Processing-AWS
π·π Set up and build a big data processing pipeline with Apache Spark, π¦ AWS services (S3, EMR, EC2, IAM, VPC, Redshift) Terraform to setup the infrastructure and Integration Airflow to automate workflowsπ₯
longNguyen010203/workshop-001
longNguyen010203/Batch-Processing-with-Amazon-EMR
longNguyen010203/Finance-Data-Ingestion-Pipeline-with-Kafka
longNguyen010203/Image-Classification-using-resnes18
longNguyen010203/longNguyen010203.github.io
AWS Data Engineer - Workshop
longNguyen010203/Stream-Processing-with-Amazon-Kinesis