longNguyen010203

🌈🌊 I am self-learning to become a 💻 data engineer in the future 💪🌎📚

Thang Long UniversityHa Noi City

Pinned Repositories

Batch-Processing-with-Amazon-EMR
workshop on data engineering using Amazon EMR for batch data processing, application to big data processing.
Language:Jupyter Notebook10
DATA-WAREHOUSE-ACCIDENT-US-2016-2023
Design and implement a data warehouse to manage automobile accident cases across all 49 states in the US, using a star schema and Snowflake for the data warehouse architecture.
Language:Jupyter Notebook10
ECommerce-ELT-Pipeline
🌄📈📉 A Data Engineering Project 🌈 that implements an ELT data pipeline using Dagster, Docker, Dbt, Polars, Snowflake, PostgreSQL. Data from kaggle website 🔥
Language:Python3 1 00
FDE-Course-2024-W3
Language:Python50
FDE-Course-2024-W4-DBT
💻💛Fundamental Data Engineering Course 2024 Week4 Learn DBT Transform Data with Models, Macro, ELT-Pipeline with Dagster 🌎
Language:Python5 2 00
Finance-Data-Ingestion-Pipeline-with-Kafka
Develop a real-time data ingestion pipeline using Kafka and Spark. Collect minute-level stock data from Yahoo Finance, ingest it into Kafka, and process it with Spark Streaming, storing the results in Cassandra. Orchestrated the workflow using Airflow deployed on Docker.
Language:Python3 1 00
InspireAI-Web-2024
🤖💎📺 This project involves creating an AI chatbot with OpenAI using ChatGPT, DALL-E, Codex, and Django to develop the web application 🍁
Language:Python3 2 02
Stream-Processing-with-Amazon-Kinesis
workshop on data engineering using Amazon Kinesis for real-time data processing, applied to big data processing.
Language:Jupyter Notebook10
Youtube-Recommend-Master-ETL-Pipeline
A Data Engineering Project that implements an ETL data pipeline using Dagster, Apache Spark, Streamlit, MinIO, Metabase, Dbt, Polars, Docker. Data from kaggle and youtube-api
Language:Jupyter Notebook20 3 01
Zillow-Home-Value-Prediction
🌈📊📈 The Zillow Home Value Prediction project employs linear regression models on Kaggle datasets to forecast house prices. 📉💰Using Apache Spark (PySpark) within a Docker setup enables efficient data preprocessing, exploration, analysis, visualization, and model building with distributed computing for parallel computation.
Language:Jupyter Notebook30

longNguyen010203's Repositories

longNguyen010203/Youtube-Recommend-Master-ETL-Pipeline
A Data Engineering Project that implements an ETL data pipeline using Dagster, Apache Spark, Streamlit, MinIO, Metabase, Dbt, Polars, Docker. Data from kaggle and youtube-api
Language:Jupyter Notebook20 3 01
longNguyen010203/FDE-Course-2024-W3
Language:Python50
longNguyen010203/FDE-Course-2024-W4-DBT
💻💛Fundamental Data Engineering Course 2024 Week4 Learn DBT Transform Data with Models, Macro, ELT-Pipeline with Dagster 🌎
Language:Python5 2 00
longNguyen010203/ECommerce-ELT-Pipeline
🌄📈📉 A Data Engineering Project 🌈 that implements an ELT data pipeline using Dagster, Docker, Dbt, Polars, Snowflake, PostgreSQL. Data from kaggle website 🔥
Language:Python3 1 00
longNguyen010203/Finance-Data-Ingestion-Pipeline-with-Kafka
Develop a real-time data ingestion pipeline using Kafka and Spark. Collect minute-level stock data from Yahoo Finance, ingest it into Kafka, and process it with Spark Streaming, storing the results in Cassandra. Orchestrated the workflow using Airflow deployed on Docker.
Language:Python3 1 00
longNguyen010203/InspireAI-Web-2024
🤖💎📺 This project involves creating an AI chatbot with OpenAI using ChatGPT, DALL-E, Codex, and Django to develop the web application 🍁
Language:Python3 2 02
longNguyen010203/Zillow-Home-Value-Prediction
🌈📊📈 The Zillow Home Value Prediction project employs linear regression models on Kaggle datasets to forecast house prices. 📉💰Using Apache Spark (PySpark) within a Docker setup enables efficient data preprocessing, exploration, analysis, visualization, and model building with distributed computing for parallel computation.
Language:Jupyter Notebook30
longNguyen010203/100Day-Self-Learning-DE
📚💻⌨ Self-study process for more than 3 months with 3-4h/day to prepare for the journey of applying for an intern or fresher position as a Data Engineer in 2024 ️🥇️🏆
Language:Jupyter Notebook2 1 00
longNguyen010203/Bank-DataWarehouse
📊🌈🏛 This project develop a data warehouse for a bank using Amazon Redshift, VPC, Glue, S3 and DBT, following a ⭐ Star Schema architecture. The goal is to storage, manage, and optimize data to support decision making and reporting 🏵️
2 1 00
longNguyen010203/Data-Structure---Algorithm-TLU
Language:C++2
longNguyen010203/Databricks-ETL-Pipeline
Language:Python2
longNguyen010203/InsightEats-Hanoi-Capital
Language:Dockerfile2
longNguyen010203/LONGNGUYEN--AWS-Trainning--2024
☁️🌈🔥 Welcome to my AWS Cloud Training repository! This repo contains notes, exercises, and projects from my AWS Cloud training journey, showcasing my progress and understanding of AWS services. 💨
Language:JavaScript20
longNguyen010203/longNguyen010203
2
longNguyen010203/longNguyen010203.github.io
AWS Data Engineer - Workshop
Language:JavaScript2
longNguyen010203/Spark-Kafka-Self-Learning
📚🌊🎓 A third-year student is self-studying Spark and Kafka as part of their 👷 data engineering journey, with the goal of securing an 📬 internship or fresher job in 2024.
Language:Shell20
longNguyen010203/Spark-Processing-AWS
👷🌇 Set up and build a big data processing pipeline with Apache Spark, 📦 AWS services (S3, EMR, EC2, IAM, VPC, Redshift) Terraform to setup the infrastructure and Integration Airflow to automate workflows🥊
Language:Python2 2 00
longNguyen010203/workshop-001
Language:HTML2
longNguyen010203/Batch-Processing-with-Amazon-EMR
workshop on data engineering using Amazon EMR for batch data processing, application to big data processing.
Language:Jupyter Notebook10
longNguyen010203/DATA-WAREHOUSE-ACCIDENT-US-2016-2023
Design and implement a data warehouse to manage automobile accident cases across all 49 states in the US, using a star schema and Snowflake for the data warehouse architecture.
Language:Jupyter Notebook10
longNguyen010203/Image-Classification-using-resnes18
Language:Python1
longNguyen010203/Stream-Processing-with-Amazon-Kinesis
workshop on data engineering using Amazon Kinesis for real-time data processing, applied to big data processing.
Language:Jupyter Notebook10

longNguyen010203

Pinned Repositories

Batch-Processing-with-Amazon-EMR

DATA-WAREHOUSE-ACCIDENT-US-2016-2023

ECommerce-ELT-Pipeline

FDE-Course-2024-W3

FDE-Course-2024-W4-DBT

Finance-Data-Ingestion-Pipeline-with-Kafka

InspireAI-Web-2024

Stream-Processing-with-Amazon-Kinesis

Youtube-Recommend-Master-ETL-Pipeline

Zillow-Home-Value-Prediction

longNguyen010203's Repositories

longNguyen010203/Youtube-Recommend-Master-ETL-Pipeline

longNguyen010203/FDE-Course-2024-W3

longNguyen010203/FDE-Course-2024-W4-DBT

longNguyen010203/ECommerce-ELT-Pipeline

longNguyen010203/Finance-Data-Ingestion-Pipeline-with-Kafka

longNguyen010203/InspireAI-Web-2024

longNguyen010203/Zillow-Home-Value-Prediction

longNguyen010203/100Day-Self-Learning-DE

longNguyen010203/Bank-DataWarehouse

longNguyen010203/Data-Structure---Algorithm-TLU

longNguyen010203/Databricks-ETL-Pipeline

longNguyen010203/InsightEats-Hanoi-Capital

longNguyen010203/LONGNGUYEN--AWS-Trainning--2024

longNguyen010203/longNguyen010203

longNguyen010203/longNguyen010203.github.io

longNguyen010203/Spark-Kafka-Self-Learning

longNguyen010203/Spark-Processing-AWS

longNguyen010203/workshop-001

longNguyen010203/Batch-Processing-with-Amazon-EMR

longNguyen010203/DATA-WAREHOUSE-ACCIDENT-US-2016-2023

longNguyen010203/Image-Classification-using-resnes18

longNguyen010203/Stream-Processing-with-Amazon-Kinesis