Pinned Repositories
Ohitv_End_To_End_Project
End-to-End ETL Pipeline for Film Data Crawling from Ohitv
Crawl_Traveloka
Crawl data from Traveloka.com for hotels, coaches, and plane trips using Selenium
azure_olympic_data_analytics
Olympic Data Analytics | Azure End-To-End Data Engineering Project
azure_databrick_formula1_project
A project I did when I was learning Azure Databricks & Spark For Data Engineers (PySpark / SQL)
leetcode_sql
SQL in leetcode
aws_ipl_spark_analytics
Ohitv_Film_ETL
Crawl data using two methods: Requests API and Selenium. Data from Selenium will be imported into a SQL Server database, while data from Requests API will be imported into a Postgres database.
ML_Result_Portal
A Machine Learning Portal that showing the Result of 3 Use Cases: Segmentation, Cross-Selling and Up-Selling
assignment_04072024
Tìm hiểu kĩ để so sánh CTE, View, Temp Table, Table Variable, Inline TVFs
assignment_27052024
Ex1: Online vs Offline extraction. Example? Ex2: ETL vs ELT? Ex3: Virtual Environment vs Virtual Machine vs Container.
hhtrieu0108's Repositories
hhtrieu0108/hhtrieu0108
My Github Profile
hhtrieu0108/Ohitv_End_To_End_Project
End-to-End ETL Pipeline for Film Data Crawling from Ohitv
hhtrieu0108/aws_ipl_spark_analytics
hhtrieu0108/azure_olympic_data_analytics
Olympic Data Analytics | Azure End-To-End Data Engineering Project
hhtrieu0108/ApacheSpark-Python
This repository is for archiving all the code files and dependencies requirements for my learning Apache Spark with Python.
hhtrieu0108/project_india_premier_league_cricket
hhtrieu0108/leetcode_sql
SQL in leetcode
hhtrieu0108/assignment_25072024
## Exercise 1 - Given a dataset of 1000 arbitrary numbers in a text file - Find all prime numbers in the given dataset - Save result under a new text file ## Exercise 2 - Generate a text file of 10,000 lines, each line contains a pair of (key,value) - Calculate the average value for each key - Apply GroupByKey() and ReduceByKey() Functions
hhtrieu0108/Fresher-DE-Exercises-By-AubrynnReina
hhtrieu0108/de_assignment
hhtrieu0108/Chanh_anime_web_scrapping
hhtrieu0108/data-engineering-zoomcamp
Free Data Engineering course!
hhtrieu0108/azure_databrick_formula1_project
A project I did when I was learning Azure Databricks & Spark For Data Engineers (PySpark / SQL)
hhtrieu0108/assignment_07082024
Convex Hull
hhtrieu0108/assignment_01082024
hhtrieu0108/Ohitv_Film_ETL
Crawl data using two methods: Requests API and Selenium. Data from Selenium will be imported into a SQL Server database, while data from Requests API will be imported into a Postgres database.
hhtrieu0108/ApacheKafka-Beginners
This repository is for archiving all the command files and dependencies requirements for my learning Apache Kafka.
hhtrieu0108/assignment_04072024
Tìm hiểu kĩ để so sánh CTE, View, Temp Table, Table Variable, Inline TVFs
hhtrieu0108/assignment_27062024
Tìm các function có thể sử dụng với subquery và tìm hiểu how these functions are executed logically và giải thích chênh lệch performance. E.x: So sánh EXISTS với IN, which is faster, why?
hhtrieu0108/Crawl_Traveloka
Crawl data from Traveloka.com for hotels, coaches, and plane trips using Selenium
hhtrieu0108/assignment_20062024
Ex1: Tìm structured dataset có 1,000,000 dòng; insert vào 1 hoặc nhiều tables. Thực hiện các câu query có cùng output với việc sử dụng where, group by, having khác nhau. Đo và so sánh thời gian chạy!
hhtrieu0108/Edge_Computing
Research and Testing about the problem Edge Computing
hhtrieu0108/Image_Classification
A project during researching for MLOps tools. Using Minio, Kubeflow and Jupyter Notebooks to build pipeline from Preprocessing Data to Building Model and Predicting Image
hhtrieu0108/assignment_31052024
Ex1: Deploy 1 hoặc nhiều docker container có thể phục vụ chạy airflow, ETL (sử dụng python, pandas...), có khả năng kết nối với datasource, cloud. Ex2: Why docker-compose? Ex3: How to reduce the size of Docker images, containers? Keyword: multistaging build.
hhtrieu0108/assignment_27052024
Ex1: Online vs Offline extraction. Example? Ex2: ETL vs ELT? Ex3: Virtual Environment vs Virtual Machine vs Container.
hhtrieu0108/SQL_Exercises_Sample
hhtrieu0108/ML_Result_Portal
A Machine Learning Portal that showing the Result of 3 Use Cases: Segmentation, Cross-Selling and Up-Selling
hhtrieu0108/data-engineer-roadmap
Roadmap to becoming a data engineer in 2021