Pinned Repositories
airflow_pipeline
Automated data pipeline using Airflow, Spark, Hadoop HDFS, and Hive to manage currency exchange rates. Checks currency API and file availability, downloads rates, stores in HDFS, creates Hive table, processes with Spark, and notifies via email and Slack.
DataStructureAndAlgorithms
Write code that run faster, use less memory and prepare for your Job Interview
PythonTutorial
Step by step to build apps with Python. Code files for YouTube tutorial
med9110's Repositories
med9110/Web-Data-Engineering-Project
med9110/airflow_pipeline
Automated data pipeline using Airflow, Spark, Hadoop HDFS, and Hive to manage currency exchange rates. Checks currency API and file availability, downloads rates, stores in HDFS, creates Hive table, processes with Spark, and notifies via email and Slack.
med9110/DataStructureAndAlgorithms
Write code that run faster, use less memory and prepare for your Job Interview
med9110/PythonTutorial
Step by step to build apps with Python. Code files for YouTube tutorial
med9110/AWS-Data-Pipeline-for-Retail-Analytics-Classic-Car-Models-Case-Study
This project showcases an end-to-end data pipeline, from MySQL data extraction on Amazon RDS to transformation with AWS Glue, and storage in Amazon S3. Data is queried using Amazon Athena, with visualizations in Jupyter Lab. Terraform is used for infrastructure management via IaC.
med9110/data-engineer-handbook
This is a repo with links to everything you'd ever want to learn about data engineering
med9110/End-to-End-Batch-and-Streaming-Data-Pipelines-for-Real-Time-Recommendations
This project demonstrates the implementation of an end-to-end data engineering solution that involves both batch and streaming data pipelines. The objective is to build a system that can process data in batch mode to train a recommendation model and use streaming data to provide real-time product recommendations based on stakeholder requirements.