Pinned Repositories
SCD-Implementation-PySpark-Retail-Project
spark-streaming-retail-project
Algorithms
alluxio
Alluxio, data orchestration for analytics and machine learning in the cloud
alluxio-py
Alluxio Python client - Access Any Data Source with Python
api_data_etl_pipeline_airflow
Azure-Guide
Microsoft Azure Guide. Learn all about Microsoft Azure Tools, Services, and Certifications.
cookiecutter-pathway
Cookiecutter Pathway is a framework for jumpstarting production-ready Pathway projects quickly.
data-engineering-essentials
data-lakehouse-project
Owengerald's Repositories
Owengerald/alluxio
Alluxio, data orchestration for analytics and machine learning in the cloud
Owengerald/Streaming--data-engineering-project
This project creates a robust data pipeline for efficient ingestion, processing, and storage. Using Apache Airflow for orchestration, it integrates Python, Kafka, Zookeeper, and Spark for real-time data processing, with Cassandra for storage. Docker containerization ensures smooth deployment and scalability of all components.
Owengerald/data-lakehouse-project
Owengerald/deequ
Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
Owengerald/alluxio-py
Alluxio Python client - Access Any Data Source with Python
Owengerald/spark-streaming-join-operation-with-static-data
Owengerald/spark-streaming-window-aggregations-retail-project-test
Owengerald/SCD-Implementation-PySpark-Retail-Project
Owengerald/spark-streaming-retail-project
Owengerald/spark-streaming-sales-project
Owengerald/pathway
Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.
Owengerald/llm-app
LLM App templates for RAG, knowledge mining, and stream analytics. Ready to run with Docker,⚡in sync with your data sources.
Owengerald/Loan-Risk-Score-Evalutation
Owengerald/RetailAnalysisProject
Owengerald/RetailProject
Owengerald/NotesOfYouTubeSQLSeries
Owengerald/spark
Apache Spark - A unified analytics engine for large-scale data processing
Owengerald/finance-project
Owengerald/pathway-benchmarks
Benchmarks for data processing systems: Pathway, Spark, Flink, Kafka Streams
Owengerald/cookiecutter-pathway
Cookiecutter Pathway is a framework for jumpstarting production-ready Pathway projects quickly.
Owengerald/Azure-Guide
Microsoft Azure Guide. Learn all about Microsoft Azure Tools, Services, and Certifications.
Owengerald/norby_inc_analytics_elt_pipeline
Owengerald/ecomarket_etl_analytics
Owengerald/Owengerald
Owengerald/api_data_etl_pipeline_airflow
Owengerald/nyc_yellow_taxi_data_pipelines
Owengerald/test-cicd
Owengerald/data-engineering-essentials
Owengerald/Algorithms
Owengerald/video-game-training-sql
Hey this is the repo that has all the queries and data for my video game training series!