cloudcruncher

NatWest GroupEdinburgh

cloudcruncher's Stars

raystack/optimus
Optimus is an easy-to-use, reliable, and performant workflow orchestrator for data transformation, data modeling, pipelines, and data quality management.
Language:Go746153
manuzhang/awesome-streaming
a curated list of awesome streaming frameworks, applications, etc
2.7k297
abhishek-ch/around-dataengineering
A Data Engineering & Machine Learning Knowledge Hub
Language:Python1.1k224
manoj9788/spark-etl-tests
A sample repository showcasing, implementation of testing for ETL pipeline developed with Apache Spark
1
vmware/versatile-data-kit
One framework to develop, deploy and operate data workflows with Python and SQL.
Language:Python43057
damklis/DataEngineeringProject
Example end to end data engineering project.
Language:Python1.1k224
Rishav273/kafkaPysparkAnalytics
Real-time ETL pipeline for financial data (kafka, pyspark) .
Language:Python81
benchsci/tinsel
PySpark schema generator
Language:Python385
DataTalksClub/data-engineering-zoomcamp
Free Data Engineering course!
Language:Jupyter Notebook25.2k5.4k
piskvorky/smart_open
Utils for streaming large files (S3, HDFS, gzip, bz2...)
Language:Python3.2k383
GoogleCloudDataproc/initialization-actions
Run in all nodes of your cluster before the cluster starts - lets you customize your cluster
Language:Shell588512
san089/Udacity-Data-Engineering-Projects
Few projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake development.
Language:Python1.5k491
GoogleCloudPlatform/professional-services
Common solutions and tools developed by Google Cloud's Professional Services team. This repository and its contents are not an officially supported Google product.
Language:Python2.8k1.3k
GoogleCloudPlatform/training-data-analyst
Labs and demos for courses for GCP Training (http://cloud.google.com/training).
Language:Jupyter Notebook7.9k5.9k

cloudcruncher

cloudcruncher's Stars

raystack/optimus

manuzhang/awesome-streaming

abhishek-ch/around-dataengineering

manoj9788/spark-etl-tests

vmware/versatile-data-kit

damklis/DataEngineeringProject

Rishav273/kafkaPysparkAnalytics

benchsci/tinsel

DataTalksClub/data-engineering-zoomcamp

piskvorky/smart_open

GoogleCloudDataproc/initialization-actions

san089/Udacity-Data-Engineering-Projects

GoogleCloudPlatform/professional-services

GoogleCloudPlatform/training-data-analyst