Spark ETL

Extract Transform and Load using Spark or Extract , Load and Tranform using Spark

Here, we will create Spark notebooks for doing all of below ETL processes. Once we learn about all the ETL processes, we will start working on projects using Spark.

Please find list ETL Pipelines

Chapter0 -> Spark ETL with Files (CSV | JSON | Parquet)
Chapter1 -> Spark ETL with SQL Database (MySQL | PostgreSQL)
Chapter2 -> Spark ETL with NonSQL Database (MongoDB)
Chapter3 -> Spark ETL with Azure (Blob | ADLS)
Chapter4 -> Spark ETL with AWS (S3 bucket)
Chapter5 -> Spark ETL with Hive tables
Chapter6 -> Spark ETL with APIs
Chapter7 -> Spark ETL with Lakehouse (Delta Lake)
Chapter8 -> Spark ETL with Lakehouse (Apache HUDI)
Chapter9 -> Spark ETL with Lakehouse (Apache Iceberg)
Chapter10 -> Spark ETL with Lakehouse (Delta Lake vs Apache Iceberg vs Apache HUDI)
Chapter11 -> Spark ETL with Lakehouse (Delta table Optimization)
Chapter12 -> Spark ETL with Apache Kafka
Chapter13 -> Spark ETL with GCP (Big Query)

Also find below blog for understanding all the data engineering ETL Chapters

https://developershome.blog/category/data-engineering/spark-etl

Also find below youtube channel for understanding all the data engineering Chapters and learning new concepts of data engineering.

https://www.youtube.com/@developershomeIn

saitzaw/SparkETL

Spark ETL