Extract Transform and Load using Spark or Extract , Load and Tranform using Spark
Here, we will create Spark notebooks for doing all of below ETL processes. Once we learn about all the ETL processes, we will start working on projects using Spark.
Please find list ETL Pipelines
- Chapter0 -> Spark ETL with Files (CSV | JSON | Parquet)
- Chapter1 -> Spark ETL with SQL Database (MySQL | PostgreSQL)
- Chapter2 -> Spark ETL with NonSQL Database (MongoDB)
- Chapter3 -> Spark ETL with Azure (Blob | ADLS)
- Chapter4 -> Spark ETL with AWS (S3 bucket)
- Chapter5 -> Spark ETL with Hive tables
- Chapter6 -> Spark ETL with APIs
- Chapter7 -> Spark ETL with Lakehouse (Delta Lake)
- Chapter8 -> Spark ETL with Lakehouse (Apache HUDI)
- Chapter9 -> Spark ETL with Lakehouse (Apache Iceberg)
- Chapter10 -> Spark ETL with Lakehouse (Delta Lake vs Apache Iceberg vs Apache HUDI)
- Chapter11 -> Spark ETL with Lakehouse (Delta table Optimization)
- Chapter12 -> Spark ETL with Apache Kafka
- Chapter13 -> Spark ETL with GCP (Big Query)
Also find below blog for understanding all the data engineering ETL Chapters
https://developershome.blog/category/data-engineering/spark-etl
Also find below youtube channel for understanding all the data engineering Chapters and learning new concepts of data engineering.