Data Engineering Zoomcamp 2022

My Course Notes

  • Data Lakes
  • Data Pipeline Orchestration with Airflow
  • Basics of Data Warehousing and Big Query
  • Ingesting data into BQ Data Warehouse with Airflow
  • Optimizing performance and cost with partitioning and clustering in BQ
  • Machine Learning in BQ
  • ETL vs ELT
  • dbt basics
  • transformations in the data warehouse and dbt Cloud
  • dbt Project Repo
  • Dashboards in Google Data Studio
  • Batch vs Streaming
  • Installing Spark
  • Spark SQL and DataFrames
  • Spark Internals