/data-engineering-zoomcamp

A course by DataTalks Club that covers Spark, Kafka, Docker, Airflow, Terraform, DBT, Big Query etc

Primary LanguageJupyter Notebook

DataTalks Data Engineering Zoomcamp

Covered Topics

Docker

  • How to write a Dockerfile
  • Building image using Dockerfile
  • Running the image
  • Running two docker containers in same network
  • Port mapping
  • Docker Compose

Terraform

  • Using Infrastructure as code
  • Creating terraform config
  • Planning, applying and destroying over GCP

GCP

  • Setup Bigquery, Storage using Terraform
  • Setup VM instance
  • Create SSH config file
  • SSH into the VM instance from local
  • Access code dir in VScode
  • Map GCP ports to local instance

Postgres, PgAdmin, SQL

  • Setup postgres and pgadmin in docker
  • Insert data into postgres using pandas.io.sql
  • connect using pgcli
  • SQL refresher

Big Query