/Data-Engineering-Learning-Guide

hat aggregates data from multiple sources & consolidate into an Analytics Data warehouse to support organization-wide analytics/reports used by the data analysts, data scientists or the BI team.

Data Engineering Learning Guide

Created by: Kelvin Oyanna
Email: dotkelplus@gmail.com
Linkedin: https://www.linkedin.com/in/oyannakelvin/
Twitter: @kelvinoyanna

About Data Engineering

Data engineering is a specialization in the field of data that concerns building scalable data infrastructure/pipelines that aggregates data from multiple sources & consolidate into an Analytics Data warehouse to support organization-wide analytics/reports used by the data analysts, data scientists or the BI team.

Must-have skills as a Data Engineer

Python:

SQL:

Database/Data modeling:

  • Get the book - The Data Warehouse Toolkit.

Cloud Infrastructure:

  • Learn Cloud fundamentals (Google cloud or AWS cloud)
  • Practice your knowledge of cloud engineering with a cloud sandbox: https://kodekloud.com/

Advanced Skill

Building ETL pipelines using:

Automating & monitoring data pipelines using:

  • Writing cron jobs, Apache Airflow (most recommended)

Building ELT Data Pipelines:

Learn Big data processing framework (This is optional for Beginners):

  • Apache Spark (for big data transformation, & building streaming data pipeline). Get the book: Spark The Definitive Guide.
  • Apache Kafka (for large-scale data streaming pipeline).
  • Docker for containerizing your data pipeline.
  • Git - For version control & remote collaboration.
  • Kubernetes for data pipeline deployment

Structured Course:

https://www.dataquest.io/path/data-engineer/

Get your hands on real-world projects:

Follow this link to access DE projects to work on: https://www.ssp.sh/brain/open-source-data-engineering-projects/

Community:

Join https://www.reddit.com/r/dataengineering/

Tutorial:

Follow & watch the videos on this Data Engineering Bootcamp:
https://github.com/DataTalksClub/data-engineering-zoomcamp

Jobs:

https://outerjoin.us/