/data-engineer-handbook

This is a repo with links to everything you'd ever want to learn about data engineering

Primary LanguageJupyter Notebook

The Data Engineering Handbook

This repo has all the resources you need to become an amazing data engineer!

Getting started

If you are new to data engineering, start by following this 2024 breaking into data engineering roadmap

If you are here for the 6-week free YouTube boot camp you can check out

For more applied learning:

  • Check out the projects section for more hands-on examples!
  • Check out the interviews section for more advice on how to pass data engineering interviews!
  • Check out the books section for a list of high quality data engineering books
  • Check out the communities section for a list of high quality data engineering communities to join
  • Check out the newsletter section to learn via email

Resources

Top 3 must read books are:

Top must-join communities for DE:

Top must-join communities for ML:

Companies:

Data Engineering blogs of companies:

Data Engineering Whitepapers:

Social Media Accounts

Here's the mostly comprehensive list of data engineering creators: (You have to have at least 5k followers somewhere to be added!)

Name
YouTube

LinkedIn

X/Twitter

Instagram

TikTok
Zach Wilson Data with Zach (70k+) Zach Wilson (400k+) EcZachly (30k+) eczachly (150k+) @eczachly (70k+)
Shashank Mishra E-learning Bridge (100k+) Shashank Mishra (100k+)
Seattle Data Guy Seattle Data Guy (100k+) Ben Rogojan (100k+) SeattleDataGuy (10k+)
TrendyTech TrendyTech (100k+) Sumit Mittal (100k+)
Darshil Parmar Darshil Parmar (100k+) Darshil Parmar (100k+)
Andreas Kretz Andreas Kretz (100k+) Andreas Kretz (100k+) learndataengineering (5k+)
ByteByteGo ByteByteGo (1m+) Alex Xu (100k+) alexxubyte (100k+)
The Ravit Show The Ravit Show (100k+)
Guy in a Cube Guy in a Cube (100k+)
Adam Marczak Adam Marczak (100k+)
nullQueries nullQueries (100k+)
TECHTFQ by Thoufiq TECHTFQ by Thoufiq (100k+)
SQLBI SQLBI (100k+) Marco Russo (50k+) marcorus (10k+)
Azure Lib Azure Lib (10k+) Deepak Goyal (100k+)
Prashanth Kumar Pandey ScholarNest (77k+) Prashanth Kumar Pandey (37K+)
Advancing Analytics Advancing Analytics (10k+) Simon Whiteley (10k+)
Kahan Data Solutions Kahan Data Solutions (10k+)
Ankit Bansal Ankit Bansal (10k+) Ankit Bansal (50k+)
Mr. K Talks Tech Mr. K Talks Tech (10k+)
Li Yin Li Yin (10k+)
Jaco van Gelder Jaco van Gelder (10k+)
Joseph Machado Joseph Machado (10k+) startdataeng (5k+)
Eric Roby Eric Roby (10k+)
Simon Späti Simon Späti (10k+)
Dipankar Mazumdar Dipankar Mazumdar (5k+)
Daniel Ciocirlan Daniel Ciocirlan (5k+)
Hugo Lu Hugo Lu (5k+)
Tobias Macey Tobias Macey (5k+)
Marcos Ortiz Marcos Ortiz (5k+)
Julien Hurault Julien Hurault (5k+)
Alex Freberg Alex The Analyst (100k+) Alex Freberg (100k+) @alex_the_analyst (10k+)
Marc Lamberti Marc Lamberti (50k+)
Chip Huyen Chip Huyen (250k+)
Alex Merced Alex Merced Data Alex Merced (30k+) @amdatalakehouse @alexmercedcoder
John Kutay John Kutay John Kutay (5k+) @JohnKutay
Lakshmi Sontenam Lakshmi Sontenam (9.5k+)
Hassaan Akbar Hassaan Akbar (5k+)
Samuel Focht Python Basics (10k+)
Constantin Lungu Constantin Lungu (10k+)
Ijaz Ali Ijaz Ali (24K+)
Subhankar Subhankar (5k+)
Ankur Ranjan Big Data Show (100k+) Ankur Ranjan (48k+)
Lenny Lenny A (6k+)
Mehdi Ouazza Mehdio DataTV (3k+) Mehdi Ouazza (20k+) mehd_io @mehdio_datatv
ITVersity ITVersity (67k+) Durga Gadiraju (48k+)
Arnaud Milleker Arnaud Milleker (7k+)
Soumil Shah [Soumil Shah] (https://www.youtube.com/@SoumilShah) (50k) Soumil Shah (8k+)
Ananth Packkildurai Ananth Packkildurai (18k+)

Great Podcasts

Top must follow newsletters for data engineering:

Glossaries:

Design Patterns

Courses / Academies

Certifications Courses