Why does this repo exist?
For the following reasons
- people keep asking me for DE training resources.
- people ask me what online courses I do.
- people ask me what books I read.
In my opinion, paid resources are a better use of your time than free content. On average, I find the return on time to be higher with paid resources.
This is not a comprehensive list but a curated list. A good starting point.
- Free Code Camp
- Advancing Analytics
- Amazon Web Services
- Apache Airflow
- ByteByteGo
- ComputerPhile
- Goolge Cloud Tech
- IBM Technology
- Interviewing.IO
- MicroSoft Azure
- PluralSight
- Snowflake
- Thoughtworks
- Hashicorp
- CircleCI
- Databricks Youtube Channel
Here is a spreadsheet of all the online courses I have done.
I would suggest that your learn the following programming languages and tools. These are not listed in any particular order.
Here are some books that might be useful.
- [Fundamentals of Data Engineering]
- [Fundamentals of Software Architecture]
- [Data Pipelines Pocker Reference]
- [Machine Learning Pocket Reference]
- [Data Mesh]
- [Refactoring Databases]
- Linux Command Line Bible
- Designing Data Intensive Applications
- Advanced Analytics wtih Spark
- PostgresSQL: Up and Running
- Fluent Python: Clear, Concise, and Effective Programming
- Effective Python
- Building MicroServicse by Sam Newman
- Monolith to MicroServices by Sam Newman
- Enterprise Intergration Patterns
- Domain-Driven Design
- Domain-Driven Distilled
- The Enterprise Big Data Lake by Alex Gorelik
- Building Evolutionary Architectures by Neal Ford
- Software Architecture: The Hard Parts by Neal Ford
- Kafka: The Definitive Guide
- Designing Distributed Systems
- Building Event-Driven Microservices
- TBD
- Axon Framework
- Spring Cloud
- FastAPI