Repository fo Data Engineering Course
- What is (Big) Data?
- The Role of Data Engineer
- From Data Wharehouse to Data Lakes
- Setup Docker
- Introduction to Jupyter Notebooks
- Relational Data
- NoSQL
- Document
- Graph
- Data Wharehousing
- Star and Snowflake schemas
- Data Vault
- Modelling and Querying Relational data: MySQL
- Modelling and Querying Document data: MongoDB
- Modelling and Querying Graph data: Cypher
- Modelling and Querying RDF data: SPARQL
- Domain Driven Design: a summary
- Event Sourcing: a Summary
- Big Data Systems Architectures
- ETL and Data Pipelines
- Best Practices and Anti-Patterns
- Batch vs Streaming Processing
- Data Replication
- Data Partitioning
- Transactions
- Data Ingestion with Apache Kafka
- Data Pipelines with Apache Airflow
- Data Processing with Kafka Streams/KSQL
- Data Pipelines with Luigi
- Data Pipelines with Apachi Nifi
- Data Processing with Apache Flink
- Cleansing
- Augumentation
- Cleansing examples using OpenRefine
- Augumentation examples using Pandas and Tensorflow