DSSG - Data Engineering Workshop 🪠❤️

Welcome to the Data Engineering Workshop organized by the Data Science for Social Good (DSSG) community. This workshop is designed to provide a hands-on introduction to the data engineering workflow and tools. The intended audience is data scientists and analysts who are interested in learning how to build simple data pipelines and data warehouses.

Workshop Setup

For this workshop we prepared a documentation page that will guide us through the different modules and exercises. You can find it here.

Timetable

Session	Time	Description	Tool
Giving Context and Getting to know each other	9:00 - 9:30	Introduce the workshop objectives and participants share their backgrounds and expectations
Storing Data	9:30 - 10:15	Persisting data in a secure and queryable location for analytics purposes	DuckDB
Extracting and Loading	10:30 - 11:15	Transferring data from different systems to a centralized repository	Airbyte
Transforming	11:30 - 12:30	Shaping raw data from various sources into a unified view that can be interpreted by stakeholders	dbt
Making data accessible	12:45 - 13:15	Providing interpretation and data access to the rest of the organization	Metabase
Follow Up Questions and next steps	13:15	Participants can ask questions and receive guidance on recommended next steps and resources for further learning in data engineering

We will learn to:

Set up a basic Analytical Database using DuckDB
Read some data from various sources into our database with Airbyte
Transform the data into a unified view with dbt
Attach a visualization tool to the database using Metabase

We are not touching:

Buildig a production-ready data pipeline.
Setting up cloud infrastructure
Orchestrating complext data pipelines with lots of dependencies
Interacting with all the bells and whistles of the tools we will use.

anastasia-mikheeva/DSSG-Data-Engineering-Workshop

DSSG - Data Engineering Workshop 🪠❤️

Workshop Setup

Timetable

We will learn to:

We are not touching: