/data-engineering

Code repo for automating data pipelines and data engineering tasks

Primary LanguageJupyter NotebookMIT LicenseMIT

Automated Data Pipelines with sqlite, sqlalchemy and airflow


Overview

This repo contains code for automating data pipelines and data engineering tasks. The purpose of this repo is to build automated data pipelines that -

  • Query data from REST APIs using requests.
  • Using json and pandas tp clean and transform the raw data.
  • Storing data into relational databases using sqlite and sqlalchemy.
  • Update databases on a schedule using cron and airflow.

This repo also contains notes and code from the Real Python SQLite and SQLAlchemy course.


Data Sources