/etl-tsl_events

etl pipeline for turkish football events

Primary LanguagePythonApache License 2.0Apache-2.0

Turkish Football Events ETL

This is an ETL pipeline to pull Turkish football events (red cards-goals etc.) from Turkish Football Federation website.

Architecture

General Diagram

An example event document from MongoDB:

Document

I used python to pull, transform and load data. Warehouse is MongoDB. All the components are running as docker containers.

Setup

Pre-requisites

  1. Docker and Docker Compose v1.27.0 or later.
  2. AWS account.
  3. AWS CLI installed and configured.
  4. git.

Local

A Makefile exists with common commands. These are executed in the running container.

make up # starts all the containers
make ci # runs formatting, lint check, type check and python test

The remaining configs are available in the .env.dist file.

Production

In production instances will run as containers. Thus, for MongoDB port 27017 must be available.

Tear down

You can spin down your local instance with:

make down

This project inspired from Designing a Data Project to Impress Hiring Managers.