ZAAI Infrastructure

This repository contains everything to build the Airflow used by the Data Team to schedule, run and monitor all tasks to serve the needs of the stakeholders.

Local Requirements

To run everything locally you need the following:

  • docker (with 4-6 GB of memory in Rescources)
  • python virtual environment (requires virtualenv or similar to select a python version different from the system default)
    • python 3.10
    • pip version 20.2.4 (less than 20.3.)

LOCAL SETUP

Please download both folders (infrastructure and mlflow) Inside each folder, you have a readme to run each container

After having both containers up, you should verify that everything is connected. To run this, you should manually add your input data in PostgreSQL, creating a table and also creating a table for the desired predictions

If pgAdmin is not displaying any server, register to a server doing the following:

pgsetup

Default logins:

Airflow Webserver :

  • User: Admin
  • Password: admin

pgAdmin :

Minio UI:

  • MINIO_ROOT_USER=zaai_infrastructure
  • MINIO_ROOT_PASSWORD=zaai_infrastructure

MLflow:

  • AWS_ACCESS_KEY_ID=mlflow
  • AWS_SECRET_ACCESS_KEY=mlflow_pwd

Certify that you have a connection made in your airflow UI to the postgres : (password is airflow)

postgrescon