GOKU: GenAIOps on Kubernetes

[Work in Progress] A reference architecture for performing Generative AI Operations (aka GenAIOps) using Kubernetes, with open source tools

Installation
Features

Installation

For installation, follow the steps provided in the setup doc

Features

Model Ingestion

GOKU uses a customizable Argo Workflows template to download models from Hugging Face and ingest them into MLFlow.

How to run

To run the model ingestion with the default image, follow these steps:

Navigate to the Argo Workflows UI (see steps in the setup doc if unsure)
Enter the "goku" namespace and click on "SUBMIT NEW WORKFLOW"
Select "model-ingestion" as the template to be used
Enter the name of the model you want to ingest and click on "SUBMIT"
You should see the model ingestion workflow running
Once the workflow completes successfully, you should be able to see the model files saved as artifacts on mlflow
You should also be able to verify that the model artifacts have been ingested successfully using MinIO console

DREAM: Distributed RAG Experimentation Framework

Distributed RAG Experimentation Framework (DREAM) presents a kubernetes native architecture and sample code to demonstrate how Retrieval Augmented Generation experiments, evaluation and tracking can be conducted in a distributed manner using Ray, LlamaIndex, Ragas, MLFlow and MinIO. Checkout the DREAM README for details