RAG With MinIO

In this Repo, we will demonstrate how to use MinIO to build a Retrieval Augmented Generation(RAG) based chat application using commodity hardware.

Use MinIO to store all the documents, processed chunks and the embeddings using the vector database.
Use MinIO's bucket notification feature to trigger events when adding or removing documents to a bucket
Webhook that consumes the event and process the documents using Langchain and saves the metadata and chunked documents to a metadata bucket
Trigger MinIO bucket notification events for newly added or removed chunked documents
A Webhook that consumes the events and generates embeddings and save it to the Vector Database (LanceDB) that is persisted in MinIO

Architecture

MinIO - Object Store to persist all the Data
LanceDB - Serverless open-source Vector Database that persists data in object store
Ollama - To run LLM and embedding model locally (OpenAI API compatible)
Gradio - Interface through which to interact with RAG application
FastAPI - Server for the Webhooks that receives bucket notification from MinIO and exposes the Gradio App
LangChain & Unstructured - To Extract useful text from our documents and Chunk them for Embedding

LLM - Phi-3-128K (3.8B Parameters)
Embeddings - Nomic Embed Text v1.5 (Matryoshka Embeddings/ 768 Dim, 8K context)

Install the required packages using the following command:

pip install -r requirements.txt

You can follow the step by step described in the Notebook to run the application.