/Open-Ollama-RAG-ChatApp

Retrieval-Augmented Generation Chat Bot using Ollama, Langchain and Gradio.

Primary LanguageJupyter NotebookMIT LicenseMIT

Open-Ollama-RAG-ChatApp

Retrieval-Augmented Generation Chat Bot using Ollama, Langchain and Gradio.

Idea

The notebook is a proof of concept on how to build a retrieval-augmented generation chatbot using Ollama, Langchain and Gradio. The chatbot is built using the following components:

  • Ollama is used as backend to host large language models and provide an API to interact with them.
  • Langchain is used as library to generate chunks from provided markdown files and embedd them using Ollama. The embeddings are stored in a chroma database.
  • Gradio is used to provide a simple chat interface to interact with the RAG-Chatbot.

RAG-ChatAPP Architecture

Installation

Prerequisites

  • Ollama to host the language models.
  • Minicoda or other conda distribution (Optional but recommended).
  • Poetry to install the required python packages (Optional but recommended).

Steps

  1. Pull the desired model for ollama and start the ollama backend using the following command:
# change model to the desired model name -> see https://ollama.com/library for other models
ollama pull llama2:chat
ollama start
  1. Create and activate a virtual environment using conda:
# create env
conda create -n open_rag_chat python=3.11
# activate env
conda activate open_rag_chat
  1. Install the required packages using poetry:
poetry install
  1. On first run set the initial_db = True. This will create new embeddings for the provided markdown files and create a new chroma db in the given path (DATA_PATH = "data/").
  2. Drop your own markdown files in the data/ folder.
  3. Run the notebook.

References