Ollama RAG

NOTE: Utilized PC with 3 NVIDIA RTX A6000 in a Windows Environment

Prerequisites

  1. Install Linux on Windows with WSL
    Follow the instructions HERE

  2. Install CUDA Tooling for Ubuntu on WSL2
    Follow the instructions HERE

  3. Verify Drivers Installation

    nvidia-smi
    
  4. Set Up Python Virtual Environment

  • Create and activate a Python 3.10 virtual environment

    python -m venv [name_of_venv]
    .\[name_of_venv]\Scripts\activate
    pip install -r requirements.txt

Installing and Running Models with Ollama (WINDOWS 🪟)

  1. Download Ollama for Windows
  • Download Ollama for Windows HERE

NOTE: If your virtual environment is created using Windows, you must ensure that the Ollama models are pulled within a PowerShell environment.

  1. Download the Model of Your Choice
  • From PowerShell terminal, download a model from HERE

    ollama run "name_of_your_model"
  1. Verify Downloaded Model
  • Verify the model and note its syntax (e.g., mistral:latest)

    ollama list
  1. Determine WSL IP Address
  • From WSL terminal, determine your WSL IP address (look under the eth# interface)

  • IP address listed will be used to host and interact with your chromadb/chroma Docker container

    ip a

Installing and Running Models with Ollama (Unix 🐧)

  1. Download Ollama for WSL
  • Open WSL terminal and run the following command:

    curl -fsSL https://ollama.com/install.sh | sh
  1. Download the Model of Your Choice
  • From WSL terminal, download a model from HERE

    ollama run "name_of_your_model"
  1. Verify Downloaded Model
  • Verify the model and note its syntax (e.g., mistral:latest)

    ollama list
  1. Determine WSL IP Address
  • From WSL terminal, determine your WSL IP address (look under the eth# interface)

    ip a

Setting Up Ollama RAG

  1. Upload Data
  • Open your preferred IDE.
  • Create a data directory.
  • Upload PDFs into the data directory.
  1. Modify Configuration Files
  • Open VSCode and modify chroma_client.py:

    • Replace "YOUR_WSL_IP_GOES_HERE" with your WSL IP.
  • Modify rag_query.py:

    • Replace "YOUR_WSL_IP_GOES_HERE" with your WSL IP.
    • Replace "YOUR_OLLAMA_MODEL_GOES_HERE" with your downloaded Ollama model.
  1. Save All Changes

Running the Project

  1. Load and Split Documents

    • Load PDF docs from the data directory and split each into chunks.

      python loader.py
  2. Initialize Docker Container

    • Pull and initiate chromadb/chroma container from Docker

      sudo docker run -p 8000:8000 chromadb/chroma
  3. Create Vector Database

    • Initialize the data directory containing your documents to create a vector database

      python chroma_client.py
    • Note: Each time you modify the documents in the data directory, re-run chroma_client.py.

  4. Launch Interactive RAG System

    python rag_query.py

References/Inspiration