Local Chat RAG

A local, privacy-first Retrieval-Augmented Generation (RAG) chat app. Upload documents, ask questions, and get answers with sources—powered by open-source LLMs running on your own machine.


Features

  • Local RAG Pipeline: No cloud, no data leaks—everything runs on your machine.
  • File Upload & Parsing: Supports DOCX, PDF, and more (via unstructured, python-docx, pdfplumber).
  • Modern UI: Gemini-style, minimal, and accessible. Built with Vite, React, TypeScript, Zustand, Chakra UI.
  • Chat with Sources: Ask questions and see which documents/sections the answer comes from.
  • FastAPI Backend: Robust API, clean separation from frontend, /api route organization.
  • Ollama LLM Integration: Use open-source models (Mistral, Llama2, etc.) locally via Ollama.
  • Extensible & Documented: Modular, testable code with strict documentation and change management policies.

Quick Start

1. Requirements

  • Python 3.9+
  • Node.js 18+
  • Ollama (for local LLMs and embeddings)

🚀 Quick Setup Checklist

  1. Install Ollama (for local LLM and embeddings)
  2. Pull required models: mistral (chat) and nomic-embed-text (embeddings)
  3. Start the Ollama server: ollama serve (must be running for backend to work)
  4. Set up backend (Python, FastAPI)
  5. Set up frontend (Node.js, Vite)
  6. Open the app in your browser: http://localhost:5173

2. Install Ollama and Required Models

  • Download Ollama: ollama.com/download (macOS, Windows, Linux)
  • Or via Homebrew (macOS):
    brew install ollama
  • Start the Ollama server: (must be running for backend to work)
    ollama serve
  • Pull required models:
    ollama pull mistral
    ollama pull nomic-embed-text
    • mistral: Used for chat and answering questions
    • nomic-embed-text: Used for document embeddings

    You can substitute mistral with any compatible model (e.g. llama3, llama2), but the backend defaults to mistral.

3. Backend Setup (FastAPI)

  • Create and activate a virtual environment:
    cd backend
    python -m venv .venv
    source .venv/bin/activate
  • Install Python dependencies:
    pip install --upgrade pip
    pip install -r requirements.txt
  • Start the backend server:
    uvicorn app.main:app --reload

4. Frontend Setup (Vite)

  • Install Node.js dependencies:
    cd frontend
    npm install
  • Start the frontend dev server:
    npm run dev

✅ Quick Setup Checklist


How to Use the App

  • Open http://localhost:5173 in your browser.
  • Upload your files using the sidebar.
  • Ask questions in the chat box; answers will cite document sources.
  • All processing is local—your data never leaves your device.

Troubleshooting & Tips

  • Ollama not running or model errors:
    • Make sure ollama serve is running in a terminal window before starting the backend.
    • Ensure you have pulled both llama3 and nomic-embed-text models.
    • You can check running models with ollama list.
  • Python dependency errors:
    • Make sure your virtual environment is activated and pip is up to date.
  • Node/npm errors:
    • Use Node.js 18+ and delete/reinstall node_modules if issues persist.
  • PDF/DOCX parsing errors:
    • Install libmagic and poppler-utils (see backend gotchas).
  • For more help:

Usage

  • Upload files in the sidebar.
  • Ask questions in the chat—answers are generated using your documents as context.
  • Sources are shown for every answer (deduplicated by file).
  • All processing is local—your data never leaves your device.

Architecture

  • Frontend: Vite + React + TypeScript + Zustand + Chakra UI
  • Backend: FastAPI + SQLAlchemy + LangChain + ChromaDB + Unstructured
  • LLM: Ollama (Mistral, Llama2, etc.) via langchain-ollama
  • RAG Pipeline: Chunking, embedding, retrieval, and chat with sources

Folder Structure

ChatRAG/
  backend/
    app/
      main.py           # FastAPI app & API endpoints
      db/               # Database models & session
      rag/              # RAG pipeline logic
    requirements.txt
    ...
  frontend/
    src/
      components/       # UI components (Chat, Files, Layout)
      state/            # Zustand stores
      ...
    vite.config.ts
    ...

Customization

  • Change LLM Model: Edit the model name in backend/app/main.py (OllamaLLM(model="mistral")).
  • Add File Types: Extend file parsing in the backend pipeline.
  • UI/UX: Tweak Chakra UI theme or component structure in frontend/src/components.

Documentation & Policies

  • All operational quirks, architecture decisions, and gotchas are logged in backend/implementation_details.md, gotchas.md, and quick_reference.md.
  • Strict documentation and code quality policies are followed—see project docs for details.

Credits


License

MIT