A local, privacy-first Retrieval-Augmented Generation (RAG) chat app. Upload documents, ask questions, and get answers with sources—powered by open-source LLMs running on your own machine.
- Local RAG Pipeline: No cloud, no data leaks—everything runs on your machine.
- File Upload & Parsing: Supports DOCX, PDF, and more (via
unstructured
,python-docx
,pdfplumber
). - Modern UI: Gemini-style, minimal, and accessible. Built with Vite, React, TypeScript, Zustand, Chakra UI.
- Chat with Sources: Ask questions and see which documents/sections the answer comes from.
- FastAPI Backend: Robust API, clean separation from frontend,
/api
route organization. - Ollama LLM Integration: Use open-source models (Mistral, Llama2, etc.) locally via Ollama.
- Extensible & Documented: Modular, testable code with strict documentation and change management policies.
- Python 3.9+
- Node.js 18+
- Ollama (for local LLMs and embeddings)
- Install Ollama (for local LLM and embeddings)
- Pull required models:
mistral
(chat) andnomic-embed-text
(embeddings) - Start the Ollama server:
ollama serve
(must be running for backend to work) - Set up backend (Python, FastAPI)
- Set up frontend (Node.js, Vite)
- Open the app in your browser: http://localhost:5173
- Download Ollama: ollama.com/download (macOS, Windows, Linux)
- Or via Homebrew (macOS):
brew install ollama
- Start the Ollama server: (must be running for backend to work)
ollama serve
- Pull required models:
ollama pull mistral ollama pull nomic-embed-text
mistral
: Used for chat and answering questionsnomic-embed-text
: Used for document embeddings
You can substitute
mistral
with any compatible model (e.g.llama3
,llama2
), but the backend defaults tomistral
.
- Create and activate a virtual environment:
cd backend python -m venv .venv source .venv/bin/activate
- Install Python dependencies:
pip install --upgrade pip pip install -r requirements.txt
- Start the backend server:
uvicorn app.main:app --reload
- The backend API will be available at: http://localhost:8000/api
- Install Node.js dependencies:
cd frontend npm install
- Start the frontend dev server:
npm run dev
- The frontend app will be available at: http://localhost:5173
- Ollama installed
-
mistral
andnomic-embed-text
models pulled -
ollama serve
running - Backend running at http://localhost:8000/api
- Frontend running at http://localhost:5173
- Open http://localhost:5173 in your browser.
- Upload your files using the sidebar.
- Ask questions in the chat box; answers will cite document sources.
- All processing is local—your data never leaves your device.
- Ollama not running or model errors:
- Make sure
ollama serve
is running in a terminal window before starting the backend. - Ensure you have pulled both
llama3
andnomic-embed-text
models. - You can check running models with
ollama list
.
- Make sure
- Python dependency errors:
- Make sure your virtual environment is activated and
pip
is up to date.
- Make sure your virtual environment is activated and
- Node/npm errors:
- Use Node.js 18+ and delete/reinstall
node_modules
if issues persist.
- Use Node.js 18+ and delete/reinstall
- PDF/DOCX parsing errors:
- Install
libmagic
andpoppler-utils
(see backend gotchas).
- Install
- For more help:
- Upload files in the sidebar.
- Ask questions in the chat—answers are generated using your documents as context.
- Sources are shown for every answer (deduplicated by file).
- All processing is local—your data never leaves your device.
- Frontend: Vite + React + TypeScript + Zustand + Chakra UI
- Backend: FastAPI + SQLAlchemy + LangChain + ChromaDB + Unstructured
- LLM: Ollama (Mistral, Llama2, etc.) via
langchain-ollama
- RAG Pipeline: Chunking, embedding, retrieval, and chat with sources
ChatRAG/
backend/
app/
main.py # FastAPI app & API endpoints
db/ # Database models & session
rag/ # RAG pipeline logic
requirements.txt
...
frontend/
src/
components/ # UI components (Chat, Files, Layout)
state/ # Zustand stores
...
vite.config.ts
...
- Change LLM Model: Edit the model name in
backend/app/main.py
(OllamaLLM(model="mistral")
). - Add File Types: Extend file parsing in the backend pipeline.
- UI/UX: Tweak Chakra UI theme or component structure in
frontend/src/components
.
- All operational quirks, architecture decisions, and gotchas are logged in
backend/implementation_details.md
,gotchas.md
, andquick_reference.md
. - Strict documentation and code quality policies are followed—see project docs for details.
- Built by Tarek Adam Mustafa and contributors.
- Powered by open-source: Ollama, LangChain, ChromaDB, Unstructured, Chakra UI, Vite.