thesanjithkumar/local-rag-example

Ingest files for retrieval augmented generation (RAG) with open-source Large Language Models (LLMs), all without 3rd parties or sensitive data leaving your network.

PythonGPL-3.0

📚 Local RAG

Offline, Open-Source RAG

Ingest files for retrieval augmented generation (RAG) with open-source Large Language Models (LLMs), all without 3rd parties or sensitive data leaving your network.

Offline Embeddings & LLMs Support (No OpenAI!)
Streaming Responses
Conversation Memory
Chat Export

Pre-Requisites

A pre-existing Ollama instance
Python 3.10+

Setup

Local:

pip install pipenv && pipenv install
pipenv shell && streamlit run main.py

Docker:

docker compose up -d

Usage

Set your Ollama endpoint and model under Settings
Upload your documents for processing
Once complete, ask questions based on your documents!

To Do

Known Issues & Bugs

Refreshing the page loses all state (expected Streamlit behavior; need to implement local-storage)
Files can be uploaded before Ollama config is set, leading to embedding errors
Assuming Ollama is hosted on localhost, Models are automatically loaded and selected, but the dropdown does not render the selected option
Upon sending a Chat message, the File Processing expander appears to re-run itself (seems something is not using state correctly)

Resources