/local-rag-example

Ingest files for retrieval augmented generation (RAG) with open-source Large Language Models (LLMs), all without 3rd parties or sensitive data leaving your network.

Primary LanguagePythonGNU General Public License v3.0GPL-3.0

📚 Local RAG

local-rag-demo

GitHub commit activity GitHub last commit GitHub License

Offline, Open-Source RAG

Ingest files for retrieval augmented generation (RAG) with open-source Large Language Models (LLMs), all without 3rd parties or sensitive data leaving your network.

  • Offline Embeddings & LLMs Support (No OpenAI!)
  • Streaming Responses
  • Conversation Memory
  • Chat Export

Pre-Requisites

  • A pre-existing Ollama instance
  • Python 3.10+

Setup

Local:

  • pip install pipenv && pipenv install
  • pipenv shell && streamlit run main.py

Docker:

  • docker compose up -d

Usage

  • Set your Ollama endpoint and model under Settings
  • Upload your documents for processing
  • Once complete, ask questions based on your documents!

To Do

  • Refactor into modules
  • Refactor file processing logic
  • Migrate Chat Stream to Llama-Index
  • Implement Llama-Index Chat Engine with Memory
  • Swap to Llama-Index Chat Engine
  • Function to Handle File Embeddings
  • Allow Users to Set LLM Settings
    • System Prompt
    • Chat Mode
    • top_k
    • chunk_size
    • chunk_overlap
  • Allow Switching of Embedding Model & Settings
  • Delete Files after Index Created/Failed
  • Support Additional Import Options
    • GitHub Repos
    • Websites
  • Remove File Type Limitations for Uploads
  • Show Loaders in UI (File Uploads, Conversions, ...)
  • Export Data (Uploaded Files, Chat History, ...)
  • View and Manage Imported Files
  • About Tab in Sidebar
  • Docker Support
  • Implement Log Library
  • Improve Logging
  • Re-write Docstrings
  • Additional Error Handling
    • Starting a chat without an Ollama model set
    • Incorrect GitHub repos

Known Issues & Bugs

  • Refreshing the page loses all state (expected Streamlit behavior; need to implement local-storage)
  • Files can be uploaded before Ollama config is set, leading to embedding errors
  • Assuming Ollama is hosted on localhost, Models are automatically loaded and selected, but the dropdown does not render the selected option
  • Upon sending a Chat message, the File Processing expander appears to re-run itself (seems something is not using state correctly)

Resources