/rag-agent

A powerful Retrieval Augmented Generation system that turns your Markdown documentation into an interactive knowledge base using vector search and LLMs.

Primary LanguagePythonMIT LicenseMIT

Markdown Knowledge RAG

Markdown Knowledge RAG transforms your Markdown files into a searchable knowledge base using vector embeddings. Ask questions in natural language and get accurate answers based on your documentation content. The system combines Milvus for vector storage with your choice of Ollama's local LLMs or OpenAI's models for embeddings and text generation.

✨ Features

  • Markdown Indexing: Recursively processes Markdown files, with directory exclusion options
  • Flexible AI Backend: Use local Ollama models or OpenAI's API
  • Vector Search: Leverages Milvus for fast similarity search
  • Self-contained Setup: Simple Docker-based deployment
  • Automatic Fallback: Gracefully falls back to local models if OpenAI isn't available
  • Command-line Interface: Easy-to-use CLI for both indexing and querying

🛠️ Tech Stack

  • Vector Database: Milvus
  • Embedding Models: Ollama (DeepSeek) or OpenAI
  • Text Generation: Ollama (DeepSeek) or OpenAI (GPT-3.5/4o)
  • Storage: Docker volumes for persistence
  • Language: Python 3.8+

System Overview

This system consists of two main components:

  1. Markdown Vectorizer: Processes Markdown files, generates embeddings, and stores them in Milvus.
  2. RAG Query System: Retrieves relevant documents and generates answers to questions using LLMs.

Requirements

  • Python 3.8+
  • Docker and Docker Compose (for running Milvus)
  • Ollama (for local LLM processing)
  • OpenAI API key (optional, for using OpenAI models)

Setup Instructions

1. Start Milvus with Docker Compose

Download the latest Milvus docker-compose.yml file using wget:

mkdir -p data
cd data
wget https://github.com/milvus-io/milvus/releases/download/v2.5.6/milvus-standalone-docker-compose.yml -O docker-compose.yml

Start Milvus:

docker-compose up -d

Verify that Milvus is running:

docker-compose ps

You should see containers running for Milvus standalone, Etcd, and MinIO.

2. Install Required Python Packages

pip install pymilvus ollama openai markdown-it

3. Install and Start Ollama

Download Ollama from https://ollama.com/ or install it using:

curl -fsSL https://ollama.com/install.sh | sh

Start the Ollama service:

ollama serve

Pull the required models:

ollama pull deepseek-llm

4. Configure Environment (Optional for OpenAI)

If you want to use OpenAI models, set your API key as an environment variable:

export OPENAI_API_KEY="your-api-key-here"

For permanent storage, add this line to your .bashrc or .zshrc file.

5. Index Your Markdown Files

Use the Markdown Vectorizer to process your documents:

python doc2vec.py --dir /path/to/your/markdown/files --skip node_modules .git dist

Options:

  • --dir or -d: Directory containing Markdown files
  • --skip or -s: Directories to skip (space-separated)
  • --host: Milvus host (default: localhost)
  • --port: Milvus port (default: 19530)
  • --model: Ollama model for embeddings (default: deepseek-llm)

6. Query Your Knowledge Base

Use the RAG system to ask questions about your indexed documents:

python ragagent.py

Options:

  • --provider or -p: Model provider (ollama or openai, default: ollama)
  • --embedding-model or -em: Model for embeddings
  • --llm-model or -lm: Model for text generation
  • --temperature or -t: Temperature for generation (0.0-1.0, default: 0.7)
  • --results or -r: Number of documents to retrieve (default: 3)
  • --milvus-host: Milvus server host (default: localhost)
  • --milvus-port: Milvus server port (default: 19530)
  • --no-fallback: Disable fallback to Ollama if OpenAI fails

Examples

Index a Documentation Repository

python doc2vec.py --dir ~/projects/documentation --skip node_modules .git assets

Query with Ollama

python ragagent.py

Query with OpenAI

# Make sure OPENAI_API_KEY is set
python ragagent.py --provider openai --embedding-model text-embedding-3-small --llm-model gpt-4o

Troubleshooting

  1. Milvus Connection Issues

    • Ensure Milvus containers are running: docker ps
    • Check Milvus logs: docker logs milvus-standalone
    • Verify Milvus port is accessible: curl -I http://localhost:19530
  2. Ollama Model Issues

    • Verify Ollama is running: ps aux | grep ollama
    • Check available models: ollama list
    • Retry pulling models with: ollama pull model-name
  3. OpenAI API Issues

    • Verify your API key is correctly set: echo $OPENAI_API_KEY
    • Check for API rate limits or quota issues
  4. Vector Dimension Mismatch

    • If you encounter errors about vector dimensions, ensure you're using the same embedding model for indexing and querying
    • If necessary, drop the collection and re-index with the desired model

Maintenance

  • Updating the Knowledge Base: Re-run the vectorizer script when your Markdown files change
  • Changing Models: Ensure consistent embedding dimensions or recreate the collection
  • Updating Milvus: To update to the latest Milvus version:
    cd data
    docker-compose down
    wget https://github.com/milvus-io/milvus/releases/latest/download/milvus-standalone-docker-compose.yml -O docker-compose.yml
    docker-compose up -d

Advanced Configuration

Custom Milvus Collection Name

To use a different collection name, modify the collection_name parameter in the MilvusCollection or MilvusManager classes.

Adjusting Vector Dimensions

Different embedding models produce vectors of different dimensions:

  • OpenAI text-embedding-3-small: 1536 dimensions
  • OpenAI text-embedding-3-large: 3072 dimensions
  • DeepSeek and most Ollama models: 4096 dimensions

The system will automatically detect the dimension based on the selected model.