A sophisticated Streamlit application that allows users to upload PDF files and interact with them using natural language queries, powered by binary quantization for efficient vector storage and retrieval.
- 📄 Multi-PDF Upload: Upload one or multiple PDF files simultaneously
- 🔧 Binary Quantization: Efficient embedding storage using binary quantization
- 💬 Interactive Chat: Natural language conversation with your PDFs
- ⏱️ Response Time Tracking: Real-time performance metrics in milliseconds
- 📋 PDF Preview: File details including page count and size
- 🗂️ Vector Database: Milvus-powered semantic search
- 🤖 Advanced LLM: Groq integration for fast response generation
- Frontend: Streamlit
- Embeddings: OpenAI text-embedding-3-small
- Vector Database: Milvus with HAMMING distance
- LLM: Groq (moonshotai/kimi-k2-instruct)
- PDF Processing: PyPDF2
- Binary Quantization: NumPy-based optimization
boost-rag-with-binary-quantization/
├── streamlit_main.py # Main Streamlit application
├── embedding.py # Binary quantization embedding logic
├── retriever_llm_index.py # Retrieval and LLM integration
├── requirements.txt # Python dependencies
├── run_app.sh # Application launcher script
├── .env.example # Environment variables template
├── docker-compose.yml # Docker configuration
└── docs/ # PDF documents directory
└── llm.pdf
# Clone or navigate to the project directory
cd boost-rag-with-binary-quantization
# Create virtual environment (recommended)
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
Create a .env
file based on .env.example
:
cp .env.example .env
Edit .env
and add your API keys:
OPENAI_API_KEY=your_openai_api_key_here
GROQ_API_KEY=your_groq_api_key_here
./run_app.sh
streamlit run streamlit_main.py
The application will open in your browser at http://localhost:8501
- Use the sidebar to upload one or multiple PDF files
- View file details in the PDF Preview section
- See the number of text chunks extracted from each file
- Click the "🔧 Create Embeddings" button in the sidebar
- Wait for the binary quantization process to complete
- The system will create a Milvus vector database with your content
- Use the chat interface in the main area
- Ask questions about your uploaded PDFs
- View response times for each interaction
- Clear chat history when needed
- Text Extraction: PDFs are processed and split into chunks
- Float32 Embeddings: Generated using OpenAI's text-embedding-3-small
- Binary Conversion: Float values > 0 become 1, others become 0
- Byte Packing: Binary vectors are packed into bytes for storage
- Milvus Storage: Stored with HAMMING distance indexing
- Storage Efficiency: 32x reduction in storage space
- Query Speed: Faster similarity search with binary operations
- Memory Usage: Significantly reduced RAM requirements
- Scalability: Better performance with large document collections
embedding_model = OpenAIEmbedding(model="text-embedding-3-small")
llm = Groq(
model="moonshotai/kimi-k2-instruct",
api_key=os.environ.get("GROQ_API_KEY"),
temperature=0.5,
max_tokens=1000
)
search_params = {"metric_type": "HAMMING"}
limit = 5 # Number of retrieved documents
Run with Docker Compose:
docker-compose up -d
The application tracks and displays:
- Response Time: LLM generation time in milliseconds
- Embedding Creation: Progress and completion status
- File Processing: Upload and parsing status
- Vector Search: Retrieval performance
-
Missing API Keys
- Ensure
.env
file exists with valid API keys - Check OpenAI and Groq API key formats
- Ensure
-
PDF Processing Errors
- Verify PDF files are not corrupted
- Check file size limitations
- Ensure text-extractable PDFs (not image-only)
-
Vector Database Issues
- Delete
milvus_data.db
and recreate embeddings - Check disk space availability
- Verify Milvus dependencies
- Delete
-
Performance Issues
- Reduce number of documents retrieved (limit parameter)
- Use smaller PDF files for testing
- Monitor system memory usage
streamlit_main.py
: Main application with UI components- Binary quantization functions: Embedding conversion logic
- Vector store management: Milvus collection handling
- Chat interface: Message history and response generation
extract_text_from_pdf()
: PDF text extractioncreate_binary_embeddings()
: Embedding quantizationsetup_vector_store()
: Milvus database setupretrieve_context()
: Semantic searchgenerate_response()
: LLM interaction
-
OpenAI API Key: For text embeddings
- Get from: https://platform.openai.com/api-keys
- Used for: text-embedding-3-small model
-
Groq API Key: For LLM inference
- Get from: https://console.groq.com/keys
- Used for: moonshotai/kimi-k2-instruct model
- Fork the repository
- Create a feature branch
- Make your changes
- Test thoroughly
- Submit a pull request
This project is open source. Please check the license file for details.
For issues and questions:
- Check the troubleshooting section
- Review the code documentation
- Open an issue on the repository
Happy Chatting with your PDFs! 🎉