Cohere & Azure AI Search RAG Demo

Welcome to the Cohere & Azure AI Search Retrieval-Augmented Generation (RAG) demo! This project showcases how to leverage Cohere Embed V3 for generating int8 and binary embeddings, significantly reducing memory costs while maintaining high search quality. These embeddings are integrated with Azure AI Search to perform RAG using CommandR+ in Azure AI Studio.

Introduction
Features
Setup
Usage
Contributing
License

Introduction

Semantic search over large datasets can require a lot of memory, which is expensive to host in a vector database. For example, searching across all of Wikipedia’s data requires storing 250 million embeddings. With 1024 dimensions per embedding and each dimension as float32 with 4 bytes, you will need close to 1 TB of memory on a server.

Demo URL: https://cohererag.streamlit.app/

Features

Memory Optimization: Use int8 and binary embeddings to reduce memory usage by up to 32x.
High Search Quality: Maintain 90-99.99% of the original search quality.
Multi-Language Support: Supports multiple languages for diverse applications.
Integration with Azure AI Search: Seamlessly integrate with Azure AI Search for advanced RAG capabilities.
Compression-Friendly Embedding Model Training: Ensures superior search quality even with vector space compression.

Setup

Prerequisites

Python 3.9 or higher
Streamlit
Cohere API Key
Cohere Command R+
Cohere English Embedding Model
Cohere Multilingual Embedding Model
Azure AI Search API Key
Azure AI Studio

Installation

Clone the repository:

git clone https://github.com/vrajroutu/cohere-azure-ai-search-rag.git  
cd cohere-azure-ai-search-rag

Install the required packages:
```
pip install -r requirements.txt  
```

Set up environment variables:
Create a .env file in the root directory and add your API keys and endpoints:

AZURE_AI_STUDIO_COHERE_EMBED_KEY=your_cohere_embed_key  
AZURE_AI_STUDIO_COHERE_EMBED_ENDPOINT=your_cohere_embed_endpoint  
AZURE_AI_STUDIO_COHERE_MULTIEMBED_KEY=your_cohere_multilingual_embed_key  
AZURE_AI_STUDIO_COHERE_MULTIEMBED_ENDPOINT=your_cohere_multilingual_embed_endpoint  
AZURE_AI_STUDIO_COHERE_COMMAND_KEY=your_cohere_command_key  
AZURE_AI_STUDIO_COHERE_COMMAND_ENDPOINT=your_cohere_command_endpoint  
SEARCH_SERVICE_API_KEY=your_search_service_api_key  
SEARCH_SERVICE_ENDPOINT=your_search_service_endpoint  
BINARY_INDEX_NAME=your_binary_index_name  
INT8_INDEX_NAME=your_int8_index_name  
MULTIBINARY_INDEX_NAME=your_multilingual_binary_index_name  
MULTIINT8_INDEX_NAME=your_multilingual_int8_index_name

Usage

Running the Application

Start the Streamlit application:
```
streamlit run welcome.py  
```
Open your browser and navigate to http://localhost:8501.

Uploading Files and Asking Questions

Upload Files: Use the sidebar to upload PDF files. The .pdf from these files will be extracted and indexed.
Ask Questions: Type your questions in the chat input at the bottom center of the page. The system will perform a vector search using the uploaded documents and provide responses.

Example Questions

"Start a knock-knock joke 🤡"
"Should I buy a road bike 🚴‍♂️ or a mountain bike 🚵‍♀️ if I want to exercise?"
"What's the most popular true crime show streaming right now? 📺🔍"
"Summarize the main points of the latest research on AI 🤖📝"
"Where should I travel if I have pollen allergies? 🌼❌✈️"

Contributing

We welcome contributions to enhance the functionality and features of this project. Please follow these steps to contribute:

Fork the repository.

Create a new branch:

git checkout -b feature/your-feature-name

Make your changes and commit them:
```
git commit -m "Add your feature"  
```

Push to the branch:

git push origin feature/your-feature-name

Open a pull request.

License

This project is licensed under the MIT License. See the LICENSE file for details.

For more information, read the full announcement from Cohere here.

Feel free to reach out to us on LinkedIn for any queries or support.

vrajroutu/cohere-azure-ai-search-rag