/Rubeus

A Blazing Fast AI Gateway by Portkey.ai

Primary LanguageTypeScriptMIT LicenseMIT

Cloudflare TypeScript Licence Discord

Rubeus streamlines API requests to 20+ LLMs. It provides a unified API signature for interacting with all LLMs alongwith powerful LLM Gateway features like load balancing, fallbacks, retries and more.

Features

  • 🌐 Interoperability: Write once, run with any provider. Switch between __ models from __ providers seamlessly.
  • 🔀 Fallback Strategies: Don't let failures stop you. If one provider fails, Rubeus can automatically switch to another.
  • 🔄 Retry Strategies: Temporary issues shouldn't mean manual re-runs. Rubeus can automatically retry failed requests.
  • ⚖️ Load Balancing: Distribute load effectively across multiple API keys or providers based on custom weights.
  • 📝 Unified API Signature: If you've used OpenAI, you already know how to use Rubeus with any other provider.

Supported Providers

Provider Support Status Supported Endpoints
OpenAI ✅ Supported /completion, /embed
Azure OpenAI ✅ Supported /completion, /embed
Anthropic ✅ Supported /complete
Cohere ✅ Supported generate, embed
Google Bard 🚧 Coming Soon
LocalAI 🚧 Coming Soon

Getting Started

npm install
npm run dev # To run locally
npm run deploy # To deploy to cloudflare

Usage

🌐 Interoperability

Rubeus allows you to switch between different language learning models from various providers, making it a highly flexible tool. The following example shows a request to openai, but you could change the provider name to cohere, anthropic or others and Rubeus will automatically handle everything else.

curl --location 'http://127.0.0.1:8787/v1/complete' \
--header 'Content-Type: application/json' \
--data-raw '{
    "config": {
        "provider": "openai",
        "api_key: "<open-ai-api-key-here>"
    },
    "params": {
        "prompt": "What are the top 10 happiest countries in the world?",
        "max_tokens": 50,
        "model": "text-davinci-003",
        "user": "jbu3470"
    }
}'

🔀 Fallback Strategies

In case one provider fails, Rubeus is designed to automatically switch to another, ensuring uninterrupted service.

# Fallback to anthropic, if openai fails (This API will use the default text-davinci-003 and claude-v1 models)
curl --location 'http://127.0.0.1:8787/v1/complete' \
--header 'Content-Type: application/json' \
--data-raw '{
    "config": {
        "mode": "fallback",
        "options": [
          {
            "provider": "openai",
            "api_key": "<open-ai-api-key-here>"
          }, 
          {
            "provider": "anthropic",
            "api_key": "<anthropic-api-key-here>"
          }
        ]
    },
    "params": {
        "prompt": "What are the top 10 happiest countries in the world?",
        "max_tokens": 50,
        "user": "jbu3470"
    }
}'

# Fallback to gpt-3.5-turbo when gpt-4 fails
curl --location 'http://127.0.0.1:8787/v1/chatComplete' \
--header 'Content-Type: application/json' \
--data-raw '{
    "config": {
        "mode": "fallback",
        "options": [
          {
            "provider": "openai", 
            "override_params": {"model": "gpt-4"},
            "api_key": "<open-ai-api-key-here>" 
          }, 
          {
            "provider": "openai", 
            "override_params": {"model": "gpt-3.5-turbo"},
            "api_key": "<open-ai-api-key-here>"
          }
        ]
    },
    "params": {
        "messages": [{"role": "user", "content": "What are the top 10 happiest countries in the world?"}],
        "max_tokens": 50,
        "user": "jbu3470"
    }
}'

🔄 Retry Strategies

Rubeus has a built-in mechanism to retry failed requests, eliminating the need for manual re-runs.

# Add the retry configuration to enable exponential back-off retries
curl --location 'http://127.0.0.1:8787/v1/complete' \
--header 'Content-Type: application/json' \
--data-raw '{
    "config": {
        "mode": "single",
        "options": [{
            "provider": "openai",
            "retry": {
                "attempts": 3,
                "on_status_codes": [429,500,504,524]
            },
            "api_key": "<open-ai-api-key-here>"
        }]
    },
    "params": {
        "prompt": "What are the top 10 happiest countries in the world?",
        "max_tokens": 50,
        "model": "text-davinci-003",
        "user": "jbu3470"
    }
}'

⚖️ Load Balancing

Manage your workload effectively with Rubeus's custom weight-based distribution across multiple API keys or providers.

# Load balance 50-50 between gpt-3.5-turbo and claude-v1
curl --location 'http://127.0.0.1:8787/v1/chatComplete' \
--header 'Content-Type: application/json' \
--data '{
    "config": {
        "mode": "loadbalance",
        "options": [{
            "provider": "openai",
            "weight": 0.5,
            "override_params": { "model": "gpt-3.5-turbo" },
            "api_key": "<open-ai-api-key-here>"
        }, {
            "provider": "anthropic",
            "weight": 0.5,
            "override_params": { "model": "claude-v1" },
            "api_key": "<anthropic-api-key-here>"
        }]
    },
    "params": {
        "messages": [{"role": "user","content":"What are the top 10 happiest countries in the world?"}],
        "max_tokens": 50,
        "user": "jbu3470"
    }
}'

📝 Unified API Signature

If you're familiar with OpenAI's API, you'll find Rubeus's API easy to use due to its unified signature.

# OpenAI query
curl --location 'http://127.0.0.1:8787/v1/complete' \
--header 'Content-Type: application/json' \
--data-raw '{
    "config": {
        "provider": "openai",
        "api_key": "<open-ai-api-key-here>"
    },
    "params": {
        "prompt": "What are the top 10 happiest countries in the world?",
        "max_tokens": 50,
        "user": "jbu3470"
    }
}'

# Anthropic Query
curl --location 'http://127.0.0.1:8787/v1/complete' \
--header 'Content-Type: application/json' \
--data-raw '{
    "config": {
        "provider": "anthropic",
        "api_key": "<anthropic-api-key-here>"
    },
    "params": {
        "prompt": "What are the top 10 happiest countries in the world?",
        "max_tokens": 50,
        "user": "jbu3470"
    }
}'

Built with Rubeus

Name Description
Portkey.ai Full Stack LLMOps

Roadmap

  1. Support for more providers, including Google Bard and LocalAI.
  2. Enhanced load balancing features to optimize resource use across different models and providers.
  3. More robust fallback and retry strategies to further improve the reliability of requests.
  4. Increased customizability of the unified API signature to cater to more diverse use cases.

💬 Participate in Roadmap discussions here.

Contributing

License

Rubeus is licensed under the MIT License. See the LICENSE file for more details.