Rubeus streamlines API requests to 20+ LLMs. It provides a unified API signature for interacting with all LLMs alongwith powerful LLM Gateway features like load balancing, fallbacks, retries and more.
- 🌐 Interoperability: Write once, run with any provider. Switch between __ models from __ providers seamlessly.
- 🔀 Fallback Strategies: Don't let failures stop you. If one provider fails, Rubeus can automatically switch to another.
- 🔄 Retry Strategies: Temporary issues shouldn't mean manual re-runs. Rubeus can automatically retry failed requests.
- ⚖️ Load Balancing: Distribute load effectively across multiple API keys or providers based on custom weights.
- 📝 Unified API Signature: If you've used OpenAI, you already know how to use Rubeus with any other provider.
npm install
npm run dev # To run locally
npm run deploy # To deploy to cloudflare
Rubeus allows you to switch between different language learning models from various providers, making it a highly flexible tool. The following example shows a request to openai
, but you could change the provider name to cohere
, anthropic
or others and Rubeus will automatically handle everything else.
curl --location 'http://127.0.0.1:8787/v1/complete' \
--header 'Content-Type: application/json' \
--data-raw '{
"config": {
"provider": "openai",
"api_key: "<open-ai-api-key-here>"
},
"params": {
"prompt": "What are the top 10 happiest countries in the world?",
"max_tokens": 50,
"model": "text-davinci-003",
"user": "jbu3470"
}
}'
In case one provider fails, Rubeus is designed to automatically switch to another, ensuring uninterrupted service.
# Fallback to anthropic, if openai fails (This API will use the default text-davinci-003 and claude-v1 models)
curl --location 'http://127.0.0.1:8787/v1/complete' \
--header 'Content-Type: application/json' \
--data-raw '{
"config": {
"mode": "fallback",
"options": [
{
"provider": "openai",
"api_key": "<open-ai-api-key-here>"
},
{
"provider": "anthropic",
"api_key": "<anthropic-api-key-here>"
}
]
},
"params": {
"prompt": "What are the top 10 happiest countries in the world?",
"max_tokens": 50,
"user": "jbu3470"
}
}'
# Fallback to gpt-3.5-turbo when gpt-4 fails
curl --location 'http://127.0.0.1:8787/v1/chatComplete' \
--header 'Content-Type: application/json' \
--data-raw '{
"config": {
"mode": "fallback",
"options": [
{
"provider": "openai",
"override_params": {"model": "gpt-4"},
"api_key": "<open-ai-api-key-here>"
},
{
"provider": "openai",
"override_params": {"model": "gpt-3.5-turbo"},
"api_key": "<open-ai-api-key-here>"
}
]
},
"params": {
"messages": [{"role": "user", "content": "What are the top 10 happiest countries in the world?"}],
"max_tokens": 50,
"user": "jbu3470"
}
}'
Rubeus has a built-in mechanism to retry failed requests, eliminating the need for manual re-runs.
# Add the retry configuration to enable exponential back-off retries
curl --location 'http://127.0.0.1:8787/v1/complete' \
--header 'Content-Type: application/json' \
--data-raw '{
"config": {
"mode": "single",
"options": [{
"provider": "openai",
"retry": {
"attempts": 3,
"on_status_codes": [429,500,504,524]
},
"api_key": "<open-ai-api-key-here>"
}]
},
"params": {
"prompt": "What are the top 10 happiest countries in the world?",
"max_tokens": 50,
"model": "text-davinci-003",
"user": "jbu3470"
}
}'
Manage your workload effectively with Rubeus's custom weight-based distribution across multiple API keys or providers.
# Load balance 50-50 between gpt-3.5-turbo and claude-v1
curl --location 'http://127.0.0.1:8787/v1/chatComplete' \
--header 'Content-Type: application/json' \
--data '{
"config": {
"mode": "loadbalance",
"options": [{
"provider": "openai",
"weight": 0.5,
"override_params": { "model": "gpt-3.5-turbo" },
"api_key": "<open-ai-api-key-here>"
}, {
"provider": "anthropic",
"weight": 0.5,
"override_params": { "model": "claude-v1" },
"api_key": "<anthropic-api-key-here>"
}]
},
"params": {
"messages": [{"role": "user","content":"What are the top 10 happiest countries in the world?"}],
"max_tokens": 50,
"user": "jbu3470"
}
}'
If you're familiar with OpenAI's API, you'll find Rubeus's API easy to use due to its unified signature.
# OpenAI query
curl --location 'http://127.0.0.1:8787/v1/complete' \
--header 'Content-Type: application/json' \
--data-raw '{
"config": {
"provider": "openai",
"api_key": "<open-ai-api-key-here>"
},
"params": {
"prompt": "What are the top 10 happiest countries in the world?",
"max_tokens": 50,
"user": "jbu3470"
}
}'
# Anthropic Query
curl --location 'http://127.0.0.1:8787/v1/complete' \
--header 'Content-Type: application/json' \
--data-raw '{
"config": {
"provider": "anthropic",
"api_key": "<anthropic-api-key-here>"
},
"params": {
"prompt": "What are the top 10 happiest countries in the world?",
"max_tokens": 50,
"user": "jbu3470"
}
}'
Name | Description |
---|---|
Portkey.ai | Full Stack LLMOps |
- Support for more providers, including Google Bard and LocalAI.
- Enhanced load balancing features to optimize resource use across different models and providers.
- More robust fallback and retry strategies to further improve the reliability of requests.
- Increased customizability of the unified API signature to cater to more diverse use cases.
💬 Participate in Roadmap discussions here.
- Bug Report? File here.
- Feature Request? File here.
- Reach out to the developers directly: Rohit | Ayush
Rubeus is licensed under the MIT License. See the LICENSE file for more details.