Portkey's AI Gateway is the interface between your app and hosted LLMs. It streamlines API requests to OpenAI, Anthropic, Mistral, LLama2, Anyscale, Google Gemini and more with a unified API.
✅ Blazing fast (9.9x faster) with a tiny footprint (~45kb installed)
✅ Load balance across multiple models, providers, and keys
✅ Fallbacks make sure your app stays resilient
✅ Automatic Retries with exponential fallbacks come by default
✅ Plug-in middleware as needed
✅ Battle tested over 100B tokens
If you're familiar with Node.js and npx
, you can run your private AI gateway locally. (Other deployment options)
npx @portkey-ai/gateway
Your AI Gateway is now running on http://localhost:8787 🚀
Let's try making a chat completions call to OpenAI through the AI gateway:
curl '127.0.0.1:8787/v1/chat/completions' \
-H 'x-portkey-provider: openai' \
-H "Authorization: Bearer $OPENAI_KEY" \
-H 'Content-Type: application/json' \
-d '{"messages": [{"role": "user","content": "Say this is test."}], "max_tokens": 20, "model": "gpt-4"}'
The AI gateway supports configs to enable versatile routing strategies like fallbacks, load balancing, retries and more.
You can use these configs while making the OpenAI call through the x-portkey-config
header
// Using the OpenAI JS SDK
const client = new OpenAI({
baseURL: "http://127.0.0.1:8787", // The gateway URL
defaultHeaders: {
'x-portkey-config': {.. your config here ..},
}
});
Here's an example config that retries an OpenAI request 5 times before falling back to Gemini Pro
{
"retry": { "count": 5 },
"strategy": { "mode": "fallback" },
"targets": [{
"provider": "openai",
"api_key": "sk-***"
},{
"provider": "google",
"api_key": "gt5***",
"override_params": {"model": "gemini-pro"}
}]
}
This config would enable load balancing equally between 2 OpenAI keys
{
"strategy": { "mode": "loadbalance" },
"targets": [{
"provider": "openai",
"api_key": "sk-***",
"weight": "0.5"
},{
"provider": "openai",
"api_key": "sk-***",
"weight": "0.5"
}
]
}
Read more about the config object.
Language | Supported SDKs |
---|---|
Node.js / JS / TS | Portkey SDK OpenAI SDK LangchainJS LlamaIndex.TS |
Python | Portkey SDK OpenAI SDK Langchain LlamaIndex |
Go | go-openai |
Java | openai-java |
Rust | async-openai |
Ruby | ruby-openai |
See docs on installing the AI Gateway locally or deploying it on popular locations.
- Support for more providers. Missing a provider or LLM Platform, raise a feature request.
- Enhanced load balancing features to optimize resource use across different models and providers.
- More robust fallback and retry strategies to further improve the reliability of requests.
- Increased customizability of the unified API signature to cater to more diverse use cases.
💬 Participate in Roadmap discussions here.
The easiest way to contribute is to pick any issue with the good first issue
tag 💪. Read the Contributing guidelines here.
Bug Report? File here | Feature Request? File here
Join our growing community around the world, for help, ideas, and discussions on AI.