AI Gateway

Route to 100+ LLMs with 1 fast & friendly API.

Portkey's AI Gateway is the interface between your app and hosted LLMs. It streamlines API requests to OpenAI, Anthropic, Mistral, LLama2, Anyscale, Google Gemini and more with a unified API.

✅ Blazing fast (9.9x faster) with a tiny footprint (~45kb installed)
✅ Load balance across multiple models, providers, and keys
✅ Fallbacks make sure your app stays resilient
✅ Automatic Retries with exponential fallbacks come by default
✅ Plug-in middleware as needed
✅ Battle tested over 100B tokens

Getting Started

Installation

If you're familiar with Node.js and npx, you can run your private AI gateway locally. (Other deployment options)

npx @portkey-ai/gateway

Your AI Gateway is now running on http://localhost:8787 🚀

Usage

Let's try making a chat completions call to OpenAI through the AI gateway:

curl '127.0.0.1:8787/v1/chat/completions' \
  -H 'x-portkey-provider: openai' \
  -H "Authorization: Bearer $OPENAI_KEY" \
  -H 'Content-Type: application/json' \
  -d '{"messages": [{"role": "user","content": "Say this is test."}], "max_tokens": 20, "model": "gpt-4"}'

Full list of supported SDKs

Supported Providers

Provider	Support	Stream	Supported Endpoints
OpenAI	✅	✅	`/completions`, `/chat/completions`,`/embeddings`, `/assistants`, `/threads`, `/runs`
Azure OpenAI	✅	✅	`/completions`, `/chat/completions`,`/embeddings`
Anyscale	✅	✅	`/chat/completions`
Google Gemini & Palm	✅	✅	`/generateMessage`, `/generateText`, `/embedText`
Anthropic	✅	✅	`/messages`, `/complete`
Cohere	✅	✅	`/generate`, `/embed`, `/rerank`
Together AI	✅	✅	`/chat/completions`, `/completions`, `/inference`
Perplexity	✅	✅	`/chat/completions`
Mistral	✅	✅	`/chat/completions`, `/embeddings`

View the complete list of 100+ supported models here

Features

Unified API Signature

Connect with 100+ LLM using OpenAI's API signature. The AI gateway handles the request, response and error transformations so you don't have to make any changes to your code. You can use the OpenAI SDK itself to connect to any of the supported LLMs.

Fallback

Don't let failures stop you. The Fallback feature allows you to specify a list of Language Model APIs (LLMs) in a prioritized order. If the primary LLM fails to respond or encounters an error, Portkey will automatically fallback to the next LLM in the list, ensuring your application's robustness and reliability.

Automatic Retries

Temporary issues shouldn't mean manual re-runs. AI Gateway can automatically retry failed requests upto 5 times. We apply an exponential backoff strategy, which spaces out retry attempts to prevent network overload.

Load Balancing

Distribute load effectively across multiple API keys or providers based on custom weights. This ensures high availability and optimal performance of your generative AI apps, preventing any single LLM from becoming a performance bottleneck.

Configuring the AI Gateway

The AI gateway supports configs to enable versatile routing strategies like fallbacks, load balancing, retries and more.

You can use these configs while making the OpenAI call through the x-portkey-config header

// Using the OpenAI JS SDK
const client = new OpenAI({
  baseURL: "http://127.0.0.1:8787", // The gateway URL
  defaultHeaders: {
    'x-portkey-config': {.. your config here ..}, 
  }
});

Here's an example config that retries an OpenAI request 5 times before falling back to Gemini Pro

{
  "retry": { "count": 5 },
  "strategy": { "mode": "fallback" },
  "targets": [{
      "provider": "openai",
      "api_key": "sk-***"
    },{
      "provider": "google",
      "api_key": "gt5***",
      "override_params": {"model": "gemini-pro"}
  }]
}

This config would enable load balancing equally between 2 OpenAI keys

{
  "strategy": { "mode": "loadbalance" },
  "targets": [{
      "provider": "openai",
      "api_key": "sk-***",
      "weight": "0.5"
    },{
      "provider": "openai",
      "api_key": "sk-***",
      "weight": "0.5"
    }
  ]
}

Read more about the config object.

Supported SDKs

Language	Supported SDKs
Node.js / JS / TS	Portkey SDK OpenAI SDK LangchainJS LlamaIndex.TS
Python	Portkey SDK OpenAI SDK Langchain LlamaIndex
Go	go-openai
Java	openai-java
Rust	async-openai
Ruby	ruby-openai

Deploying AI Gateway

See docs on installing the AI Gateway locally or deploying it on popular locations.

Roadmap

Support for more providers. Missing a provider or LLM Platform, raise a feature request.
Enhanced load balancing features to optimize resource use across different models and providers.
More robust fallback and retry strategies to further improve the reliability of requests.
Increased customizability of the unified API signature to cater to more diverse use cases.

💬 Participate in Roadmap discussions here.

Contributing

The easiest way to contribute is to pick any issue with the good first issue tag 💪. Read the Contributing guidelines here.

Bug Report? File here | Feature Request? File here

Community

Join our growing community around the world, for help, ideas, and discussions on AI.

View our official Blog
Chat live with us on Discord
Follow us on Twitter
Connect with us on LinkedIn

mohamara/ai-gateway