GenAI Gateway Resilience Service

This is a cookiecutter repository for a Gen AI Gateway Service. It serves as a gateway service for multiple OpenAI models. This service is designed to implement a resilient operation mechanism that implements a fallback mechanism between a primary and a fallback model. This repository is particularly suited for smaller architectures that use multiple LLMs models or architectures that don't have access to enterprise-grade balancing services or managed solutions. It could also be used as a starting point for more complex gateway services implement other operational patterns that can be found here.

Features

Fast API: Serves as a centralized entry point for accessing multiple OpenAI models.
Fallback mechanism: Implements a circuit breaker pattern to switch between a primary and a fallback model when the primary model is unavailable.

Getting Started

Configuration

The OpenAI Gateway Service requires the following attributes to be configured in the .env file or with environment variables:

FALLBACK_OPENAI_HOST=""
FALLBACK_OPENAI_API_KEY=""
PRIMARY_OPENAI_HOST=""
PRIMARY_OPENAI_API_KEY=""

Setting up your service

To get started with the OpenAI Gateway Service, follow these steps:

Clone the Repository and create a new project based on the template:
```
git clone <repository_url>
```
Create a new project based on the template:
```
cookiecutter <cloned_directory>
```
Install the project dependencies with poetry:
```
cd <project_name>
poetry install
```

Create an .env file with the following environment variables:

FALLBACK_OPENAI_HOST=""
FALLBACK_OPENAI_API_KEY=""
PRIMARY_OPENAI_HOST=""
PRIMARY_OPENAI_API_KEY=""

Run the application locally:
```
poetry run uvicorn src.app:app --reload
```
Build and run the application with docker-compose:
```
docker build -t <image_name> .
```
Run the docker image:
```
docker run -p 8000:8000 <image_name>
```

Usage

Once the service is up and running, you can send requests to the gateway endpoint to interact with the configured OpenAI models. Make sure to refer to the API documentation for details on the supported endpoints and request/response formats.

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.

Trademarks

This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.

kanijiy/GenAI-Gateway-Resilience-Service-Example