Introduction

This package is intended to start an API server that is able to handle chat completion requests efficiently. One can configure several handlers to achieve high performance and bypass rate limits of a single handler.

API

Request and response are the same as that in go-openai, see doc.

Stream

For stream responses, similar to OpenAI's official API, the server will send data-only server-sent events. Data is a JSON object defined as ChatCompletionStreamResponse.

Errors

The server will return http.StatusBadRequest if the request is invalid.
The server will return http.InternalServerError if the request is valid but the handler fails to handle it. The OpenAI side errors are mapped to http.InternalServerError (with the error message?) (I did not get the error message...)

Configurations

The server can be configured through command line flags with config file. To configure the server, you need to create a config file. Its json format example can be found here. User can hint the configuration file or its search path by:

command line flags
- config_file is the path to the server config file.
- config_path is the path to the directory that contains the server config file. Default is will search (in order):
  - .
  - .config
  - /etc/chatgpt-apiserver
environment variables
- CHATGPT_APISERVER_CONFIG_FILE or CONFIG_FILE is the path to the server config file.
- CHATGPT_APISERVER_CONFIG_PATH or CONFIG_PATH is the path to the directory that contains the server config file.

Controller

OpenAIController can be configured through a config file or directly in above config file. Its json format example can be found here.

Simple Usage

To use:

go install github.com/huweiATgithub/chatgpt-apiserver@latest
chatgpt-apiserver

Docker

Build yourself:

docker build -t chatgpt-apiserver .
docker run -p 8080:8080 -v {Mount Your configuration file} chatgpt-apiserver

You can also use weihu0/chatgpt-apiserver I built.

TODOs:

Implement a load balance pool
Allow to configure the apiserver from file
Disable Controller if exceeding its usage limit