There's now official Cloudflare (Workers) AI support for OpenAI comptabile API endpoints, I recommend using this instead: OpenAI compatible API endpoints
This project aims to convert/proxy Cloudflare Workers AI responses to OpenAI API compatible responses so that Cloudflare Workers AI models can be used with any OpenAI/ChatGPT compatible client.
- Supports streaming and non-streaming responses
- Rewrites default models such as
gpt-3
andgpt-4
to use@cf/meta/llama-3-8b-instruct
- If the OpenAI client can be configured to use other model names, simply replace
gpt-4
with the Cloudflare model ID - Here's a list of all Cloudflare Workers AI models
- create a Cloudflare Account
- clone this repo
- run
npm run deploy
- generate an API key and add it to your project:
npx wrangler secret put token
after the script has been deployed, you'll get an URL which you can use as your OpenAI API endpoint for other applications, something like this: https://openai-api.foobar.workers.dev
I mainly created this project to make it work with the awesome LLM project from Simon Willison
- go ahead and read everything about LLM here
- how to install LLM
- since LLM can work with OpenAI-compatible models, we're adding our OpenAI API proxy like this:
- find the directory of your llm configuration:
dirname "$(llm logs path)"
- create this file:
vi ~/Library/Application\ Support/io.datasette.llm/extra-openai-models.yaml
- model_id: cloudflare
model_name: '@hf/thebloke/llama-2-13b-chat-awq'
api_base: 'https://openai-api.foobar.workers.dev/'
api_key_name: cloudflare
you can also add multiple models there:
- model_id: cfllama2
model_name: '@cf/meta/llama-2-7b-chat-fp16'
api_base: 'https://openai-api.foobar.workers.dev'
api_key_name: cloudflare
- model_id: cfllama3
model_name: '@cf/meta/llama-3-8b-instruct'
api_base: 'https://openai-api.foobar.workers.dev'
api_key_name: cloudflare
- set the API key in LLM:
llm keys set cloudflare
to the one you configured in the Worker
use it with streaming (recommended):
llm chat -m cfllama3
use it without streaming:
llm chat --no-stream -m cfllama3
chatblade is a cool CLI utility for ChatGPT. It was a bit harder to configure to use it with a custom endpoint and model, but this seems to work:
export OPENAI_API_KEY="your-own-auth-key" # your worker secret api key
export OPENAI_API_AZURE_ENGINE="@cf/meta/llama-3-8b-instruct" # the model you want to use
export OPENAI_API_VERSION="@cf/meta/llama-3-8b-instruct" # again, the model you want to use
export AZURE_OPENAI_ENDPOINT="https://openai-api.foobar.workers.dev/" # your workers endpoint
export OPENAI_API_TYPE=azure # I don't know why this is required
use Chatblade like this then:
chatblade -i -c "@cf/meta/llama-3-8b-instruct" # interactive mode as a chat
chatblade -c "@cf/meta/llama-3-8b-instruct" "tell a joke" # single prompt
There's Pal Chat for iOS which can be used with custom endpoints.
- settings → modify custom host →
openai-api.foobar.workers.dev
→ enter api key - modify custom model → your model name, for example:
@cf/meta/llama-3-8b-instruct