OpenOpenAI

Intro
Why?
Stack
Development
TODO
License

Intro

This project is a self-hosted version of OpenAI's new stateful Assistants API. 💪

All API route definitions and types are 100% auto-generated from OpenAI's official OpenAPI spec, so all it takes to switch between the official API and your custom API is changing the baseURL. 🤯

This means that all API parameters, responses, and types are wire-compatible with the official OpenAI API, and the fact that they're auto-generated means that it will be relatively easy to keep them in sync over time.

Here's an example using the official Node.js openai package:

import OpenAI from 'openai'

// The only difference is the `baseURL` pointing to your custom API server 🔥
const openai = new OpenAI({
  baseURL: 'http//:localhost:3000'
})

// Since the custom API is spec-compliant with OpenAI, you can use the sdk normally 💯
const assistant = await openai.beta.assistants.create({
  model: 'gpt-4-1106-preview',
  instructions: 'You are a helpful assistant.'
})

Python example

Here's the same example using the official Python openai package:

from openai import OpenAI

client = OpenAI(
    base_url: "http//:localhost:3000"
)

# Now you can use the sdk normally!
# (only file and beta assistant resources are currently supported)
# You can even switch back and forth between the official and custom APIs!
assistant = client.beta.assistants.create(
    model="gpt-4-1106-preview",
    description="You are a helpful assistant."
)

Note that this project is not meant to be a full recreation of the entire OpenAI API. Rather, it is focused only on the stateful portions of the new Assistants API. The following resource types are supported:

Assistants
AssistantFiles
Files
Messages
MessageFiles
Threads
Runs
RunSteps

See the official OpenAI Assistants Guide for more info on how Assistants work.

Why?

Being able to run your own, custom OpenAI Assistants that are 100% compatible with the official OpenAI Assistants unlocks all sorts of useful possibilities:

Using OpenAI Assistants with custom models (OSS ftw!) 💪
Fully customizable RAG via the built-in retrieval tool (LangChain and LlamaIndex integrations coming soon)
Using a custom code interpreter like open-interpreter 🔥
Self-hosting / on-premise deployments of Assistants
Full control over assistant evals
Developing & testing GPTs in fully sandboxed environments
Sandboxed testing of custom Actions before deploying to the OpenAI "GPT Store"

Most importantly, if the OpenAI "GPT Store" ends up gaining traction with ChatGPT's 100M weekly active users, then the ability to reliably run, debug, and customize OpenAI-compatible Assistants will end up being incredibly important in the future.

I could even imagine a future Assistant store which is fully compatible with OpenAI's GPTs, but instead of relying on OpenAI as the gatekeeper, it could be fully or partially decentralized. 💯

Stack

Postgres - Primary datastore via Prisma (schema file)
Redis - Backing store for the async task queue used to process thread runs via BullMQ
S3 - Stores uploaded files
- Any S3-compatible storage provider is supported, such as Cloudflare R2
Hono - Serves the REST API via @hono/zod-openapi
- We're using the Node.js adaptor by default, but Hono supports many environments including CF workers, Vercel, Netlify, Deno, Bun, Lambda, etc.
Dexter - Production RAG by Dexa
TypeScript 💕

Development

Prerequisites:

node >= 18
pnpm >= 8

Install deps:

pnpm install

Generate the prisma types locally:

pnpm generate

Environment Variables

cp .env.example .env

Postgres
- DATABASE_URL - Postgres connection string
- On macOS: brew install postgresql && brew services start postgresql
- You'll need to run npx prisma db push to set up your database according to our prisma schema
OpenAI
- OPENAI_API_KEY - OpenAI API key for running the underlying chat completion calls
- This is required for now, but depending on how interested people are, it won't be hard to add support for local models and other providers
Redis
- On macOS: brew install redis && brew services start redis
- If you have a local redis instance running, the default redis env vars should work without touching them
- REDIS_HOST - Optional; defaults to localhost
- REDIS_PORT - Optional; defaults to 6379
- REDIS_USERNAME - Optional; defaults to default
- REDIS_PASSWORD - Optional
S3 - Required to use file attachments
- Any S3-compatible provider is supported, such as Cloudflare R2
- Alterantively, you can use a local S3 server like MinIO or LocalStack
  - To run LocalStack on macOS: brew install localstack/tap/localstack-cli && localstack start -d
  - To run MinIO macOS: brew install minio/stable/minio && minio server /data
- I recommend using Cloudflare R2, though – it's amazing and should be free for most use cases!
- S3_BUCKET - Required
- S3_REGION - Optional; defaults to auto
- S3_ENDPOINT - Required; example: https://<id>.r2.cloudflarestorage.com
- ACCESS_KEY_ID - Required (cloudflare R2 docs)
- SECRET_ACCESS_KEY - Required (cloudflare R2 docs)

Services

The app is composed of two services: a RESTful API server and an async task runner. Both services are stateless and can be scaled horizontally.

There are two ways to run these services locally. The quickest way is via tsx:

# Start the REST API server in one shell
npx tsx src/server

# Start an async task queue runner in another shell
npx tsx src/runner

Alternatively, you can transpile the source TS to JS first, which is preferred for running in production:

pnpm build

# Start the REST API server in one shell
npx tsx dist/server

# Start an async task queue runner in another shell
npx tsx dist/runner

E2E Examples

Custom Function Example

This example contains an end-to-end assistant script which uses a custom get_weather function.

You can run it using the official openai client for Node.js against the default OpenAI API hosted at https://api.openai.com/v1.

npx tsx e2e

To run the same test suite against your local API, you can run:

OPENAI_API_BASE_URL='http://127.0.0.1:3000' npx tsx e2e

It's pretty cool to see both test suites running the exact same Assistants code using the official OpenAI Node.js client – without any noticeable differences between the two versions. Huzzah! 🥳

Retrieval Tool Example

This example contains an end-to-end assistant script which uses the built-in retrieval tool with this readme.md file as an attachment.

You can run it using the official openai client for Node.js against the default OpenAI API hosted at https://api.openai.com/v1.

npx tsx e2e/retrieval.ts

To run the same test suite against your local API, you can run:

OPENAI_API_BASE_URL='http://127.0.0.1:3000' npx tsx e2e/retrieval.ts

The output will likely differ slightly due to differences in OpenAI's built-in retrieval implementation and our default, naive retrieval implementation.

Note that the current retrieval implementation only support text files like text/plain and markdown, as no preprocessing or conversions are done at the moment. We also use a very naive retrieval method at the moment which always returns the full file contents as opposed to pre-processing them and only returning the most semantically relevant chunks. See this issue for more info.

Server routes

GET       /files
POST      /files
DELETE    /files/:file_id
GET       /files/:file_id
GET       /files/:file_id/content
GET       /assistants
POST      /assistants
GET       /assistants/:assistant_id
POST      /assistants/:assistant_id
DELETE    /assistants/:assistant_id
GET       /assistants/:assistant_id/files
GET       /assistants/:assistant_id/files
POST      /assistants/:assistant_id/files
DELETE    /assistants/:assistant_id/files/:file_id
GET       /assistants/:assistant_id/files/:file_id
POST      /threads
GET       /threads/:thread_id
POST      /threads/:thread_id
DELETE    /threads/:thread_id
GET       /threads/:thread_id/messages
POST      /threads/:thread_id/messages
GET       /threads/:thread_id/messages/:message_id
POST      /threads/:thread_id/messages/:message_id
GET       /threads/:thread_id/messages/:message_id/files
GET       /threads/:thread_id/messages/:message_id/files/:file_id
GET       /threads/:thread_id/runs
POST      /threads/runs
POST      /threads/:thread_id/runs
GET       /threads/:thread_id/runs/:run_id
POST      /threads/:thread_id/runs/:run_id
POST      /threads/:thread_id/runs/:run_id/submit_tool_outputs
POST      /threads/:thread_id/runs/:run_id/cancel
GET       /threads/:thread_id/runs/:run_id/steps
GET       /threads/:thread_id/runs/:run_id/steps/:step_id
GET       /openapi

You can view the server's auto-generated openapi spec by running the server and then visiting http://127.0.0.1:3000/openapi

TODO

Status: All API routes have been tested side-by-side with the official OpenAI API and are working as expected. The only missing features at the moment are support for the built-in code_interpreter tool (issue) and support for non-text files with the built-in retrieval tool (issue). All other functionality should be fully supported and wire-compatible with the official API.

TODO:

hosted demo (bring your own OpenAI API key?)
get hosted redis working
handle locking thread and messages
- not sure how this works exactly, but according to the OpenAI Assistants Guide, threads are locked while runs are being processed
built-in code_interpreter tool (issue)
support non-text files w/ built-in retrieval tool (issue)
openai uses prefix IDs for its resources, which would be great, except it's a pain to get working with Prisma (issue)
figure out why localhost resolution wasn't working for #6
handle context overflows (truncation for now)

License

If you found this project useful, please consider sponsoring me or following me on twitter twitter

ScienceArtist/OpenOpenAI