/openai-realtime-api

TypeScript client for OpenAI's realtime voice API.

Primary LanguageTypeScriptMIT LicenseMIT

openai-realtime-api

TypeScript client for OpenAI's realtime voice API.

Build Status NPM MIT License Prettier Code Formatting

Features

Install

npm install openai-realtime-api

This package is ESM-only. It requires Node.js >= 18, a browser environment, or an equivalent JS runtime (Deno, Bun, CF workers, etc).

Usage

Important

All usage and events are 100% compatible with the OpenAI JS version. The main difference aside from bug fixes is that all events are fully-typed.

import { RealtimeClient } from 'openai-realtime-api'

// Create a new client; all params are optional; apiKey defaults to the
// `OPENAI_API_KEY` environment variable (when using Node.js).
const client = new RealtimeClient({
  sessionConfig: {
    instructions: 'You are a great, upbeat friend.',
    voice: 'alloy'
  }
})

// Can change session config ahead of connecting.
client.updateSession({
  turn_detection: null,
  input_audio_transcription: { model: 'whisper-1' }
})

// Example of custom event handling
client.on('conversation.updated', (event) => {
  // All events are fully-typed based on the event name.
  // In this case, `event` will have the type `RealtimeCustomEvents.ConversationUpdatedEvent`
  const { item, delta } = event

  // Access the full list of conversation items.
  const items = client.conversation.getItems()
})

// Connect to the Realtime API.
await client.connect()

// Send a text message and trigger a response generation.
client.sendUserMessageContent([{ type: 'input_text', text: 'How are you?' }])

// Wait for a completed response from the model.
// (`event` will be of type `RealtimeServerEvents.ResponseDoneEvent`)
const event = await client.realtime.waitForNext('response.done')

See examples for more complete demos.

See also the official OpenAI Realtime API Guide and API Reference.

For more info on usage, tools, and custom events, see OpenAI's readme. Note that this package is 100% compatible with OpenAI's beta package in terms of both official and unofficial events. The only difference is that all events are typed.

Server Usage

RealtimeClient takes in an optional apiKey which defaults to process.env.OPENAI_API_KEY.

Browser Usage

RealtimeClient takes in an optional url which can be pointed at a relay server.

import { RealtimeClient } from 'openai-realtime-api'

// Create a browser client which points to a relay server.
const client = new RealtimeClient({ url: RELAY_SERVER_URL })

Alternatively, you can use apiKey with RealtimeClient in the browser, but you also have to pass dangerouslyAllowAPIKeyInBrowser: true.

import { RealtimeClient } from 'openai-realtime-api'

// Create a browser client which connects directly to the OpenAI realtime API
// with an unsafe, client-side API key.
const client = new RealtimeClient({
  apiKey: process.env.OPENAI_API_KEY,
  dangerouslyAllowAPIKeyInBrowser: true
})

Caution

We strongly recommend against including your API key in any client (mobile or browser). It can be useful for local testing, but for production, you should be using a relay server.

Relay Server

import { RealtimeClient } from 'openai-realtime-api'
import { RealtimeRelay } from 'openai-realtime-api/node'

// Setting `relay: true` disables tool calls and directly modifying the session,
// since that will be the responsibility of the upstream client.
const client = new RealtimeClient({ relay: true })
const relay = new RealtimeRelay({ client })

relay.listen(8081)

Note that RealtimeRelay uses a different import path because it contains Node.js-specific code.

A full example is included in examples/node/relay-server.ts.

Examples

To run the included examples (requires Node.js >= 18):

  1. Clone this repo
  2. Run pnpm install
  3. Setup .env with your OPENAI_API_KEY

You can set debug: true in the RealtimeClient constructor of these examples to print out the full event log.

Node.js Basic

Simple Node.js demo using the RealtimeClient which sends a text message and waits for a complete response.

Node.js Audio

Simple Node.js demo using the RealtimeClient which sends a short audio message and waits for a complete response.

Node.js Conversation

Simple Node.js demo using the RealtimeClient with a microphone and speaker to simulate a full, back & forth conversation from the terminal.

OpenAI Realtime Console

This example has been imported from https://github.com/openai/openai-realtime-console (at commit 6ea4dba). The only change has been to replace @openai/realtime-api-beta with openai-realtime-api and to fix a few types.

To run the realtime console example:

pnpm install
cd examples/openai-realtime-console
pnpm start

TODO

  • add an example using tools
  • add an example next.js app
  • improve readme docs

License

MIT © Travis Fischer

If you found this project interesting, consider following me on Twitter.