This backend repo demonstrates how to start a WebSocket server that Retell server will connect to. Retell will send live transcripts and other updates to the server, and get responses from this server. see API Docs for walkthrough.
The protocol of messages we send and expect to receive is documented here.
This repo also contains code to use Twilio to get numbers, set up inbound, make phone calls, see API Docs for a walkthrough.
This repo contains azure OpenAI
, OpenAI
, and OpenRouter
, modify the import inside src/server.ts
to switch between which one to use.
Check this Youtube Tutorial containing a walkthrough using the Frontend Demo and this repo.
-
Add Retell and your LLM API key (Azure OpenAI / OpenAI / OpenRouter) to ".env.development". Optionally add your Twilio credentials if you want to use phone call abilities here.
- Azure OpenAI is pretty fast and stable: guide for setup
- OpenAI is the most widely used one, although the latency can vary.
- OpenRouter allows you to choose between tons of Open Source AI Models.
-
Install dependencies
npm install
- In another bash, use ngrok to expose this port to the public network
ngrok http 8080
- Start the server
npm run dev
You should see a fowarding address like
https://dc14-2601-645-c57f-8670-9986-5662-2c9a-adbd.ngrok-free.app
, and you
are going to take the IP address, prepend it with wss, postpend with
llm-websocket
path and use that in the dashboard to create a new agent. Now
the agent you created should connect with your localhost.
The custom LLM URL would look like
wss://dc14-2601-645-c57f-8670-9986-5662-2c9a-adbd.ngrok-free.app/llm-websocket
The src/twilio_api.ts
contains helper functions you could utilize to create phone numbers, tie agent to a number,
make a phone call with an agent, etc. Here we assume you already created agent from last step, and have agent id ready.
To use these features, follow these steps:
-
Uncomment twilio client initialization and
ListenTwilioVoiceWebhook(this.app)
insrc/server.ts
file to set up Twilio voice webhook. What this does is that every time a number of yours in Twilio get called, it would call this webhook which internally calls theregister-call
API and sends the correct audio websocket address back to Twilio, so it can connects with Retell to start the call. -
Put your ngrok ip address into
.env.development
, it would be something likehttps://dc14-2601-645-c57f-8670-9986-5662-2c9a-adbd.ngrok-free.app
. -
(optional) Call
CreatePhoneNumber
to get a new number and associate with an agent id. This phone number now can handle inbound calls as long as this server is running. -
(optional) Call
RegisterPhoneAgent
to register your Twilio number and associate with an agent id. This phone number now can handle inbound calls as long as this server is running. -
(optional) Call
DeletePhoneNumber
to release a number from your Twilio pool. -
(optional) Call
TransferCall
to transfer this on-going call to a destination number. -
(optional) Call
EndCall
to end this on-going call. -
Call
CreatePhoneCall
to start a call with caller & callee number, and your agent Id. This call would use the agent id supplied, and ignore the agent id you set up in step 3 or 4. It automatically hang up if machine/voicemail/IVR is detected. To turn it off, remove "machineDetection, asyncAmd" params.
To run in prod, you probably want to customize your LLM solution, host the code in a cloud, and use that IP to create the agent.