This implementation demonstrates how AssemblyAI's asynchronous transcription API can be used along with LeMUR to provide a near real time 'LLM assistant' experience.
main.py
holds the logic for chunking a live stream & sending files off for transcriptionapp.py
is a webhook server which handles the LeMUR completions & LLM logic.
- Start docker container (be sure pull the pluot/nginx-rtmp container from docker hub first if you don't have it)
docker run \
--rm -it \
-p 1935:1935 \
-p 9090:80 \
--name nginx-rtmp \
pluot/nginx-rtmp
- Start ngrok to make sure that all ports you'll use are available
ngrok start --all
is what you can use for this.
Note that, if you want to use ngrok like this, you'll need to make sure that your ./ngrok.yml file has port 5000, 5001, and 1935 configured correctly. I.e. you should find your ngrok.yml file on your machine and make it look something like this:
tunnels:
rtmp:
proto: tcp
addr: 1935
hls:
proto: http
addr: 9090
webhook:
proto: http
addr: 5000
segmentation:
proto: http
addr: 5001
Port 5000 and 5001 should be there by default, but you may need to add rtmp. From here, you need to add the correct ngrok URLs in several areas of your application:
- the ngrok URL associated with localhost 5001 (or main.py) needs to be inputted in /api/begin_processing.js and /api/get_stream_id.js
- everything after tcp:// (i.e. the URL pointing to localhost 1935) should be inputted in Video.js
- the ngrok URL associated with localhost 5000 (or app.py) should be placed within app.py at the top of the file & it should also be the url that our EventSource object listens to in Assistant.js on the client side
[OPTIONAL]
- open OBS or VLC to see the live stream.
- for example, if using VLC, use CMD + N then enter the live stream url there i.e. rtmp://ngrok_tcp_address:PORT/live/lemur-assistant-room
cd
into the frontend repo, install deps the next app withnpm run dev
oryarn dev.
At this step you'll also need to go into Assistant.js and Video.js and ensure that the rtmp stream url and webhook server url are correct. The webhook server should correspond to port 5000, and the rtmp server will correspond to the tcp url you see in ngrok.
The rtmp server is a bit more complicated to identify, but it is required for the Video.js component.
If your tcp url exposed via ngrok is: tcp://8.tcp.ngrok.io:18834, then the rtmp url you should use is rtmp://8.tcp.ngrok.io:18834/live/foo. See an example of a url in that component (just note that the one there now won't work out of the gate)
-
Join the live stream and click 'Activate LeMUR Agent'
-
cd
into the backend app, and install python deps viapip install requirements.txt
-
Make sure that you add your AssemblyAI key and webhook server url where needed in main.py and app.py
-
cd
intobackend
and run app.py, then main.py (app.py will set the webhook url which is used by main.py). Note that if you want to modify the prompt or core LeMUR logic, app.py is the place to do it. -
On your frontend, make sure you've joined the live stream, and click 'Activate LeMUR Agent' at the bottom right hand of the screen to begin processing.
If everything worked correctly, you should see lemur write notes and provide coaching as you talk into your live stream feed!
Here's what's happening under the hood:
- A live stream is created when you click 'activate lemur agent.' We are using the Daily.co sdk for video streaming, but any rtmp stream should work here
- Our server running main.py uses ffmpeg to slice the stream into 20 second chunks and sends each chunk off to assemblyai for processing, along with a webhook url which is associated with the current session
- Our webhook server receives the results of the transcription from assembly, then takes those results, stores the ids in Redis, and then uses the ids, the historical lemur results, and a custom prompt to generate a new assistant response
- Using SSE, we stream that response to the client.