Spoken conversational experience for mobile devices, made with Expo.
I wanted to fiddle with OpenAI whisper and gpt, and plug elevenlabs speech synthesis on top of it. This has probably been done way better, but I wanted do put something fun together.
This is a pretty rough POC but good enough to work and see some initial areas of improvement:
- I want to explore a more conversational experience without having to press a button but this is not as easy as it sounds. This would enable the user to cut off the bot as it is answering which could lead to a pretty fun UX for role playing.
- the latency must go. There are a few avenues to do so e.g. running the models on device.