Real Time Speech to text conversation

Question

Real Time Speech to text conversation

surendransuri opened this issue 8 months ago · 2 comments

Description of the feature request:

Hi, I want to use Gemini 1.5 Pro's newest multi-model. I'm looking for a feature similar to what's in Vertex AI playground, where we can convert speech to text in real-time. Right now, In this Gemini API examples we see that inferencing does this in batches after uploading a file, but I need it to happen in real-time. Can you help me figure out how to do this? Thanks a lot!

What problem are you trying to solve with this feature?

No response

Any other information you'd like to share?

No response

Answer 1 · 2024-05-07T18:28:08.000Z

Hi surendransuri, thanks for your question. Google AI Studio and the Gemini API do not support audio streaming right now. You can ask more about the best ways to work with audio on https://discuss.ai.google.dev/.

Answer 2 · 2024-06-11T02:21:42.000Z

you can just use speechSynthesis from Google Chrome native API.