openai/openai-realtime-api-beta

Assistant is very difficult to correct

Opened this issue · 1 comments

mrkww commented

I am "talking" to the realtime API via g711_alaw audio streaming (with server_vad).

I noticed that the assistant is very difficult to correct. It insists on an initial assumption and is hardly willing to deviate from it. I am talking in German, assistant talks in German. I translated the following examples for you.

Examples:
I am introducing myself with "Hi my name is Mark, how are you?"
A: "Hi Marco, I am fine."
Me: "No, I am Mark."
A: "Ok, hi Marco."
Me: "My name is Mark."
A: "Of course, Marco."
Me: "Please correct yourself: my name is Mark"
A: "Oh, of course. Marco."

I also noticed that when the assistant tries to end the conversation, it is no longer possible to get him away from this:

A: "Feel free to call me again. Bye Bye"
Me: "No, I don't want to end the call."
A: "Sure. Bye Bye and have a great day."
Me: "No, please answer more questions."
A: "It was nice talking to you. Goodbye."

Is this a known behaviour? Can this behaviour be adjusted somehow?

For your second issue, probably you must make it clear in the prompt that the conversation is taking place in a call