Incorrect Realtime Item Type and Missing Audio Output with tool_choice: {type: "function", name: "my_function"}})
Closed this issue · 1 comments
Expected Behavior:
When calling client.updateSession({tool_choice: {type: "function", name: "list_emails"}}), the function list_emails should be invoked correctly, similar to the behavior when using client.updateSession({tool_choice: "auto"}). This includes:
- A realtime.item of type function_call being generated with the correct arguments.
- A function_call_output type item returned with the processed data.
- Audio output in response to the function's execution.
Here is an example for the working case using (tool_choice: "auto"):
Initial User Message:
{
"id": "item_AY81IkN6zZ0We2KwXkd8B",
"object": "realtime.item",
"type": "message",
"status": "completed",
"role": "user",
"content": [
{
"type": "input_text",
"text": "Hello! Check my last email and reply to it."
}
],
"formatted": {
"audio": {},
"text": "Hello! Check my last email and reply to it.",
"transcript": ""
}
}
AI Audio Response:
{
"id": "item_AY81IaiSORg9pH7Vr0WEr",
"object": "realtime.item",
"type": "message",
"status": "completed",
"role": "assistant",
"content": [
{
"type": "audio",
"transcript": "Sure! I'll check your latest email and get a reply ready. Please give me a moment."
}
],
"formatted": {
"audio": {},
"text": "Hello! Check my last email and reply to it.",
"transcript": ""
}
}
Function Call:
{
"id": "item_AY81KpMatEHuAxWRNIwLa",
"object": "realtime.item",
"type": "function_call",
"status": "completed",
"name": "list_emails",
"call_id": "call_I14Ukrd5RS2gtfY8",
"arguments": "{\"numEmails\":1}",
"formatted": {
"audio": {},
"text": "",
"transcript": "",
"tool": {
"type": "function",
"name": "list_emails",
"call_id": "call_I14Ukrd5RS2gtfY8",
"arguments": "{\"numEmails\":1}"
}
}
}
Function Call Output:
{
"id": "item_AY81MM9OwvmT4v8KAuQbx",
"object": "realtime.item",
"type": "function_call_output",
"call_id": "call_I14Ukrd5RS2gtfY8",
"output": " [data retrieved] ",
"formatted": {
"audio": {},
"text": "",
"transcript": "",
"output": " [data retrieved] "
},
"status": "completed"
}
Incorrect Behavior
using client.updateSession({tool_choice: {type: "function", name: "list_emails"}})
Initial User Message:
{
"id": "item_AY8B7XZfNVik3T9v6vJ7C",
"object": "realtime.item",
"type": "message",
"status": "completed",
"role": "user",
"content": [
{
"type": "input_text",
"text": "Hello! Check my last email and reply to it."
}
],
"formatted": {
"audio": {},
"text": "Hello! Check my last email and reply to it.",
"transcript": ""
}
}
Argument Passed as message instead of function_call (numEmails = 1):
{
"id": "item_AY8B7KuSqtPRwh1Y7HStO",
"object": "realtime.item",
"type": "message",
"status": "completed",
"role": "assistant",
"content": [
{
"type": "text",
"text": "{\"numEmails\":1}"
}
],
"formatted": {
"audio": {},
"text": "{\"numEmails\":1}",
"transcript": ""
}
}
Repeated Argument Item (Still Incorrect Type):
{
"id": "item_AY8B8DN1mW7Rgp0r3dEbY",
"object": "realtime.item",
"type": "message",
"status": "completed",
"role": "assistant",
"content": [
{
"type": "text",
"text": "{\"numEmails\":1}"
}
],
"formatted": {
"audio": {},
"text": "{\"numEmails\":1}",
"transcript": ""
}
}
Missing Audio Output:
Unlike the auto case, no audio output is generated.
Summary of the Issue:
When forcing a function call with client.updateSession({tool_choice: {type: "function", name: "list_emails"}}):
The function_call type is not used for the generated realtime.item.
The arguments appear as message type instead of being passed correctly as a function_call type.
No audio output is generated, contrary to the behavior with tool_choice: "auto".
How can I fix that?
I am forcing the tool to be used through prompting now.
clientRef.current.sendUserMessageContent([
{
type: 'input_text',
text: "You must execute the function 'my function' now.",
},
]);