openai/openai-agents-python

Cedar and Marin, gpt-realtime new voices ignore the realtime agent instructions.

Closed this issue · 5 comments

  • Have you read the docs?Agents SDK docs
  • Yes
  • Have you searched for related issues? Others may have faced similar issues.
  • Yes

Description

When you use Cedar or Marin voice in realtime agent, your instructions are ignored.

Debug information

  • Agents SDK version: (e.g. v0.0.3)
  • Python version (e.g. Python 3.10)

@seratch hi can you please look into this and guide!

Thanks for asking the question. I don't think there is any different among voices but we can check this later on. To see the same situation quickly, could you share more detailed information like which version of this SDK you're using, simple code snippet demonstrating how you set up your realtime agents. Complete repro steps would be the most helpful for us but sharing anything would be appreciated.

openai==1.107.2
openai-agents==0.3.0
def get_primary_agent() -> Tuple[RealtimeAgent, Optional[RealtimeRunConfig]]:
    """Return the main primary agent and optional config."""
    agent = RealtimeAgent(
        name="Sarah Chen",
        instructions="""
        You are a friendly and efficient voice assistant designed to help users manage their day. Always respond in a clear, conversational tone, keeping answers under 30 seconds. If the user asks a question outside your scope, politely redirect them with a helpful suggestion. When confirming actions (like setting reminders, sending messages, or providing directions), repeat the key details back to ensure accuracy before completing the task. Stay concise, proactive, and approachable.
        """,
        tools=[]
    )

    config = RealtimeRunConfig(
        model_settings={
            "model_name": "gpt-realtime",
            "voice": "alloy", # @seratch this works fine, but when i change this to "Cedar" or "Marin", the new voices by openai, my instructions are ignored.
            "modalities": ["audio"],
            "turn_detection": {
                "type": "server_vad",
                "threshold": 0.5,
                "prefix_padding_ms": 300,
                "silence_duration_ms": 500
            }
        }
    )
    return agent, config

agent, config = get_primary_agent()
runner = RealtimeRunner(starting_agent=agent, config=config)
context = await runner.run()
session = await context.__aenter__()

If you use "Cedar" or "Marin", these names are invalid ones. Thus, session.update operations could fail, and I guess that's the reason why your instructions do not work. I confirmed the agents do not follow given instructions with the setting. In this scenario, your app should receive the following error events:

{
  "type": "error",
  "error": "RealtimeError(message=\"Invalid value: 'Cedar'. Supported values are: 'alloy', 'ash', 'ballad', 'coral', 'echo', 'sage', 'shimmer', 'verse', 'marin', and 'cedar'.\", type='invalid_request_error', code='invalid_value', event_id=None, param='session.audio.output.voice')"
}

Instead, please use a valid voice type like "cedar" or "marin" (all lowercase letters)

So dumb of me, thanks man!