openai/openai-agents-python

Clarification on model_config vs run_config and configuration issues in RealtimeSession / RealtimeRunner

Closed this issue · 1 comments

Please read this first

  • Have you read the docs?Agents SDK docs yes
  • Have you searched for related issues? Others may have had similar requests yes

Question

I have some questions and issues regarding how configuration is passed into RealtimeSession and RealtimeRunner.

Difference between model_config and run_config
When creating a session like:

session = RealtimeSession(
    model=mock_model,
    agent=mock_agent,
    context=None,
    model_config={},
    run_config={},
)

What is the intended difference between model_config and run_config?

Are there recommended examples of what belongs to each?

Configuration at session vs runner level
I also noticed that configs can be passed both in RealtimeSession and in RealtimeRunner:

runner = RealtimeRunner(starting_agent=agent)
session_context = await runner.run(model_config=model_config_settings)

What is the practical implication of injecting configs in RealtimeSession vs RealtimeRunner?

Turn detection + modalities error
While experimenting with semantic_vad turn detection for the voice "marin", I ran into this error:

{
  "type": "error",
  "error": "RealtimeError(message=\"Invalid modalities: ['text', 'audio']. Supported combinations are: ['text'] and ['audio'].\", type='invalid_request_error', code='invalid_value', event_id=None, param='session.output_modalities')"
}

However, I explicitly declare:

"modalities": ["text", "audio"]

There are others errors when I include parameters like:

  • threshold
  • prefix_padding_ms
  • silence_duration_ms

(which are documented in RealtimeTurnDetectionConfig).

Missing parameters?
I also noticed that the parameter input_audio_noise_reduction (mentioned in some docs/examples) seems to be missing from the current config. Should this still be supported?

Steps to Reproduce

Create a RealtimeRunner and run with a model_config that includes turn detection settings (semantic_vad).

Declare "modalities": ["text", "audio"].

Add threshold, prefix_padding_ms, or silence_duration_ms.

Observe the error above.

Expected behavior

Clear distinction between model_config and run_config.

Ability to pass turn detection config (e.g. semantic_vad) with modalities: ["text", "audio"] without errors.

Support or clarification for parameters like input_audio_noise_reduction.

Actual behavior

Error with "Invalid modalities: ['text', 'audio']".

Some parameters appear ignored or unsupported.

Environment

openai-agents-python 0.3.0

Python version: 3.13.7

Additional context
It would be helpful to have a small reference table or doc snippet clarifying which configs belong to model_config vs run_config, and which are supported in RealtimeSession vs RealtimeRunner.

Thanks for asking the questions.

What is the intended difference between model_config and run_config?

When directly using RealtimeSession, indeed the names could be a bit confusing. The model_config provides the necessary information for establishing a connection with OpenAI like API key and endpoint URL. The run_config is the configuration passed from RealtimeRunner when you use a runner, and the config provides some other details like model params and guardrails. Please check the config classes' properties and if there are the ones you want to customize, you can pass them as well.

"modalities": ["text", "audio"]

This is a breaking change since Realtime GA migration. Unlike beta models, GA model (gpt-realtime) no longer accepts both text and audio for the parameter like the error message indicates. We generally recommend passing ["audio"] but still you can receive transcription of the input/output audio, so technically there is no feature gap. Just the initialization options were changed then.