[FEATURE REQ] GPT-5 new reasoning effort and verbosity level parameters to set in response client
Opened this issue · 10 comments
Describe the feature or improvement you are requesting
Is to set new reasoning effort and verbosity parameters in GPT-5 mentioned here in Response client: https://platform.openai.com/docs/guides/latest-model?lang=python
Additional context
Helps to avoid working on raw JSON and HTTP calls, or even worse, switching the entire thing to Python : )
Hi @mjmustafaev. Thanks for reaching out and we regret that you're experiencing difficulties. Because the GPT-5 model is brand new, the REST API specification on which this library is based has not yet been updated to include the new properties. Once the spec catches up, we will be able to regenerate the client and include them here.
In the meantime, you can include the new parameters that you referenced and continue to use the strongly typed models by influencing serialization, which would look something like:
var client = new OpenAIResponseClient("gpt-5", "<< YOUR API KEY >>");
// Force an additional member into the options properties bag.
var options = ((IJsonModel<ResponseCreationOptions>)new ResponseCreationOptions())
.Create(BinaryData.FromObjectAsJson(new
{
reasoning = new { effort = "minimal" },
text = new { verbosity = "low" },
}),
ModelReaderWriterOptions.Json);
// Set the desired options with the strongly-typed interface.
options.Instructions = "Always talk like a literary scholar when you answer but be brief and make puns.";
// The service call can be made as normal.
var response = await client.CreateResponseAsync(
"Is it true that you don't know how many 'b's are in blueberry?",
options);
Console.WriteLine(response.Value.GetOutputText());Got it, thanks very much 🙏 @jsquire
@jsquire this is a neat trick. As a suggestion you should write this sample code in the readme > advanced
It's relatively common for the REST Api to implement new properies that may take long to be implemented in this c# sdk
Same request, I'd rather not fool around with workarounds, and Python over my dead fingers (unless there is no other way)
@JoshLove-msft / @m-nash, this should be nicely addressed with public additional properties. You might want to use it as a validation scenario for the feature.
Is there an equivalent work around for adding arbitrary properties to a chat completion request? As opposed to OpenAIResponseClient.
We've switched to gpt-5 and really now need to test its lower verbosity levels - through Semantic Kernel.
@jonnermut: The same pattern works across features. Each has the protocol-level overload as an escape hatch.
I used the json method above and it did have an effect — the responses did become faster. But as of today, this method seems to have stopped working.
@ipevo-brooks: There's no context to work from in your comment for specifics, but I'm able to execute the above workaround without issue.
Request details
{"instructions":"Always talk like a literary scholar when you answer but be brief and make puns.","model":"gpt-5","input":[{"type":"message","role":"user","content":[{"type":"input_text","text":"Is it true that you don\u0027t know how many \u0027b\u0027s are in blueberry?"}]}],"reasoning":{"effort":"minimal"},"text":{"verbosity":"low"}}Response details
{
"id": "resp_023e3f975bbb54ba0068efe9110ac081938de119288bbfd17d",
"object": "response",
"created_at": 1760553233,
"status": "completed",
"background": false,
"billing": {
"payer": "developer"
},
"error": null,
"incomplete_details": null,
"instructions": "Always talk like a literary scholar when you answer but be brief and make puns.",
"max_output_tokens": null,
"max_tool_calls": null,
"model": "gpt-5-2025-08-07",
"output": [
{
"id": "rs_023e3f975bbb54ba0068efe911c8c48193853989eb4cd800eb",
"type": "reasoning",
"summary": []
},
{
"id": "msg_023e3f975bbb54ba0068efe9120eec8193ae287920a437407b",
"type": "message",
"status": "completed",
"content": [
{
"type": "output_text",
"annotations": [],
"logprobs": [],
"text": "Ah, the blueberry\u2019s consonantal bouquet! I do know: there are two b\u2019s\u2014one at the dawn, one mid-fruit. To be brief: b-b, then the berry. Call it a double-bill of bilberry bravura."
}
],
"role": "assistant"
}
],
"parallel_tool_calls": true,
"previous_response_id": null,
"prompt_cache_key": null,
"reasoning": {
"effort": "minimal",
"summary": null
},
"safety_identifier": null,
"service_tier": "default",
"store": true,
"temperature": 1.0,
"text": {
"format": {
"type": "text"
},
"verbosity": "low"
},
"tool_choice": "auto",
"tools": [],
"top_logprobs": 0,
"top_p": 1.0,
"truncation": "disabled",
"usage": {
"input_tokens": 43,
"input_tokens_details": {
"cached_tokens": 0
},
"output_tokens": 58,
"output_tokens_details": {
"reasoning_tokens": 0
},
"total_tokens": 101
},
"user": null,
"metadata": {}
}@jsquire
Sorry for not providing enough information.
Previously, I used the JSON approach mentioned above to reduce GPT-5’s thinking time and get faster responses. It had been working well for several weeks.
However, yesterday, without any changes to my code, my program suddenly experienced longer response times when interacting with GPT-5, similar to how it was before optimization. That’s why I raised this question here.
Currently, customers have reported slight improvements, so we are monitoring the situation. I’ll provide an update here if there’s any new development.
Thanks again for your response~