Unacceptable latency with 4.1

Question

Unacceptable latency with 4.1

Opened this issue 2 months ago · 3 comments

Describe the bug

After calling a tool, frequently 4.1 will take up to 30 seconds to give a response, even with a small number of tokens. Here are two examples

GPT 4.1-mini has no delay but frequently gets things wrong so we can't use it.

This started about a month ago. Prior to that, everything worked fine.

This is completely unacceptable for a paid product. I have submitted a support ticket as well and gotten no response.

I don't know how you expect businesses to adopt the agents SDK with such abysmal performance. We are getting complaints about it constantly and since we are raising money, VCs are calling it out and saying our product is shit. This needs to be fixed.

Answer 1 · 2025-09-04T21:26:35.000Z

For reference, the 20 second movieglu response had 9,000 tokens in it.

Answer 2 · 2025-09-04T23:58:21.000Z

We're sorry about the disruption, and thank you for taking the time to report it. I've also escalated this issue along with #1481 to the Responses API team. I will share updates once I get any.

Answer 3 · 2025-09-05T01:52:47.000Z

@seratch Don't take this the wrong way, but I've reported this 3 times now over the last month and nothing has changed. Using OpenAI is becoming an embarrassment for our company and damaging our brand. Something needs to be done.