openai/openai-dotnet

Delta in StreamingResponseFunctionCallArgumentsDeltaUpdate should be a `string`

Closed this issue ยท 3 comments

Is this change from string to BinaryData intentional?

Originally posted by @gaspardpetit in d893a79

Thank you for reaching out, @gaspardpetit ! The change was intentional. It aligns with how we represent raw JSON everywhere else in the library. Here are a couple of usage examples that might be useful:

  1. Streaming function calling with Responses:
    ๐Ÿ”— https://github.com/openai/openai-dotnet/blob/main/examples/Responses/Example04_FunctionCallingStreaming.cs
    Note that this example leverages StreamingResponseOutputItemDoneUpdate to retrieve the fully-formed function arguments as opposed to getting them piece by piece in deltas with StreamingResponseFunctionCallArgumentsDeltaUpdate.

  2. Streaming function calling with Chat Completions:
    ๐Ÿ”— https://github.com/openai/openai-dotnet/blob/main/examples/Chat/Example04_FunctionCallingStreaming.cs
    This one contains some helper code that you could re-use to build the function arguments.

Thank you for the quick response. If I understand correctly, when the response is expected to be in json, the client minimizes transformations (ex. risking stripping out the quotes on a potential naked json string type), and when the delta is expected to be a non-formatted regular string, it uses string

I think what confuses me is that - even in the first example, the StreamingResponseOutputTextDeltaUpdate uses a string for the delta - like all these:

  • StreamingResponseOutputTextDeltaUpdate,
  • StreamingAudioTranscriptionTextDeltaUpdate,
  • InputAudioTranscriptionDeltaUpdate,
  • InternalRealtimeServerEventResponseAudioTranscriptDelta,
  • InternalRealtimeServerEventResponseFunctionCallArgumentsDelta,
  • InternalRealtimeServerEventResponseTextDelta,
  • InternalResponseCodeInterpreterCallCodeDeltaEvent,
  • InternalResponseReasoningSummaryTextDeltaEvent,
  • StreamingResponseRefusalDeltaUpdate

While a few others rely on BinaryData:

  • InternalRealtimeServerEventResponseAudioDelta
  • InternalResponseReasoningDeltaEvent
  • InternalResponseReasoningSummaryDeltaEvent
  • StreamingResponseFunctionCallArgumentsDeltaUpdate
  • StreamingResponseMcpCallArgumentsDeltaUpdate

More than it being just JSON, we use BinaryData when the value contains raw data that needs to be processed in some way (versus a string that simply needs to be printed and/or read). For example: Function call arguments represented as raw JSON typically need to be parsed or deserialized in some way, an image represented as a base64-encoded string typically needs to be converted into a file (jpeg, png, etc.), and an array of audio bytes typically need to be converted into a stream to be played. Using BinaryData in these cases has a few benefits. For example: BinaryData offers a few useful methods to handle different types of raw data, such as ToStream(), ToMemory(), ToObjectFromJson(), ToString(), etc. Additionally, because strings in .NET are encoded using UTF-16, treating raw data as a string could misrepresent the original encoding, which could be important in some cases. When we keep these values as BinaryData`, we avoid making any assumptions and preserve the original encoding as it was received from the service.

The Internal* types likely need to be revised before being made public, but for example, in the case of:

  • StreamingResponseOutputTextDeltaUpdate
  • StreamingAudioTranscriptionTextDeltaUpdate
    These represent simple text, which is why we represent them as strings.

But in the case of:

  • StreamingResponseFunctionCallArgumentsDeltaUpdate
  • StreamingResponseMcpCallArgumentsDeltaUpdate
    These represent JSON arguments, which is why we represent them as BinaryData.

Similarly, in the case of:

  • InternalRealtimeServerEventResponseAudioDelta
    The delta represents raw audio data, which is why we also represent it as BinaryData.