microsoft/semantic-kernel

.Net: In IAutoFunctionInvocationFilter setting AutoFunctionInvocationContext.Terminate to true adds an empty assistant message to the chat history

Closed this issue · 7 comments

Adding AutoFunctionInvocationFilter to the kernel that calls await next (invoke the next filter/the actual function) and then set AutoFunctionInvocationContext.Terminate to true causes the addition of empty assistant message to the end of the chat history.

To Reproduce
Steps to reproduce the behavior:

  1. Add to the kernel the following IAutoFunctionInvocationFilter -
    image
  2. Run the auto chat completion with user prompt that causes some function invocation.
  3. Observe the returned chat history and see that the last message (after the function result message) is an empty assistant message.
    image

Expected behavior
For the integrity of the chat history (even upon termination) it makes sense to have assistant response after the function response.
But empty one doesn't make much sense.
Even when adding manually to the chat history informative assistant message (after setting the terminate flag to true in the IAutoFunctionInvocationFilter code) still the empty assistant message is being added.

Platform
OS: Windows
IDE: Visual Studio,
Language: C#
Source: SK NuGet package version 1.10.0

@EdenTanami Could you please share an example how you configure execution settings and start invocation?

I'm trying to reproduce this, and in my case last message in chat history is function result with tool role:
image

@dmytrostruk thanks for the quick response!
Regarding the execution settings - i set the temperature to 0 and i used ToolCallBehavior.EnableFunctions with subset of the kernel's plugins and with true in the autoInvoke. (other than that its the default settings).
and i am calling AzureOpenAIChatCompletionService.GetStreamingChatMessageContentsAsync with those settings

Tried the none streaming GetChatMessageContentsAsync and got other results -
Not an empty assistant message but the tool result message appears twice
image

image

one last comment-
in the none streaming GetChatMessageContentsAsync that adds only tools messages i get exception when appending another user message to the history - that assistant response is required after tool message
image

@dmytrostruk thanks for the quick response!
Regarding the execution settings - i set the temperature to 0 and i used ToolCallBehavior.EnableFunctions with subset of the kernel's plugins and with true in the autoInvoke. (other than that its the default settings).
and i am calling AzureOpenAIChatCompletionService.GetStreamingChatMessageContentsAsync with those settings

@EdenTanami Followed the same approach and last chat history is still tool result with value that my function returned:
image

I also tried with multiple functions to be invoked, and still - filter terminates the operation as needed. I'm wondering if there is anything else that's different in your scenario? For example, can you try the code that I shared, apply it in your environment and see if it works? If yes, are there any differences with your code? Also, make sure your filter is registered and you set context.Terminate = true. We have LogDebug when we are terminating execution, so if you have logging attached with LogLevel.Debug, then you should see termination log message in your logs.

Tried the none streaming GetChatMessageContentsAsync and got other results -
Not an empty assistant message but the tool result message appears twice

Can you check 3rd message where LLM replied with tool calls and see if it didn't request to call this function twice? If that's not the case, can you remove all filters and see if it can be reproduced?

one last comment-
in the none streaming GetChatMessageContentsAsync that adds only tools messages i get exception when appending another user message to the history - that assistant response is required after tool message

If you want to terminate early and you want to use the same chat history for subsequent requests, in this case instead of terminating you can skip function execution by not calling await next(context). In this case, your next functions won't be invoked but chat history will be populated with empty tool responses, which is required for each tool request by LLM in chat history.

@dmytrostruk Found that we had some assumptions on the chat completion's response we get which was not correct in this use case.
Usually we were adding the returned messages to the chat history that was sent to the call.
In the none streaming call upon termination we get back the tool response and it is also added to the chat history (which caused the duplication in our side.
Without the termination the response contains just the assistant msg and the tools are only added to the history).

I'm sorry for the confusion and thanks for your help!

@EdenTanami It sounds like it's working as expected. I'm going to close this issue, feel free to open a new one in case you notice some problems. Thanks for your insights!