microsoft/semantic-kernel

Starting from Filters to talk about invoke and ChatCompletionService

Closed this issue · 3 comments

Filters rapresents a great value for the framework and I appreciate the effort to extend to auto functions agens and so on. Still I think that there is an issue when. for example you use ChatCompletionServices directly instead use Kernel.Invoke. In my mind the framework should have a preferred way to use it and should have all necessary feature to use it at the best (e.g. history management). Call directly a service or a specific function could be possible but not recomended as you loose consistency. From this point of view all the examples that use ChatCompletitionServices, for example, could generate confusion about if filters are available in those scenario and when to use invoke instead ChatCompletionService

@guru98 Thank you very much for feedback!
The problem with applying filters on ChatCompletionService level is that we can do that for chat completion services that exist today (e.g. OpenAI, Google, Hugging Face), but it's also possible to inject your custom chat completion service. In this case, you will have to apply filtering logic manually, while if you invoke chat completion through kernel, filters will be invoked automatically, no matter which actual chat completion service you use.

It's still possible to achieve similar behavior, but we will have to re-think ChatCompletionService abstraction (e.g. use abstract class and provide some default implementation that will trigger filters), but we should think if this is really the problem we want to resolve.

Maybe we should understand why there is a need to call ChatCompletionService directly instead of Kernel.InvokeAsync, for example if it's for chat history management, then maybe we should improve it from this perspective.

Also, current filters provide a context with information like KernelFunction, KernelArguments, FunctionResult etc., so it is specific to kernel/function invocation. While in chat completion services we operate with different type of data - LLM-related settings, text content, audio content, etc. So, even if we make a decision to apply filtering on chat completion service level, there should be new type of filters added which will receive information specific to chat completion service.

@guru98 That makes sense and thanks a lot for this feedback, it's really helpful! I agree, we should minimize the need to use chat completion service directly. There are other benefits of using kernel instead of chat completion service such as telemetry, prompt templating etc.