Azure OpenAI LLM - The answers are showing part of the prompt context and instructions

I'm having bad answers in fresh installed 1.4.3 version.

The answers are always showing part of the prompt context and instructions as you can see from the screenshots.
I've started from a fresh docker installation (reset the docker to factory settings), deleted all the old cheshire cat folders.
I'm using Azure OpenAI as LLM (and embedder).

the LLM response is correct but continues the prompt. Check if you can set some stop strings on AzureAI, if not idk what say bc this is the first time i see problem with AzureAI...

I can ensure you I had no problems with old versions of the cat (not sure about the version but at least previous then 1 month ago) and that I didn't change (redeployed) Azure OpenAI models.. Maybe something is changed in the prompt creation with the latest versions?

@Roby91 thanks for signaling, I agree with @valentimarco it is a "stop" token problem
Let me tag somebody in the community a little more expert on Azure @zioproto @cristianorevil

@Roby91 can you share your configuration ? (remove the API key of course ). What model are you using ? Are you using LLMAzureOpenAIConfig or LLMAzureChatOpenAIConfig ?

@zioproto here is my metadata.json:

metadata.json

I'm using "LLMAzureOpenAIConfig" because my model is completion.

Your configuration is wrong.

with the model gpt-35-turbo you have to use in the menu "Azure OpenAI Chat Models", because gpt-35-turbo is a chat model.

You are using " "Azure OpenAI Completion models". This one was used in the past with the model text-davinci-003 that is a completion model (now deprecated).

Please fix the config and try again.

Related code:
See the comment # Use only completion models !

core/core/cat/factory/llm.py

Lines 127 to 170 in e6eabe6

    
           # https://learn.microsoft.com/en-gb/azure/cognitive-services/openai/reference#chat-completions 
        
           class LLMAzureChatOpenAIConfig(LLMSettings): 
        
               openai_api_key: str 
        
               model_name: str = "gpt-35-turbo"  # or gpt-4, use only chat models ! 
        
               openai_api_base: str 
        
               openai_api_type: str = "azure" 
        
               # Dont mix api versions https://github.com/hwchase17/langchain/issues/4775 
        
               openai_api_version: str = "2023-05-15" 
        
               deployment_name: str 
        
               streaming: bool = True 
        
               _pyclass: Type = AzureChatOpenAI 
        
               model_config = ConfigDict( 
        
                   json_schema_extra={ 
        
                       "humanReadableName": "Azure OpenAI Chat Models", 
        
                       "description": "Chat model from Azure OpenAI", 
        
                       "link": "https://azure.microsoft.com/en-us/products/ai-services/openai-service", 
        
                   } 
        
               ) 
        
           # https://python.langchain.com/en/latest/modules/models/llms/integrations/azure_openai_example.html 
        
           class LLMAzureOpenAIConfig(LLMSettings): 
        
               openai_api_key: str 
        
               openai_api_base: str 
        
               api_type: str = "azure" 
        
               # https://learn.microsoft.com/en-us/azure/cognitive-services/openai/reference#completions 
        
               # Current supported versions 2022-12-01, 2023-03-15-preview, 2023-05-15 
        
               # Don't mix api versions: https://github.com/hwchase17/langchain/issues/4775 
        
               api_version: str = "2023-05-15" 
        
               deployment_name: str = "gpt-35-turbo-instruct"  # Model "comming soon" according to microsoft 
        
               model_name: str = "gpt-35-turbo-instruct"  # Use only completion models ! 
        
               streaming: bool = True 
        
               _pyclass: Type = AzureOpenAI 
        
               model_config = ConfigDict( 
        
                   json_schema_extra={ 
        
                       "humanReadableName": "Azure OpenAI Completion models", 
        
                       "description": "Configuration for Cognitive Services Azure OpenAI", 
        
                       "link": "https://azure.microsoft.com/en-us/products/ai-services/openai-service", 
        
                   } 
        
               )

Ok thanks.
In Azure AI Studio the "gpt-35-turbo" is flagged as "Completion" so that's why I was using completion model setting in the cat (which worked well in old versions).

Anyway, i tried switching to "Azure OpenAI Chat Models" and looks like it works for "regular" questions but it behaves strange when plugins are involved.

@Roby91 sorry for the confusion. The gpt-35-turbo is indeed a "Completion" model that is meant for the "Chat" use-case. The Azure Portal says Completion to distinguish from other models that do completely different things like "Embeddings".

If you think it would help, please feel free to propose a patch to change the "Azure OpenAI Chat Models" wording with something that contains "Chat Completion" and it is more easy to understand.

Thanks

@pieroit I am not able to comment on the problem with the specific plugin.

If I understand correctly @Roby91 is activating plugins that are altering the prompts, and the tools are not working correctly anymore.

This should not be Azure specific and requires @pieroit or somebody from the core team to have a closer look.

@zioproto @Roby91 a model is completion/instruction OR chat

Edit: in the screenshot they just distinguishing between token prediction (completion) and geometrical projection (embedding)

@zioproto @Roby91 a model is completion OR chat

It is called "Chat Completion API" in the documentation. Because the word completion is always used, this caused the confusion.

https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/chatgpt?tabs=python&pivots=programming-language-chat-completions

Ok thanks for clarifying @zioproto

@Roby91 can you try a fresh install? Pull the image again
Cannot repeat the issue

@pieroit @zioproto

I removed all Docker images, downloaded 1.4.3 zip release, opened it via "docker compose up".
Configured Azure OpenAI Chat, with "Streaming" flag off (what's that for by the way?).
No plugins installed (except the Core CCat).

After lot of attempts I can say that, the issue happens with random frequency but may be related to "Clear conversation" action:
if it happens it does just as the first answer (at least for all my tests).

To reproduce the issue I do this:
start a new conversation, if the answer is ok I "clear conversation" and try with a random question (from the suggested ones).
Sometimes I get the wrong answer right away, sometimes after 3 attempts, sometimes after 10 attempts.
I cannot find similarities or strange patterns between questions.
Maybe is useful to notice that in 2 of 4 of the following screenshoots there is the "what time is it" question involved inside the full prompt of the answer that fails.

	# https://learn.microsoft.com/en-gb/azure/cognitive-services/openai/reference#chat-completions
	class LLMAzureChatOpenAIConfig(LLMSettings):
	openai_api_key: str
	model_name: str = "gpt-35-turbo" # or gpt-4, use only chat models !
	openai_api_base: str
	openai_api_type: str = "azure"
	# Dont mix api versions https://github.com/hwchase17/langchain/issues/4775
	openai_api_version: str = "2023-05-15"

	deployment_name: str
	streaming: bool = True
	_pyclass: Type = AzureChatOpenAI

	model_config = ConfigDict(
	json_schema_extra={
	"humanReadableName": "Azure OpenAI Chat Models",
	"description": "Chat model from Azure OpenAI",
	"link": "https://azure.microsoft.com/en-us/products/ai-services/openai-service",
	}
	)


	# https://python.langchain.com/en/latest/modules/models/llms/integrations/azure_openai_example.html
	class LLMAzureOpenAIConfig(LLMSettings):
	openai_api_key: str
	openai_api_base: str
	api_type: str = "azure"
	# https://learn.microsoft.com/en-us/azure/cognitive-services/openai/reference#completions
	# Current supported versions 2022-12-01, 2023-03-15-preview, 2023-05-15
	# Don't mix api versions: https://github.com/hwchase17/langchain/issues/4775
	api_version: str = "2023-05-15"
	deployment_name: str = "gpt-35-turbo-instruct" # Model "comming soon" according to microsoft
	model_name: str = "gpt-35-turbo-instruct" # Use only completion models !
	streaming: bool = True
	_pyclass: Type = AzureOpenAI

	model_config = ConfigDict(
	json_schema_extra={
	"humanReadableName": "Azure OpenAI Completion models",
	"description": "Configuration for Cognitive Services Azure OpenAI",
	"link": "https://azure.microsoft.com/en-us/products/ai-services/openai-service",
	}
	)