cheshire-cat-ai/core

Azure OpenAI LLM - The answers are showing part of the prompt context and instructions

Closed this issue · 13 comments

I'm having bad answers in fresh installed 1.4.3 version.

The answers are always showing part of the prompt context and instructions as you can see from the screenshots.
I've started from a fresh docker installation (reset the docker to factory settings), deleted all the old cheshire cat folders.
I'm using Azure OpenAI as LLM (and embedder).

Screenshot 2023-12-09 195124

Screenshot 2023-12-09 200309

Screenshot 2023-12-09 194930

the LLM response is correct but continues the prompt. Check if you can set some stop strings on AzureAI, if not idk what say bc this is the first time i see problem with AzureAI...

I can ensure you I had no problems with old versions of the cat (not sure about the version but at least previous then 1 month ago) and that I didn't change (redeployed) Azure OpenAI models.. Maybe something is changed in the prompt creation with the latest versions?

@Roby91 thanks for signaling, I agree with @valentimarco it is a "stop" token problem
Let me tag somebody in the community a little more expert on Azure @zioproto @cristianorevil

@Roby91 can you share your configuration ? (remove the API key of course ). What model are you using ? Are you using LLMAzureOpenAIConfig or LLMAzureChatOpenAIConfig ?

@zioproto here is my metadata.json:

metadata.json

I'm using "LLMAzureOpenAIConfig" because my model is completion.

Your configuration is wrong.

with the model gpt-35-turbo you have to use in the menu "Azure OpenAI Chat Models", because gpt-35-turbo is a chat model.

You are using " "Azure OpenAI Completion models". This one was used in the past with the model text-davinci-003 that is a completion model (now deprecated).

Please fix the config and try again.

Related code:
See the comment # Use only completion models !

# https://learn.microsoft.com/en-gb/azure/cognitive-services/openai/reference#chat-completions
class LLMAzureChatOpenAIConfig(LLMSettings):
openai_api_key: str
model_name: str = "gpt-35-turbo" # or gpt-4, use only chat models !
openai_api_base: str
openai_api_type: str = "azure"
# Dont mix api versions https://github.com/hwchase17/langchain/issues/4775
openai_api_version: str = "2023-05-15"
deployment_name: str
streaming: bool = True
_pyclass: Type = AzureChatOpenAI
model_config = ConfigDict(
json_schema_extra={
"humanReadableName": "Azure OpenAI Chat Models",
"description": "Chat model from Azure OpenAI",
"link": "https://azure.microsoft.com/en-us/products/ai-services/openai-service",
}
)
# https://python.langchain.com/en/latest/modules/models/llms/integrations/azure_openai_example.html
class LLMAzureOpenAIConfig(LLMSettings):
openai_api_key: str
openai_api_base: str
api_type: str = "azure"
# https://learn.microsoft.com/en-us/azure/cognitive-services/openai/reference#completions
# Current supported versions 2022-12-01, 2023-03-15-preview, 2023-05-15
# Don't mix api versions: https://github.com/hwchase17/langchain/issues/4775
api_version: str = "2023-05-15"
deployment_name: str = "gpt-35-turbo-instruct" # Model "comming soon" according to microsoft
model_name: str = "gpt-35-turbo-instruct" # Use only completion models !
streaming: bool = True
_pyclass: Type = AzureOpenAI
model_config = ConfigDict(
json_schema_extra={
"humanReadableName": "Azure OpenAI Completion models",
"description": "Configuration for Cognitive Services Azure OpenAI",
"link": "https://azure.microsoft.com/en-us/products/ai-services/openai-service",
}
)

Ok thanks.
In Azure AI Studio the "gpt-35-turbo" is flagged as "Completion" so that's why I was using completion model setting in the cat (which worked well in old versions).

Anyway, i tried switching to "Azure OpenAI Chat Models" and looks like it works for "regular" questions but it behaves strange when plugins are involved.

Screenshot 2023-12-12 10 47 11

Screenshot 2023-12-12 10 49 59

@Roby91 sorry for the confusion. The gpt-35-turbo is indeed a "Completion" model that is meant for the "Chat" use-case. The Azure Portal says Completion to distinguish from other models that do completely different things like "Embeddings".

If you think it would help, please feel free to propose a patch to change the "Azure OpenAI Chat Models" wording with something that contains "Chat Completion" and it is more easy to understand.

Thanks

@pieroit I am not able to comment on the problem with the specific plugin.

If I understand correctly @Roby91 is activating plugins that are altering the prompts, and the tools are not working correctly anymore.

This should not be Azure specific and requires @pieroit or somebody from the core team to have a closer look.

@zioproto @Roby91 a model is completion/instruction OR chat

Edit: in the screenshot they just distinguishing between token prediction (completion) and geometrical projection (embedding)

@zioproto @Roby91 a model is completion OR chat

It is called "Chat Completion API" in the documentation. Because the word completion is always used, this caused the confusion.

https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/chatgpt?tabs=python&pivots=programming-language-chat-completions

Screenshot 2023-12-22 at 10 53 07

Ok thanks for clarifying @zioproto

@Roby91 can you try a fresh install? Pull the image again
Cannot repeat the issue

@pieroit @zioproto

I removed all Docker images, downloaded 1.4.3 zip release, opened it via "docker compose up".
Configured Azure OpenAI Chat, with "Streaming" flag off (what's that for by the way?).
No plugins installed (except the Core CCat).

After lot of attempts I can say that, the issue happens with random frequency but may be related to "Clear conversation" action:
if it happens it does just as the first answer (at least for all my tests).

To reproduce the issue I do this:
start a new conversation, if the answer is ok I "clear conversation" and try with a random question (from the suggested ones).
Sometimes I get the wrong answer right away, sometimes after 3 attempts, sometimes after 10 attempts.
I cannot find similarities or strange patterns between questions.
Maybe is useful to notice that in 2 of 4 of the following screenshoots there is the "what time is it" question involved inside the full prompt of the answer that fails.

screencapture-localhost-1865-admin-2023-12-24-18_55_55

screencapture-localhost-1865-admin-2023-12-24-18_56_46

screencapture-localhost-1865-admin-2023-12-24-19_06_50

screencapture-localhost-1865-admin-2023-12-24-19_03_07