ggerganov/llama.cpp

Possible bug in the 'deepseek-coder' chat template's system message

jukofyork opened this issue · 1 comments

I noticed this when adding the 'alpaca' chat template (which is very similar to the 'deepseek-coder' chat template):

#7383 (comment)

----- deepseek-ai/deepseek-coder-33b-instruct -----
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
<|begin▁of▁sentence|>You are a helpful assistant### Instruction:
Hello
### Response:
Hi there
<|EOT|>
### Instruction:
Who are you
### Response:
   I am an assistant   
<|EOT|>
### Instruction:
Another question
### Response:

Doesn't look correct to me for deepseek-coder as the default system message ends with a newline like so:

----- deepseek-ai/deepseek-coder-33b-instruct -----
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
<|begin▁of▁sentence|>You are a helpful assistant
### Instruction:
Hello
### Response:
Hi there
<|EOT|>
### Instruction:
Who are you
### Response:
   I am an assistant   
<|EOT|>
### Instruction:
Another question
### Response:

Perhaps the Python script on this page should detect this and move the newline from the default system message into the template itself?

There may be another bug in this:

output = AutoTokenizer.from_pretrained(VARIANTS_TO_TEST[0]).apply_chat_template(history, tokenize=False, add_generation_prompt=True, chat_template=GEMMA_TMLP)

Should this be VARIANTS_TO_TEST[0] and thus getting the 'teknium/OpenHermes-2.5-Mistral-7B' data or has the vector been rearranged at some time and 'google/gemma-7b-it' was originally at the start of the vector?