Possible bug in the 'deepseek-coder' chat template's system message

Question

Possible bug in the 'deepseek-coder' chat template's system message

jukofyork opened this issue 23 days ago · 1 comments

I noticed this when adding the 'alpaca' chat template (which is very similar to the 'deepseek-coder' chat template):

----- deepseek-ai/deepseek-coder-33b-instruct -----
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
<｜begin▁of▁sentence｜>You are a helpful assistant### Instruction:
Hello
### Response:
Hi there
<|EOT|>
### Instruction:
Who are you
### Response:
   I am an assistant   
<|EOT|>
### Instruction:
Another question
### Response:

Doesn't look correct to me for deepseek-coder as the default system message ends with a newline like so:

----- deepseek-ai/deepseek-coder-33b-instruct -----
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
<｜begin▁of▁sentence｜>You are a helpful assistant
### Instruction:
Hello
### Response:
Hi there
<|EOT|>
### Instruction:
Who are you
### Response:
   I am an assistant   
<|EOT|>
### Instruction:
Another question
### Response:

Perhaps the Python script on this page should detect this and move the newline from the default system message into the template itself?

Answer 1 · 2024-05-20T20:18:32.000Z

There may be another bug in this:

output = AutoTokenizer.from_pretrained(VARIANTS_TO_TEST[0]).apply_chat_template(history, tokenize=False, add_generation_prompt=True, chat_template=GEMMA_TMLP)

Should this be VARIANTS_TO_TEST[0] and thus getting the 'teknium/OpenHermes-2.5-Mistral-7B' data or has the vector been rearranged at some time and 'google/gemma-7b-it' was originally at the start of the vector?