chat_templates

This is a repository that includes proper chat templates (or input formats) for large language models (LLMs), to support transformers's chat_template feature.

We know that different models are trained with different input formats, especially for those instruction-tuned or chat models. This is especially noted in transformers's new chat_template feature. However, I found that popular models (e.g., vicuna, falcon) on HuggingFace do not include this parameter in their tokenizer_config.json files, which may make it troublesome to properly run these models. Also, the chat_template feature requires to implement a Jinja template, which may be not intuitive to be directly done in the json files.

So I collect proper chat templates of several popular models from official reference or implementations, which are put under chat_templates. If you are interested to include more chat templates, feel free to open a pull request.

If you find this repo useful, please kindly cite it:

@misc{zheng-2023-chat-templates,
  author = {Zheng, Chujie},
  title = {Chat Templates for HuggingFace Large Language Models},
  year = {2023},
  howpublished = {\url{https://github.com/chujiezheng/chat_templates}}
}

Updates

[02/2024] Added support for Google's Gemma models
[02/2024] Added usage explanation for generation_configs.
[01/2024] Added support for Qwen2 models.

What are Contained in This Repo?

chat_templates contains the jinja files of collected chat templates, which can be directly replaced in the Huggingface tokenizers.
generation_configs contains the corresponding json configs used for controlling the ending of response generations. Specially, the stop_token_ids should be directly passed into the generate method by the eos_token_id argument.

Supported Models

Model (Family)	Template File	Reference	Comment
`llama-2-chat`	`llama-2-chat.jinja`	link	Official template
`mistral-instruct`	`mistral-instruct.jinja`	link	`Mistral-7B-Instruct-v0.1/0.2` System message allowed
`gemma-it` New	`gemma-it.jinja`	link	`gemma-2b/7b-it` System message allowed
`qwen2-chat` New	`chatml.jinja`	link	ChatML format `Qwen1.5-0.4/1.8/4/7/14/72B-Chat`
`openchat`	`openchat.jinja`	link	`openchat-3.5`
`yi-chat`	`chatml.jinja`	link	ChatML format `Yi-6/34B-Chat`
`zephyr`	`zephyr.jinja`	link	`zephyr-7b-alpha/beta`
`orca-2`	`chatml.jinja`	link	ChatML format `Orca-2-7/13b`
`vicuna`	`vicuna.jinja`	link	`vicuna-7/13b-v1.5`
`falcon-instruct`	`falcon-instruct.jinja`	link	`falcon-7/40b-instruct`
`starling-lm`	`openchat.jinja`	link	`Starling-LM-7B-alpha`
`solar-instruct`	`solar-instruct.jinja`	link	`SOLAR-10.7B-Instruct-v1.0`
`alpaca`	`alpaca.jinja`	link	`alpaca`-style models, like `Platypus2-13B`
`amberchat`	`amberchat.jinja`	link	`AmberChat`, `AmberSafe`
`saiga`	`saiga.jinja`	link	`saiga`, a series of Russian models

Examples of Setting `chat_template`

Example 1: `llama-2-chat`

This example may check if the jinja file is correctly implemented.

from transformers import AutoTokenizer

toker = AutoTokenizer.from_pretrained("meta-llama/Llama-2-7b-chat-hf", token="YOUR_OWN_TOKEN")
messages = [
    {'role': 'system', 'content': 'This is a system prompt.'},
    {'role': 'user', 'content': 'This is the first user input.'},
    {'role': 'assistant', 'content': 'This is the first assistant response.'},
    {'role': 'user', 'content': 'This is the second user input.'},
]
print('###### Default (yet Correct) Chat Template ######')
print(toker.apply_chat_template(messages, tokenize=False, add_generation_prompt=True))
print('###### Corrected Chat Template ######')
chat_template = open('./chat_templates/llama-2-chat.jinja').read()
chat_template = chat_template.replace('    ', '').replace('\n', '')
toker.chat_template = chat_template
print(toker.apply_chat_template(messages, tokenize=False, add_generation_prompt=True))

Expected output:

###### Default (yet Correct) Chat Template ######
<s>[INST] <<SYS>>
This is a system prompt.
<</SYS>>

This is the first user input. [/INST] This is the first assistant response. </s><s>[INST] This is the second user input. [/INST]
###### Corrected Chat Template ######
<s>[INST] <<SYS>>
This is a system prompt.
<</SYS>>

This is the first user input. [/INST] This is the first assistant response. </s><s>[INST] This is the second user input. [/INST]

Example 2: `mistral-instruct`

For mistral-instruct (also gemma-it), it does not natively support the system message, so passing the system message would raise error.

from transformers import AutoTokenizer

toker = AutoTokenizer.from_pretrained("lmsys/vicuna-7b-v1.5")
messages = [
    {'role': 'system', 'content': 'This is a system prompt.'},
    {'role': 'user', 'content': 'This is the first user input.'},
    {'role': 'assistant', 'content': 'This is the first assistant response.'},
    {'role': 'user', 'content': 'This is the second user input.'},
]
print('###### Default (but Improper) Chat Template ######')
# raising error
#print(toker.apply_chat_template(messages, tokenize=False, add_generation_prompt=True))
print('###### Corrected Chat Template ######')
chat_template = open('./chat_templates/mistral-instruct.jinja').read()
chat_template = chat_template.replace('    ', '').replace('\n', '')
toker.chat_template = chat_template
print(toker.apply_chat_template(messages, tokenize=False, add_generation_prompt=True))

Expected output:

###### Default (but Error-Raising) Chat Template ######
jinja2.exceptions.TemplateError: Conversation roles must alternate user/assistant/user/assistant/...
###### Corrected Chat Template ######
<s>[INST] This is a system prompt.

This is the first user input. [/INST] This is the first assistant response. </s>[INST] This is the second user input. [/INST]

Example 3: `vicuna`

NOTE: In fast-chat, vicuna does not add linebreaks between roles' messages. But I found that adding linebreaks leads to a bit better performance (especially for the v1.5 version).

Also, I found vicuna-7/13/33b-v1.3 may not work well when given a system message different from its default one. So I would recommend to use vicuna-7/13b-v1.5 instead.

from transformers import AutoTokenizer

toker = AutoTokenizer.from_pretrained("lmsys/vicuna-7b-v1.5")
messages = [
    {'role': 'system', 'content': 'This is a system prompt.'},
    {'role': 'user', 'content': 'This is the first user input.'},
    {'role': 'assistant', 'content': 'This is the first assistant response.'},
    {'role': 'user', 'content': 'This is the second user input.'},
]
print('###### Default (but Improper) Chat Template ######')
print(toker.apply_chat_template(messages, tokenize=False, add_generation_prompt=True))
print('###### Corrected Chat Template ######')
chat_template = open('./chat_templates/vicuna.jinja').read()
chat_template = chat_template.replace('    ', '').replace('\n', '')
toker.chat_template = chat_template
print(toker.apply_chat_template(messages, tokenize=False, add_generation_prompt=True))

Expected output:

###### Default (but Improper) Chat Template ######
<s>[INST] <<SYS>>
This is a system prompt.
<</SYS>>

This is the first user input. [/INST] This is the first assistant response. </s><s>[INST] This is the second user input. [/INST]
###### Corrected Chat Template ######
<s>This is a system prompt.

USER: This is the first user input.
ASSISTANT: This is the first assistant response.</s>
USER: This is the second user input.
ASSISTANT:

Leezekun/chat_templates

chat_templates

Updates

What are Contained in This Repo?

Supported Models

Examples of Setting chat_template

Example 1: llama-2-chat

Example 2: mistral-instruct

Example 3: vicuna

Examples of Setting `chat_template`

Example 1: `llama-2-chat`

Example 2: `mistral-instruct`

Example 3: `vicuna`