guidance-ai/guidance

Llama-3 Chat Template Support?

ibehnam opened this issue · 1 comments

I know work is being done on simplifying the chat template usage (#820). In the mean time, can someone please share the custom chat template they built for Llama-3? Here's the Jinja template:

{{ bos_token }}
{% if messages[0]['role'] == 'system' %}
    {% set loop_messages = messages[1:] %}
    {% set system_message = '<|start_header_id|>' + 'system' + '<|end_header_id|>\n\n' + messages[0]['content'].strip() + '<|eot_id|>' %}
{% else %}
    {% set loop_messages = messages %}
    {% set system_message = '' %}
{% endif %}

{% for message in loop_messages %}
    {% if (message['role'] == 'user') != (loop.index0 % 2 == 0) %}
        {{ raise_exception('Conversation roles must alternate user/assistant/user/assistant/...') }}
    {% endif %}

    {% if loop.index0 == 0 %}
        {{ system_message }}
    {% endif %}

    {{ '<|start_header_id|>' + message['role'] + '<|end_header_id|>\n\n' + message['content'].strip() + '<|eot_id|>' }}

    {% if loop.last and message['role'] == 'user' and add_generation_prompt %}
        {{ '<|start_header_id|>' + 'assistant' + '<|end_header_id|>\n\n' }}
    {% endif %}
{% endfor %}

Update:

Switched from a Q4_1 model to Q4_0 and it solved the issue. Now I can use llama-3-instruct models with the above custom template.

===

I tried to adapt the example code to this but it generates nonsense:

class Llama(LlamaCpp):
    pass

class Llama3Chat(LlamaCppChat, Llama):
    def get_role_start(self, role_name, **kwargs):
        if role_name == "system":
            return "<|start_header_id|>system<|end_header_id|>\n\n"
        elif role_name == "user":
            return "<|start_header_id|>user<|end_header_id|>\n\n"
        else:
            return "<|start_header_id|>assistant<|end_header_id|>\n\n"

    def get_role_end(self, role_name=None):
        return "<|eot_id|>"


lm = Llama3Chat(
# slm = LlamaCppChat(
    model=...
    n_gpu_layers=128,
    seed=42,
    n_ctx=2048,
    use_mlock=True,
    no_mmap=True,
    echo=True,
)

Output:


short<|eot_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|>
-