ITREX need to do modification for llama3 new prompt format

Question

ITREX need to do modification for llama3 new prompt format

redhairerINTEL opened this issue 9 months ago · 3 comments

redhairerINTEL commented 9 months ago

New prompt format for llama3
https://llama.meta.com/docs/model-cards-and-prompt-formats/meta-llama-3/

Answer 1 · 2024-04-23T16:53:26.000Z

@kevinintel

Answer 2 · 2024-04-25T09:03:45.000Z

here is the sample code if you want to use llama3 template:
all you need is to apply template to input_ids.

from transformers import AutoTokenizer, TextStreamer
from intel_extension_for_transformers.transformers import AutoModelForCausalLM, WeightOnlyQuantConfig

model_name = "meta-llama/Meta-Llama-3-8B-Instruct"
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
streamer = TextStreamer(tokenizer)
model = AutoModelForCausalLM.from_pretrained(model_name, load_in_4bit=True)
messages = [
    {"role": "system", "content": "You are a pirate chatbot who always responds in pirate speak!"},
    {"role": "user", "content": "Who are you?"},
]

input_ids = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    return_tensors="pt"
).to(model.device)


outputs = model.generate(input_ids , streamer=streamer)

We will also add it to doc soon.

Answer 3 · 2024-04-30T06:13:17.000Z

here is the sample code if you want to use llama3 template: all you need is to apply template to input_ids.

from transformers import AutoTokenizer, TextStreamer
from intel_extension_for_transformers.transformers import AutoModelForCausalLM, WeightOnlyQuantConfig

model_name = "meta-llama/Meta-Llama-3-8B-Instruct"
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
streamer = TextStreamer(tokenizer)
model = AutoModelForCausalLM.from_pretrained(model_name, load_in_4bit=True)
messages = [
    {"role": "system", "content": "You are a pirate chatbot who always responds in pirate speak!"},
    {"role": "user", "content": "Who are you?"},
]

input_ids = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    return_tensors="pt"
).to(model.device)


outputs = model.generate(input_ids , streamer=streamer)

We will also add it to doc soon.

This gives me AssertionError: Fail to convert pytorch model