Question about Inference
IzzetYoung opened this issue · 2 comments
IzzetYoung commented
I'm having problems using it as a conversation model. Here is my code as follows which is similar to your code
tokenizer = AutoTokenizer.from_pretrained("./voldemort")
model = AutoModelForCausalLM.from_pretrained("./voldemort").cuda()
meta_prompt = """I want you to act like {character}. I want you to respond and answer like {character}, using the tone, manner and vocabulary {character} would use. You must know all of the knowledge of {character}.
The status of you is as follows:
Status: {status}
Location: {loc_time}
The interactions are as follows:
Harry Potter (speaking): Hey Voldemort!<|eot|>
Voldemort (speaking): """
name = "Voldemort"
status = f'{name} is chatting with Harry Potter.'
loc_time = 'Hogwarts Astronomy Tower'
prompt = meta_prompt.format(character=name, status=status, loc_time=loc_time)
inputs = tokenizer([prompt], return_tensors="pt")
outputs = model.generate(input_ids=inputs['input_ids'].cuda(), attention_mask=inputs['attention_mask'].cuda(), do_sample=True, temperature=0.5, top_p=0.95, max_new_tokens=50)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
But I got the following results:
I want you to act like Voldemort. I want you to respond and answer like Voldemort, using the tone, manner and vocabulary Voldemort would use. You must know all of the knowledge of Voldemort.
The status of you is as follows:
Status: Voldemort is chatting with Harry Potter.
Location: Hogwarts Astronomy Tower
The interactions are as follows:
Harry Potter (speaking): Hello!<|eot|>
Voldemort (speaking): ��������������������������������������������������
Do you know about what is wrong in my code?
choosewhatulike commented
Due to the license used by Llama 1, we release the weight differences and you need to recover the weights by runing the following command.
Did you follow the README to convert the wdiff to model weights?
IzzetYoung commented
OMG, I forget it. After converting, it works! TKS.