datawhalechina/self-llm

LLaMA3-8B-Instruct WebDemo部署

Closed this issue · 1 comments

我按照教程运行demo,可以跑通,但是回复特别长,还会自问自答,请问会是什么原因呢?
image
image
image

解决了,有两种修改可以不让LLaMA3-8B-Instruct一直输出
法一:原本的 outputs = model.generate(
input_ids=input_ids, max_new_tokens=512, do_sample=True,
top_p=0.9, temperature=0.5, repetition_penalty=1.1, eos_token_id=tokenizer.encode('<|eot_id|>')[0])这里
去掉 eos_token_id=tokenizer.encode('<|eot_id|>')[0],保持默认
outputs = model.generate(
input_ids=input_ids, max_new_tokens=512, do_sample=True,
top_p=0.9, temperature=0.5, repetition_penalty=1.1)
法二:terminators = [
tokenizer.eos_token_id,
tokenizer.convert_tokens_to_ids("<|eot_id|>")
]
outputs = model.generate(
input_ids=input_ids, max_new_tokens=512, do_sample=True,
top_p=0.9, temperature=0.5, repetition_penalty=1.1,eos_token_id=terminators)
QQ_1734658894589