repeat the output content until the maximum output length is set

Question

repeat the output content until the maximum output length is set

Opened this issue 3 months ago · 2 comments

Using Openchat-3.5-0106 locally will repeat the output content until the maximum output length is set. In other words, the output of the model does not stop automatically.
And the model is loaded with the following warning:
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
May I ask how to solve this kind of problem?

Answer 1 · 2024-03-27T06:04:15.000Z

Hi @zestaken, can you provide more information about your local model setup?

Answer 2 · 2024-04-08T11:33:18.000Z

I also encountered this problem