Stop token during Inference

Question

Stop token during Inference

sankethgadadinni opened this issue 2 years ago · 3 comments

sankethgadadinni commented 2 years ago

Why are the responses cut down in the middle?

Answer 1 · 2023-06-26T05:53:27.000Z

you need to update the
generation_config.max_new_tokens = 200

to which how many max new tokens you want it to generate

Answer 2 · 2023-06-26T05:56:08.000Z

generation_config = model.generation_config
generation_config.max_new_tokens = 100
generation_config.temperature = 0.5
generation_config.top_p = 0.7
generation_config.num_return_sequences = 1
generation_config.pad_token_id = tokenizer.eos_token_id
generation_config.eos_token_id = tokenizer.eos_token_id

I have this config but still it ends after completion of number of tokens. Is there way to stop at end of sentences like openai.

Answer 3 · 2023-06-26T06:19:38.000Z

Increase the number of max new tokens to something like 400-500 to get bigger replies. Falcon7b can output max 2k tokens