Generation gets interrupted
Closed this issue · 6 comments
MAKESEB commented
rjmacarthy commented
Please check max token for chat setting and set to -1
for infinite. The model may be sending eos token or the request from the server was ended.
MAKESEB commented
rjmacarthy commented
Hmm, not sure will have to check. Does it happen in terminal when calling Ollama with the same options? You can enable debug modem in settings to see the request and options.
rjmacarthy commented
Hey @MAKESEB I updated in 3.7.0 to the new version of the Ollama API which supports Open AI specification, it may help with this issue, please report back.
Smrkc commented
Updating twinny and setting all token limits to -1 solves the incompletion issue for me. Thanks
rjmacarthy commented
Thanks!