TextGenerationOptions is totally not used
AsakusaRinne opened this issue · 9 comments
TextGenerationOptions
is a parameter of ITextGeneration.GenerateTextAsync
. However currently it seems to be not used anywhere.
For some API service like OpenAI Chatgpt, stop sequence is not so important. However for local model inference, the model will endlessly generate response without a stop sequence.
Could you please expose TextGenerationOptions
to AskAsync
API to let users configure the settings themselves? It will help a lot for local LLM inference integration.
If possible, I hope that the method of calculating the number of tokens can also provide custom configuration.
Any updates? @dluc I understand that at the beginning stage of a project, there's always short of hands. Please at least let us know if it would be solved in the future.
Sorry we didn't have an opportunity to look into this yet, but we always keep an eye on the list of open issues, so we'll provide an update as soon as possible.
Ok, I'm looking forward to it. Thank you for your works anyway.
I noticed that LLama would generate tokens ad infinitum (almost, at some point it throws an exception). SearchClientConfig.AnswerTokens
will be passed as TextGenerationOptions.MaxTokens
.
I'll look into adding the options to the Ask API, so the behavior can be managed more easily.
Thank you a lot! I'm looking forward to it
Update: @marcominerva added new settings to SeachClientConfig
, see #341, allowing to configure Temperature, TopP and other LLM request settings. These will soon be used by AskAsync
, PR coming soon.
@dluc Thank you Devis! We'll keep track with the next release and apply it in LLamaSharp