default values of max_generated_tokens, top_k, top_p, and temperature?
Opened this issue · 1 comments
JamieVC commented
What are default values of max_generated_tokens, top_k, top_p, and temperature?
If user doesn't set all parameters in generate_kwargs
such as the example below, it should use default values. How do we get them in which source file?
# Use custom LLM in BigDL
from ipex_llm.llamaindex.llms import IpexLLM
llm = IpexLLM.from_model_id(
model_name=args.model_path,
tokenizer_name=args.tokenizer_path,
context_window=512,
max_new_tokens=args.n_predict,
generate_kwargs={"temperature": 0.7, "do_sample": False},
model_kwargs={},
messages_to_prompt=messages_to_prompt,
completion_to_prompt=completion_to_prompt,
device_map="xpu",
)
refer to:
https://github.com/intel-analytics/ipex-llm/blob/main/python/llm/example/GPU/LlamaIndex/rag.py
ivy-lv11 commented
The default values are determined by the model configuration. For instance, if you use the Llama-2-chat-hf model, the default settings are temperature=0.9 and top_p=0.6. You can modify these settings in the generation_config.json
file located in the model folder.