intel-analytics/ipex-llm

default values of max_generated_tokens, top_k, top_p, and temperature?

Opened this issue · 1 comments

What are default values of max_generated_tokens, top_k, top_p, and temperature?
If user doesn't set all parameters in generate_kwargs such as the example below, it should use default values. How do we get them in which source file?

    # Use custom LLM in BigDL
    from ipex_llm.llamaindex.llms import IpexLLM
    llm = IpexLLM.from_model_id(
        model_name=args.model_path,
        tokenizer_name=args.tokenizer_path,
        context_window=512,
        max_new_tokens=args.n_predict,
        generate_kwargs={"temperature": 0.7, "do_sample": False},
        model_kwargs={},
        messages_to_prompt=messages_to_prompt,
        completion_to_prompt=completion_to_prompt,
        device_map="xpu",
    )

refer to:
https://github.com/intel-analytics/ipex-llm/blob/main/python/llm/example/GPU/LlamaIndex/rag.py

The default values are determined by the model configuration. For instance, if you use the Llama-2-chat-hf model, the default settings are temperature=0.9 and top_p=0.6. You can modify these settings in the generation_config.json file located in the model folder.