Script utilizing LLM
jcgeo9 opened this issue · 1 comments
Can you provide a script similar to inference-example.py
, that utilises run_generation.py
file? i.e instead of command like execution
python src/run_generation.py --model_type llama --model_name_or_path meta-llama/Llama-2-13b-chat-hf \ --prefix "<s>[INST] <<SYS>>\n You are a helpful assistant. Answer with detailed responses according to the entire instruction or question. \n<</SYS>>\n\n Summarize the following book: " \ --prompt example_inputs/harry_potter_full.txt \ --suffix " [/INST]" --test_unlimiformer --fp16 --length 200 --layer_begin 16 \ --index_devices 1 --datastore_device 1
instead load the model and run inference from python script.
Thanks in advance!
You can do this from a script by importing run_generation and calling it with your arguments:
from run_generation import main
main(['--model_type', 'llama', '--model_name_or_path', 'meta-llama/Llama-2-13b-chat-hf', <rest of your args here>])```