Location of downloaded bin file to deploy
nigue3025 opened this issue · 2 comments
HI,
I want to deploy the model in machine A.
After executing the sh file, the docker seems worked.
Due to the bad internet quality of machine A, it couldn't sucessfully download the whole bin files(approximately 26 gb) as the docker start up.
So I downloaded the bin files from here https://huggingface.co/yentinglin/Taiwan-LLaMa-v1.0/tree/main to my own computer
and move them to machine A.
However, it could not work(It still attempt to download the bin files from internet).
I am not certain the path of the bin files I should place to in the project folder.
Any suggestion?
You can specific the absolute path of model in container, e.g.
# Must be aware that the model path cannot include a dot.
git clone https://huggingface.co/yentinglin/Taiwan-LLaMa-v1.0 TwLLaMAv1
# Following is the script I used in my use case.
docker run --gpus 'device=0' \
--shm-size 1g \
-p 8085:80 \
-v $PWD/TwLLaMAv1:/TwLLaMAv1 \
ghcr.io/huggingface/text-generation-inference:latest \
--model-id /TwLLaMAv1 \
--num-shard 1 \
--quantize bitsandbytes \
--max-input-length 1000 \
--max-total-tokens 2000
You can specific the absolute path of model in container, e.g.
# Must be aware that the model path cannot include a dot. git clone https://huggingface.co/yentinglin/Taiwan-LLaMa-v1.0 TwLLaMAv1 # Following is the script I used in my use case. docker run --gpus 'device=0' \ --shm-size 1g \ -p 8085:80 \ -v $PWD/TwLLaMAv1:/TwLLaMAv1 \ ghcr.io/huggingface/text-generation-inference:latest \ --model-id /TwLLaMAv1 \ --num-shard 1 \ --quantize bitsandbytes \ --max-input-length 1000 \ --max-total-tokens 2000
It works!
Thanks for kind help!