Download 7B model seems stuck
guanlinz opened this issue · 9 comments
It stucks at downloading file for 3 hours but still unfinished:
(llamachat) [ec2-user@ip-172-31-6-66 llamachat]$ python -m llama.download --model_size 7B --folder ./pyllama_data/
❤️ Resume download is supported. You can ctrl-c and rerun the program to resume the downloading
Downloading tokenizer...
✅ ./pyllama_data//tokenizer.model
✅ ./pyllama_data//tokenizer_checklist.chk
tokenizer.model: OK
Downloading 7B
downloading file to ./pyllama_data//7B/consolidated.00.pth ...please wait for a few minutes ...
Hello, are you still stuck at downloading file? I'm also stuck when downloading the 7B model.
Same here. I checked the bandwidth usage and confirmed that it get stuck for downloading the 7B model with this script.
Same here.
I'm stuck with 13B. Is it expected to finish in a few minutes?
I also meet the same issue, and apparently if I try to rerun the command, the download process would continue for like 4-5 mins and then stuck again.
I do not look in the code for debugging yet, but for my purpose, I just created a bash script to restart the download process after sometime anyway, and it works for me
here the sketchy solution of mine :D (which I borrowed a lot from chatGPT lol ):
#!/bin/bash
# Function to handle stopping the script
function stop_script() {
echo "Stopping the script."
exit 0
}
# Register the signal handler
trap stop_script SIGINT
while true; do
# Run the command with a timeout of 200 seconds
timeout 200 python -m llama.download --model_size $1 --folder model
echo "restart download"
sleep 1 # Wait for 1 second before starting the next iteration
# Wait for any key to be pressed within a 1-second timeout
read -t 1 -n 1 -s key
if [[ $key ]]; then
stop_script
fi
done
and using script like so:
bash llama_download.sh 7B
highly recommend to download each model alone, rather than download all since it will check the checksum of previous model downloaded, which might take full 200secs each iteration
Hi @CuongTranXuan,
Thank you for sharing the shell script. I ran into the same issue and used your script to download the 7B weight. However, seems like this is also a never ending loop. I keep running into the following:
❤️ Resume download is supported. You can ctrl-c and rerun the program to resume the downloading`
Downloading tokenizer...
✅ model/tokenizer.model
✅ model/tokenizer_checklist.chk
tokenizer.model: OK
Downloading 7B
downloading file to model/7B/consolidated.00.pth ...please wait for a few minutes ...
✅ model/7B/consolidated.00.pth
✅ model/7B/params.json
✅ model/7B/checklist.chk
Checking checksums
consolidated.00.pth: OK
params.json: OK
restart download
I was wondering if I should stop the code manually but I am not sure if the download is complete. Do by any chance happen to know the file size of the weights that you downloaded? Mine are the following:
checklist.chk -> 100 bytes
consolidated.00.pth -> 12852.61 MB
params.json -> 101 bytes
Hi @CuongTranXuan,
Thank you for sharing the shell script. I ran into the same issue and used your script to download the 7B weight. However, seems like this is also a never ending loop. I keep running into the following:
❤️ Resume download is supported. You can ctrl-c and rerun the program to resume the downloading` Downloading tokenizer... ✅ model/tokenizer.model ✅ model/tokenizer_checklist.chk tokenizer.model: OK Downloading 7B downloading file to model/7B/consolidated.00.pth ...please wait for a few minutes ... ✅ model/7B/consolidated.00.pth ✅ model/7B/params.json ✅ model/7B/checklist.chk Checking checksums consolidated.00.pth: OK params.json: OK restart download
I was wondering if I should stop the code manually but I am not sure if the download is complete. Do by any chance happen to know the file size of the weights that you downloaded? Mine are the following:
checklist.chk -> 100 bytes consolidated.00.pth -> 12852.61 MB params.json -> 101 bytes
Hi @z-mahmud22 ,
from the download script itself after the model finished downloading, it will check for hash checksums to verify the integrity of the model, so you can just stop the download script when the checking is done, either by bash script or by yourself. Since this kind of sketchy script is just a workaround and we hope the actual download script will be patched soon. Cheers!
It seems that the wget process is working correctly. Kill the python -m xxx process while keeping wget working well.
It seems that the wget process is working correctly. Kill the python -m xxx process while keeping wget working well.
How can I use wget to download? Because it seems like I only have the magnet link'magnet:?xt=urn:btih:ZXXDAUWYLRUXXBHUYEMS6Q5CE5WA3LVA&dn=LLaMA'.