Error loading model: is this really a GGML file?
kilkujadek opened this issue ยท 9 comments
kilkujadek commented
Hello,
Using One-line install seems to by successful (except few warnings):
git clone https://github.com/fredi-python/llama.cpp.git && cd llama.cpp && make -j && cd models && wget -c https://huggingface.co/eachadea/ggml-vicuna-13b-1.1/resolve/main/ggml-vic13b-uncensored-q5_1.bin
Cloning into 'llama.cpp'...
remote: Enumerating objects: 2390, done.
remote: Counting objects: 100% (867/867), done.
remote: Compressing objects: 100% (77/77), done.
remote: Total 2390 (delta 815), reused 790 (delta 790), pack-reused 1523
Receiving objects: 100% (2390/2390), 2.16 MiB | 3.93 MiB/s, done.
Resolving deltas: 100% (1566/1566), done.
I llama.cpp build info:
I UNAME_S: Linux
I UNAME_P: unknown
I UNAME_M: x86_64
I CFLAGS: -I. -O3 -std=c11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -pthread -march=native -mtune=native
I CXXFLAGS: -I. -I./examples -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native
I LDFLAGS:
I CC: cc (Debian 10.2.1-6) 10.2.1 20210110
I CXX: g++ (Debian 10.2.1-6) 10.2.1 20210110
cc -I. -O3 -std=c11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -pthread -march=native -mtune=native -c ggml.c -o ggml.o
g++ -I. -I./examples -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native -c llama.cpp -o llama.o
g++ -I. -I./examples -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native -c examples/common.cpp -o common.o
llama.cpp: In function 'size_t llama_set_state_data(llama_context*, const uint8_t*)':
llama.cpp:2615:27: warning: cast from type 'const uint8_t*' {aka 'const unsigned char*'} to type 'void*' casts away qualifiers [-Wcast-qual]
2615 | kin3d->data = (void *) in;
| ^~~~~~~~~~~
llama.cpp:2619:27: warning: cast from type 'const uint8_t*' {aka 'const unsigned char*'} to type 'void*' casts away qualifiers [-Wcast-qual]
2619 | vin3d->data = (void *) in;
| ^~~~~~~~~~~
g++ -I. -I./examples -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native pocs/vdot/vdot.cpp ggml.o -o vdot
g++ -I. -I./examples -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native examples/main/main.cpp ggml.o llama.o common.o -o main
g++ -I. -I./examples -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native examples/quantize/quantize.cpp ggml.o llama.o -o quantize
g++ -I. -I./examples -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native examples/quantize-stats/quantize-stats.cpp ggml.o llama.o -o quantize-stats
g++ -I. -I./examples -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native examples/perplexity/perplexity.cpp ggml.o llama.o common.o -o perplexity
g++ -I. -I./examples -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native examples/embedding/embedding.cpp ggml.o llama.o common.o -o embedding
==== Run ./main -h for help. ====
--2023-05-15 09:57:30-- https://huggingface.co/eachadea/ggml-vicuna-13b-1.1/resolve/main/ggml-vic13b-uncensored-q5_1.bin
Resolving huggingface.co (huggingface.co)... 108.138.51.20, 108.138.51.95, 108.138.51.49, ...
Connecting to huggingface.co (huggingface.co)|108.138.51.20|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://cdn-lfs.huggingface.co/repos/0a/36/0a36ee786df124a005175a3d339738ad57350a96ae625c2111bce6483acbe34a/6fc1294b722082631cd61b1bde2cfecd1533eb95b331dbbdacbebe4944ff974a?response-content-disposition=attachment%3B+filename*%3DUTF-8%27%27ggml-vic13b-uncensored-q5_1.bin%3B+filename%3D%22ggml-vic13b-uncensored-q5_1.bin%22%3B&response-content-type=application%2Foctet-stream&Expires=1684397227&Policy=eyJTdGF0ZW1lbnQiOlt7IlJlc291cmNlIjoiaHR0cHM6Ly9jZG4tbGZzLmh1Z2dpbmdmYWNlLmNvL3JlcG9zLzBhLzM2LzBhMzZlZTc4NmRmMTI0YTAwNTE3NWEzZDMzOTczOGFkNTczNTBhOTZhZTYyNWMyMTExYmNlNjQ4M2FjYmUzNGEvNmZjMTI5NGI3MjIwODI2MzFjZDYxYjFiZGUyY2ZlY2QxNTMzZWI5NWIzMzFkYmJkYWNiZWJlNDk0NGZmOTc0YT9yZXNwb25zZS1jb250ZW50LWRpc3Bvc2l0aW9uPSomcmVzcG9uc2UtY29udGVudC10eXBlPSoiLCJDb25kaXRpb24iOnsiRGF0ZUxlc3NUaGFuIjp7IkFXUzpFcG9jaFRpbWUiOjE2ODQzOTcyMjd9fX1dfQ__&Signature=tbSCc4T3qUsBlw-mQrtcKwBQL0cbfeZe8MH3aGUv4EgOfo0JZibFFetpyqKk88LsDRKNzStyM6epwjbiB11PwEE73JT6ajJnAkArMkNDOmTO4NP6poC1rHlM-XRz3WuSdi3nY0fdDYYYL1gHb%7EAPwILghy-z4-vWRSEPldUQGTuqCZqj2knjmVtIuHSk06fShBYKOWKM7nnzb0-ENQumj6garze%7Es7n0hQjX%7EBKTGAD-HI5mMy1I5rwfA5M6eQ9zYavGHKNj104LftBPBLjpvAamO6fGS1L6KQYiKG-t68AuDgBy8TVbdIfTYJbN52vnvcfaiz3E5QB8JrvMv5uETQ__&Key-Pair-Id=KVTP0A1DKRTAX [following]
--2023-05-15 09:57:31-- https://cdn-lfs.huggingface.co/repos/0a/36/0a36ee786df124a005175a3d339738ad57350a96ae625c2111bce6483acbe34a/6fc1294b722082631cd61b1bde2cfecd1533eb95b331dbbdacbebe4944ff974a?response-content-disposition=attachment%3B+filename*%3DUTF-8''ggml-vic13b-uncensored-q5_1.bin%3B+filename%3D%22ggml-vic13b-uncensored-q5_1.bin%22%3B&response-content-type=application%2Foctet-stream&Expires=1684397227&Policy=eyJTdGF0ZW1lbnQiOlt7IlJlc291cmNlIjoiaHR0cHM6Ly9jZG4tbGZzLmh1Z2dpbmdmYWNlLmNvL3JlcG9zLzBhLzM2LzBhMzZlZTc4NmRmMTI0YTAwNTE3NWEzZDMzOTczOGFkNTczNTBhOTZhZTYyNWMyMTExYmNlNjQ4M2FjYmUzNGEvNmZjMTI5NGI3MjIwODI2MzFjZDYxYjFiZGUyY2ZlY2QxNTMzZWI5NWIzMzFkYmJkYWNiZWJlNDk0NGZmOTc0YT9yZXNwb25zZS1jb250ZW50LWRpc3Bvc2l0aW9uPSomcmVzcG9uc2UtY29udGVudC10eXBlPSoiLCJDb25kaXRpb24iOnsiRGF0ZUxlc3NUaGFuIjp7IkFXUzpFcG9jaFRpbWUiOjE2ODQzOTcyMjd9fX1dfQ__&Signature=tbSCc4T3qUsBlw-mQrtcKwBQL0cbfeZe8MH3aGUv4EgOfo0JZibFFetpyqKk88LsDRKNzStyM6epwjbiB11PwEE73JT6ajJnAkArMkNDOmTO4NP6poC1rHlM-XRz3WuSdi3nY0fdDYYYL1gHb~APwILghy-z4-vWRSEPldUQGTuqCZqj2knjmVtIuHSk06fShBYKOWKM7nnzb0-ENQumj6garze~s7n0hQjX~BKTGAD-HI5mMy1I5rwfA5M6eQ9zYavGHKNj104LftBPBLjpvAamO6fGS1L6KQYiKG-t68AuDgBy8TVbdIfTYJbN52vnvcfaiz3E5QB8JrvMv5uETQ__&Key-Pair-Id=KVTP0A1DKRTAX
Resolving cdn-lfs.huggingface.co (cdn-lfs.huggingface.co)... 18.244.102.114, 18.244.102.76, 18.244.102.9, ...
Connecting to cdn-lfs.huggingface.co (cdn-lfs.huggingface.co)|18.244.102.114|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 9763701888 (9.1G) [application/octet-stream]
Saving to: 'ggml-vic13b-uncensored-q5_1.bin'
ggml-vic13b-uncensored-q5_1.bin 100%[=========================================================================================================>] 9.09G 2.99MB/s in 49m 46s
2023-05-15 10:47:17 (3.12 MB/s) - 'ggml-vic13b-uncensored-q5_1.bin' saved [9763701888/9763701888]
But when I try to run it, it throwing an error:
./main -m models/ggml-vic13b-uncensored-q5_1.bin -f 'prompts/chat-with-vicuna-v1.txt' -r 'User:' --temp 0.36
main: build = 523 (0737a47)
main: seed = 1684152947
llama.cpp: loading model from models/ggml-vic13b-uncensored-q5_1.bin
error loading model: unknown (magic, version) combination: 67676a74, 00000002; is this really a GGML file?
llama_init_from_file: failed to load model
llama_init_from_gpt_params: error: failed to load model 'models/ggml-vic13b-uncensored-q5_1.bin'
main: error: unable to load model```
fredi-python commented
navigate to your llama.cpp folder and type:
git pull
kilkujadek commented
navigate to your llama.cpp folder and type:
git pull
Nope, still same:
(lama2) kilku@debian:~/vicuna/llama.cpp$ git pull
Already up to date.
(lama2) kilku@debian:~/vicuna/llama.cpp$ ./main -m models/ggml-vic13b-uncensored-q5_1.bin -f 'prompts/chat-with-vicuna-v1.txt' -r 'User:' --temp 0.36
main: build = 523 (0737a47)
main: seed = 1684157106
llama.cpp: loading model from models/ggml-vic13b-uncensored-q5_1.bin
error loading model: unknown (magic, version) combination: 67676a74, 00000002; is this really a GGML file?
llama_init_from_file: failed to load model
llama_init_from_gpt_params: error: failed to load model 'models/ggml-vic13b-uncensored-q5_1.bin'
main: error: unable to load model
fredi-python commented
try to run make -j
again
kilkujadek commented
make -j
did that as well
r3t4k3r commented
ziliangpeng commented
same error. just freshly cloned the repo, freshly built, freshly downloaded the 7b model, and see the same error
andreibondarev commented
Same error with the 13b model.
fredi-python commented
OK, i just updated my fork of llama.cpp, should work now!
navigate to the llama.cpp folder and type
git pull
then:
make -j
Should work!
kilkujadek commented
It is working now, thanks!