vicuna-tools/vicuna-installation-guide

Error loading model: is this really a GGML file?

kilkujadek opened this issue ยท 9 comments

Hello,

Using One-line install seems to by successful (except few warnings):

git clone https://github.com/fredi-python/llama.cpp.git && cd llama.cpp && make -j && cd models && wget -c https://huggingface.co/eachadea/ggml-vicuna-13b-1.1/resolve/main/ggml-vic13b-uncensored-q5_1.bin
Cloning into 'llama.cpp'...
remote: Enumerating objects: 2390, done.
remote: Counting objects: 100% (867/867), done.
remote: Compressing objects: 100% (77/77), done.
remote: Total 2390 (delta 815), reused 790 (delta 790), pack-reused 1523
Receiving objects: 100% (2390/2390), 2.16 MiB | 3.93 MiB/s, done.
Resolving deltas: 100% (1566/1566), done.
I llama.cpp build info: 
I UNAME_S:  Linux
I UNAME_P:  unknown
I UNAME_M:  x86_64
I CFLAGS:   -I.              -O3 -std=c11   -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -pthread -march=native -mtune=native
I CXXFLAGS: -I. -I./examples -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native
I LDFLAGS:  
I CC:       cc (Debian 10.2.1-6) 10.2.1 20210110
I CXX:      g++ (Debian 10.2.1-6) 10.2.1 20210110

cc  -I.              -O3 -std=c11   -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -pthread -march=native -mtune=native   -c ggml.c -o ggml.o
g++ -I. -I./examples -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native -c llama.cpp -o llama.o
g++ -I. -I./examples -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native -c examples/common.cpp -o common.o
llama.cpp: In function 'size_t llama_set_state_data(llama_context*, const uint8_t*)':
llama.cpp:2615:27: warning: cast from type 'const uint8_t*' {aka 'const unsigned char*'} to type 'void*' casts away qualifiers [-Wcast-qual]
 2615 |             kin3d->data = (void *) in;
      |                           ^~~~~~~~~~~
llama.cpp:2619:27: warning: cast from type 'const uint8_t*' {aka 'const unsigned char*'} to type 'void*' casts away qualifiers [-Wcast-qual]
 2619 |             vin3d->data = (void *) in;
      |                           ^~~~~~~~~~~
g++ -I. -I./examples -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native pocs/vdot/vdot.cpp ggml.o -o vdot 
g++ -I. -I./examples -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native examples/main/main.cpp ggml.o llama.o common.o -o main 
g++ -I. -I./examples -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native examples/quantize/quantize.cpp ggml.o llama.o -o quantize 
g++ -I. -I./examples -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native examples/quantize-stats/quantize-stats.cpp ggml.o llama.o -o quantize-stats 
g++ -I. -I./examples -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native examples/perplexity/perplexity.cpp ggml.o llama.o common.o -o perplexity 
g++ -I. -I./examples -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native examples/embedding/embedding.cpp ggml.o llama.o common.o -o embedding 

====  Run ./main -h for help.  ====

--2023-05-15 09:57:30--  https://huggingface.co/eachadea/ggml-vicuna-13b-1.1/resolve/main/ggml-vic13b-uncensored-q5_1.bin
Resolving huggingface.co (huggingface.co)... 108.138.51.20, 108.138.51.95, 108.138.51.49, ...
Connecting to huggingface.co (huggingface.co)|108.138.51.20|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://cdn-lfs.huggingface.co/repos/0a/36/0a36ee786df124a005175a3d339738ad57350a96ae625c2111bce6483acbe34a/6fc1294b722082631cd61b1bde2cfecd1533eb95b331dbbdacbebe4944ff974a?response-content-disposition=attachment%3B+filename*%3DUTF-8%27%27ggml-vic13b-uncensored-q5_1.bin%3B+filename%3D%22ggml-vic13b-uncensored-q5_1.bin%22%3B&response-content-type=application%2Foctet-stream&Expires=1684397227&Policy=eyJTdGF0ZW1lbnQiOlt7IlJlc291cmNlIjoiaHR0cHM6Ly9jZG4tbGZzLmh1Z2dpbmdmYWNlLmNvL3JlcG9zLzBhLzM2LzBhMzZlZTc4NmRmMTI0YTAwNTE3NWEzZDMzOTczOGFkNTczNTBhOTZhZTYyNWMyMTExYmNlNjQ4M2FjYmUzNGEvNmZjMTI5NGI3MjIwODI2MzFjZDYxYjFiZGUyY2ZlY2QxNTMzZWI5NWIzMzFkYmJkYWNiZWJlNDk0NGZmOTc0YT9yZXNwb25zZS1jb250ZW50LWRpc3Bvc2l0aW9uPSomcmVzcG9uc2UtY29udGVudC10eXBlPSoiLCJDb25kaXRpb24iOnsiRGF0ZUxlc3NUaGFuIjp7IkFXUzpFcG9jaFRpbWUiOjE2ODQzOTcyMjd9fX1dfQ__&Signature=tbSCc4T3qUsBlw-mQrtcKwBQL0cbfeZe8MH3aGUv4EgOfo0JZibFFetpyqKk88LsDRKNzStyM6epwjbiB11PwEE73JT6ajJnAkArMkNDOmTO4NP6poC1rHlM-XRz3WuSdi3nY0fdDYYYL1gHb%7EAPwILghy-z4-vWRSEPldUQGTuqCZqj2knjmVtIuHSk06fShBYKOWKM7nnzb0-ENQumj6garze%7Es7n0hQjX%7EBKTGAD-HI5mMy1I5rwfA5M6eQ9zYavGHKNj104LftBPBLjpvAamO6fGS1L6KQYiKG-t68AuDgBy8TVbdIfTYJbN52vnvcfaiz3E5QB8JrvMv5uETQ__&Key-Pair-Id=KVTP0A1DKRTAX [following]
--2023-05-15 09:57:31--  https://cdn-lfs.huggingface.co/repos/0a/36/0a36ee786df124a005175a3d339738ad57350a96ae625c2111bce6483acbe34a/6fc1294b722082631cd61b1bde2cfecd1533eb95b331dbbdacbebe4944ff974a?response-content-disposition=attachment%3B+filename*%3DUTF-8''ggml-vic13b-uncensored-q5_1.bin%3B+filename%3D%22ggml-vic13b-uncensored-q5_1.bin%22%3B&response-content-type=application%2Foctet-stream&Expires=1684397227&Policy=eyJTdGF0ZW1lbnQiOlt7IlJlc291cmNlIjoiaHR0cHM6Ly9jZG4tbGZzLmh1Z2dpbmdmYWNlLmNvL3JlcG9zLzBhLzM2LzBhMzZlZTc4NmRmMTI0YTAwNTE3NWEzZDMzOTczOGFkNTczNTBhOTZhZTYyNWMyMTExYmNlNjQ4M2FjYmUzNGEvNmZjMTI5NGI3MjIwODI2MzFjZDYxYjFiZGUyY2ZlY2QxNTMzZWI5NWIzMzFkYmJkYWNiZWJlNDk0NGZmOTc0YT9yZXNwb25zZS1jb250ZW50LWRpc3Bvc2l0aW9uPSomcmVzcG9uc2UtY29udGVudC10eXBlPSoiLCJDb25kaXRpb24iOnsiRGF0ZUxlc3NUaGFuIjp7IkFXUzpFcG9jaFRpbWUiOjE2ODQzOTcyMjd9fX1dfQ__&Signature=tbSCc4T3qUsBlw-mQrtcKwBQL0cbfeZe8MH3aGUv4EgOfo0JZibFFetpyqKk88LsDRKNzStyM6epwjbiB11PwEE73JT6ajJnAkArMkNDOmTO4NP6poC1rHlM-XRz3WuSdi3nY0fdDYYYL1gHb~APwILghy-z4-vWRSEPldUQGTuqCZqj2knjmVtIuHSk06fShBYKOWKM7nnzb0-ENQumj6garze~s7n0hQjX~BKTGAD-HI5mMy1I5rwfA5M6eQ9zYavGHKNj104LftBPBLjpvAamO6fGS1L6KQYiKG-t68AuDgBy8TVbdIfTYJbN52vnvcfaiz3E5QB8JrvMv5uETQ__&Key-Pair-Id=KVTP0A1DKRTAX
Resolving cdn-lfs.huggingface.co (cdn-lfs.huggingface.co)... 18.244.102.114, 18.244.102.76, 18.244.102.9, ...
Connecting to cdn-lfs.huggingface.co (cdn-lfs.huggingface.co)|18.244.102.114|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 9763701888 (9.1G) [application/octet-stream]
Saving to: 'ggml-vic13b-uncensored-q5_1.bin'

ggml-vic13b-uncensored-q5_1.bin                  100%[=========================================================================================================>]   9.09G  2.99MB/s    in 49m 46s 

2023-05-15 10:47:17 (3.12 MB/s) - 'ggml-vic13b-uncensored-q5_1.bin' saved [9763701888/9763701888]

But when I try to run it, it throwing an error:

./main -m models/ggml-vic13b-uncensored-q5_1.bin -f 'prompts/chat-with-vicuna-v1.txt' -r 'User:' --temp 0.36

main: build = 523 (0737a47)
main: seed  = 1684152947
llama.cpp: loading model from models/ggml-vic13b-uncensored-q5_1.bin
error loading model: unknown (magic, version) combination: 67676a74, 00000002; is this really a GGML file?
llama_init_from_file: failed to load model
llama_init_from_gpt_params: error: failed to load model 'models/ggml-vic13b-uncensored-q5_1.bin'
main: error: unable to load model```

navigate to your llama.cpp folder and type:
git pull

navigate to your llama.cpp folder and type:
git pull

Nope, still same:

(lama2) kilku@debian:~/vicuna/llama.cpp$ git pull
Already up to date.
(lama2) kilku@debian:~/vicuna/llama.cpp$ ./main -m models/ggml-vic13b-uncensored-q5_1.bin -f 'prompts/chat-with-vicuna-v1.txt' -r 'User:' --temp 0.36
main: build = 523 (0737a47)
main: seed  = 1684157106
llama.cpp: loading model from models/ggml-vic13b-uncensored-q5_1.bin
error loading model: unknown (magic, version) combination: 67676a74, 00000002; is this really a GGML file?
llama_init_from_file: failed to load model
llama_init_from_gpt_params: error: failed to load model 'models/ggml-vic13b-uncensored-q5_1.bin'
main: error: unable to load model

try to run make -j again

make -j

did that as well

i have the same error, OS Linux Mint
image

same error. just freshly cloned the repo, freshly built, freshly downloaded the 7b model, and see the same error

Same error with the 13b model.

OK, i just updated my fork of llama.cpp, should work now!
navigate to the llama.cpp folder and type

git pull

then:

make -j

Should work!

It is working now, thanks!