ParisNeo/lollms-webui

Converting models from gpt4all

andzejsp opened this issue · 1 comments

Can someone help me to understand why they are not converting?

Default model that is downloaded by the UI converted no problem.

I wrote a script based on install.bat, Cloned the lama.cpp and then run command on all the models.

gpt4all-unfiltered  - does not work
ggml-vicuna-7b-4bit  - does not work
vicuna-13b-GPTQ-4bit-128g  - already been converted but does not work
LLaMa-Storytelling-4Bit  - does not work

Ignore the .og extension on th emodels, i renamed them so that i still have the original copy when/if it gets converted.

Lets try to convert:

sd2@sd2:~/gpt4all-ui$ bash convert-model.sh


converting .. story-llama30b-4bit-32g.safetensors
Traceback (most recent call last):
  File "/home/sd2/gpt4all-ui/./tmp/llama.cpp/migrate-ggml-2023-03-30-pr613.py", line 311, in <module>
    main()
  File "/home/sd2/gpt4all-ui/./tmp/llama.cpp/migrate-ggml-2023-03-30-pr613.py", line 272, in main
    tokens = read_tokens(fin, hparams)
  File "/home/sd2/gpt4all-ui/./tmp/llama.cpp/migrate-ggml-2023-03-30-pr613.py", line 133, in read_tokens
    word = fin.read(length)
ValueError: read length must be non-negative or -1


converting .. story-llama13b-4bit-32g.safetensors.og
Traceback (most recent call last):
  File "/home/sd2/gpt4all-ui/./tmp/llama.cpp/migrate-ggml-2023-03-30-pr613.py", line 311, in <module>
    main()
  File "/home/sd2/gpt4all-ui/./tmp/llama.cpp/migrate-ggml-2023-03-30-pr613.py", line 272, in main
    tokens = read_tokens(fin, hparams)
  File "/home/sd2/gpt4all-ui/./tmp/llama.cpp/migrate-ggml-2023-03-30-pr613.py", line 135, in read_tokens
    (score,) = struct.unpack("f", score_b)
struct.error: unpack requires a buffer of 4 bytes


converting .. ggml-vicuna-13b-1.1-q4_1.bin.og
./models/ggml-vicuna-13b-1.1-q4_1.bin.og: input ggml has already been converted to 'ggjt' magic



converting .. gpt4all-lora-unfiltered-quantized.bin.og
Traceback (most recent call last):
  File "/home/sd2/gpt4all-ui/./tmp/llama.cpp/migrate-ggml-2023-03-30-pr613.py", line 311, in <module>
    main()
  File "/home/sd2/gpt4all-ui/./tmp/llama.cpp/migrate-ggml-2023-03-30-pr613.py", line 272, in main
    tokens = read_tokens(fin, hparams)
  File "/home/sd2/gpt4all-ui/./tmp/llama.cpp/migrate-ggml-2023-03-30-pr613.py", line 133, in read_tokens
    word = fin.read(length)
ValueError: read length must be non-negative or -1
sd2@sd2:~/gpt4all-ui$

When runing vicuna model i gots this error:

HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH
Checking discussions database...
llama_model_load: loading model from './models/ggml-vicuna-13b-1.1-q4_1.bin.og' - please wait ...
llama_model_load: n_vocab = 32000
llama_model_load: n_ctx   = 512
llama_model_load: n_embd  = 5120
llama_model_load: n_mult  = 256
llama_model_load: n_head  = 40
llama_model_load: n_layer = 40
llama_model_load: n_rot   = 128
llama_model_load: f16     = 5
llama_model_load: n_ff    = 13824
llama_model_load: n_parts = 2
llama_model_load: type    = 2
llama_model_load: invalid model file './models/ggml-vicuna-13b-1.1-q4_1.bin.og' (bad f16 value 5)
llama_init_from_file: failed to load model
 * Serving Flask app 'GPT4All-WebUI'
 * Debug mode: off
[2023-04-13 12:29:47,313] {_internal.py:224} INFO - WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
 * Running on all addresses (0.0.0.0)
 * Running on http://127.0.0.1:9600
 * Running on http://192.168.0.210:9600
[2023-04-13 12:29:47,313] {_internal.py:224} INFO - Press CTRL+C to quit

When running gpt4all-lora-unfiltered-quantized.bin.og model:

HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH
Checking discussions database...
llama_model_load: loading model from './models/gpt4all-lora-unfiltered-quantized.bin.og' - please wait ...
llama_model_load: invalid model file './models/gpt4all-lora-unfiltered-quantized.bin.og' (too old, regenerate your model files or convert them with convert-unversioned-ggml-to-ggml.py!)
llama_init_from_file: failed to load model
 * Serving Flask app 'GPT4All-WebUI'
 * Debug mode: off
[2023-04-13 12:32:34,593] {_internal.py:224} INFO - WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
 * Running on all addresses (0.0.0.0)
 * Running on http://127.0.0.1:9600
 * Running on http://192.168.0.210:9600
[2023-04-13 12:32:34,593] {_internal.py:224} INFO - Press CTRL+C to quit

I hope some one here have more knowledge on how to use other models.

There are different scripts and methods for each model type, as describe in llama.cpp repo. It looks like your script only runs the migrate, but you will need to first use the convert script.

https://github.com/ggerganov/llama.cpp#using-gpt4all

I haven't yet got the unfiltered gpt4all model to load but I'll update if I have any more revelations.

EDIT I realized my model file was incomplete - only 13MB, so that's probably why mine was failing.