install.sh and app.py look for different model names

Question

install.sh and app.py look for different model names

CaptainChemist opened this issue 2 years ago · 3 comments

The install.sh file runs a wget to clone the model to the following folder:

model/gpt4all-lora-quantized-ggml.bin

The app.py file is looking for a model in the models folder rather than model and the name is slightly different:

    chatbot_bindings = Model(ggml_model='./models/gpt4all-converted.bin', n_ctx=512)

When I move the downloaded file over to the new location and new name though, I see this error:

llama_model_load: loading model from './models/gpt4all-converted.bin' - please wait ...
./models/gpt4all-converted.bin: invalid model file (bad magic [got 0x67676d66 want 0x67676a74])
	you most likely need to regenerate your ggml files
	the benefit is you'll get 10-100x faster load times
	see https://github.com/ggerganov/llama.cpp/issues/91
	use convert-pth-to-ggml.py to regenerate from original pth
	use migrate-ggml-2023-03-30-pr613.py if you deleted originals

Is there a missing conversion that should be occurring here and that's why it is failing? I was trying to run the two llama scripts recommended but couldn't quite figure them out. I'm on a mac for what its worth.

Answer 1 · 2023-04-07T03:29:21.000Z

Hi there. My bad. As Yesterday the stress on the install.sh model was high i switched to a model i generated locally for test and forgot to switch back before committing.

Answer 2 · 2023-04-07T05:02:43.000Z

The model name mis-match has been fixed in 3b3bfc6

Ok to close issue?

Answer 3 · 2023-04-07T14:38:56.000Z

I can get further, but the model it is downloading is still not correct. How do you fix the version it is looking for?

llama_model_load: loading model from './models/gpt4all-lora-quantized-ggml.bin' - please wait ...
./models/gpt4all-lora-quantized-ggml.bin: invalid model file (bad magic [got 0x67676d66 want 0x67676a74])
	you most likely need to regenerate your ggml files
	the benefit is you'll get 10-100x faster load times
	see https://github.com/ggerganov/llama.cpp/issues/91
	use convert-pth-to-ggml.py to regenerate from original pth
	use migrate-ggml-2023-03-30-pr613.py if you deleted originals
llama_init_from_file: failed to load model
Checking discussions database...
Ok
llama_generate: seed = 1680878004

system_info: n_threads = 8 / 10 | AVX = 0 | AVX2 = 0 | AVX512 = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | F16C = 0 | FP16_VA = 1 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | VSX = 0 | 
./run.sh: line 43: 42697 Segmentation fault: 11  python app.py