can't run the starcoder-ggml.bin

Question

can't run the starcoder-ggml.bin

breakenknife opened this issue 2 years ago · 2 comments

./main -m models/bigcode/starcoder-ggml.bin -p "def fibonnaci(" --top_k 0 --top_p 0.95 --temp 0.2

main: seed = 1685609262
starcoder_model_load: loading model from 'models/bigcode/starcoder-ggml.bin'
starcoder_model_load: n_vocab = 49152
starcoder_model_load: n_ctx   = 8192
starcoder_model_load: n_embd  = 6144
starcoder_model_load: n_head  = 48
starcoder_model_load: n_layer = 40
starcoder_model_load: ftype   = 1
starcoder_model_load: qntvr   = 0
starcoder_model_load: ggml ctx size = 51276.47 MB
GGML_ASSERT: ggml.c:3874: ctx->mem_buffer != NULL
Aborted (core dumped)

hi, My machine has 38GB of memory and can execute starcoder starcoder-ggml-q4_1.bin, but cannot execute non quantified starcoder-ggml.bin. Is this because there is not enough memory?

Answer 1 · 2023-06-02T18:33:01.000Z

When running inference, the whole model is loaded into memory. If your starcoder-ggml.bin file is larger than your memory, then: yes :)

Answer 2 · 2023-06-08T18:00:26.000Z

Also some devices allow offloading some memory to disk (warning: inference will be slower). So you can always try running a model, and if that fails then know that the lack of memory is the issue