Llama.cpp Support

Question

Closed this issue 8 months ago · 2 comments

Exploring possibilities to support GGML / GGUF formats to run with Llama.cpp

Answer 1 · 2023-09-26T16:52:04.000Z

the model is missing some keys and count be converted to GGUF format

'rms_norm_eps'

Answer 2 · 2023-11-11T14:47:40.000Z

A full set of Llama.cpp compatible .gguf files is available at
https://huggingface.co/maddes8cht/adept-persimmon-8b-base-gguf
and
https://huggingface.co/maddes8cht/adept-persimmon-8b-chat-gguf
For the moment, cuda accelleration seems not to work, so you need to use -ngl 0 with the cublas versions.