Custom GGML outside LlamaCpp scope

Question

Custom GGML outside LlamaCpp scope

su77ungr opened this issue 2 years ago · 6 comments

For the MosaiML: haven't tried yet, feel free to create another issue so that we don't forget after closing this one
Update: mpt-7b-q4_0.bin doesn't work "out of the box", it yields what(): unexpectedly reached end of file and a runtime error.

Originally posted by @hippalectryon-0 in #33 (comment)

Answer 1 · 2023-05-14T19:24:53.000Z

Outsourced curated list of supported models; later adding to README.md

Answer 2 · 2023-05-14T19:28:25.000Z

Maye create setup.py that fetches directly from HF

Edit: this does counteract the air-gapped idea

from huggingface_hub import hf_hub_download

#Download the model
hf_hub_download(repo_id="LLukas22/gpt4all-lora-quantized-ggjt", filename="ggjt-model.bin", local_dir=".")

Edit: implemented with #61
Also @hippalectryon-0 did you test the 4bit or 16 from Mosaic?

Answer 3 · 2023-05-14T19:30:05.000Z

Only mpt-7b-q4_0.bin from https://huggingface.co/LLukas22/mpt-7b-ggml

Answer 4 · 2023-05-15T21:22:13.000Z

I feel this mpt-7B is faster than the existing one here.

Answer 5 · 2023-05-15T23:30:07.000Z

You got it running? We should add benchmark runs so everyone can plot and share results.

Answer 6 · 2023-05-16T09:30:54.000Z

ggerganov/llama.cpp#1333