PygmalionAI/aphrodite-engine

[Bug]: gguf loading failed. config.json?

Opened this issue · 4 comments

Your current environment

I excute that command below:

python -m aphrodite.endpoints.openai.api_server --model /root/.cache/huggingface/hub/gguf/ --quantization gguf --gpu-memory-utilization 0.35 --max-model-len 4096 --port 8000

there is an error that
OSError: /root/.cache/huggingface/hub/gguf/ does not appear to have a file named config.json. Checkout 'https://huggingface.co//root/.cache/huggingface/hub/gguf//tree/None' for available files.

why there wants config.json?
as you know gguf format doesn't have a config.json...

🐛 Describe the bug

I excute that command below:

python -m aphrodite.endpoints.openai.api_server --model /root/.cache/huggingface/hub/gguf/ --quantization gguf --gpu-memory-utilization 0.35 --max-model-len 4096 --port 8000

there is an error that
OSError: /root/.cache/huggingface/hub/gguf/ does not appear to have a file named config.json. Checkout 'https://huggingface.co//root/.cache/huggingface/hub/gguf//tree/None' for available files.

why there wants config.json?
as you know gguf format doesn't have a config.json...

You need to point to the file (xxxx.gguf), not the directory containing the file.

You need to point to the file (xxxx.gguf), not the directory containing the file.

model is consist
of 2 gguf files.. how can i do that?

sharded gguf (a model in multiple files) is not currently supported. #420 adds support but we need to fix something else related to ggufs first.

Experimental support of multiple gguf files is added to the dev branch, please test if it works according to the documentation