stanford-oval/WikiChat

422 Unprocessable Entity

Opened this issue · 5 comments

I get a “422 Unprocessable Entity” when calling a local LLM service and I don't know what's causing it。
image

Hi,

Can you please let us know what server and model you are using? E.g. LLaMA-3 on text-generation-inference etc.
And what command you are using to run WikiChat?

Hi,

Can you please let us know what server and model you are using? E.g. LLaMA-3 on text-generation-inference etc. And what command you are using to run WikiChat?

I use the api deployment code from the chatglm3 repository, which is compatible with Openai-api. I use "inv demo --engine local" to run wikichat。The error message on the terminal is as follows:
image
In addition, I successfully called the local LLM service in the litellm library alone

One thing to check is which port you are serving chatglm3 from. By default, WikiChat expects local models to be served from port 5002. See https://github.com/stanford-oval/WikiChat/blob/main/llm_config.yaml#L99-L103 on how to change that if needed.

If that doesn't help, you can enable LiteLLM's verbose logging (https://github.com/stanford-oval/WikiChat/blob/main/llm_config.yaml#L17) and paste the full log here, to help us with troubleshooting.

One thing to check is which port you are serving chatglm3 from. By default, WikiChat expects local models to be served from port 5002. See https://github.com/stanford-oval/WikiChat/blob/main/llm_config.yaml#L99-L103 on how to change that if needed.

If that doesn't help, you can enable LiteLLM's verbose logging (https://github.com/stanford-oval/WikiChat/blob/main/llm_config.yaml#L17) and paste the full log here, to help us with troubleshooting.

I use vllm to deploy a local LLM. How should I modify the ”local: huggingface/local" field in the llm_config.yaml file? I tried to change it to the name set when vllm was deployed, but it reported an error that the model does not exist. If I don't modify it, huggingface will report an error。
image

I just tested, and it does not seem to work with vLLM. I will need to look into it more closely.
In the meantime, you can use https://github.com/huggingface/text-generation-inference/, which I just tested and works with this code base.