LLM chat and autocomplete can not work using Xinference api
Closed this issue · 0 comments
jiusi9 commented
Before submitting your bug report
- I believe this is a bug. I'll try to join the Continue Discord for questions
- I'm not able to find an open issue that reports the same bug
- I've seen the troubleshooting guide on the Continue Docs
Relevant environment info
- OS: ubuntu
- Continue version: 0.8.61
- IDE version: VScode 1.83.0
- Model: Qwen2.5-32B-Instruct, Qwen2.5-coder-32B-Instruct, Qwen2.5-coder-14B
- config.json:
Description
I ran a Xinference as model provider, and model provider is openai
Here are configurations, Qwen2-Instruct model is running well,
but Qwen2.5-32B-Instruct, Qwen2.5-Coder-32B-Instruct, Qwen2.5-Coder-14B can not work.
When chat with them, never print this response, but I'm sure that model had been answered.
Difference configuration is:
Xinference for Qwen2.5 model was upgrade to latest, but I checked API /v1/chat/compeletion is nothing changed.
"models": [
{
"title": "Qwen2.5-Coder-Instruct",
"model": "Qwen2.5-32B-Instruct",
"systemMessage": "You are an expert software developer. You give helpful and concise responses.",
"apiBase": "https://xinference-qwen25-32b-instruct/v1",
"apiKey": "aaaaaa",
"provider": "openai",
"useLegacyCompletionsEndpoint": false
},
{
"title": "Qwen2-Instruct",
"model": "Qwen2-7B-Instruct",
"systemMessage": "You are an expert software developer. You give helpful and concise responses.",
"apiBase": "https://xinference-qwen2/v1",
"apiKey": "aaaaaa",
"provider": "openai",
"useLegacyCompletionsEndpoint": false
}
],
"tabAutocompleteModel": {
"title": "Qwen2.5-coder",
"provider": "openai",
"model": "Qwen2.5-Coder-14B",
"apiBase": "https://xinference-qwen25-coder-14b/v1",
"apiKey": "aaaaaa",
"systemMessage": "You are an expert software developer. You give helpful and concise responses."
},
I can not get output from continue, what maybe the root case?
To reproduce
No response
Log output
==========================================================================
##### Completion options #####
{
"contextLength": 8096,
"model": "Qwen2.5-32B-Instruct",
"maxTokens": 4096
}
##### Request options #####
{}
##### Prompt #####
<system>
You are an expert software developer. You give helpful and concise responses.
<user>
Who are you?
<assistant>
I am a sophisticated AI designed to assist with information, guidance, and problem-solving tasks related to software development and other topics. I'm here to help answer your questions, provide code examples, explain concepts, and much more. How can I assist you today?
<user>
Who are you?
==========================================================================
==========================================================================
##### Completion options #####
{
"contextLength": 8096,
"model": "Qwen2.5-32B-Instruct",
"maxTokens": 4096
}
##### Request options #####
{}
##### Prompt #####
<system>
You are an expert software developer. You give helpful and concise responses.
<user>
Who are you?
==========================================================================