continuedev/continue

LLM chat and autocomplete can not work using Xinference api

Closed this issue · 0 comments

jiusi9 commented

Before submitting your bug report

Relevant environment info

- OS: ubuntu
- Continue version: 0.8.61
- IDE version: VScode 1.83.0
- Model: Qwen2.5-32B-Instruct, Qwen2.5-coder-32B-Instruct, Qwen2.5-coder-14B
- config.json:

Description

I ran a Xinference as model provider, and model provider is openai

Here are configurations, Qwen2-Instruct model is running well,
but Qwen2.5-32B-Instruct, Qwen2.5-Coder-32B-Instruct, Qwen2.5-Coder-14B can not work.
When chat with them, never print this response, but I'm sure that model had been answered.

Difference configuration is:
Xinference for Qwen2.5 model was upgrade to latest, but I checked API /v1/chat/compeletion is nothing changed.

  "models": [
    {
      "title": "Qwen2.5-Coder-Instruct",
      "model": "Qwen2.5-32B-Instruct",
      "systemMessage": "You are an expert software developer. You give helpful and concise responses.",
      "apiBase": "https://xinference-qwen25-32b-instruct/v1",
      "apiKey": "aaaaaa",
      "provider": "openai",
      "useLegacyCompletionsEndpoint": false
    },
    {
      "title": "Qwen2-Instruct",
      "model": "Qwen2-7B-Instruct",
      "systemMessage": "You are an expert software developer. You give helpful and concise responses.",
      "apiBase": "https://xinference-qwen2/v1",
      "apiKey": "aaaaaa",
      "provider": "openai",
      "useLegacyCompletionsEndpoint": false
    }
  ],
  "tabAutocompleteModel": {
    "title": "Qwen2.5-coder",
    "provider": "openai",
    "model": "Qwen2.5-Coder-14B",
    "apiBase": "https://xinference-qwen25-coder-14b/v1",
    "apiKey": "aaaaaa",
    "systemMessage": "You are an expert software developer. You give helpful and concise responses."
  },

image

I can not get output from continue, what maybe the root case?

To reproduce

No response

Log output

==========================================================================
##### Completion options #####
{
  "contextLength": 8096,
  "model": "Qwen2.5-32B-Instruct",
  "maxTokens": 4096
}

##### Request options #####
{}

##### Prompt #####
<system>
You are an expert software developer. You give helpful and concise responses.

<user>
Who are you?

<assistant>
I am a sophisticated AI designed to assist with information, guidance, and problem-solving tasks related to software development and other topics. I'm here to help answer your questions, provide code examples, explain concepts, and much more. How can I assist you today?

<user>
Who are you?

==========================================================================
==========================================================================
##### Completion options #####
{
  "contextLength": 8096,
  "model": "Qwen2.5-32B-Instruct",
  "maxTokens": 4096
}

##### Request options #####
{}

##### Prompt #####
<system>
You are an expert software developer. You give helpful and concise responses.

<user>
Who are you?

==========================================================================