Auto complete seems does not work
Opened this issue · 3 comments
Before submitting your bug report
- I've tried using the "Ask AI" feature on the Continue docs site to see if the docs have an answer
- I believe this is a bug. I'll try to join the Continue Discord for questions
- I'm not able to find an open issue that reports the same bug
- I've seen the troubleshooting guide on the Continue Docs
Relevant environment info
- OS: MacOS Tahoe
- Continue version: 1.2.4
- IDE version: 1.104.1
- Model: Gemini-2.5 Flash
- config:
name: Local Agent
version: 1.0.0
schema: v1
models:
- name: OpenRouter Gemini 2.5 Flash
provider: openrouter
model: google/gemini-2.5-flash
roles:
- chat
- edit
- apply
- autocomplete
apiBase: https://openrouter.ai/api/v1
OR link to agent in Continue hub:Description
Hi there.
I am new to here. I use OpenRouter as the backend then the chat and edit works fine, but the autocomplete does not pop out the candidate code even from Continue Debug Console it shows Gemini has already replied code, as shown in the following image:
I've tried reccommed troubleshooting approchs that mentioned in the doc but it does not help. Hope someone could help me. Thank in advance.
To reproduce
No response
Log output
And I also tried to use Continue in Jetbrains Pycharm and it does not work either. Here is the log:
Code: undefined
Error number: undefined
Syscall: undefined
Type: aborted
wze: The operation was aborted.
at I (/snapshot/continue/binary/out/index.js:8055:16904)
at AbortSignal.u (/snapshot/continue/binary/out/index.js:8055:17091)
at [nodejs.internal.kHybridDispatch] (node:internal/event_target:645:20)
at AbortSignal.dispatchEvent (node:internal/event_target:587:26)
at abortSignal (node:internal/abort_controller:292:10)
at AbortController.abort (node:internal/abort_controller:322:5)
at suA.cancel (/snapshot/continue/binary/out/index.js:8456:2845)
at auA._createListenableGenerator (/snapshot/continue/binary/out/index.js:8456:3583)
at auA.getGenerator (/snapshot/continue/binary/out/index.js:8456:4055)
at getGenerator.next (<anonymous>)i've encountered the similar situation. After a few debug running in jetbrain plugins, i found that it's the shared config between roles, contextLength and maxTokens in defaultCompletionOptions config causing a big latency when triggering autocomplete, which should have been carrying less context and responding way faster than other chat/edit role. Well, in my case it is, at least. So, i'd recommend u setting up an independent model named like 'Gemini AutoComplete Only' with a single role 'autocomplete' and smaller contextLength(, such as 1024), maxTokens(, 512), which should be enough context input and code result output for autocomplete scenario. demo config is like below, hope it helps🙂:
- name: DeepSeek Autocomplete
provider: deepseek
model: deepseek-chat
apiBase: https://api.deepseek.com
apiKey: $MY_API_KEY
useLegacyCompletionsEndpoint: false
roles:
- autocomplete
defaultCompletionOptions:
temperature: 0
stream: true
contextLength: 1024
maxTokens: 512
- name: DeepSeek Common
provider: deepseek
model: deepseek-chat
apiBase: https://api.deepseek.com
apiKey: $MY_API_KEY
useLegacyCompletionsEndpoint: false
roles:
- chat
- edit
- summarize
defaultCompletionOptions:
temperature: 0
stream: true
contextLength: 131072
maxTokens: 8192
@SoLoHiC I see. Thank you very much and I'll check it out later. :D
Just to add more info to what @SoLoHiC reported, it seems that Continue has a low tolerance to high latency. When using Ollama for completion locally, when editing/deleting text, usually I get a bunch of these errors in Ollama log:
set 27 21:29:36 ollama[2595044]: time=2025-09-27T21:29:36.420-03:00 level=ERROR source=server.go:1459 msg="post predict" error="Post \"http://127.0.0.1:34119/completion\": context canceled"
Then Continue stops responding all together, not only completion. Only restarting the editor fixes this. While the editor isn't restarted, there is a high CPU usage, while the builtin VSCode inline suggestion icon in the status bar keeps spinning (by the way, probably this is what was reported in #7372 by "conflicts with copilot" perhaps?).
In my case, increasing debounceDelay in the LLM configuration helps a bit, this can be used if the model used does not respond fast enough.