huggingface/llm-vscode

Error decoding response body: expected value at line 1 column 1

jalalirs opened this issue · 7 comments

I am trying to use llm-vscode with a locally deployed Text Generation Inference (TGI) server but I keep getting the following error:

Error decoding response body: expected value at line 1 column 1

My setting is the following where and correspond to my server path. I tried both with /generate and without it

{
    "editor.accessibilitySupport": "off",
    "workbench.colorTheme": "Default Dark+",
    "git.openRepositoryInParentFolders": "always",
    "diffEditor.codeLens": true,
    "llm.attributionEndpoint": "http://<host>:<port>/generate",
    "llm.configTemplate": "Custom",
    "llm.modelIdOrEndpoint": "http:// <host>:<port>/generate",
    "llm.fillInTheMiddle.enabled": true,
    "llm.fillInTheMiddle.prefix": "<PRE> ",
    "llm.fillInTheMiddle.middle": " <MID>",
    "llm.fillInTheMiddle.suffix": " <SUF>",
    "llm.temperature": 0.2,
    "llm.contextWindow": 4096,
    "llm.tokensToClear": [
        "<EOT>"
    ],
    "llm.enableAutoSuggest": true,
    "llm.documentFilter": {
 

    },
    "llm.tlsSkipVerifyInsecure": true
}

This issue is stale because it has been open for 30 days with no activity.

Hello, I also get the same error. Did you find a solution for it?

Unfortunately not yet and I am still waiting for an answer or an update

Hello, did you check the TGI logs? I assume the response body is not formatted correctly, there may be an issue with the way the response is parsed.

The TGI output is fine and can be consumed both by langchain and chat-ui. TGI with codellama-34 can be consumed fine with python requests call.

This issue is stale because it has been open for 30 days with no activity.