gsuuon/model.nvim

Failing to connect/start the llama.cpp server

mutlusun opened this issue · 4 comments

Thank you for creating this neovim plugin and publishing it online!

I tried the plugin to access a local llama.cpp server. If I start the server manually, everything works fine. However, if I let llama.cpp start the server (as described here), I get the following error message in neovim:

curl: (7) Failed to connect to 127.0.0.1 port 8080 after 6 ms: Couldn't connect to server

I tried to set some curl arguments to increase the timeout time, but had still no luck.

My configuration looks like that:

require('llm').setup({
    llamacpp = {
        provider = llamacpp,
        options = {
            server_start = {
                command = "~/Documents/src/llama.cpp/server",
                args = {
                    "-m", "~/Documents/src/llama.cpp/models/open-llama-7b-v2/ggml-model-q4_0.gguf",
                    "-c", 2048,
                    --"-c", 4096,
                    --"-ngl", 22
                }
            },
        },
        builder = function(input, context)
            return {
                prompt = llamacpp.llama_2_user_prompt({
                    user = context.args or '',
                    message = input
                })
            }
        end,
    },
})

Do I something wrong here? I'm sorry, if this is explained somehow, but I couldn't find more information on this problem. Thank you for your help and time!

gsuuon commented

Hi! Can you double check that the command field points to the server binary? On windows it should probably be server.exe - although you should be getting a message that starting the server failed, which may be a bug if not

Thanks for your feedback! Yes, the path of the command field is correct. It is /Users/***/Documents/src/llama.cpp/server as I'm currently working on MacOS. I tried it now with an absolute path but that didn't change anything.

I get the same error message as above even if I specify a wrong path to the server binary. The binary is an executable. I also assured that I'm on the last commit.

gsuuon commented

Wow, I missed it again - I just noticed your setup is incorrect, the prompt needs to be in the prompts field and not top level so:

require('llm').setup({
  prompts = {
    llamacpp = {
      provider = llamacpp,
      options = {
        server_start = {
          command = "~/Documents/src/llama.cpp/server",
          args = {
            "-m", "~/Documents/src/llama.cpp/models/open-llama-7b-v2/ggml-model-q4_0.gguf",
            "-c", 2048,
            --"-c", 4096,
            --"-ngl", 22
          }
        },
      },
      builder = function(input, context)
        return {
          prompt = llamacpp.llama_2_user_prompt({
            user = context.args or '',
            message = input
          })
        }
      end,
    },
  }
})

I'll be improving the docs soonish! Sorry this wasn't clear.

Again, thank you for your fast help! I'm sorry that I missed this point in the Readme. After you mentioned it, I saw it there. Just for reference, the following config works now:

local llamacpp = require("llm.providers.llamacpp")

require("llm").setup({
    prompts = {
        llamacpp = {
            provider = llamacpp,
            options = {
                server_start = {
                    command = "/Users/***/Documents/src/llama.cpp/server",
                    args = {
                        "-m", "/Users/***/Documents/src/llama.cpp/models/open-llama-7b-v2/ggml-model-q4_0.gguf",
                        "-c", 2048,
                        --"-c", 4096,
                        --"-ngl", 22
                    }
                },
            },
            builder = function(input, context)
                return {
                    prompt = llamacpp.llama_2_user_prompt({
                        user = context.args or "",
                        message = input
                    })
                }
            end,
        },
    },
})

Thus, I had to require llm.proviers.llamacpp and I actually needed an absolute path in the server and model path.