/collama

VSCode AI coding assistant powered by self-hosted llama.cpp endpoint.

Primary LanguageTypeScriptApache License 2.0Apache-2.0

AI Copilot with LLaMA.cpp

"VSCode AI coding assistant powered by self-hosted llama.cpp endpoint."

Get started

  • Install Open Copilot from the VSCode marketplace.
  • Set your llama.cpp server's address to something such as http://192.168.0.101:8080 in the "Cody » Llama Server Endpoint" setting.
  • Now enjoy coding with your localized deploy models.

chat

chat with llama.cpp server

code completion

code completion

code generate

code generate

code explain

explain code

Quick start your model service

Windows

  1. Download llama.cpp binary release archive

  2. Unzip llama-bxxx-bin-win-cublas-cuxx.x.x-x64.zip to folder

  3. Download GGUF model file, for example: wizardcoder-python-13b-v1.0.Q4_K_M.gguf

  4. Execute server.exe startup command.

# only use cpu
D:\path_to_unzip_files\server.exe -m D:\path_to_model\wizardcoder-python-13b-v1.0.Q4_K_M.gguf -t 8 -c 1024
# use gpu
D:\path_to_unzip_files\server.exe -m D:\path_to_model\wizardcoder-python-13b-v1.0.Q4_K_M.gguf -t 8 -ngl 81 -c 1024

Linux or MacOS

Please compile the llama.cpp project by yourself, and follow the same startup steps.

Contributing

All code in this repository is open source (Apache 2).

Quickstart: pnpm install && cd vscode && pnpm run dev to run a local build of the Cody VS Code extension.