AI Copilot with LLaMA.cpp

"VSCode AI coding assistant powered by self-hosted llama.cpp endpoint."

Get started

Install Open Copilot from the VSCode marketplace.
Set your llama.cpp server's address to something such as http://192.168.0.101:8080 in the "Cody » Llama Server Endpoint" setting.
Now enjoy coding with your localized deploy models.

chat

code completion

code generate

code explain

Quick start your model service

Windows

Download llama.cpp binary release archive
Unzip llama-bxxx-bin-win-cublas-cuxx.x.x-x64.zip to folder
Download GGUF model file, for example: wizardcoder-python-13b-v1.0.Q4_K_M.gguf
Execute server.exe startup command.

# only use cpu
D:\path_to_unzip_files\server.exe -m D:\path_to_model\wizardcoder-python-13b-v1.0.Q4_K_M.gguf -t 8 -c 1024
# use gpu
D:\path_to_unzip_files\server.exe -m D:\path_to_model\wizardcoder-python-13b-v1.0.Q4_K_M.gguf -t 8 -ngl 81 -c 1024

Linux or MacOS

Please compile the llama.cpp project by yourself, and follow the same startup steps.

Contributing

All code in this repository is open source (Apache 2).

Quickstart: pnpm install && cd vscode && pnpm run dev to run a local build of the Cody VS Code extension.

iohub/collama