llama-cpp

There are 189 repositories under llama-cpp topic.

getumbrel/llama-gpt
A self-hosted, offline, ChatGPT-like chatbot. Powered by Llama 2. 100% private, with no data leaving your device. New: Code Llama support!
Language:TypeScript11k 80 129711
SciSharp/LLamaSharp
A C#/.NET library to run LLM (🦙LLaMA/LLaVA) on your local device efficiently.
Language:C#3.4k 64 446476
Mobile-Artificial-Intelligence/maid
Maid is a cross-platform Flutter app for interfacing with GGUF / llama.cpp models locally, and with Ollama and OpenAI models remotely.
Language:Dart2.2k 34 189225
withcatai/node-llama-cpp
Run AI models locally on your machine with node.js bindings for llama.cpp. Enforce a JSON schema on the model output on the generation level
Language:TypeScript1.7k 20 130155
gotzmann/llama.go
llama.go is like llama.cpp in pure Golang!
Language:Go1.4k 30 2268
undreamai/LLMUnity
Create characters in Unity with LLMs!
Language:C#1.3k 22 145144
mybigday/llama.rn
React Native binding of llama.cpp
Language:C711 10 7975
docker/compose-for-agents
Build and run AI agents using Docker Compose. A collection of ready-to-use examples for orchestrating open-source LLMs, tools, and agent runtimes.
Language:TypeScript700 22 9288
the-crypt-keeper/can-ai-code
Self-evaluating interview for AI coders
Language:Python596 11 25635
withcatai/catai
Run AI ✨ assistant locally! with simple API for Node.js 🚀
Language:TypeScript479 9 4540
mdrokz/rust-llama.cpp
LLama.cpp rust bindings
Language:Rust406 8 2250
dipampaul17/KVSplit
Run larger LLMs with longer contexts on Apple Silicon by using differentiated precision for KV cache quantization. KVSplit enables 8-bit keys & 4-bit values, reducing memory by 59% with <1% quality loss. Includes benchmarking, visualization, and one-command setup. Optimized for M1/M2/M3 Macs with Metal support.
Language:Python36012
jlonge4/local_llama
This repo is to showcase how you can run a model locally and offline, free of OpenAI dependencies.
Language:Python293 9 1446
gpustack/gguf-parser-go
Review/Check GGUF files and estimate the memory usage and maximum tokens per second.
Language:Go214 8 822
lucasjinreal/Crane
A Pure Rust based LLM (Any LLM based MLLM such as Spark-TTS) Inference Engine, powering by Candle framework.
Language:Rust190 6 1115
ptsochantaris/emeltal
Local ML voice chat using high-end models.
Language:C++178 5 113
phronmophobic/llama.clj
Run LLMs locally. A clojure wrapper for llama.cpp.
Language:Clojure166 7 149
gotzmann/booster
Booster - open accelerator for LLM models. Better inference and debugging for AI hackers
Language:C++163 7 610
BrutalCoding/shady.ai
Making offline AI models accessible to all types of edge devices.
Language:Dart142 13 1018
nuance1979/llama-server
LLaMA Server combines the power of LLaMA C++ with the beauty of Chatbot UI.
Language:Python130 3 814
1038lab/ComfyUI-MiniCPM
A custom ComfyUI node for MiniCPM vision-language models, supporting v4, v4.5, and v4 GGUF formats, enabling high-quality image captioning and visual analysis.
Language:Python126 1 910
nrl-ai/CustomChar
Your customized AI assistant - Personal assistants on any hardware! With llama.cpp, whisper.cpp, ggml, LLaMA-v2.
Language:C++118 5 213
thushan/olla
High-performance lightweight proxy and load balancer for LLM infrastructure. Intelligent routing, automatic failover and unified model discovery across local and remote inference backends.
Language:Go11011
R3gm/InsightSolver-Colab
InsightSolver: Colab notebooks for exploring and solving operational issues using deep learning, machine learning, and related models.
Language:Jupyter Notebook101 5 131
vtuber-plan/langport
Langport is a language model inference service
Language:Python95 4 1113
robiwan303/babyagi
BabyAGI-🦙: Enhanced for Llama models (running 100% local) and persistent memory, with smart internet search based on BabyCatAGI and document embedding in langchain based on privateGPT
Language:Python90 7 08
OpenCSGs/llm-inference
llm-inference is a platform for publishing and managing llm inference, providing a wide range of out-of-the-box features for model deployment, such as UI, RESTful API, auto-scaling, computing resource management, monitoring, and more.
Language:Python88 12 3416
Abhi5h3k/PrivateDocBot
📚 Local PDF-Integrated Chat Bot: Secure Conversations and Document Assistance with LLM-Powered Privacy
Language:Python87 2 521
greynewell/musegpt
Local LLMs in your DAW!
Language:C++82 3 244
rbourgeat/ImpAI
😈 ImpAI is an advanced role play app using large language and diffusion models.
Language:JavaScript63 3 54
ystemsrx/code-atlas
A C++ implementation of Open Interpreter. / Open Interpreter 的 C++ 实现
Language:C++63 4 116
fboulnois/llama-cpp-docker
Run llama.cpp in a GPU accelerated Docker container
Language:Dockerfile55 0 214
hyparam/hyllama
llama.cpp gguf file parser for javascript
Language:JavaScript50 3 23
iacopPBK/llama.cpp-gfx906
llama.cpp-gfx906
Language:C++49 2 34
lordmathis/llamactl
Unified management and routing for llama.cpp, MLX and vLLM models with web dashboard.
Language:Go490
blueraai/universal-intelligence
◉ Universal Intelligence: AI made simple.
Language:Python45 3 166

llama-cpp

getumbrel/llama-gpt

SciSharp/LLamaSharp

Mobile-Artificial-Intelligence/maid

withcatai/node-llama-cpp

gotzmann/llama.go

undreamai/LLMUnity

mybigday/llama.rn

docker/compose-for-agents

the-crypt-keeper/can-ai-code

withcatai/catai

mdrokz/rust-llama.cpp

dipampaul17/KVSplit

jlonge4/local_llama

gpustack/gguf-parser-go

lucasjinreal/Crane

ptsochantaris/emeltal

phronmophobic/llama.clj

gotzmann/booster

BrutalCoding/shady.ai

nuance1979/llama-server

1038lab/ComfyUI-MiniCPM

nrl-ai/CustomChar

thushan/olla

R3gm/InsightSolver-Colab

vtuber-plan/langport

robiwan303/babyagi

OpenCSGs/llm-inference

Abhi5h3k/PrivateDocBot

greynewell/musegpt

rbourgeat/ImpAI

ystemsrx/code-atlas

fboulnois/llama-cpp-docker

hyparam/hyllama

iacopPBK/llama.cpp-gfx906

lordmathis/llamactl

blueraai/universal-intelligence