llama-cpp
There are 189 repositories under llama-cpp topic.
getumbrel/llama-gpt
A self-hosted, offline, ChatGPT-like chatbot. Powered by Llama 2. 100% private, with no data leaving your device. New: Code Llama support!
SciSharp/LLamaSharp
A C#/.NET library to run LLM (🦙LLaMA/LLaVA) on your local device efficiently.
Mobile-Artificial-Intelligence/maid
Maid is a cross-platform Flutter app for interfacing with GGUF / llama.cpp models locally, and with Ollama and OpenAI models remotely.
withcatai/node-llama-cpp
Run AI models locally on your machine with node.js bindings for llama.cpp. Enforce a JSON schema on the model output on the generation level
gotzmann/llama.go
llama.go is like llama.cpp in pure Golang!
undreamai/LLMUnity
Create characters in Unity with LLMs!
mybigday/llama.rn
React Native binding of llama.cpp
docker/compose-for-agents
Build and run AI agents using Docker Compose. A collection of ready-to-use examples for orchestrating open-source LLMs, tools, and agent runtimes.
the-crypt-keeper/can-ai-code
Self-evaluating interview for AI coders
withcatai/catai
Run AI ✨ assistant locally! with simple API for Node.js 🚀
mdrokz/rust-llama.cpp
LLama.cpp rust bindings
dipampaul17/KVSplit
Run larger LLMs with longer contexts on Apple Silicon by using differentiated precision for KV cache quantization. KVSplit enables 8-bit keys & 4-bit values, reducing memory by 59% with <1% quality loss. Includes benchmarking, visualization, and one-command setup. Optimized for M1/M2/M3 Macs with Metal support.
jlonge4/local_llama
This repo is to showcase how you can run a model locally and offline, free of OpenAI dependencies.
gpustack/gguf-parser-go
Review/Check GGUF files and estimate the memory usage and maximum tokens per second.
lucasjinreal/Crane
A Pure Rust based LLM (Any LLM based MLLM such as Spark-TTS) Inference Engine, powering by Candle framework.
ptsochantaris/emeltal
Local ML voice chat using high-end models.
phronmophobic/llama.clj
Run LLMs locally. A clojure wrapper for llama.cpp.
gotzmann/booster
Booster - open accelerator for LLM models. Better inference and debugging for AI hackers
BrutalCoding/shady.ai
Making offline AI models accessible to all types of edge devices.
nuance1979/llama-server
LLaMA Server combines the power of LLaMA C++ with the beauty of Chatbot UI.
1038lab/ComfyUI-MiniCPM
A custom ComfyUI node for MiniCPM vision-language models, supporting v4, v4.5, and v4 GGUF formats, enabling high-quality image captioning and visual analysis.
nrl-ai/CustomChar
Your customized AI assistant - Personal assistants on any hardware! With llama.cpp, whisper.cpp, ggml, LLaMA-v2.
thushan/olla
High-performance lightweight proxy and load balancer for LLM infrastructure. Intelligent routing, automatic failover and unified model discovery across local and remote inference backends.
R3gm/InsightSolver-Colab
InsightSolver: Colab notebooks for exploring and solving operational issues using deep learning, machine learning, and related models.
vtuber-plan/langport
Langport is a language model inference service
robiwan303/babyagi
BabyAGI-🦙: Enhanced for Llama models (running 100% local) and persistent memory, with smart internet search based on BabyCatAGI and document embedding in langchain based on privateGPT
OpenCSGs/llm-inference
llm-inference is a platform for publishing and managing llm inference, providing a wide range of out-of-the-box features for model deployment, such as UI, RESTful API, auto-scaling, computing resource management, monitoring, and more.
Abhi5h3k/PrivateDocBot
📚 Local PDF-Integrated Chat Bot: Secure Conversations and Document Assistance with LLM-Powered Privacy
greynewell/musegpt
Local LLMs in your DAW!
rbourgeat/ImpAI
😈 ImpAI is an advanced role play app using large language and diffusion models.
ystemsrx/code-atlas
A C++ implementation of Open Interpreter. / Open Interpreter 的 C++ 实现
fboulnois/llama-cpp-docker
Run llama.cpp in a GPU accelerated Docker container
hyparam/hyllama
llama.cpp gguf file parser for javascript
iacopPBK/llama.cpp-gfx906
llama.cpp-gfx906
lordmathis/llamactl
Unified management and routing for llama.cpp, MLX and vLLM models with web dashboard.
blueraai/universal-intelligence
◉ Universal Intelligence: AI made simple.