gguf

There are 176 repositories under gguf topic.

  • LostRuins/koboldcpp

    Run GGUF models easily with a KoboldAI UI. One File. Zero Install.

    Language:C++8.2k821.1k535
  • menloresearch/cortex.cpp

    Local AI API Platform

    Language:C++2.8k26881179
  • maid

    Mobile-Artificial-Intelligence/maid

    Maid is a cross-platform Flutter app for interfacing with GGUF / llama.cpp models locally, and with Ollama and OpenAI models remotely.

    Language:Dart2.2k37186218
  • av/harbor

    Effortlessly run LLM backends, APIs, frontends, and services with one command.

    Language:Python2.1k18155140
  • heshengtao/comfyui_LLM_party

    LLM Agent Framework in ComfyUI includes MCP sever, Omost,GPT-sovits, ChatTTS,GOT-OCR2.0, and FLUX prompt nodes,access to Feishu,discord,and adapts to all llms with similar openai / aisuite interfaces, such as o1,ollama, gemini, grok, qwen, GLM, deepseek, kimi,doubao. Adapted to local llms, vlm, gguf such as llama-3.3 Janus-Pro, Linkage graphRAG

    Language:Python1.9k16143158
  • datawhalechina/handy-ollama

    动手学Ollama,CPU玩转大模型部署,在线阅读地址:https://datawhalechina.github.io/handy-ollama/

    Language:Jupyter Notebook1.9k148200
  • node-llama-cpp

    withcatai/node-llama-cpp

    Run AI models locally on your machine with node.js bindings for llama.cpp. Enforce a JSON schema on the model output on the generation level

    Language:TypeScript1.7k18128143
  • gollama

    sammcj/gollama

    Go manage your Ollama models

    Language:Go1.4k117877
  • edwko/OuteTTS

    Interface for OuteTTS models.

    Language:Python1.4k2858111
  • kitops

    kitops-ml/kitops

    An open source DevOps tool from the CNCF for packaging and versioning AI/ML models, datasets, code, and configuration into an OCI Artifact.

    Language:Go1.2k14204135
  • Michael-A-Kuykendall/shimmy

    ⚡ Python-free Rust inference server — OpenAI-API compatible. GGUF + SafeTensors, hot model swap, auto-discovery, single binary. FREE now, FREE forever.

    Language:Rust1.1k92
  • mukel/llama3.java

    Practical Llama 3 inference in Java

    Language:Java781291793
  • eastriverlee/LLM.swift

    LLM.swift is a simple and readable library that allows you to interact with large language models locally with ease for macOS, iOS, watchOS, tvOS, and visionOS.

    Language:C727163788
  • kelindar/search

    Go library for embedded vector search and semantic embeddings using llama.cpp

    Language:Go4863119
  • withcatai/catai

    Run AI ✨ assistant locally! with simple API for Node.js 🚀

    Language:TypeScript47994533
  • antirez/gguf-tools

    GGUF implementation in C as a library and a tools CLI program

    Language:C29012118
  • Webscout

    OEvortex/Webscout

    Webscout is the all-in-one search and AI toolkit you need. Discover insights with Yep.com, DuckDuckGo, and Phind; access cutting-edge AI models; transcribe YouTube videos; generate temporary emails and phone numbers; perform text-to-speech conversions; and much more!

    Language:Python28741755
  • gpustack/llama-box

    LM inference server implementation based on *.cpp.

    Language:C++274104224
  • ShelbyJenkins/llm_client

    The Easiest Rust Interface for Local LLMs and an Interface for Deterministic Signals from Probabilistic LLM Vibes

    Language:Rust2344922
  • mgonzs13/llama_ros

    llama.cpp (GGUF LLMs) and llava.cpp (GGUF VLMs) for ROS 2

    Language:C++2284540
  • gpustack/gguf-parser-go

    Review/Check GGUF files and estimate the memory usage and maximum tokens per second.

    Language:Go2056715
  • beehive-lab/GPULlama3.java

    GPU-accelerated Llama3.java inference in pure Java using TornadoVM.

    Language:Java16991818
  • shady.ai

    BrutalCoding/shady.ai

    Making offline AI models accessible to all types of edge devices.

    Language:Dart145121015
  • Nexesenex/croco.cpp

    Croco.Cpp is fork of KoboldCPP infering GGML/GGUF models on CPU/Cuda with KoboldAI's UI. It's powered partly by IK_LLama.cpp, and compatible with most of Ikawrakow's quants except Bitnet.

    Language:C++136403
  • sinanuozdemir/oreilly-hands-on-gpt-llm

    Mastering the Art of Scalable and Efficient AI Model Deployment

    Language:Jupyter Notebook1346390
  • calcuis/gguf

    gguf node for comfyui

    Language:Python1204136
  • Mobile-Artificial-Intelligence/llama_sdk

    lcpp is a dart implementation of llama.cpp used by the mobile artificial intelligence distribution (maid)

    Language:C++10272824
  • akx/ollama-dl

    Download models from the Ollama library, without Ollama

    Language:Python971211
  • Aesthisia/LLMinator

    Gradio based tool to run opensource LLM models directly from Huggingface

    Language:Python954419
  • 1038lab/ComfyUI-MiniCPM

    A custom ComfyUI node for MiniCPM vision-language models, supporting v4, v4.5, and v4 GGUF formats, enabling high-quality image captioning and visual analysis.

    Language:Python91
  • AstraBert/PrAIvateSearch

    Own your AI, search the web with it🌐😎

    Language:Python8911112
  • 1038lab/ComfyUI-JoyCaption

    Joy Caption is a ComfyUI node using the LLaVA model to generate stylized image captions, supporting batch processing and GGUF models.

    Language:Python85
  • tattn/LocalLLMClient

    Swift local LLM client for iOS, macOS, Linux

    Language:Swift8119
  • ADT109119/llamacpp-distributed-inference

    一個基於 llama.cpp 的分佈式 LLM 推理程式,讓您能夠利用區域網路內的多台電腦協同進行大型語言模型的分佈式推理,使用 Electron 的製作跨平台桌面應用程式操作 UI。

    Language:JavaScript61
  • mikecvet/nl-sh

    The Natural Language Shell integrates OpenAI's GPTs, Anthropic's Claude, or local GGUF-formatted LLMs directly into the terminal experience, allowing operators to describe their tasks in either POSIX commands or fluent human language

    Language:Rust61101
  • ImpAI

    rbourgeat/ImpAI

    😈 ImpAI is an advanced role play app using large language and diffusion models.

    Language:JavaScript61354