llamacpp

There are 542 repositories under llamacpp topic.

  • menloresearch/jan

    Jan is an open source alternative to ChatGPT that runs 100% offline on your computer

    Language:TypeScript37.8k2002.9k2.3k
  • khoj

    khoj-ai/khoj

    Your AI second brain. Self-hostable. Get answers from the web or your docs. Build custom agents, schedule automations, do deep research. Turn any online or local LLM into your personal, autonomous AI (gpt, claude, gemini, llama, qwen, mistral). Get started - free.

    Language:Python31k1535531.8k
  • llmware-ai/llmware

    Unified framework for building enterprise RAG pipelines with small, specialized models

    Language:Python14.5k601653k
  • getumbrel/llama-gpt

    A self-hosted, offline, ChatGPT-like chatbot. Powered by Llama 2. 100% private, with no data leaving your device. New: Code Llama support!

    Language:TypeScript11k82129710
  • xorbitsai/inference

    Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.

    Language:Python8.5k582.5k741
  • reorproject/reor

    Private & local AI personal knowledge management app for high entropy people.

    Language:JavaScript8.3k54215505
  • LostRuins/koboldcpp

    Run GGUF models easily with a KoboldAI UI. One File. Zero Install.

    Language:C++8.2k821.1k535
  • serge

    serge-chat/serge

    A web interface for chatting with Alpaca through llama.cpp. Fully dockerized, with an easy to use API.

    Language:Svelte5.8k47183402
  • spark-nlp

    JohnSnowLabs/spark-nlp

    State of the Art Natural Language Processing

    Language:Scala4k100907732
  • gptme/gptme

    Your agent in your terminal, equipped with local tools: writes code, uses the terminal, browses the web, vision.

    Language:Python4k43189335
  • gpustack/gpustack

    Simple, scalable AI model deployment on GPU clusters

    Language:Python3.7k361.7k376
  • twinnydotdev/twinny

    The most no-nonsense, locally or API-hosted AI code completion plugin for Visual Studio Code - like GitHub Copilot but 100% free.

    Language:TypeScript3.6k22265208
  • SciSharp/LLamaSharp

    A C#/.NET library to run LLM (🦙LLaMA/LLaVA) on your local device efficiently.

    Language:C#3.4k62434463
  • Josh-XT/AGiXT

    AGiXT is a dynamic AI Agent Automation Platform that seamlessly orchestrates instruction management and complex task execution across diverse AI providers. Combining adaptive memory, smart features, and a versatile plugin system, AGiXT delivers efficient and comprehensive AI solutions.

    Language:Python3.1k71415429
  • cactus-compute/cactus

    Run AI locally on phones and AI-native devices

    Language:C++3k180
  • lsp-ai

    SilasMarvin/lsp-ai

    LSP-AI is an open-source language server that serves as a backend for AI-powered functionality, designed to assist and empower software engineers, not replace them.

    Language:Rust3k2570104
  • menloresearch/cortex.cpp

    Local AI API Platform

    Language:C++2.8k26881179
  • maid

    Mobile-Artificial-Intelligence/maid

    Maid is a cross-platform Flutter app for interfacing with GGUF / llama.cpp models locally, and with Ollama and OpenAI models remotely.

    Language:Dart2.2k37186218
  • containers/ramalama

    RamaLama is an open-source developer tool that simplifies the local serving of AI models from any source and facilitates their use for inference in production, all through the familiar language of containers.

    Language:Python2.1k32427252
  • floneum/floneum

    Instant, controllable, local pre-trained AI models in Rust

    Language:Rust2k26125111
  • Dot

    alexpinel/Dot

    Text-To-Speech, RAG, and LLMs. All local!

    Language:JavaScript1.8k2315110
  • LlamaChat

    alexrozanski/LlamaChat

    Chat with your favourite LLaMA models in a native macOS app

    Language:Swift1.5k154363
  • mostlygeek/llama-swap

    Model swapping for llama.cpp (or any local OpenAI API compatible server)

    Language:Go1.5k1115996
  • RahulSChand/gpu_poor

    Calculate token/s & GPU memory requirement for any LLM. Supports llama.cpp/ggml/bnb/QLoRA quantization

    Language:JavaScript1.4k71683
  • intentee/paddler

    Open-source LLMOps platform for hosting and scaling AI in your own infrastructure 🏓🦙

    Language:Rust1.3k60
  • vercel/modelfusion

    The TypeScript library for building AI applications.

    Language:TypeScript1.3k136690
  • awaescher/OllamaSharp

    The easiest way to use Ollama in .NET

    Language:C#1.1k27129159
  • Michael-A-Kuykendall/shimmy

    ⚡ Python-free Rust inference server — OpenAI-API compatible. GGUF + SafeTensors, hot model swap, auto-discovery, single binary. FREE now, FREE forever.

    Language:Rust1.1k92
  • benman1/generative_ai_with_langchain

    Build production-ready LLM applications and advanced agents using Python, LangChain, and LangGraph. This is the companion repository for the book on generative AI with LangChain.

    Language:Jupyter Notebook1k1543449
  • Dicklesworthstone/swiss_army_llama

    A FastAPI service for semantic text search using precomputed embeddings and advanced similarity measures, with built-in support for various file types through textract.

    Language:Python1k16663
  • ngxson/wllama

    WebAssembly binding for llama.cpp - Enabling on-browser LLM inference

    Language:TypeScript88578459
  • Atome-FE/llama-node

    Believe in AI democratization. llama for nodejs backed by llama-rs, llama.cpp and rwkv.cpp, work locally on your laptop CPU. support llama/alpaca/gpt4all/vicuna/rwkv model.

    Language:Rust872167064
  • huggingface/llm-ls

    LSP server leveraging LLMs for code completion (and more?)

    Language:Rust819213765
  • mukel/llama3.java

    Practical Llama 3 inference in Java

    Language:Java781291793
  • if-ai/ComfyUI-IF_AI_tools

    ComfyUI-IF_AI_tools is a set of custom nodes for ComfyUI that allows you to generate prompts using a local Large Language Model (LLM) via Ollama. This tool enables you to enhance your image generation workflow by leveraging the power of language models.

    Language:Python681814051
  • llama-cpp-agent

    Maximilian-Winter/llama-cpp-agent

    The llama-cpp-agent framework is a tool designed for easy interaction with Large Language Models (LLMs). Allowing users to chat with LLM models, execute structured function calls and get structured output. Works also with models not fine-tuned to JSON output and function calls.

    Language:Python589144659