qwen3

There are 108 repositories under qwen3 topic.

  • unsloth

    Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, Qwen3, Llama 4, DeepSeek-R1, Gemma 3, TTS 2x faster with 70% less VRAM.

    Language:Python45.5k
  • MaxKB

    🔥 MaxKB is an open-source platform for building enterprise-grade agents. MaxKB 是强大易用的开源企业级智能体平台。

    Language:Python18.4k
  • sglang

    SGLang is a fast serving framework for large language models and vision language models.

    Language:Python17.9k
  • ms-swift

    Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen3, Qwen3-MoE, Llama4, GLM4.5, InternLM3, DeepSeek-R1, ...) and 200+ MLLMs (Qwen2.5-VL, Qwen2.5-Omni, InternVL3.5, Ovis2.5, Llava, GLM4v, Phi4, ...) (AAAI 2025).

    Language:Python9.9k
  • ART

    ART

    Agent Reinforcement Trainer: train multi-step agents for real-world tasks using GRPO. Give your agents on-the-job training. Reinforcement learning for Qwen2.5, Qwen3, Llama, and more!

    Language:Python7.2k
  • deep-searcher

    Open Source Deep Research Alternative to Reason and Search on Private Data. Written in Python.

    Language:Python6.9k
  • Awesome-LLM-Inference

    Awesome-LLM-Inference

    📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉

    Language:Python4.5k
  • Sidekick

    Sidekick

    A native macOS app that allows users to chat with a local LLM that can respond with information from files, folders and websites on your Mac without installing any other software. Powered by llama.cpp.

    Language:Swift3.1k
  • papersgpt-for-zotero

    A powerful Zotero AI and MCP plugin with ChatGPT, Gemini, Claude, Grok, DeepSeek, OpenRouter, Kimi, GLM, SiliconFlow, GPT-oss, Gemma 3, Qwen 3

    Language:JavaScript1.9k
  • kubewall

    kubewall

    kubewall - Single-Binary Kubernetes Dashboard with Multi-Cluster Management & AI Integration. (OpenAI / Claude 4 / Gemini / DeepSeek / OpenRouter / Ollama / Qwen / LMStudio)

    Language:TypeScript1.6k
  • awesome-llm-and-aigc

    🚀🚀🚀A collection of some awesome public projects about Large Language Model(LLM), Vision Language Model(VLM), Vision Language Action(VLA), AI Generated Content(AIGC), the related Datasets and Applications.

  • tome

    a magical LLM desktop client that makes it easy for *anyone* to use LLMs and MCP

    Language:Svelte471
  • qwen600

    Static suckless single batch CUDA-only qwen3-0.6B mini inference engine

    Language:Cuda441
  • LLM-TPU

    Run generative AI models in sophgo BM1684X/BM1688

    Language:C++240
  • hud-python

    OSS RL environment + evals toolkit

    Language:Python174
  • Qwen3-Medical-SFT

    Qwen3 Fine-tuning: Medical R1 Style Chat

    Language:Python171
  • GPULlama3.java

    GPU-accelerated Llama3.java inference in pure Java using TornadoVM.

    Language:Java169
  • Crane

    A Pure Rust based LLM (Any LLM based MLLM such as Spark-TTS) Inference Engine, powering by Candle framework.

    Language:Rust159
  • grps_trtllm

    Higher performance OpenAI LLM service than vLLM serve: A pure C++ high-performance OpenAI LLM service implemented with GPRS+TensorRT-LLM+Tokenizers.cpp, supporting chat and function call, AI agents, distributed multi-GPU inference, multimodal capabilities, and a Gradio chat interface.

    Language:Python153
  • easy-model-deployer

    Deploy open-source LLMs on AWS in minutes — with OpenAI-compatible APIs and a powerful CLI/SDK toolkit.

    Language:Python72
  • Automodel

    DTensor-native pretraining and fine-tuning for LLMs/VLMs with day-0 Hugging Face support, GPU-acceleration, and memory efficiency.

    Language:Python71
  • Better-Qwen3

    Auto Thinking Mode switch for Qwen3 in Open webui

    Language:Python67
  • everyday

    ✨ 让经典名言焕发新生!基于LLM模型动态生成创意故事,用AI重新诠释金山每日一句的智慧结晶

    Language:Markdown52
  • qwen3-semantic-search

    interactive semantic search demo using Qwen3-0.6B-Embedding in your browser

    Language:TypeScript51
  • cot_proxy

    Smart proxy for LLM APIs that enables model-specific parameter control, automatic mode switching (like Qwen3's /think and /no_think), and <think> tag filtering. Perfect for using advanced models with apps that lack parameter customization.

    Language:Python50
  • qwen3-MoE-from-scratch

    A Step-by-Step Implementation of Qwen 3 MoE Architecture from Scratch

    Language:Jupyter Notebook47
  • Prompt_Maker

    Makes a improved prompts from a basic prompt

    Language:Python43
  • gLLM

    gLLM: Global Balanced Pipeline Parallelism System for Distributed LLM Serving with Token Throttling

    Language:Python41
  • qwen_cli_coder

    🤖 Community fork of Google's Gemini CLI for Qwen AI models. A powerful command-line tool that uses Alibaba Cloud's Qwen models to understand your code, automate workflows, and accelerate development. Features multilingual support (EN/CN), model switching, and web search integration.

    Language:TypeScript36
  • steadytext

    steadytext

    Deterministic text generation and embeddings with zero configuration

    Language:PLpgSQL36
  • awesome-sglang

    Make SGLang go brrr

  • PolyMath

    Evaluation Code Repo for Paper "PolyMath: Evaluating Mathematical Reasoning in Multilingual Contexts"

    Language:Python30
  • qwen3-rs

    An educational Rust project for exporting and running inference on Qwen3 LLM family

    Language:Rust28
  • LLM-Quantization

    记录量化LLM中的总结。

    Language:Python26
  • qwen3-mcp

    An MCP-enabled Qwen3 0.6B demo with adjustable thinking budget, all in your browser!

    Language:JavaScript25
  • qwen3.c

    Lightweight C inference for Qwen3 GGUF. Multiturn prefix caching & batch processing.

    Language:C18