qwen3

There are 108 repositories under qwen3 topic.

unsloth
Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, Qwen3, Llama 4, DeepSeek-R1, Gemma 3, TTS 2x faster with 70% less VRAM.
Language:Python45.5k
MaxKB
🔥 MaxKB is an open-source platform for building enterprise-grade agents. MaxKB 是强大易用的开源企业级智能体平台。
Language:Python18.4k
sglang
SGLang is a fast serving framework for large language models and vision language models.
Language:Python17.9k
ms-swift
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen3, Qwen3-MoE, Llama4, GLM4.5, InternLM3, DeepSeek-R1, ...) and 200+ MLLMs (Qwen2.5-VL, Qwen2.5-Omni, InternVL3.5, Ovis2.5, Llava, GLM4v, Phi4, ...) (AAAI 2025).
Language:Python9.9k
ART
Agent Reinforcement Trainer: train multi-step agents for real-world tasks using GRPO. Give your agents on-the-job training. Reinforcement learning for Qwen2.5, Qwen3, Llama, and more!
Language:Python7.2k
deep-searcher
Open Source Deep Research Alternative to Reason and Search on Private Data. Written in Python.
Language:Python6.9k
Awesome-LLM-Inference
📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉
Language:Python4.5k
Sidekick
A native macOS app that allows users to chat with a local LLM that can respond with information from files, folders and websites on your Mac without installing any other software. Powered by llama.cpp.
Language:Swift3.1k
papersgpt-for-zotero
A powerful Zotero AI and MCP plugin with ChatGPT, Gemini, Claude, Grok, DeepSeek, OpenRouter, Kimi, GLM, SiliconFlow, GPT-oss, Gemma 3, Qwen 3
Language:JavaScript1.9k
kubewall
kubewall - Single-Binary Kubernetes Dashboard with Multi-Cluster Management & AI Integration. (OpenAI / Claude 4 / Gemini / DeepSeek / OpenRouter / Ollama / Qwen / LMStudio)
Language:TypeScript1.6k
awesome-llm-and-aigc
🚀🚀🚀A collection of some awesome public projects about Large Language Model(LLM), Vision Language Model(VLM), Vision Language Action(VLA), AI Generated Content(AIGC), the related Datasets and Applications.
751
tome
a magical LLM desktop client that makes it easy for *anyone* to use LLMs and MCP
Language:Svelte471
qwen600
Static suckless single batch CUDA-only qwen3-0.6B mini inference engine
Language:Cuda441
LLM-TPU
Run generative AI models in sophgo BM1684X/BM1688
Language:C++240
hud-python
OSS RL environment + evals toolkit
Language:Python174
Qwen3-Medical-SFT
Qwen3 Fine-tuning: Medical R1 Style Chat
Language:Python171
GPULlama3.java
GPU-accelerated Llama3.java inference in pure Java using TornadoVM.
Language:Java169
Crane
A Pure Rust based LLM (Any LLM based MLLM such as Spark-TTS) Inference Engine, powering by Candle framework.
Language:Rust159
grps_trtllm
Higher performance OpenAI LLM service than vLLM serve: A pure C++ high-performance OpenAI LLM service implemented with GPRS+TensorRT-LLM+Tokenizers.cpp, supporting chat and function call, AI agents, distributed multi-GPU inference, multimodal capabilities, and a Gradio chat interface.
Language:Python153
easy-model-deployer
Deploy open-source LLMs on AWS in minutes — with OpenAI-compatible APIs and a powerful CLI/SDK toolkit.
Language:Python72
Automodel
DTensor-native pretraining and fine-tuning for LLMs/VLMs with day-0 Hugging Face support, GPU-acceleration, and memory efficiency.
Language:Python71
Better-Qwen3
Auto Thinking Mode switch for Qwen3 in Open webui
Language:Python67
everyday
✨ 让经典名言焕发新生！基于LLM模型动态生成创意故事，用AI重新诠释金山每日一句的智慧结晶
Language:Markdown52
qwen3-semantic-search
interactive semantic search demo using Qwen3-0.6B-Embedding in your browser
Language:TypeScript51
cot_proxy
Smart proxy for LLM APIs that enables model-specific parameter control, automatic mode switching (like Qwen3's /think and /no_think), and <think> tag filtering. Perfect for using advanced models with apps that lack parameter customization.
Language:Python50
qwen3-MoE-from-scratch
A Step-by-Step Implementation of Qwen 3 MoE Architecture from Scratch
Language:Jupyter Notebook47
Prompt_Maker
Makes a improved prompts from a basic prompt
Language:Python43
gLLM
gLLM: Global Balanced Pipeline Parallelism System for Distributed LLM Serving with Token Throttling
Language:Python41
qwen_cli_coder
🤖 Community fork of Google's Gemini CLI for Qwen AI models. A powerful command-line tool that uses Alibaba Cloud's Qwen models to understand your code, automate workflows, and accelerate development. Features multilingual support (EN/CN), model switching, and web search integration.
Language:TypeScript36
steadytext
Deterministic text generation and embeddings with zero configuration
Language:PLpgSQL36
awesome-sglang
Make SGLang go brrr
30
PolyMath
Evaluation Code Repo for Paper "PolyMath: Evaluating Mathematical Reasoning in Multilingual Contexts"
Language:Python30
qwen3-rs
An educational Rust project for exporting and running inference on Qwen3 LLM family
Language:Rust28
LLM-Quantization
记录量化LLM中的总结。
Language:Python26
qwen3-mcp
An MCP-enabled Qwen3 0.6B demo with adjustable thinking budget, all in your browser!
Language:JavaScript25
qwen3.c
Lightweight C inference for Qwen3 GGUF. Multiturn prefix caching & batch processing.
Language:C18