llx-08's Stars
LLMServe/DistServe
Disaggregated serving system for Large Language Models (LLMs).
phidatahq/phidata
Build AI Agents with memory, knowledge, tools and reasoning. Chat with them using a beautiful Agent UI.
owenliang/qwen-vllm
通义千问VLLM推理部署DEMO
infiniflow/ragflow
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
xdit-project/xDiT
xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) on multi-GPU Clusters
microsoft/mscclpp
MSCCL++: A GPU-driven communication stack for scalable AI applications
tyler-griggs/melange-release
Paitesanshi/LLM-Agent-Survey
protocolbuffers/protobuf
Protocol Buffers - Google's data interchange format
alibaba/llm-scheduling-artifact
Artifact of OSDI '24 paper, ”Llumnix: Dynamic Scheduling for Large Language Model Serving“
dutsc/speculative-decoding
Explorations into some recent techniques surrounding speculative decoding
X-LANCE/SLAM-LLM
Speech, Language, Audio, Music Processing with Large Language Model
Azure/AzurePublicDataset
Microsoft Azure Traces
triton-lang/triton
Development repository for the Triton language and compiler
dvmazur/mixtral-offloading
Run Mixtral-8x7B models in Colab or consumer desktops
InternLM/lmdeploy
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
FasterDecoding/REST
REST: Retrieval-Based Speculative Decoding, NAACL 2024
yuweihao/MambaOut
MambaOut: Do We Really Need Mamba for Vision?
coder/code-server
VS Code in the browser
rany2/edge-tts
Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key
deepseek-ai/DeepSeek-V2
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
zeroQiaoba/MERTools
Toolkits for Multimodal Emotion Recognition
OpenMOSS/AnyGPT
Code for "AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling"
wangshusen/RecommenderSystem
wangshusen/SearchEngine
搜索引擎原理
KnowingNothing/MatmulTutorial
A Easy-to-understand TensorOp Matmul Tutorial
adam-maj/tiny-gpu
A minimal GPU design in Verilog to learn how GPUs work from the ground up
NVIDIA/cuda-samples
Samples for CUDA Developers which demonstrates features in CUDA Toolkit
brucefan1983/CUDA-Programming
Sample codes for my CUDA programming book
microsoft/DeepSpeedExamples
Example models using DeepSpeed