Cunxiao2002
University of Science and Technology Beijing in Automation
University of Science and Technology BeijingBeijing
Cunxiao2002's Stars
hzwer/WritingAIPaper
Writing AI Conference Papers: A Handbook for Beginners
KnowingNothing/compiler-and-arch
A list of tutorials, paper, talks, and open-source projects for emerging compiler and architecture
xdit-project/xDiT
xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) on multi-GPU Clusters
NVIDIA/Megatron-LM
Ongoing research training transformer models at scale
kvcache-ai/ktransformers
A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations
byungsoo-oh/ml-systems-papers
Curated collection of papers in machine learning systems
NAOSI-DLUT/Campus2025
2025届互联网校招信息汇总
NVIDIA/FasterTransformer
Transformer related optimization, including BERT, GPT
NVIDIA/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
mlc-ai/mlc-llm
Universal LLM Deployment Engine with ML Compilation
karpathy/build-nanogpt
Video+code lecture on building nanoGPT from scratch
microsoft/AI-System
System for AI Education Resource.
RussWong/LLM-engineering
karpathy/nanoGPT
The simplest, fastest repository for training/finetuning medium-sized GPTs.
ModelTC/lightllm
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.
InternLM/lmdeploy
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
RussWong/CUDATutorial
A CUDA tutorial to make people learn CUDA program from 0
jefferyZhan/Griffon
【ECCV2024】The official repo of Griffon series
DefTruth/Awesome-LLM-Inference
📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.
mlabonne/llm-course
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
cuda-mode/awesomeMLSys
An ML Systems Onboarding list
conanhujinming/tips_for_interview
我的一些面试心得;自学CS历程分享;找工作求职经验分享
gzc/CLRS
:notebook:Solutions to Introduction to Algorithms
liguodongiot/llm-action
本项目旨在分享大模型相关技术原理以及实战经验。
AmberLJC/LLMSys-PaperList
Large Language Model (LLM) Systems Paper List
kebijuelun/Awesome-LLM-Learning
Learning Large Language Model (LLM)(大语言模型学习)
triton-lang/triton
Development repository for the Triton language and compiler
l0ngc/hpc-learning
hpc-learning
microsoft/BitBLAS
BitBLAS is a library to support mixed-precision matrix multiplications, especially for quantized LLM deployment.