Pinned Repositories
Awesome-LLM-Inference
📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.
docs
Documentations for PaddlePaddle
flash-attention-minimal
Flash Attention in ~100 lines of CUDA (forward pass only)
llm-course
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
llm.c
LLM training in simple, raw C/CUDA
MiniTorch
Paddle
PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)
TensorRT_Tutorial
whale-starry
繁星点点,光芒万丈
docs
Documentations for PaddlePaddle
challengewly's Repositories
challengewly/Awesome-LLM-Inference
📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.
challengewly/docs
Documentations for PaddlePaddle
challengewly/flash-attention-minimal
Flash Attention in ~100 lines of CUDA (forward pass only)
challengewly/llm-course
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
challengewly/llm.c
LLM training in simple, raw C/CUDA
challengewly/MiniTorch
challengewly/Paddle
PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)
challengewly/TensorRT_Tutorial
challengewly/whale-starry
繁星点点,光芒万丈