challengewly

Pinned Repositories

Awesome-LLM-Inference
📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.
00
docs
Documentations for PaddlePaddle
Language:Python00
flash-attention-minimal
Flash Attention in ~100 lines of CUDA (forward pass only)
Language:Cuda00
llm-course
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
Language:Jupyter Notebook00
llm.c
LLM training in simple, raw C/CUDA
Language:Cuda00
MiniTorch
Language:Python00
Paddle
PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice （『飞桨』核心框架，深度学习&机器学习高性能单机、分布式训练和跨平台部署）
Language:C++00
TensorRT_Tutorial
Language:C++00
whale-starry
繁星点点，光芒万丈
Language:C++00
docs
Documentations for PaddlePaddle
Language:Python243 77 785737

challengewly's Repositories

challengewly/Awesome-LLM-Inference
📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.
00
challengewly/docs
Documentations for PaddlePaddle
Language:Python00
challengewly/flash-attention-minimal
Flash Attention in ~100 lines of CUDA (forward pass only)
Language:Cuda00
challengewly/llm-course
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
Language:Jupyter Notebook00
challengewly/llm.c
LLM training in simple, raw C/CUDA
Language:Cuda00
challengewly/MiniTorch
Language:Python00
challengewly/Paddle
PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice （『飞桨』核心框架，深度学习&机器学习高性能单机、分布式训练和跨平台部署）
Language:C++00
challengewly/TensorRT_Tutorial
Language:C++00
challengewly/whale-starry
繁星点点，光芒万丈
Language:C++00