Pinned Repositories
nn-Meter
A DNN inference latency prediction toolkit for accurately modeling and predicting the latency on diverse edge devices.
Paddle
PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)
Awesome-LLM-Inference
📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.
Detection
基于视频的行人流量密度检测
Paddle
PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)
Torch-Pruning
[CVPR-2023] Towards Any Structural Pruning; LLMs / Diffusion / Transformers / YOLOv8 / CNNs
triton
Development repository for the Triton language and compiler
SaltFish11's Repositories
SaltFish11/Awesome-LLM-Inference
📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.
SaltFish11/Detection
基于视频的行人流量密度检测
SaltFish11/Paddle
PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)
SaltFish11/Torch-Pruning
[CVPR-2023] Towards Any Structural Pruning; LLMs / Diffusion / Transformers / YOLOv8 / CNNs
SaltFish11/triton
Development repository for the Triton language and compiler