Pinned Repositories
BladeDISC
BladeDISC is an end-to-end DynamIc Shape Compiler project for machine learning workloads.
caffe
Caffe: a fast open framework for deep learning.
cub
Cooperative primitives for CUDA C++.
G-SLIDE
HashingDeepLearning
Codebase for "SLIDE : In Defense of Smart Algorithms over Hardware Acceleration for Large-Scale Deep Learning Systems"
moderngpu
Patterns and behaviors for GPU computing
panzaifeng.github.io
RecFlex
A recommendation model kernel optimizing system
recom
vimrc
my simple vim configuration
PanZaifeng's Repositories
PanZaifeng/G-SLIDE
PanZaifeng/RecFlex
A recommendation model kernel optimizing system
PanZaifeng/panzaifeng.github.io
PanZaifeng/vimrc
my simple vim configuration
PanZaifeng/BladeDISC
BladeDISC is an end-to-end DynamIc Shape Compiler project for machine learning workloads.
PanZaifeng/caffe
Caffe: a fast open framework for deep learning.
PanZaifeng/cub
Cooperative primitives for CUDA C++.
PanZaifeng/HashingDeepLearning
Codebase for "SLIDE : In Defense of Smart Algorithms over Hardware Acceleration for Large-Scale Deep Learning Systems"
PanZaifeng/moderngpu
Patterns and behaviors for GPU computing
PanZaifeng/recom
PanZaifeng/tensorflow
An Open Source Machine Learning Framework for Everyone
PanZaifeng/the-algorithm
Source code for Twitter's Recommendation Algorithm
PanZaifeng/Vitis_Libraries
Vitis Libraries
PanZaifeng/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.