Pinned Repositories
AITemplate_public
AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.
caffe2
Caffe2 is a lightweight, modular, and scalable deep learning framework.
cpuinfo
CPU INFOrmation library (x86/ARM, Linux/Mach/NaCl)
cutlass
CUDA Templates for Linear Algebra Subroutines
dmlc-core
A common bricks library for building scalable and portable distributed machine learning.
onnx
Open Neural Network Exchange
pytorch
Tensors and Dynamic neural networks in Python with strong GPU acceleration
QNNPACK
Quantized Neural Network PACKage - mobile-optimized implementation of quantized neural network operators
tvm
Open deep learning compiler stack for cpu, gpu and specialized accelerators
hlu1's Repositories
hlu1/pytorch
Tensors and Dynamic neural networks in Python with strong GPU acceleration
hlu1/tvm
Open deep learning compiler stack for cpu, gpu and specialized accelerators
hlu1/AITemplate_public
AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.
hlu1/onnx
Open Neural Network Exchange
hlu1/QNNPACK
Quantized Neural Network PACKage - mobile-optimized implementation of quantized neural network operators
hlu1/caffe2
Caffe2 is a lightweight, modular, and scalable deep learning framework.
hlu1/cpuinfo
CPU INFOrmation library (x86/ARM, Linux/Mach/NaCl)
hlu1/cutlass
CUDA Templates for Linear Algebra Subroutines
hlu1/dmlc-core
A common bricks library for building scalable and portable distributed machine learning.
hlu1/KeepingYouAwake
Prevents your Mac from going to sleep.
hlu1/minGPT
A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training
hlu1/models
A repository for storing pre-trained Caffe2 models.
hlu1/TASO
A Tensor Algebra SuperOptimizer for Deep Learning
hlu1/tvm-samples
hlu1/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs