keilsmart

Pinned Repositories

caffe
Ristretto: Caffe-based approximation of convolutional neural networks.
Language:C++0 0 00
CuAssembler
An unofficial cuda assembler, for all generations of SASS, hopefully ：）
Language:Python0 0 00
cutlass
CUDA Templates for Linear Algebra Subroutines
Language:C++0 0 00
kernl
Kernl lets you run PyTorch transformer models several times faster on GPU with a single line of code, and is designed to be easily hackable.
Language:Jupyter Notebook0 0 00
llvm-project
The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
0 0 00
onediff
OneDiff: An out-of-the-box acceleration library for diffusion models.
Language:Python0 0 00
oneflow
OneFlow is a deep learning framework designed to be user-friendly, scalable and efficient.
Language:C++0 0 00
openpose
OpenPose: Real-time multi-person keypoint detection library for body, face, hands, and foot estimation
Language:C++0 0 00
pytorch
Tensors and Dynamic neural networks in Python with strong GPU acceleration
Language:Python00
rocMLIR
0 0 00

keilsmart's Repositories

keilsmart/caffe
Ristretto: Caffe-based approximation of convolutional neural networks.
Language:C++0 0 00
keilsmart/CuAssembler
An unofficial cuda assembler, for all generations of SASS, hopefully ：）
Language:Python0 0 00
keilsmart/cutlass
CUDA Templates for Linear Algebra Subroutines
Language:C++0 0 00
keilsmart/kernl
Kernl lets you run PyTorch transformer models several times faster on GPU with a single line of code, and is designed to be easily hackable.
Language:Jupyter Notebook0 0 00
keilsmart/llvm-project
The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
0 0 00
keilsmart/onediff
OneDiff: An out-of-the-box acceleration library for diffusion models.
Language:Python0 0 00
keilsmart/oneflow
OneFlow is a deep learning framework designed to be user-friendly, scalable and efficient.
Language:C++0 0 00
keilsmart/openpose
OpenPose: Real-time multi-person keypoint detection library for body, face, hands, and foot estimation
Language:C++0 0 00
keilsmart/pytorch
Tensors and Dynamic neural networks in Python with strong GPU acceleration
Language:Python00
keilsmart/rocMLIR
0 0 00
keilsmart/stable-fast
Best inference performance optimization framework for HuggingFace Diffusers on NVIDIA GPUs.
Language:Python0 0 00
keilsmart/Stochastic-Quantization
Training Low-bits DNNs with Stochastic Quantization
Language:Jupyter Notebook0 0 00
keilsmart/tvm
Open deep learning compiler stack for cpu, gpu and specialized accelerators
Language:Python0 0 00
keilsmart/web-stable-diffusion
Bringing stable diffusion models to web browsers. Everything runs inside the browser with no server support.
Language:Jupyter Notebook0 0 00
keilsmart/YOLOv3-model-pruning
对 YOLOv3 做模型剪枝，在 oxford hand 数据集上模型的参数量减少 80% ，FLOPs 降低 70%，Infer 的速度可以达到原来的 200%，mAP 基本保持不变
Language:Python0 0 00