Pinned Repositories
flux
A fast communication-overlapping library for tensor parallelism on GPUs.
nni
An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
SparTA
brp-nas
Compression-Latency-Predictor
A latency predictor for the filter pruning.
nmsparse
nni
An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
Pruning-from-scratch
Pytorch-Visualization
Libs to visuallize the network architecture automaticlly
SparTA
zheng-ningxin's Repositories
zheng-ningxin/brp-nas
zheng-ningxin/SparTA
zheng-ningxin/nmsparse
zheng-ningxin/nni
An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
zheng-ningxin/SparseOP
zheng-ningxin/compression_exp
zheng-ningxin/cuda-tensorcore-hgemm
zheng-ningxin/CUDALibrarySamples
CUDA Library Samples
zheng-ningxin/CustomizeOP
zheng-ningxin/cutlass
CUDA Templates for Linear Algebra Subroutines
zheng-ningxin/FasterTransformer
Transformer related optimization, including BERT, GPT
zheng-ningxin/flux
A fast communication-overlapping library for tensor parallelism on GPUs.
zheng-ningxin/gpu-sparsert
zheng-ningxin/latency_raw_data
zheng-ningxin/LeViT
LeViT a Vision Transformer in ConvNet's Clothing for Faster Inference
zheng-ningxin/linux
Linux kernel source tree
zheng-ningxin/Mkl-Sparse
zheng-ningxin/MLPruning
MLPruning, PyTorch, NLP, BERT, Structured Pruning
zheng-ningxin/nn_pruning
Prune a model while finetuning or training.
zheng-ningxin/nnfusion
A flexible and efficient deep neural network (DNN) compiler that generates high-performance executable from a DNN model description.
zheng-ningxin/pytorch_block_sparse
Fast Block Sparse Matrices for Pytorch
zheng-ningxin/sparsednn
Fast sparse deep learning on CPUs
zheng-ningxin/SparTA_Fork
zheng-ningxin/sputnik
A library of GPU kernels for sparse matrix operations.
zheng-ningxin/test_pip
zheng-ningxin/transformers
🤗 Transformers: State-of-the-art Natural Language Processing for Pytorch, TensorFlow, and JAX.
zheng-ningxin/TurboTransformers
a fast and user-friendly runtime for transformer inference (Bert, Albert, GPT2, Decoders, etc) on CPU and GPU.
zheng-ningxin/tvm
Open deep learning compiler stack for cpu, gpu and specialized accelerators
zheng-ningxin/Utils
zheng-ningxin/zheng-ningxin.github.io