Pinned Repositories
GSparsity
xformers
Hackable and optimized Transformers building blocks, supporting a composable construction.
AITemplate
AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.
BSCA
Matlab code for block successive convex approximation algorithms
proxsgd
ProxSGD algorithm in TensorFlow
STELA
STELA algorithm for sparsity regularized linear regression (LASSO)
TransformerEngine
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper GPUs, to provide better performance with lower memory utilization in both training and inference.
xformers
Hackable and optimized Transformers building blocks, supporting a composable construction.
optyang's Repositories
optyang/BSCA
Matlab code for block successive convex approximation algorithms
optyang/STELA
STELA algorithm for sparsity regularized linear regression (LASSO)
optyang/proxsgd
ProxSGD algorithm in TensorFlow
optyang/AITemplate
AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.
optyang/TransformerEngine
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper GPUs, to provide better performance with lower memory utilization in both training and inference.
optyang/xformers
Hackable and optimized Transformers building blocks, supporting a composable construction.