optyang

Machine Learning Engineer

MetaKaiserslautern, Germany

Pinned Repositories

GSparsity
Language:Python5 2 02
xformers
Hackable and optimized Transformers building blocks, supporting a composable construction.
Language:Python9.2k 76 607652
AITemplate
AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.
Language:Python0 0 00
BSCA
Matlab code for block successive convex approximation algorithms
Language:MATLAB28 2 010
proxsgd
ProxSGD algorithm in TensorFlow
Language:Python5 1 01
STELA
STELA algorithm for sparsity regularized linear regression (LASSO)
Language:Python6 1 03
TransformerEngine
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper GPUs, to provide better performance with lower memory utilization in both training and inference.
Language:Python0 0 00
xformers
Hackable and optimized Transformers building blocks, supporting a composable construction.
Language:Python0 0 00

optyang/BSCA
Matlab code for block successive convex approximation algorithms
Language:MATLAB28 2 010
optyang/STELA
STELA algorithm for sparsity regularized linear regression (LASSO)
Language:Python6 1 03
optyang/proxsgd
ProxSGD algorithm in TensorFlow
Language:Python5 1 01
optyang/AITemplate
AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.
Language:Python0 0 00
optyang/TransformerEngine
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper GPUs, to provide better performance with lower memory utilization in both training and inference.
Language:Python0 0 00
optyang/xformers
Hackable and optimized Transformers building blocks, supporting a composable construction.
Language:Python0 0 00