Pinned Repositories
spark
Apache Spark - A unified analytics engine for large-scale data processing
gpt-fast
Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
pytorch
Tensors and Dynamic neural networks in Python with strong GPU acceleration
triton
Development repository for the Triton language and compiler
bound-optimization
Linear regression and logistic regression under bound constrained optimization in Python.
gpt-fast
Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
Megatron-LM
Ongoing research training transformer models at scale
spark-vlbfgs
Vector-free L-BFGS implementation for Spark MLlib
TransformerEngine
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilization in both training and inference.
yanboliang's Repositories
yanboliang/spark-vlbfgs
Vector-free L-BFGS implementation for Spark MLlib
yanboliang/bound-optimization
Linear regression and logistic regression under bound constrained optimization in Python.
yanboliang/gpt-fast
Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
yanboliang/Megatron-LM
Ongoing research training transformer models at scale
yanboliang/TransformerEngine
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilization in both training and inference.
yanboliang/benchmark
TorchBench is a collection of open source benchmarks used to evaluate PyTorch performance.
yanboliang/keras
Deep Learning for humans
yanboliang/pytorch
Tensors and Dynamic neural networks in Python with strong GPU acceleration
yanboliang/pytorch-jit-paritybench
yanboliang/spark
Mirror of Apache Spark
yanboliang/torchdynamo
A Python-level JIT compiler designed to make unmodified PyTorch programs faster.
yanboliang/einops
Deep learning operations reinvented (for pytorch, tensorflow, jax and others)
yanboliang/tensorflow
Computation using data flow graphs for scalable machine learning
yanboliang/triton
Development repository for the Triton language and compiler