dbyoung18

USTC | ADSL | MLSYS | MLPerf | LLM

@IntelShanghai, China

Pinned Repositories

ao
Custom data types and layouts for training and inference
Language:Python00
DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Language:Python00
DeepSpeed-MII
MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.
Language:Python00
gpt-fast
Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
Language:Python00
inference
Reference implementations of MLPerf™ inference benchmarks
Language:Python00
inference_results_v1.1
Language:Python00
intel-extension-for-deepspeed
Intel® Extension for DeepSpeed* is an extension that brings Intel GPU (XPU) support to DeepSpeed.
Language:C++00
TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
Language:C++00
training
Reference implementations of MLPerf™ training benchmarks
Language:Python00
pai
Resource scheduling and cluster management for AI
Language:JavaScript2.6k 104 2.1k548

dbyoung18's Repositories

dbyoung18/ao
Custom data types and layouts for training and inference
Language:Python00
dbyoung18/DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Language:Python00
dbyoung18/DeepSpeed-MII
MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.
Language:Python00
dbyoung18/gpt-fast
Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
Language:Python00
dbyoung18/inference
Reference implementations of MLPerf™ inference benchmarks
Language:Python00
dbyoung18/inference_results_v1.1
Language:Python00
dbyoung18/intel-extension-for-deepspeed
Intel® Extension for DeepSpeed* is an extension that brings Intel GPU (XPU) support to DeepSpeed.
Language:C++00
dbyoung18/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
Language:C++00
dbyoung18/training
Reference implementations of MLPerf™ training benchmarks
Language:Python00
dbyoung18/KE-complex_modifications
Karabiner-Elements complex_modifications rules
dbyoung18/mlx
MLX: An array framework for Apple silicon
dbyoung18/safetensors
Simple, safe way to store and distribute tensors
dbyoung18/TensorRT
TensorRT is a C++ library for high performance inference on NVIDIA GPUs and deep learning accelerators.
dbyoung18/training_results_v2.1
dbyoung18/transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
dbyoung18/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python
dbyoung18/warp-transducer
A fast parallel implementation of RNN Transducer.
dbyoung18/xformers
Hackable and optimized Transformers building blocks, supporting a composable construction.

dbyoung18

Pinned Repositories

ao

DeepSpeed

DeepSpeed-MII

gpt-fast

inference

inference_results_v1.1

intel-extension-for-deepspeed

TensorRT-LLM

training

pai

dbyoung18's Repositories

dbyoung18/ao

dbyoung18/DeepSpeed

dbyoung18/DeepSpeed-MII

dbyoung18/gpt-fast

dbyoung18/inference

dbyoung18/inference_results_v1.1

dbyoung18/intel-extension-for-deepspeed

dbyoung18/TensorRT-LLM

dbyoung18/training

dbyoung18/KE-complex_modifications

dbyoung18/mlx

dbyoung18/safetensors

dbyoung18/TensorRT

dbyoung18/training_results_v2.1

dbyoung18/transformers

dbyoung18/vllm

dbyoung18/warp-transducer

dbyoung18/xformers