Pinned Repositories
tilt
antlr4
ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.
ava
Automatic virtualization of (general) accelerators.
awesome-tensor-compilers
A list of awesome compiler projects and papers for tensor computation and deep learning.
benchmark
A microbenchmark support library
clusterdata
cluster data collected from production clusters in Alibaba for cluster management research
cpp-ipc
C++ IPC Library: A high-performance inter-process communication using shared memory on Linux/Windows.
cricket
cricket is a virtualization solution for GPUs
fault-tolerent-kv-store
Fault-tolerant distributed key-value store that consists of multiple key-value servers, each of which is responsible for a portion of the key space.
streambox
wzhao18's Repositories
wzhao18/streambox
wzhao18/ava
Automatic virtualization of (general) accelerators.
wzhao18/awesome-tensor-compilers
A list of awesome compiler projects and papers for tensor computation and deep learning.
wzhao18/clusterdata
cluster data collected from production clusters in Alibaba for cluster management research
wzhao18/cpp-ipc
C++ IPC Library: A high-performance inter-process communication using shared memory on Linux/Windows.
wzhao18/cricket
cricket is a virtualization solution for GPUs
wzhao18/cuda-graph-with-dynamic-parameters
wzhao18/DeepLearningExamples
State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.
wzhao18/finetune-gpt2xl
Guide: Finetune GPT2-XL (1.5 Billion Parameters) and finetune GPT-NEO (2.7 B) on a single GPU with Huggingface Transformers using DeepSpeed
wzhao18/GPU-Virtualization-Benchmarks
wzhao18/hidet
An open-source efficient deep learning framework.
wzhao18/HUVM
wzhao18/iceoryx
Eclipse iceoryx™ - true zero-copy inter-process-communication
wzhao18/llvm-project
The LLVM Project is a collection of modular and reusable compiler and toolchain technologies. Note: the repository does not accept github pull requests at this moment. Please submit your patches at http://reviews.llvm.org.
wzhao18/Lucid
Lucid: A Non-Intrusive, Scalable and Interpretable Scheduler for Deep Learning Training Jobs
wzhao18/ml-cvnets
CVNets: A library for training computer vision networks
wzhao18/needle
wzhao18/open-gpu-kernel-modules
NVIDIA Linux open GPU kernel module source
wzhao18/protobuf-messaging
C++ library for sending/receiving protobuf messages over various channels (pipe, socket, kafka, etc.)
wzhao18/pytorch
Tensors and Dynamic neural networks in Python with strong GPU acceleration
wzhao18/rocksdb
A library that provides an embeddable, persistent key-value store for fast storage.
wzhao18/Saber
Window-Based Hybrid CPU/GPU Stream Processing Engine
wzhao18/streambench
wzhao18/tenset
wzhao18/TensorRT
PyTorch/TorchScript/FX compiler for NVIDIA GPUs using TensorRT
wzhao18/tlp
wzhao18/triton
Development repository for the Triton language and compiler
wzhao18/tvm
Open deep learning compiler stack for cpu, gpu and specialized accelerators
wzhao18/unilm
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
wzhao18/yolov5
YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite