Pinned Repositories
abides
ABIDES: Agent-Based Interactive Discrete Event Simulation
AutoSched
blislab
BLISlab: A Sandbox for Optimizing GEMM
byteps
A high performance and generic framework for distributed DNN training
Canonical_ES_Atari
Benchmarking Canonical Evolution Strategies for Playing Atari
CROPTD
Dataset for "Cross-Regional Oil Palm Tree Detection"
cs231n.github.io
Public facing notes page
cuda_hgemm
Several optimization methods of half-precision general matrix multiplication (HGEMM) using tensor core with WMMA API and MMA PTX instruction.
darknet
Convolutional Neural Networks
vpex
Wenzha0Wu's Repositories
Wenzha0Wu/CROPTD
Dataset for "Cross-Regional Oil Palm Tree Detection"
Wenzha0Wu/mri_score
Wenzha0Wu/vpex
Wenzha0Wu/byteps
A high performance and generic framework for distributed DNN training
Wenzha0Wu/cs231n.github.io
Public facing notes page
Wenzha0Wu/cuda_hgemm
Several optimization methods of half-precision general matrix multiplication (HGEMM) using tensor core with WMMA API and MMA PTX instruction.
Wenzha0Wu/darknet
Convolutional Neural Networks
Wenzha0Wu/Domain-generalization
All about domain generalization
Wenzha0Wu/DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Wenzha0Wu/DissectingTensorCores
Wenzha0Wu/intel-extension-for-pytorch
A Python package for extending the official PyTorch that can easily obtain performance on Intel platform
Wenzha0Wu/jax
Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
Wenzha0Wu/llmsys_s24_hw1
Wenzha0Wu/llmsys_s24_hw2
Wenzha0Wu/llmsys_s24_hw3
Wenzha0Wu/llmsys_s24_hw4
Wenzha0Wu/models
Models and examples built with TensorFlow
Wenzha0Wu/nni
An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
Wenzha0Wu/os_course_exercises
Exercises for OS course
Wenzha0Wu/Paddle
PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)
Wenzha0Wu/ppl.llm.kernel.cuda
Wenzha0Wu/PyTorch-GAN
PyTorch implementations of Generative Adversarial Networks.
Wenzha0Wu/ray
An open source framework that provides a simple, universal API for building distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyperparameter tuning library.
Wenzha0Wu/SWCompiler
Domain specific end-to-end compiler for heterogeneous HPC systems
Wenzha0Wu/tianshou
An elegant, flexible, and superfast PyTorch deep reinforcement learning platform.
Wenzha0Wu/vimrc
The ultimate Vim configuration: vimrc
Wenzha0Wu/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Wenzha0Wu/voc-inspector
Dash app for voc distribution visualization
Wenzha0Wu/wechaty
WeChat Bot SDK for Individual Account, Powered by TypeScript, Docker, and 💖
Wenzha0Wu/xv6-public
xv6 OS