yzhwang
UC Davis PhD from @owensgroup. Co-Founder, CTO at PolyLabs Inc., ex-@sail-sg, ex-@Tencent and Wechat, ex-@google Brain. Built @gunrock. @tensorflow contributor.
PolyLabs Inc.Beijing
Pinned Repositories
gunrock
Programmable CUDA/C++ GPU Graph Analytics
mini
mini is mini
hloenv
an environment based on XLA for deep learning compiler optimization research.
plato
腾讯高性能分布式图计算框架Plato
bitstarter
depixelization
Depixelization of pixel art on GPU
jax-multi-gpu-resnet50-example
An example showing how to use jax to train resnet50 on multi-node multi-GPU
moderngpu
Patterns and behaviors for GPU computing
plas
Parallel Linear Algebra Subroutines
yzhwang's Repositories
yzhwang/jax-multi-gpu-resnet50-example
An example showing how to use jax to train resnet50 on multi-node multi-GPU
yzhwang/moderngpu
Patterns and behaviors for GPU computing
yzhwang/plas
Parallel Linear Algebra Subroutines
yzhwang/tensorflow-wheels
Custom built tensorflow wheels.
yzhwang/zhua
Forum crawler using Python Scrapy
yzhwang/awesome-ai-infrastructures
:tent: Infrastructures™ for Machine Learning Training / Inference in Production.
yzhwang/awesome-machine-learning-in-compilers
Must read research papers and links to tools and datasets that are related to using machine learning for compilers and systems optimisation
yzhwang/bert
TensorFlow code and pre-trained models for BERT
yzhwang/char-rnn-tensorflow
Multi-layer Recurrent Neural Networks (LSTM, RNN) for character-level language models in Python using Tensorflow
yzhwang/civ-sim
yzhwang/cub
CUB is a flexible library of cooperative threadblock primitives and other utilities for CUDA kernel programming.
yzhwang/cublas-benchmark
Simple benchmark program for cublas routines
yzhwang/Enterprise
Source Code of Enterprise Project @ GWU
yzhwang/gnn
TensorFlow GNN is a library to build Graph Neural Networks on the TensorFlow platform.
yzhwang/gunrock
High-performance Graph Primitives on GPU
yzhwang/instant-ngp
Instant neural graphics primitives: lightning fast NeRF and more
yzhwang/llm.c
LLM training in simple, raw C/CUDA
yzhwang/Medusa
Medusa: Building GPU-based Parallel Sparse Graph Applications with Sequential C/C++ Code
yzhwang/morphic
An AI-powered answer engine with a generative UI
yzhwang/mpi4jax
Zero-copy MPI communication of JAX arrays, for turbo-charged HPC applications in Python :zap:
yzhwang/openai-gemm
Open single and half precision gemm implementations
yzhwang/pbrt-v4
Source code to pbrt, the ray tracer described in the forthcoming 4th edition of the "Physically Based Rendering: From Theory to Implementation" book.
yzhwang/plato
腾讯高性能图计算框架Plato
yzhwang/redsync
implementation of redsync
yzhwang/sprocketnes
NES emulator written in Rust
yzhwang/tensorflow
Computation using data flow graphs for scalable machine learning
yzhwang/tf_weights
A TensorFlow implementation of AlexNet with pretrained weights
yzhwang/ThunderKittens
Tile primitives for speedy kernels
yzhwang/yzhwang.github.io
My personal page.
yzhwang/zero123
Zero-1-to-3: Zero-shot One Image to 3D Object: https://zero123.cs.columbia.edu/