YukeWang96
Ph.D. candidate at the University of California, Santa Barbara | System/Compiler for Deep Learning
University of California, Santa BarbaraSanta Barbara, US
Pinned Repositories
AlCOP_MLSys23
Artifact for MLSys'23: ALCOP: Automatic Load-Compute Pipelining in Deep Learning Compiler for AI-GPUs.
APNN-TC_SC21
Artifact for SC21: APNN-TC: Accelerating Arbitrary Precision Neural Networks on Ampere GPU Tensor Cores.
CNN-TensorRT
Benchmarking TensorRT on CNN models
CS263-project
UCSB CS263 Project for Spring 2020 Quarter
DSXplore_IPDPS21
Artifact for IPDPS'21: DSXplore: Optimizing Convolutional Neural Networks via Sliding-Channel Convolutions.
GNNAdvisor_OSDI21
Artifact for OSDI'21 GNNAdvisor: An Adaptive and Efficient Runtime System for GNN Acceleration on GPUs.
MGG_OSDI23
Artifact for OSDI'23: MGG: Accelerating Graph Neural Networks with Fine-grained intra-kernel Communication-Computation Pipelining on Multi-GPU Platforms.
QGTC_PPoPP22
Artifact for PPoPP22 QGTC: Accelerating Quantized GNN via GPU Tensor Core.
SGQuant
SGQuant: Squeezing the Last Bit on Graph Neural Networks with Specialized Quantization
TC-GNN_ATC23
Artifact for USENIX ATC'23: TC-GNN: Bridging Sparse GNN Computation and Dense Tensor Cores on GPUs.
YukeWang96's Repositories
YukeWang96/GNNAdvisor_OSDI21
Artifact for OSDI'21 GNNAdvisor: An Adaptive and Efficient Runtime System for GNN Acceleration on GPUs.
YukeWang96/TC-GNN_ATC23
Artifact for USENIX ATC'23: TC-GNN: Bridging Sparse GNN Computation and Dense Tensor Cores on GPUs.
YukeWang96/MGG_OSDI23
Artifact for OSDI'23: MGG: Accelerating Graph Neural Networks with Fine-grained intra-kernel Communication-Computation Pipelining on Multi-GPU Platforms.
YukeWang96/QGTC_PPoPP22
Artifact for PPoPP22 QGTC: Accelerating Quantized GNN via GPU Tensor Core.
YukeWang96/DSXplore_IPDPS21
Artifact for IPDPS'21: DSXplore: Optimizing Convolutional Neural Networks via Sliding-Channel Convolutions.
YukeWang96/SGQuant
SGQuant: Squeezing the Last Bit on Graph Neural Networks with Specialized Quantization
YukeWang96/AlCOP_MLSys23
Artifact for MLSys'23: ALCOP: Automatic Load-Compute Pipelining in Deep Learning Compiler for AI-GPUs.
YukeWang96/APNN-TC_SC21
Artifact for SC21: APNN-TC: Accelerating Arbitrary Precision Neural Networks on Ampere GPU Tensor Cores.
YukeWang96/CNN-TensorRT
Benchmarking TensorRT on CNN models
YukeWang96/APNN-TC-kernel
YukeWang96/cuda-samples
Samples for CUDA Developers which demonstrates features in CUDA Toolkit
YukeWang96/CUDALibrarySamples
CUDA Library Samples
YukeWang96/cutlass
CUDA Templates for Linear Algebra Subroutines
YukeWang96/dgl_pydirect_internal
dgl_pydirect for multi-GPU full-graph computation
YukeWang96/docker-pytorch
A Docker image for PyTorch
YukeWang96/EL-Rec_SC22
Artifact for SC'22: EL-Rec: Efficient Large-scale Recommendation Model Training via Tensor-Train Embedding Table.
YukeWang96/Faith_ATC22
Artifact for Faith: An Efficient Framework for Transformer Verification on GPUs.
YukeWang96/fast-dpsgd
Code for fast dpsgd implementations in JAX/TF
YukeWang96/github_page
YukeWang96/llvm-build
Docker file for build LLVM LibTooling
YukeWang96/openshmem-examples
Some miscellaneous OpenSHMEM examples
YukeWang96/personal_page
YukeWang96/rosette
The Rosette solver-aided host language, sample solver-aided DSLs, and demos
YukeWang96/sc21_AD
YukeWang96/TCGNN-bSpmm
YukeWang96/TCGNN-trition
YukeWang96/TCGNN-tsparse
YukeWang96/tutorials
PyTorch tutorials.
YukeWang96/tutorials-1
Training material for IPU users: tutorials, feature examples, simple applications
YukeWang96/YukeWang96.github.io
A beautiful, simple, clean, and responsive Jekyll theme for academics