GD06
Ph.D. student in SEAL Lab, Dept. of Electrical and Computer Engineering at UC, Santa Barbara. Research interests include the computer system and architecture.
UC, Santa Barbarahttps://seal.ece.ucsb.edu/location
Pinned Repositories
caffe
Caffe: a fast open framework for deep learning.
caffe-tensorflow
Caffe models in TensorFlow
cublas_perf
Testing the performance of the cuBLAS
cuda-convnet2
Automatically exported from code.google.com/p/cuda-convnet2
cudnn-tuning
Codes for auto-tuning cudnn conv forward implementations
fathom
Reference workloads for modern deep learning methods.
FBGEMM
FB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications/fbgemm/
mkldnn-perf
Testing the performance of the MKL-DNN
MPU-ASPLOS-2021
Source code of MPU simulator and compiler for ASPLOS 2021 submission.
mpu-sim_distribution
GD06's Repositories
GD06/mpu-sim_distribution
GD06/MPU-ASPLOS-2021
Source code of MPU simulator and compiler for ASPLOS 2021 submission.
GD06/cudnn-tuning
Codes for auto-tuning cudnn conv forward implementations
GD06/mkldnn-perf
Testing the performance of the MKL-DNN
GD06/caffe
Caffe: a fast open framework for deep learning.
GD06/caffe-tensorflow
Caffe models in TensorFlow
GD06/cublas_perf
Testing the performance of the cuBLAS
GD06/cuda-convnet2
Automatically exported from code.google.com/p/cuda-convnet2
GD06/fathom
Reference workloads for modern deep learning methods.
GD06/FBGEMM
FB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications/fbgemm/
GD06/flash-attention
Fast and memory-efficient exact attention
GD06/GD06.github.io
Homepage
GD06/gpgpu-sim_distribution
GPGPU-Sim provides a detailed simulation model of contemporary NVIDIA GPUs running CUDA and/or OpenCL workloads. It includes support for features such as TensorCores and CUDA Dynamic Parallelism as well as a performance visualization tool, AerialVisoin, and an integrated energy model, GPUWattch.
GD06/Halide
a language for fast, portable data-parallel computation
GD06/leveldb
LevelDB is a fast key-value storage library written at Google that provides an ordered mapping from string keys to string values.
GD06/models
Models and examples built with TensorFlow
GD06/mpu-homepage
Homepage of the MPU project based on the Cayman theme.
GD06/mxnet
Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more
GD06/NiftyRec
NiftyRec is a software toolbox for Tomographic image reconstruction. NiftyRec is written in C and computationally intensive functions have a GPU accelerated version based on NVidia CUDA. NiftyRec includes a Matlab Toolbox and a Python Package that access the low level routines, hiding the complexity of the GPU accelerated algorithms.
GD06/pytorch
Tensors and Dynamic neural networks in Python with strong GPU acceleration
GD06/pytorch-cifar
95.16% on CIFAR10 with PyTorch
GD06/torchrec
Pytorch domain library for recommendation systems
GD06/xformers
Hackable and optimized Transformers building blocks, supporting a composable construction.