Pinned Repositories
AITemplate
AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.
asfermi
assembler for NVIDIA FERMI. Imported from Google Code
asplos21_ae_script
asplos_2021_ae
benchmark
TorchBench is a collection of open source benchmarks used to evaluate PyTorch performance.
CHIPKIT
CHIPKIT: An agile, reusable open-source framework for rapid test chip development
ck-artifact-evaluation
Public CK repository with materials and workflows to reproduce results from published papers or open competitions at ACM, IEEE and NeurIPS conferences and journals
cub
THIS REPOSITORY HAS MOVED TO github.com/nvidia/cub, WHICH IS AUTOMATICALLY MIRRORED HERE.
gpgpu-sim_distribution
GPGPU-Sim provides a detailed simulation model of a contemporary GPU running CUDA and/or OpenCL workloads and now includes an integrated (and validated) energy model, GPUWattch.
gpgpu-sim_distribution
GPGPU-Sim provides a detailed simulation model of a contemporary GPU running CUDA and/or OpenCL workloads and now includes an integrated (and validated) energy model, GPUWattch.
brad-mengchi's Repositories
brad-mengchi/AITemplate
AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.
brad-mengchi/asplos21_ae_script
brad-mengchi/asplos_2021_ae
brad-mengchi/benchmark
TorchBench is a collection of open source benchmarks used to evaluate PyTorch performance.
brad-mengchi/CHIPKIT
CHIPKIT: An agile, reusable open-source framework for rapid test chip development
brad-mengchi/ck-artifact-evaluation
Public CK repository with materials and workflows to reproduce results from published papers or open competitions at ACM, IEEE and NeurIPS conferences and journals
brad-mengchi/cub
THIS REPOSITORY HAS MOVED TO github.com/nvidia/cub, WHICH IS AUTOMATICALLY MIRRORED HERE.
brad-mengchi/cuda-samples
Samples for CUDA Developers which demonstrates features in CUDA Toolkit
brad-mengchi/gpgpu-sim_distribution
GPGPU-Sim provides a detailed simulation model of a contemporary GPU running CUDA and/or OpenCL workloads and now includes an integrated (and validated) energy model, GPUWattch.
brad-mengchi/echoedit.github.io
brad-mengchi/fairscale
PyTorch extensions for high performance and large scale training.
brad-mengchi/FBGEMM
FB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications/fbgemm/
brad-mengchi/Galois
Galois: C++ library for multi-core and multi-node parallelization
brad-mengchi/gpgpu-sim_simulations
A repository that compliments gpgpu-sim, providing automated regression scripts, simulation launching utilities and the code + arguments for simulations that complete in a reasonable amount of time on GPGPU-Sim.
brad-mengchi/gpufs
GPUfs - File system support for NVIDIA GPUs
brad-mengchi/ISCA-2021-Script
A collection of redistributable Python scripts to help organize ISCA 2021 (The 48th International Symposium on Computer Architecture).
brad-mengchi/kokkos-top-level
brad-mengchi/llvm-lto
brad-mengchi/llvm-pass-skeleton
example LLVM pass
brad-mengchi/llvm-project
The LLVM Project is a collection of modular and reusable compiler and toolchain technologies. Note: the repository does not accept github pull requests at this moment. Please submit your patches at http://reviews.llvm.org.
brad-mengchi/micrograd
A tiny scalar-valued autograd engine and a neural net library on top of it with PyTorch-like API
brad-mengchi/MightyPC
Mighty toolkit for conference Program Chairs.
brad-mengchi/Paraploy
brad-mengchi/pennant
brad-mengchi/pytorch
Tensors and Dynamic neural networks in Python with strong GPU acceleration
brad-mengchi/sst-gpgpusim
SST GPGPU Simulation Components
brad-mengchi/thrust
The C++ parallel algorithms library.
brad-mengchi/tinygrad
You like pytorch? You like micrograd? You love tinygrad! ❤️
brad-mengchi/torchrec
Pytorch domain library for recommendation systems
brad-mengchi/triton
Development repository for the Triton language and compiler