Pinned Repositories
cutlass
CUDA Templates for Linear Algebra Subroutines
FBGEMM
FB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications/fbgemm/
five-letter-words
Experiments with Knuth's 5,757 five letter words.
gpt-neox
An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.
SHARK
Distributed SHARK
nirvedhmeshram's Repositories
nirvedhmeshram/SHARK
Distributed SHARK
nirvedhmeshram/cutlass
CUDA Templates for Linear Algebra Subroutines
nirvedhmeshram/FBGEMM
FB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications/fbgemm/
nirvedhmeshram/five-letter-words
Experiments with Knuth's 5,757 five letter words.
nirvedhmeshram/gpt-neox
An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.
nirvedhmeshram/hbc_verification
nirvedhmeshram/iree
👻
nirvedhmeshram/llvm-project
The LLVM Project is a collection of modular and reusable compiler and toolchain technologies. Note: the repository does not accept github pull requests at this moment. Please submit your patches at http://reviews.llvm.org.
nirvedhmeshram/llvm-test-suite
nirvedhmeshram/mmperf
MatMul Performance Benchmarks for a Single CPU Core comparing both hand engineered and codegen kernels.
nirvedhmeshram/PI
A lightweight MLIR Python frontend with support for PyTorch
nirvedhmeshram/pytorch
Tensors and Dynamic neural networks in Python with strong GPU acceleration
nirvedhmeshram/torch-mlir
The Torch-MLIR project aims to provide first class support from the PyTorch ecosystem to the MLIR ecosystem.