Pinned Repositories
lectures
Material for gpu-mode lectures
neurips_llm_efficiency_challenge
NeurIPS Large Language Model Efficiency Challenge: 1 LLM + 1GPU + 1Day
awesome-profiling
Awesome utilities for performance profiling
C-compiler-optimizations
Description of commonly done compiler optimizations in C
ml-design-patterns
Software Architecture for ML engineers
multiple_dispatch
Why multiple dispatch lets you write composable code
ao
PyTorch native quantization and sparsity for training and inference
examples
A set of examples around pytorch in Vision, Text, Reinforcement Learning, etc.
pytorch
Tensors and Dynamic neural networks in Python with strong GPU acceleration
serve
Serve, optimize and scale PyTorch models in production
msaroufim's Repositories
msaroufim/ml-design-patterns
Software Architecture for ML engineers
msaroufim/C-compiler-optimizations
Description of commonly done compiler optimizations in C
msaroufim/Discord-PDFPreview
Preview PDFs locally within the Discord UI!
msaroufim/torchprep
msaroufim/vscode-pytorch-extension
msaroufim/cmake-experiments
msaroufim/openaitritontutorial
msaroufim/matmul-pad
smad;isagdasdbaudasdas
msaroufim/tutorials
PyTorch tutorials.
msaroufim/when-did-CUDA-add-X-
msaroufim/compiler-explorer
Run compilers interactively from your web browser and interact with the assembly
msaroufim/data
A PyTorch repo for data loading and utilities to be shared by the PyTorch domain libraries.
msaroufim/faster-pytorch-blog
Outlining techniques for improving the training performance of your PyTorch model without compromising its accuracy
msaroufim/funky_dynamo
Potentially hard examples for torchdynamo
msaroufim/importnow
import for chads
msaroufim/kineto
A CPU+GPU Profiling library that provides access to timeline traces and hardware performance counters.
msaroufim/Myblog
msaroufim/nanoGPT
The simplest, fastest repository for training/finetuning medium-sized GPTs.
msaroufim/nerf-pytorch
A PyTorch implementation of NeRF (Neural Radiance Fields) that reproduces the results.
msaroufim/nvgpu
NVIDIA GPU tools - monitoring on CLI & web app with multiple agents
msaroufim/picotorchGPT
An unnecessarily tiny implementation of GPT-2 in NumPy.
msaroufim/pyscript
msaroufim/singlegpu
msaroufim/TensorRT
PyTorch/TorchScript/FX compiler for NVIDIA GPUs using TensorRT
msaroufim/torchdynamo
A Python-level JIT compiler designed to make unmodified PyTorch programs faster.
msaroufim/torchdynamo-tests
msaroufim/Transformers-Recipe
🧠 A quick recipe to learn all about Transformers
msaroufim/triton
Development repository for the Triton language and compiler
msaroufim/vision
Datasets, Transforms and Models specific to Computer Vision
msaroufim/xla
Enabling PyTorch on Google TPU