rdspring1
A PhD graduate researching Machine Learning, Locality-Sensitive Hashing, and Deep Learning Compilers.
Rice University; @RUSH-LAB ; @NvidiaSanta Clara
Pinned Repositories
Fuser
A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")
comp450-planning_under_uncertainty
Motion planning for a steerable needle under action uncertainty
comp450-Reachability-Guided-RRT
Use dynamic constraints to sample plausible states for RRT algorithm, improving robot motion planning
Count-Sketch-Optimizers
A compressed adaptive optimizer for training large-scale deep learning models using PyTorch
lightning-thunder
Source to source compiler for PyTorch. It makes PyTorch programs faster on single accelerators and distributed.
LSH-Mutual-Information
Use LSH Sampling for Mutual Information Estimation
LSH_DeepLearning
Scalable and Sustainable Deep Learning via Randomized Hashing
MISSION
MISSION: Ultra Large-Scale Feature Selection using Count-Sketches
PyTorch_GBW_LM
PyTorch Language Model for 1-Billion Word (LM1B / GBW) Dataset
LSH_Memory
One-Shot Learning using Nearest-Neighbor Search (NNS) and Locality-Sensitive Hashing LSH
rdspring1's Repositories
rdspring1/PyTorch_GBW_LM
PyTorch Language Model for 1-Billion Word (LM1B / GBW) Dataset
rdspring1/LSH_DeepLearning
Scalable and Sustainable Deep Learning via Randomized Hashing
rdspring1/Count-Sketch-Optimizers
A compressed adaptive optimizer for training large-scale deep learning models using PyTorch
rdspring1/MISSION
MISSION: Ultra Large-Scale Feature Selection using Count-Sketches
rdspring1/LSH-Mutual-Information
Use LSH Sampling for Mutual Information Estimation
rdspring1/comp450-Reachability-Guided-RRT
Use dynamic constraints to sample plausible states for RRT algorithm, improving robot motion planning
rdspring1/comp450-planning_under_uncertainty
Motion planning for a steerable needle under action uncertainty
rdspring1/pytorch
Tensors and Dynamic neural networks in Python with strong GPU acceleration
rdspring1/RzLinear
A compressed alternative to matrix multiplication using state-of-the art compression ROBE-Z
rdspring1/lightning-thunder
Source to source compiler for PyTorch. It makes PyTorch programs faster on single accelerators and distributed.
rdspring1/AITemplate
AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.
rdspring1/atari-representation-learning
Code for "Unsupervised State Representation Learning in Atari"
rdspring1/Auto-GPT
An experimental open-source attempt to make GPT-4 fully autonomous.
rdspring1/Autodiff-Puzzles
rdspring1/Autopilot-TensorFlow
A TensorFlow implementation of this Nvidia paper: https://arxiv.org/pdf/1604.07316.pdf with some changes
rdspring1/cs231n
Solutions to Stanford CS231n Spring 2018 Course Assignments.
rdspring1/cuda-training-series
Training materials associated with NVIDIA's CUDA Training Series (www.olcf.ornl.gov/cuda-training-series/)
rdspring1/dlrm_ssm
rdspring1/micrograd
A tiny scalar-valued autograd engine and a neural net library on top of it with PyTorch-like API
rdspring1/minGPT
A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training
rdspring1/mongoose
A Learnable LSH Framework for Efficient NN Training
rdspring1/NvFuser
A Fusion Code Generator for NVIDIA GPUs
rdspring1/nvprims-torchdynamo
A Python-level JIT compiler designed to make unmodified PyTorch programs faster.
rdspring1/Optimizing-DGEMM-on-Intel-CPUs-with-AVX512F
Stepwise optimizations of DGEMM on CPU, reaching performance faster than Intel MKL eventually, even under multithreading.
rdspring1/Optimizing-DGEMV-on-Intel-CPUs
Highly optimized DGEMV on CPU with both serial and parallel performance better than MKL and OpenBLAS.
rdspring1/Optimizing-SGEMM-on-NVIDIA-Turing-GPUs
Optimizing SGEMM kernel functions on NVIDIA GPUs to a close-to-cuBLAS performance.
rdspring1/tutel
Tutel MoE: An Optimized Mixture-of-Experts Implementation
rdspring1/twitter-algorithm-ml
Source code for Twitter's Recommendation Algorithm
rdspring1/vector-search-class-notes
Class notes for the course "Long Term Memory in AI - Vector Search and Databases" COS 495 @ Princeton Fall 2023
rdspring1/xla
Enabling PyTorch on Google TPU