Pinned Repositories
6.824-labs
labs for MIT 6.824 distributed systems
61c-matrix-tools
Tools for debugging the cs61c classify project
async-recursion-monotonic-list-contrived
contrived example of an async recursive function with a monotonically growing immutable list
brain4d
Note management utility
braind
Utility for managing my markdown notes
cuda-bitonic-merge
dotfiles
flash-attention
Fast and memory-efficient exact attention
softgrep
Code search with tree-sitter + semantic search
speculative-forecasting
Experiments with controlling how many tokens to predict for speculative decoding
skrider's Repositories
skrider/softgrep
Code search with tree-sitter + semantic search
skrider/flash-attention
Fast and memory-efficient exact attention
skrider/6.824-labs
labs for MIT 6.824 distributed systems
skrider/async-recursion-monotonic-list-contrived
contrived example of an async recursive function with a monotonically growing immutable list
skrider/brain4d
Note management utility
skrider/chat-with-gpt
An open-source ChatGPT app with a voice
skrider/crossgrep
Cross-encoding AST queries with LLMs for dense retrieval
skrider/cten
CUDA tensor library from scratch for practice
skrider/cuda-bitonic-merge
skrider/cuda-workshop
cuda-workshop
skrider/dotfiles
skrider/draftsman
CS285 final project
skrider/serverless-model-example
Practice project demonstrating how to serve copies of a single model efficiently and autoscale on demand
skrider/speculative-forecasting
Experiments with controlling how many tokens to predict for speculative decoding
skrider/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
skrider/candle
Minimalist ML framework for Rust
skrider/cs285-project
CS 285 Final Project
skrider/eecs182-midterm-review
eecs182-midterm-review
skrider/flashinfer
FlashInfer: Kernel Library for LLM Serving
skrider/kernel-introspection
skrider/kubernetes-examples
Kubernetes application example tutorials
skrider/labs
labs
skrider/mq-paged-attention
Mutli query paged attention kernel
skrider/paged_flash_attention_inference
Paged flash attention
skrider/pip-prune
Tool for automatically minimizing python dependencies as much as possible
skrider/react-flow-onboard
skrider/resume
My resume template in latex
skrider/serverless-sam
Deploying SAM on banana ml serverless
skrider/serverless-scraper
Serverless web scraper + RAG backend
skrider/torch-dynamo-experiments
Profiling models compiled with pytorch dynamo and inductor on a large body of models on an A10