Notes and code for exploring ML.
- Research Notes collected from various sources.
- Linear Algebra Notes
- Idea: Neural Network in MIT Scratch
- Ideas prompted by developments in machine learning for psychology, philosophy, politics etc.
Several papers recommended by the alphanerds at GPU Mode ML Systems Onboarding Reading List
- Attention
- Performance
- Towards Efficient Generative Large Language Model Serving: A Survey from Algorithms to Systems: Wonderful survey, start here
- Efficiently Scaling transformer inference: Introduced many ideas most notably KV caches
- Making Deep Learning go Brrr from First Principles: One of the best intros to fusions and overhead
- Quantisation
- A White Paper on Neural Network Quantization
- LLM.int8: All of Dettmers papers are great but this is a natural intro
- FP8 formats for deep learning: For a first hand look of how new number formats come about
- Smoothquant: Balancing rounding errors between weights and activations#
- Long Context Length
- RoFormer: Enhanced Transformer with Rotary Position Embedding: The paper that introduced rotary positional embeddings
- YaRN: Efficient Context Window Extension of Large Language Models: Extend base model context lengths with finetuning
- Ring Attention with Blockwise Transformers for Near-Infinite Context: Scale to infinite context lengths as long as you can stack more GPUs
- Sparsity
- Venom: Vectorized N:M Format for sparse tensor cores
- Megablocks: Efficient Sparse training with mixture of experts
- ReLu Strikes Back: Activation sparsity in LLMs
- Sparse Llama
- Simple pruning for LLMs
Colour name classifier without ML library dependency Colour Prototype
- Only 2-3 input nodes (RGB, HSL or maybe HL)
- output nodes on the order of 10
- prototype Implementation in TypeScript
Music classifier idea Muzaklassifier (pre-larval stage)