ML Notes

Documentation

  1. Frameworks
  2. Transformers
  3. Detection
  4. Normalization layers
  5. Optimizers

Frameworks:

Transformers:

  • Transformer: Attention Is All You Need
  • Reformer: The Efficient Transformer (Local Attention with Hash / Sphere)
  • Linformer: Self-Attention with Linear Complexity (O(n^2) => O(n))
  • Vision Transformer: An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
  • DeiT: Training data-efficient image transformers & distillation through attention: Distill token and huge CNN teachers
  • CaiT: Going deeper with Image Transformers

Detection

  • DETR: End-to-End Object Detection with Transformers

Residual Networks:

  • RevNet: The Reversible Residual Network – Backpropagation Without Storing Activations

Language models:

  • PEER: Plan, Edit, Explain, Repeat. A collaborative language model from Meta AI

Normalization layers:

normalization

Optimizers:

  • Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models