Danielto1404/ml-notes

ML Notes

Documentation

Frameworks
Transformers
Detection
Normalization layers
Optimizers

Frameworks:

Transformers:

Transformer: Attention Is All You Need
Reformer: The Efficient Transformer (Local Attention with Hash / Sphere)
Linformer: Self-Attention with Linear Complexity (O(n^2) => O(n))
Vision Transformer: An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
DeiT: Training data-efficient image transformers & distillation through attention: Distill token and huge CNN teachers
CaiT: Going deeper with Image Transformers

Detection

DETR: End-to-End Object Detection with Transformers

Residual Networks:

RevNet: The Reversible Residual Network – Backpropagation Without Storing Activations

Language models:

PEER: Plan, Edit, Explain, Repeat. A collaborative language model from Meta AI

Normalization layers:

Overview

Optimizers:

Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models