/saturate

Repository for studying training dynamics of transformer LMs

Primary LanguagePython

Saturate

A codebase that developed over time for studying the training dynamics of transformer language models. This code has facilitated my experiments with training different model architectures, and allows for the easy collection of time series data for optimization-related metrics.