A codebase that developed over time for studying the training dynamics of transformer language models. This code has facilitated my experiments with training different model architectures, and allows for the easy collection of time series data for optimization-related metrics.