Train, ribify and analyse modadd with layernorm
danbraunai-apollo opened this issue · 1 comments
danbraunai-apollo commented
We used to have an lm_rib_build config for modular arithmetic with layer norm, but it used some outdated config vals (float32, truncation_threshold=1e-5, and the model was trained with an old transformerlens that used IGNORE=-1e5).
To reproduce with the latest updates, the following needs to be done:
- Train a new model with the standard config as is checked in but with normalization_type = "LNPre"
- Run lm_rib_build on the standard config, changing the tlens_model_path to the path of the layer normed model.
I (Dan) still think it's quite important to try and understand a simple, overparameterised model trained with layer norm.
danbraunai-apollo commented
Copied over to #118. Closing here