Train, ribify and analyse modadd with layernorm

Question

Train, ribify and analyse modadd with layernorm

danbraunai-apollo opened this issue a year ago · 1 comments

We used to have an lm_rib_build config for modular arithmetic with layer norm, but it used some outdated config vals (float32, truncation_threshold=1e-5, and the model was trained with an old transformerlens that used IGNORE=-1e5).

To reproduce with the latest updates, the following needs to be done:

Train a new model with the standard config as is checked in but with normalization_type = "LNPre"
Run lm_rib_build on the standard config, changing the tlens_model_path to the path of the layer normed model.

I (Dan) still think it's quite important to try and understand a simple, overparameterised model trained with layer norm.

Answer 1 · 2023-11-29T11:43:26.000Z

Copied over to #118. Closing here