Code for How Transformers Learn Causal Structure with Gradient Descent. Eshaan Nichani, Alex Damian, Jason D. Lee. ICML 2024.
single_parent.py: Code for single parent experiments (Figures 2, 6)
multi_parent.ipynb: Code for multi-parent experiments (Figures 5, 7)
tf_with_mlp.ipynb: Code for Figure 8
many_graphs.ipynb: Code for Figure 9
catformer.py: Model helper code
problems.py: Task helper code
plots.py: Plotting helper code