Can Transformers Learn Optimal Filtering for Unknown Systems? The code accompanying our paper "Can Transformers Learn Optimal Filtering for Unknown Systems?" by Zhe Du*, Haldun Balim*, Samet Oymak, Necmiye Ozay. Setup Training Creating Plots