There's a reproducibility issue on Task 19 as well. Running the code for 53041 global steps reports: "Best step: 37440 with accuracy = 0.412", i.e., 0.6 error, while the repo reports 0 error.
Also, the gradient norms seems to be increasing which is unusual. Is this expected?