how to Hyperparameters of the AdamW optimizer? How to use global vectors?
Yang-Changhui opened this issue · 4 comments
when I train the ICAR-ENSO dataset,I found the hyperparameter settings of AdamW in the paper are different from the parameter settings in the earthformer_enso_v1.yaml file, so which one should be used?If I want to use global vector, should I modify those parameters? Thank you.
Thank you for bringing up this issue. Your observation is correct regarding the default config on ICAR-ENSO, which slightly differs from that of SEVIR, N-body MNIST, and Moving MNIST. We have noticed that using lr=1e-3
and num_global_vectors=8
for training is rather unstable. Therefore, we have released a more stable config for better reproducibility.
Thank you for your question. Yes, num_global_vectors=0
indicates not using global vectors.
The training of Earthformer with global vectors (num_global_vectors=8
) on ICAR-ENSO is rather unstable. To help alleviate this, I would recommend initializing some parameters to zero, similar to what we did in our PreDiff that used Earthformer.
Thank you.