several questions on diffcast code
plaovem opened this issue · 3 comments
Thank you for your open sourced code. I have several questions on it:
-
I run the pre-tarined model provided by you, “diffcast_phydnet_sevir128.pt” on SEVIR dataset. I find the mCSI is 0.3066, which is much higher than the experimental results reported in paper (0.2757). Is there any difference between this pre-trained model and the paper ? What is the mean of the “128” on this model name? Do you use a larger batch size like 128 in training ?
-
I run the training process of SimVP under the default setting, which is much higher than the paper results (0.3167 vs 0.2662). The SimVP even performs better than your provided model “diffcast_phydnet_sevir128.pt”. Is there any reason for it ?
-
I try to re-implement your training process on diffcast+phydnet. However, I only achieve 0.2660, which is much lower than the paper results 0.2757. Is there any point I need to care about ?
My guess is diffusion model is probabilistic, so every time testing on it might vary a little bit.
Actually may not, I train differet times and get same results.
@plaovem @ruibing-jin Thank you for your attention to our work. Here are some answers on our code:
- About the pre-trained checkpoint, we obtained our results on a reduced test set (the training set remained unchanged) for the SEVIR, the specific time period corresponds to the last two months of the dataset, comprising approximately 5,000 samples, depending on the specific sliding step size. Concerning the discrepancy in results, we have reason to hypothesize that the precipitation data from the last three months (Oct, Nov,Dec) may deviate from the overall precipitation distribution, leading to poorer performance on the smaller test set. It is important to note that all our experiments were conducted on the same test set, and the primary objective of our paper was to explore the enhancement effect of diffcast on different deterministic backbones.
- About the batch size, the end-to-end training is expensive and we set batch_size < 10 depends on different backbone. However, there are indeed indications that a larger batch size may contribute to better training results, given the presence of stochastic processes.
- About the re-trained SimVP, given the more challenging long-sequence forecasting task (5-->20) compared to the benchmark, different backbones handle the crucial temporal dependencies in varying ways, resulting in diverse performance. As we mentioned earlier, the role of diffcast is to enhance the performance of these different backbones. As a result, there is a possibility that simvp > diffcast_phydnet > phydnet.
- About the replication, if you could provide details of your replication process and loss trends during training, we would be honored to discuss this further with you.
We sincerely apologize for any confusion which may have caused in your replication efforts. If you encounter any questions or challenges, please don't hesitate to reach out.
@earthpimp Thanks for your reply. Indeed, every prediction may have differences in local areas due to stochastic generation, but this has only a slight influence on the final results of the entire test set because of DDIM process.