Zero-shot performance is not reproduceable
Opened this issue · 5 comments
Dear Authors,
Thanks for your work! Following your zero-shot setting (with the lookback length as 672 and the forecast length as 96), my results do not match those reported in Table 18 in your paper, though running your official codes.
Below are the MSEs:
Timer (reproduced) | Timer-1B | Timer-16B | Timer-28B | |
---|---|---|---|---|
ETTh1 | 0.454 | 0.438 | 0.364 | 0.393 |
traffic | 0.479 | 0.458 | 0.399 | 0.414 |
weather | 0.190 | 0.181 | 0.203 | 0.243 |
electricity | 0.210 | 0.192 | 0.139 | 0.147 |
Could you provide more information about your released Timer_forecast_1.0.ckpt
?
Same question: Could you please clarify the dataset size that the provided checkpoint was pretrained on?
Hi, we have released the model at HuggingFace, where you can evaluate the model following the provided pipeline.
Please refer to the appendix of our paper, where the details of datasets and configurations are provided.
Is there any different hyperparameter for the new checkpoint? Besides, could you update the metrics?