Running long term forecasting on custom data

Question

Running long term forecasting on custom data

FarahSaeed opened this issue a year ago · 2 comments

Hi

This is regarding running GPT4TS for long term forecasting on custom data. The data is related to public influenza weekly cases of a region from Our world in data. Attached file covers the data. The illness.sh script is run and the dataset path is replaced, the seq len = 36. All other parameters are same as illness.sh. The training loss decreases but the validation loss increases in every epoch and due to early stopping, the program stops after few epochs. Dropout was also added with value of 0.3. The batch size was also updated to 32 and learning rate is also updated to 0.001 or 1e-5. but the val loss still increases. With the same parameters the code shows good results for ILI dataset.
influenza_.csv

Any insights regarding are highly appreciated.

Thanks!

Answer 1 · 2023-12-01T08:06:56.000Z

We haven't actually tested this dataset, so the following are hypothetical situations that could potentially occur:

The influenza dataset has only one variable and only 700+ time steps. Although channel-independent mechanism is used, information from other channels is still learned.
The data has a wide range of scales and even contains some 0 values, which could potentially lead to suboptimal performance.

Answer 2 · 2023-12-09T18:35:45.000Z

Thanks for the insights. Just for the purpose of insights, the dataset was compared with ILI dataset. The ILI dataset has around 900+ time steps so I thought it should work on influenza data. The performance was satisfactory with single feature on ILI dataset.

Probably, I guess the range of values should not matter because the input to the network is scaled.

However, thank you for the reply. I can explore it more. It is not necessary to reply. Feel free to close this issue.