Prophet 1.1.5 saving and loading model leads to incorrect historical data points & possibly prediction

Question

Prophet 1.1.5 saving and loading model leads to incorrect historical data points & possibly prediction

vladchestTELUS opened this issue a year ago · 2 comments

I am encountering model corruption on save.

My current approach is to fit model -> model_to_json -> cloud bucket / filesystem-> model_from_json(f.read()). Once I read the model, I tend to encounter corruption issues. I am able to use the loaded model but its representation of historical data is inaccurate (all of the datapoints are clumped up very tightly in 1970s, predictions tend to be in 1970, 1 second apart, instead of data interval). Before I save the model, the plot function shows the results I expect to see (repeating pattern over span of training data in 2023/2024). Once saved its no longer the case.

Also once model is loaded make_future_dataframe does not produce expected timestamps (instead of being 1hour or 15 mins apart and in 2024, predicted points are in 1970s and 1 second apart). The data is sub-daily (1 hour/15 minutes).

None of the above issues are present if I skip the saving and loading and do prediction after fitting. Then Prophet behaves as expected.

Is this a bug or I should change my approach?

Answer 1 · 2024-02-16T09:05:21.000Z

I don't suppose you're not stating explicitly the interval periods in the make_future_dataframe function once you reload the data? It's the same freq notation as a pd.date_range()
E.g.:

df_future  = model.make_future_dataframe(periods=forecast_periods, freq="30min")

I have no issues when I load my model back from json when I do this. Unsure if this is helpful or not.

Answer 2 · 2024-02-19T15:35:37.000Z

Hi, it is good to know that usually this is not an issue.

I tried multiple approaches including similar to one above:
future = prophets[p]["model"].make_future_dataframe(periods = periodsToPredict, include_history = False) #1970s... but even when I use make_future_dataframe function I am still getting this issue.
For context next step is predict and then I run plot_components and plot.

Another approach I attempted was to use pd.date_range to generate a pandas dataframe with correct dates.

In either case the data has unexpected flat vertical line pattern. This includes historical data which I do not think should be impacted by dataframe generation.

To write and load models I am using:
f.write(model_to_json(model))
model_from_json(f.read())