Wrong forecast results
Closed this issue · 5 comments
Hello,
Prophet return some strange values
3.1.0 :200 > series
=>
{#<Date: 2022-01-03 ((2459583j,0s,0n),+0s,2299161j)>=>1.639,
#<Date: 2022-01-05 ((2459585j,0s,0n),+0s,2299161j)>=>1.649,
#<Date: 2022-01-06 ((2459586j,0s,0n),+0s,2299161j)>=>1.659,
#<Date: 2022-01-07 ((2459587j,0s,0n),+0s,2299161j)>=>1.669,
#<Date: 2022-01-08 ((2459588j,0s,0n),+0s,2299161j)>=>1.659,
#<Date: 2022-01-10 ((2459590j,0s,0n),+0s,2299161j)>=>1.669,
#<Date: 2022-01-11 ((2459591j,0s,0n),+0s,2299161j)>=>1.689,
#<Date: 2022-01-12 ((2459592j,0s,0n),+0s,2299161j)>=>1.679,
#<Date: 2022-01-13 ((2459593j,0s,0n),+0s,2299161j)>=>1.689,
#<Date: 2022-01-14 ((2459594j,0s,0n),+0s,2299161j)>=>1.699,
#<Date: 2022-01-15 ((2459595j,0s,0n),+0s,2299161j)>=>1.699,
#<Date: 2022-01-18 ((2459598j,0s,0n),+0s,2299161j)>=>1.709,
#<Date: 2022-01-20 ((2459600j,0s,0n),+0s,2299161j)>=>1.719,
#<Date: 2022-01-21 ((2459601j,0s,0n),+0s,2299161j)>=>1.729,
#<Date: 2022-01-22 ((2459602j,0s,0n),+0s,2299161j)>=>1.719,
#<Date: 2022-01-25 ((2459605j,0s,0n),+0s,2299161j)>=>1.729,
#<Date: 2022-01-27 ((2459607j,0s,0n),+0s,2299161j)>=>1.739,
#<Date: 2022-01-29 ((2459609j,0s,0n),+0s,2299161j)>=>1.729}
3.1.0 :201 > Prophet.forecast(series)
=>
{#<Date: 2022-01-30 ((2459610j,0s,0n),+0s,2299161j)>=>5.547226954196092,
#<Date: 2022-01-31 ((2459611j,0s,0n),+0s,2299161j)>=>1.7157371604726062,
#<Date: 2022-02-01 ((2459612j,0s,0n),+0s,2299161j)>=>1.7390083969256263,
#<Date: 2022-02-02 ((2459613j,0s,0n),+0s,2299161j)>=>1.7340095632954138,
#<Date: 2022-02-03 ((2459614j,0s,0n),+0s,2299161j)>=>1.7490078239414186,
#<Date: 2022-02-04 ((2459615j,0s,0n),+0s,2299161j)>=>1.749007500166091,
#<Date: 2022-02-05 ((2459616j,0s,0n),+0s,2299161j)>=>1.7390047264672899,
#<Date: 2022-02-06 ((2459617j,0s,0n),+0s,2299161j)>=>5.557226905370591,
#<Date: 2022-02-07 ((2459618j,0s,0n),+0s,2299161j)>=>1.725737111645919,
#<Date: 2022-02-08 ((2459619j,0s,0n),+0s,2299161j)>=>1.7490083480986351}
Clearly the first one and the value for the 2022-02-06 are wrong. I suspect a arm64 bug but I don't have an amd64 for test the code.
Hey @blackrez, I'm seeing similar results on x86-64, so don't think it's related to ARM.
It looks like the problem is series
doesn't include any Sundays, but it's trying to predict them. If you need predictions for Sundays, make sure to include them in the input. Otherwise, you can filter them from the output.
It looks like the Python library has similar behavior.
import pandas as pd
from prophet import Prophet
df = pd.DataFrame({
'ds': ["2022-01-03", "2022-01-05", "2022-01-06", "2022-01-07", "2022-01-08", "2022-01-10", "2022-01-11", "2022-01-12", "2022-01-13", "2022-01-14", "2022-01-15", "2022-01-18", "2022-01-20", "2022-01-21", "2022-01-22", "2022-01-25", "2022-01-27", "2022-01-29"],
'y': [1.639, 1.649, 1.659, 1.669, 1.659, 1.669, 1.689, 1.679, 1.689, 1.699, 1.699, 1.709, 1.719, 1.729, 1.719, 1.729, 1.739, 1.729]
})
m = Prophet()
m.fit(df)
future = m.make_future_dataframe(periods=10, include_history=False)
forecast = m.predict(future)
print(forecast[['ds', 'yhat']])
Output
ds yhat
0 2022-01-30 -3.960552
1 2022-01-31 1.720297
2 2022-02-01 1.739000
3 2022-02-02 1.731007
4 2022-02-03 1.749000
5 2022-02-04 1.749000
6 2022-02-05 1.739000
7 2022-02-06 -3.950552
8 2022-02-07 1.730297
9 2022-02-08 1.749000
Thanks for your response and your help, my dataset have a lots of issue and it have a lots of missing days.
Another option is to disable weekly seasonality with the advanced API:
require "prophet"
df = Rover::DataFrame.new({
"ds" => ["2022-01-03", "2022-01-05", "2022-01-06", "2022-01-07", "2022-01-08", "2022-01-10", "2022-01-11", "2022-01-12", "2022-01-13", "2022-01-14", "2022-01-15", "2022-01-18", "2022-01-20", "2022-01-21", "2022-01-22", "2022-01-25", "2022-01-27", "2022-01-29"],
"y" => [1.639, 1.649, 1.659, 1.669, 1.659, 1.669, 1.689, 1.679, 1.689, 1.699, 1.699, 1.709, 1.719, 1.729, 1.719, 1.729, 1.739, 1.729]
})
m = Prophet.new(weekly_seasonality: false)
m.fit(df)
future = m.make_future_dataframe(periods: 10, include_history: false)
forecast = m.predict(future)
p forecast[["ds", "yhat"]]
Output
ds yhat
2022-01-30 00:00:00 UTC 1.7360273138060855
2022-01-31 00:00:00 UTC 1.7374329821020085
2022-02-01 00:00:00 UTC 1.7388386503979312
2022-02-02 00:00:00 UTC 1.740244318693854
2022-02-03 00:00:00 UTC 1.7416499869897768
2022-02-04 00:00:00 UTC 1.7430556552856997
2022-02-05 00:00:00 UTC 1.7444613235816226
2022-02-06 00:00:00 UTC 1.7458669918775456
2022-02-07 00:00:00 UTC 1.747272660173468
2022-02-08 00:00:00 UTC 1.7486783284693912
Yeah, it could be the best solution for my very inconsistant dataset. Many thanks for your help.