Schedule retrain an existing prophet model
AyushBhardwaj321 opened this issue · 6 comments
I am using mlflow to log the prophet model let's name as model1. I am able to log the model, versioning is also happening, and prediction also working fine.
Now i got data for next 1 week since my model was trained, So i want to train my model which should have understanding of previous data / historical data based on which model1 was trained as well as on new data also.
One way of doing that collab all data and train it which in time consuming.
since i am having trained model of prophet:
Is there a way to retrain the model it weekly with new weekly data?
using mlflow i am able to log it and load it, but I don't know how to retrain it with new data.
Are you sure it is worth it to log the Prophet model in the first place ?
"Training" the Prophet model actually takes less <1s especially if you set the interval_width=0 or mcmc_samples=0
IMO, just retrain the model from scratch every time you have a new data point. No need to use MLFlow or any model life cycle management system.
@imad24 Thanks for replying.
over here why i want to re-train new model based on previous trained model because the dataset will over a million for the first time which we can say historical data, training over that much amount of data it took > 1 sec. thats why i want to utilized the previously trained model to get new model in which is having features of previously trained model and technically i might take less time then trained whole model from scratch
@imad24 Thanks for replying. over here why i want to re-train new model based on previous trained model because the dataset will over a million for the first time which we can say historical data, training over that much amount of data it took > 1 sec. thats why i want to utilized the previously trained model to get new model in which is having features of previously trained model and technically i might take less time then trained whole model from scratch
I'm not sure I understand. Prophet is not exactly like other "classical" supervised ML models.
When you train a Prophet model, you do it on a single univariate time series.
It doesn't support global modeling, where you can have one model for multiple time series.
So what do you mean by
the dataset will over a million for the first time
@imad24 Thanks for active response. Really i appreciate that.
What I want to tell is that i am having a big data in which i am having 2.5 millions of data
which looks like
ds y
0 2020-10-21 12:57:47+00:00 0.0
1 2020-10-21 12:57:48+00:00 0.0
2 2020-10-21 12:57:49+00:00 0.0
3 2020-10-21 12:57:50+00:00 0.0
4 2020-10-21 12:57:51+00:00 0.0
... ... ...
2591996 2020-11-20 12:57:43+00:00 7.0
2591997 2020-11-20 12:57:44+00:00 7.0
2591998 2020-11-20 12:57:45+00:00 7.0
2591999 2020-11-20 12:57:46+00:00 6.0
2592000 2020-11-20 12:57:47+00:00 6.0
[2592001 rows x 2 columns]
I want to train my first model based on the historical data as the only issue is that it takes a long time. Now once after the model created lets name it as Model_v1 and i am having real-time data source from which i am getting data suppose every minute, so I want to re-train the model on weekly or monthly basis., If i have to train from scratch it will again take alot of time. So, i want to use Model_v1 which is having historical pattern to use that as base model and fit new values on top of that and create new model lets call it Model_v2.
@imad24 Thanks for active response. Really i appreciate that. What I want to tell is that i am having a big data in which i am having 2.5 millions of data which looks like
ds y 0 2020-10-21 12:57:47+00:00 0.0 1 2020-10-21 12:57:48+00:00 0.0 2 2020-10-21 12:57:49+00:00 0.0 3 2020-10-21 12:57:50+00:00 0.0 4 2020-10-21 12:57:51+00:00 0.0 ... ... ... 2591996 2020-11-20 12:57:43+00:00 7.0 2591997 2020-11-20 12:57:44+00:00 7.0 2591998 2020-11-20 12:57:45+00:00 7.0 2591999 2020-11-20 12:57:46+00:00 6.0 2592000 2020-11-20 12:57:47+00:00 6.0 [2592001 rows x 2 columns]
I want to train my first model based on the historical data as the only issue is that it takes a long time. Now once after the model created lets name it as Model_v1 and i am having real-time data source from which i am getting data suppose every minute, so I want to re-train the model on weekly or monthly basis., If i have to train from scratch it will again take alot of time. So, i want to use Model_v1 which is having historical pattern to use that as base model and fit new values on top of that and create new model lets call it Model_v2.
Oh I see now you're working with high frequency time series (secondly in this case).
In this case you're right, it does make sense to use warm start training.
Have you tried this approach explained in the documentation ?
@imad24 Hi, Thanks alot for quick reply.
Yes i have looked into the above mention solution in the link, and i tried it out as well but as i mentioned earlier i'm using prophet with mlflow to manage end-to-end model lifecycle. I am able to log the model with airflow , But when i am loading the model, I am unable to use the function mentioned in this.