Time dependent recovery and death rate
gergelytakacs opened this issue ยท 6 comments
Dear @ECheynet
Again, not really an issue, more of a discussion.
Are the time-dependent recovery and death rate functions purely your creations, or have you run into something like that in other papers as well? They do make sense as far as your elaborations go, I'm just wondering if these approximations are accepted in literature (not an epidemiologist myself, so have no idea)?
I decided to re-do the model you are using, and I left the time dependent death rate to approach the "dogmatic" SEIR structure a bit more. However, I find that your recovery rate produces more stable fits - which could be (partly) explained by the additional parameter. However, when the "original" model by Peng. et. al. fits, it produces very same results - but it elongates the infectious curve after the peak much more than when using your idea w/ the time dependent recovery rate. Have you compared the two model versions? How does this idea fare with countries on the "other side" of the active infection curve?
Hi gergelytakacs,
That is a good question. Yes, there are "purely my creation" in the sense that I was simply curious to know what would happen if I implemented time-dependent parameters. I did not check if the scientific literature includes similar models. As far as I know, Peng et al. do not describe clearly how they implement their model. They mostly focus on the experimental determination of the model parameters.
Initially, I used only intuition to implement kappa and lambda. The basic requirement is that the death rate and recovery rate should converge toward a fixed value as time increases. Otherwise, we may have numerical instabilities. It seems it worked well for the death rate but the recovery rate was not always no nice. In particular, the second parameter was sometimes reaching the upper limit, given as input for the function lsqcurvefit.
In the last few weeks, I have tried to see what recovery rate would be best suited for the new releases of the code (v.4). For a better model of the recovery rate, I looked at the "estimated" recovery rate, using the Chinese provinces I also looked at the data from Italian regions. The goal was to see what type of simple, empirical model could work. In the end, I found two types of time-dependent recovery rate. In V.4 and further, these two time-dependent recovery rates are implemented. The function fit_SEIRQDP chooses which one is best. One initial issue I had with these two recovery rates is that the fitting could lead to numerical instabilities. This issue should be solved now.
Hello there Etienne ( @ECheynet ) and sorry for the lag in our communication.
I use the Peng et al. model and your original idea on the recovery rate, but not the death rate. I am not using your repo, instead the System Identification Toolbox because it gives me some freedom to switch between solvers, settings etc. I am only running Slovakia through this model, so I have a fairly good heuristic sense of what's happening.
I can tell you the following: Your old recovery rate formulation works a lot better than the "orthodox" SEIR model from Peng et. al. I'll give you a sense with the latest data:
Your empiric recovery rate (before v4!) but fixed death:
Yes, I do run into numerical issues sometimes, so I see your point there. But not too terrible. Also, I a really different character of estimates when using anything other than least-squares. For example GNA tends to identify a wilder set of parameters and not so conservative as least-squares. I won't give any % fits or FPE values, since the differences are obvious to the naked eye.
Fixed recovery (e.g. Peng et al. "orthodox") from the same "best day":
I have not tried your new / alternate formulations but will do so. I'm also very much interested to combine the following ideas to the model (it probably has been done, I'm not an epidemiologist):
- Fitted protection rate but with data from mobility given by Google.
- Including ICU/hospitalized cases, since these are available in some countries.
- I would be also very interesting to see which recovery formulation predicts a good fit (into the future) by less data. I'm saying that in addition to evaluating how well the model fits to an (almost) complete data set, it would be cool to see which formulation allows for the best long-time predictions made from incomplete data.
A bit of comparison for the different time-dependent recoveries:
Your original formulation gamma=gamma0*(1-exp(-gamma1*t));
(before v 4?) and as seen above, but picking the "best" day in a range producing the best fit. Today that would be day 31 (I call recovery rate gamma):
Producing a fit for each state 85.6%, 88.9%, 76.4%
The formulation gamma=gamma0/(1+exp(-gamma1*(t-gammaTau)));
produces
for the given day and dataset, with fits 85.3%, 92%, 76.9%. Recovery rate is better both by the fit % both visually. Infections and death remains more or less the same.
And finally the formulation gamma=gamma0+exp(-gamma1*(t+gammaTau))
produces a quite abysmal fit:
with 47% 71% and -300% for each dataset. So that formulation is not really a winner (for me).
Also, keep in mind that I'm weighting the various data with diag([120,25,20])
to get a better match for quarantined infections.
Now after this I'm also wondering which of these formulations is best for predicting from less data. This also may be a question of robustness, numerical and otherwise...
Hi,
yes, I know that some formulation of the recovery rate can be different :). That is why only the best formulation, in terms of mean square error, is chosen. I did implement different formulation because, dependaning on the country/data set, the recovery rate can have different time-dependency (see the example with the Italian regions).
Sure thing, I understand your reasons. I'll also try and see which recovery model is the best for long term predictions. I'll keep the conversation alive here, if you don't mind.
No problem, I'm fine with keeping the conversation alive here