epiforecasts/EpiNow

Bug in dist_fit?

manulari opened this issue · 1 comments

I am trying to understand the code.

Looking at

target += log(exponential_cdf(up[i] , lambda) - exponential_cdf(low[i] , lambda));

I would expect that the intention is to model delays as distributed like an exponentially distributed random variable, which has additionally been rounded / floored.

If this is right, then [low_i,up_i] should be some interval of length one which contains delay_i.

But the code here

EpiNow/R/dist_fit.R

Lines 27 to 29 in 929fcc0

lows <- delays - 1
lows <- ifelse(lows <=0, 1e-6, lows)
ups <- delays + 1

makes [low_i,up_i] an interval of length 2. So for example the intervals for delay=3 and delay=4 overlap. Is this a bug, or is this intentional for some reason?

Also, is delay=0 a valid input? If yes, then the interval for that case only has length one, which I'd expect to skew things in unintended ways.

Hello,

Due to censoring occuring from the round of the onset date and confirmation date, you can end up with two situations at the extremes:

  1. I onset at 00:00 on 01/02/20 and confirm at 23:59 on 04/02/20, this gives a total delay of 3 days + 23 hours 59 minutes (4 days basically)
  2. I onset at 23:59 on 01/02/20 and confirm at 00:00 on 04/02/20, this gives 2 days + 1 minute

So the interval does has length 2 since for delay = 3 it could plausibly be anywhere between 2 and 4 days without censoring.

The likelihood function we use is explained in: https://www.jstatsoft.org/article/view/2241/0
I am not sure what effect having the interval [0,1] with length 1 has to be honest with you. The article linked proviodes an example where there are many different interval lengths in the data.

Thanks for getting in touch,
Joel