Confused about the R matrix interpretation

Question

Confused about the R matrix interpretation

RanaElnaggar opened this issue 6 years ago · 4 comments

Hi,

I am confused about the returned R matrix interpretation in the online detection algorithm. In the notebook example, the third plot is R[Nw,Nw:-1], where it is mentioned to be "the probability at each time step for a sequence length of 0, i.e. the probability of the current time step to be a changepoint."
So why do we choose the indices R[Nw,Nw:-1] ? why not R[Nw,:]

Also, it was mentioned as an example that R[7,3] means the probability at time step 7 taking a sequence of length 3, so does R[Nw,Nw:-1] means that we are taking all the probabilities at time step Nw ?

Any suggestions to help me to understand the output R ?

Thanks

Answer 1 · 2019-01-18T06:41:08.000Z

Same question. Confunsed about R[Nw,Nw:-1], and it dose not work in my code...

Answer 2 · 2019-01-18T08:17:31.000Z

To be honest: I'm confused as well. I think I mixed the order of dimensions in the description. So R[7, 3] means the probability at timestep 3 that the sequence is 7 timesteps long (which is not a very sensible question to ask, but just to keep the example the same. Now R[Nw, Nw:-1] means give me the probability for each time step that the sequence is Nw steps long already.

Answer 3 · 2019-01-18T08:46:22.000Z

Hi,

Thanks for the clarification. Can you also elaborate more on your choice of Nw=10 ? Also, I am confused about how the following statement is related to the original algorithm: "Because it's very hard to correctly evaluate a change after a single sample of a new distribution, we instead can "wait" for Nw samples and evaluate the probability of a change happening Nw samples prior."

Also, according to your description, in order to get the changepoints for a datastream we need to check for data points with R > 0, is this true ?
I was also trying to map the algorithm in the original paper to the implementation, and I was wondering what does mu, alpha, beta, and kappa in the code correspond to in the paper ?
I was also wondering why did you choose the value of lambda in the hazard function to be equal to 250 ?
Also, I am confused about the difference between the paper theory and the implementation. So, the theory evaluates the predictive marginal distribution for x_t+1, while the code evaluates the probability that the point x_t was a change point, so can you clarify more how these two equations are related?

Thanks

Answer 4 · 2020-06-08T06:14:20.000Z

I think a better way to check for changepoints is by adding a condition something like the below -

thresh = 100
if (abs(maxes[i] - maxes[i+1]) > thresh):
 print("possible changepoint!")

I think this would ensure that the changepoint jump is always greater than a desired threshold whenever it occurs. For more intuitive understanding of BOCD, I have written a blog here that could possibly help to understand.