prove something about approximation for full likelihood?

Question

Opened this issue 3 years ago · 0 comments

This is harder than the case where we don't condition on the final sequence, as demonstrated by the following two examples:

TASEP: let x be a string of 0s with a 1 at the left-hand end, and y be a string of 0s with a 1 at the right-hand end. If both these strings are of length n, then p(y[n] | y[k:n-1], x[k:n]) is zero (or, small, if entering from the boundary is allowed), while p(y[n] | y[1:n-1], x[1:n]) is in fact very large, since conditioning on there being a particle at position 1 at the start, and there being no particles at positions y[1:n-1] implies that the particle must be at y[n] (or further to the right). So - we get totally the wrong answer by conditioning on a shorter sequence... but such situations are also very unlikely.
See attached photo. The two sequences (AGC -> CTT) might have happened through two scenarios (suppose), shown in top and bottom. Rates are on the right. This is an example where a mutation (the A->C in the leftmost site) affects the probability of an earlier one (the G->C in the top scenario). This means that information is propagating backwards in time, as well as forwards, and the conditional probability of the rightmost site could be very different depending on whether the outcome at the leftmost site is included in the conditioning or not.