prove something about approximation for full likelihood?
Opened this issue · 0 comments
This is harder than the case where we don't condition on the final sequence, as demonstrated by the following two examples:
-
TASEP: let
x
be a string of0
s with a1
at the left-hand end, andy
be a string of0
s with a1
at the right-hand end. If both these strings are of lengthn
, thenp(y[n] | y[k:n-1], x[k:n])
is zero (or, small, if entering from the boundary is allowed), whilep(y[n] | y[1:n-1], x[1:n])
is in fact very large, since conditioning on there being a particle at position 1 at the start, and there being no particles at positionsy[1:n-1]
implies that the particle must be aty[n]
(or further to the right). So - we get totally the wrong answer by conditioning on a shorter sequence... but such situations are also very unlikely. -
See attached photo. The two sequences (AGC -> CTT) might have happened through two scenarios (suppose), shown in top and bottom. Rates are on the right. This is an example where a mutation (the A->C in the leftmost site) affects the probability of an earlier one (the G->C in the top scenario). This means that information is propagating backwards in time, as well as forwards, and the conditional probability of the rightmost site could be very different depending on whether the outcome at the leftmost site is included in the conditioning or not.