charles-river-analytics/figaro

VE output on disconnected graphs

bruttenberg opened this issue · 6 comments

At the end of VE, all factors are multiplied together and everything is eliminated but the target. When you have a disconnected model though, you end up with an empty factor that could potentially be zero. When this empty factor is multiplied by the non-empty factors, it can zero out the result. A small fix in VE can fix this:

private def makeResultFactor(factorsAfterElimination: MultiSet[Factor[Double]]): Factor[Double] = {
// It is possible that there are no factors (this will happen if there are no queries or evidence).
// Therefore, we start with the unit factor and use foldLeft, instead of simply reducing the factorsAfterElimination.
val nonEmpty = factorsAfterElimination.filter(p => p.variables.nonEmpty) // This line is new
nonEmpty .foldLeft(Factory.unit(semiring))(.product())
}

Not sure I understand this:

  1. Why would we want to multiply disconnected models together? It would seem that you would want to do the elimination only on the one that has the target. The other should have no distribution over the target anyway.
  2. Shouldn't the product method already take care of this? ie shouldn't two factors have variables in common in order to be multiplied? If so, multiplying by an empty factor should be a no-op.

There are definitely other places in the codebase where we compute a product of factors, so a solution that changes the product method is desired.

I don't think we should require factors to have common variables in order to compute their product. But it's unclear to me how an empty factor is treated as the zero factor. Shouldn't we treat it as the unit factor?

I agree about allowing factors to combine without overlapping variables.

The issue of zero factors is, I think, an artifact of our implementation. For dense factors, at least, we default the factor values to 0. Since an empty factor does not set any values, and the cell index is always 0, the probability will always be 0.

I just realized a fix for this never got pushed into Figaro 5. It's sitting on my machine now (the fix described at the top of this issue).

Do we want me to push it into the new 5 dev branch?