mgormley/pacaya

Unclear Algebra semantics in ExplicitFactor

Opened this issue · 1 comments

The constructor for ExplicitFactor suggests that it is a kind of VarTensor that is hard-coded to use RealAlgebra, but the fact that getLogUnormalizedScore directly returns array values suggests that they should be interpreted as log values (and hence implicitly use LogSemiring). Per @mgormley 's suggestion, it would make more sense for ExplicitFactor either to wrap an instance of VarTensor instead extending it or to extend some other multidimensional array class.

Yes, the fact that getLogUnormalizedScore of ExplicitFactor directly returns the value provided by the user makes it very easy for a user to perform BP (belief propagation) with wrong factor values. Here's an example:

Example
A user creates an ExplicitFactor with RealAlgebra. He sets the values in the factor by providing it with numbers in the real domain, such as 1.0. Now, he puts this factor into a factor graph, and runs BP. He is careful, and makes sure that the algebra in the BeliefPropagationPrm is also set as RealAlgebra.
Now an odd thing happens. The user notices that in the initial iteration of BP, the messages sent by the factor contains values like 2.71828, yet he never set any values like that in the factor. It looks like the value 1.0 is replaced by e1.0 for some reason.

A quick step-by-step debugging should reveal that, when the BP code creates the modules for non-global factors, it uses the method edu.jhu.pacaya.gm.inf.BruteForceInferencer.safeNewVarTensor to convert a factor to a VarTensor. This conversion uses the getLogUnormalizedScore method of Factor, and then calls the fromLogProb method of the algebra.

As @drew-reisinger points out, getLogUnormalizedScore of ExplicitFactor directly returns the value provided by the user. In the example above, what the user provided are numbers in the real domain, which does not meet the expectation of safeNewVarTensor.