google-deepmind/PGMax

Thank you for the nice library and some questions

Closed this issue · 4 comments

Dear PGMax Team,

first of all: Thank you very much for the library, it works super well and we found it to be orders of magnitude faster than other libraries. Furthermore, the flattened potentials are super neat :)

If you don't mind, I would have two questions and would be very grateful if you could provide some pointers:

  • prob == 0 we have unary potentials that can be (numerically) zero. This leads to errors where the prediction seems to completely break down for some variables, although there are variable -> state assignments that would not hit the prob == 0. Initially, we replaced the log potentials at -np.inf with your pgmax.utils.NEG_INF = -1e20. While this fixed some issues, we still got random assignments in some cases that seem to disappear if we use say -20 as the smallest unary potential. I have a hunch that this happens specifically for temperature=0 when getting map_states but couldn't investigate. What is the guidance on this? Could you point towards a good paper that has some experiments regarding this? Is there an intuitive explanation why LBP fails in this case?

  • Initialization. Currently, we use bp_arrays = bp.run_bp(bp.init(), num_iters=num_iters, damping=damping) as given in most tutorials. This seems to init the log_potentials as 0, right? Is there some way we could use prior knowledge to do this? Maybe init with the most likely states given the unary potentials?

  • If you have any other papers that give good hints on how to squeeze the most out of LBP, I would be happy to read them.

Best and Thanks
Lukas

Hi Lukas,
Thank you for using PGMax and for your positive feedback. Please find an answer to your questions below:

  • The log potentials are initialized when the factors / factor groups are added to the factor graph. For instance in the ising model notebook https://github.com/deepmind/PGMax/blob/main/examples/ising_model.ipynb the log_potential_matrix argument specifies the log potentials of the PairwiseFactorGroup (which, in this case, are shared across all the pairwise factors)
factor_group = fgroup.PairwiseFactorGroup(
    variables_for_factors=variables_for_factors,
    log_potential_matrix=0.8 * np.array([[1.0, -1.0], [-1.0, 1.0]]),
)
fg.add_factors(factor_group)

The log potentials are only set to 0 if no initialization is given: see https://github.com/deepmind/PGMax/blob/54baa4a2c550e396fc1ff946cb09976f078ea6ff/pgmax/fgroup/enum.py#L65 for EnumFactorGroup and https://github.com/deepmind/PGMax/blob/54baa4a2c550e396fc1ff946cb09976f078ea6ff/pgmax/fgroup/enum.py#L240 for PairwiseFactorGroup

uX = np.zeros((X_gt.shape) + (2,))
uX[..., 0] = (2 * X_gt - 1) * logit(pX)

Let us know if this helps,
Best,
Antoine

Hi @antoine-dedieu,

thank you for the quick answer, but two questions would still be open:

  1. Is there any way to initialize pgmax so that it converges to a better energy and/or faster? We have seen cases where we can manually provide map_states with lower energy than the inference result and would like to use this.

  2. Are there known issues with very small log potentials? I see in your notebook that you use logit(pX) = logit(1e-100) = -230, is this the smallest recommended unary?

  3. Are factors between 3 variables possible?

Thank you :)

  1. Mmmmh why do you want to run inference if you already know all your variables assignments? Or do you want to fix some variables and infer the remaining ones?
    If the latter, you can set the evidence of the variables that you want to condition on similarly to theuX I shared.

    Also, when you say that you were manually specifying solution with lower energy than PGMax, were you computing the energy with compute_energy? If so, could you please rebase and try again, the sign of the energy returned by this function was flipped, we have fixed this.

    Regarding the initialization, you can try playing with the messages initialization here and see if it improves inference -- for the analogy, you can think of it as changing the starting point x_0 when you run gradient descent

    Finally, the inference speed is determined by the num_iters parameters in run_bp
    Also note that loopy BP is not guaranteed to converge in undirected model.

  2. Sure, you can set the evidence to NEG_INF.
    To go into the details, using logit(1e-100) here is equivalent to having a noisy channel between the variables X and the pixels, which we found to be useful as it leaves more flexibility during inference (ie a very small subset of the X variables may be decoded differently than the pixels which may lead to a better solution).

  3. You can certainly define factors for as many variables as you want!
    See for instance the ORFactorGroup defined in https://github.com/deepmind/PGMax/blob/main/examples/pmp_binary_deconvolution.ipynb. You can do the same for EnumFactorGroup -- note that factor_configs would need to be of the corresponding size

Hope it helps,
Antoine

Thank you Antoine for your answers, I'll close the issue.