facebookresearch/beanmachine

Invalid proposal solutions

samvoisin opened this issue · 3 comments

Issue Description

I am trying out bean machine for the first time (as well as the Newtonian MC sampler). I am running what I think should be a simple hierarchical model, however I am getting a lot of outputs from the logger which say,

Gradient or Hessian is invalid at node sigma_mu().

Node sigma_mu() has invalid proposal solution. Proposer falls back to SingleSiteAncestralProposer.

My model code is:

J = neighborhoods.size
nbd_idx = tensor(raw_df.NeighborhoodCode.values.tolist())

@bm.random_variable
def sigma_mu():
    return dist.Exponential(3.)

@bm.random_variable
def mu_y():
    return dist.Normal(12, sigma_mu())

@bm.random_variable
def sigma_alpha():
    return dist.Uniform(0, 10)

@bm.random_variable
def alpha_j():
    return dist.Normal(0, sigma_alpha()).expand((J,))

@bm.random_variable
def sigma_y():
    return dist.Exponential(3.)

@bm.functional
def alpha_j_vec():  # vector of alpha_j as long as y
    return alpha_j()[nbd_idx]

@bm.random_variable
def y():
    return dist.Normal(mu_y() + alpha_j_vec() , sigma_y())

Unless I am missing something, sigma_mu should only be able to sample valid proposals. I am wondering if this issue is caused by sampling values very near zero so that the estimated gradient does not exist? If so, this is very disappointing because my data suggests very small values for sigma_mu. Is there something obviously wrong in the way I've specified my model?

Steps to Reproduce

Here is the code I've used to do my inference:

qrys = [
    mu_y(), sigma_mu(),
    alpha_j(), sigma_alpha(),
    sigma_y()
]

samples = bm.CompositionalInference().infer(
    queries=qrys,
    observations=observations,
    num_adaptive_samples=4000,
    num_samples=2000,
    num_chains=2
)

Thanks for the well documented issue @samvoisin! As you suggested, this is likely because the numerical instability with large gradients near 0 in the exponential prior, and improving the robustness of NMC is still a bit of open research. In general, if your model has all continuous variables, I'd recommend using GlobalNoUTurnSampler, as it is more robust to these types of distributions.

Thank you for the recommendation, @jpchen . Using GlobalNoUTurnSampler made a big difference. Looking forward to watching this project develop! Feel free to close this issue when ready.

Thanks, feel free to open other issues that may come up, and keep an eye out for faster MCMC in our next few releases :)