Constrained sampling in parameter space

Question

Constrained sampling in parameter space

dforero0896 opened this issue 4 years ago · 13 comments

Hi, I'm trying to fit a model using Zeus, however I don't have an analytic model, which forces me to interpolate my model data. I run into issues with the sampler when the walkers explore the parameter space too far from where I expect the MAP to be. For instance, I define one parameter to have a uniform prior in (0.8, 1.2) but when the walkers explore ~1.5 my model interpolation breaks (above interpolation bounds). I have tried filling the absent values (outside interpolation range) with nans or infs but those seem to be problematic. Extrapolation also seems to be problematic too since I get

RuntimeError: Number of expansions exceeded maximum limit! 
Make sure that the pdf is well-defined. 
Otherwise increase the maximum limit (maxiter=10^4 by default).

But this could be other bug and still I doubt extrapolating is a good approach.
In short, do you have some way to constrain walkers to a certain domain in the parameter space?

Thanks in advance,

Answer 1 · 2021-01-29T11:31:31.000Z

Hi @dforero0896,

Could you share any parts of the code so that I can have a look?

The problem is probably the way that you define the log prior or the log posterior but I can't say more without seeing the code.

Cheers,

Answer 2 · 2021-01-29T11:36:41.000Z

Sure!
Here is the definition of the pdfs:

 def logprior(self, theta):

        lp = 0
        for i in range(len(theta)):
            if self.prior_types[i] == 'flat':
                lp -= 0 if self.prior_params[i][0] < theta[i] < self.prior_params[i][1] else -np.inf
            elif self.prior_types[i] == 'gauss':
                mu, sigma = self.prior_params[i]
                lp -= 0.5 * ((theta[i] - mu) / sigma)**2
            else:
                raise NotImplementedError
            
        return lp 
    
    def loglike(self, theta, data, s):

        alpha, B, Sigma_nl = theta

        chisq = self.get_chisq(data, alpha, B, Sigma_nl, s)

        return - 0.5 * chisq

    def logpost(self, theta, data, s):
        
        # Use Bayes' rule

        return self.logprior(theta) + self.loglike(theta, data, s)

with

prior_types = ('flat', 'flat', 'flat')
prior_params = ((0.8, 1.2), (0, 25), (0, 30))

Cheers,

Answer 3 · 2021-01-29T11:43:23.000Z

Try the following and let me know if this solves the issue

def logprior(self, theta):

    lp = 0
    for i in range(len(theta)):
        if self.prior_types[i] == 'flat':
            lp -= 0 if self.prior_params[i][0] < theta[i] < self.prior_params[i][1] else -np.inf
        elif self.prior_types[i] == 'gauss':
            mu, sigma = self.prior_params[i]
            lp -= 0.5 * ((theta[i] - mu) / sigma)**2
        else:
            lp = -np.inf # I've changed this
            
    return lp 
    
def loglike(self, theta, data, s):

    alpha, B, Sigma_nl = theta

    chisq = self.get_chisq(data, alpha, B, Sigma_nl, s)

    return - 0.5 * chisq

def logpost(self, theta, data, s):
        
    # Use Bayes' rule
    lp =  self.logprior(theta)
    if ~np.isfinite(lp):
        return -np.inf
            
    return lp + self.loglike(theta, data, s)

Answer 4 · 2021-01-29T12:14:36.000Z

@dforero0896 I also found another bug in your code
This needs to be changed to

lp -= 0 if self.prior_params[i][0] < theta[i] < self.prior_params[i][1] else -np.inf

to this

lp -= 0 if self.prior_params[i][0] < theta[i] < self.prior_params[i][1] else np.inf

since you already account for the minus sign in the -= operator.

Answer 5 · 2021-01-29T12:26:46.000Z

You are right, however I still encounter the above/below interpolation range error.

Answer 6 · 2021-01-29T12:50:50.000Z

It works if I put some extremely restrictive priors (say gaussian with mu=1, sigma=0.001) which I think is not very good anyway. But in this case I get a weird error

Traceback (most recent call last):
  File "src/fit.py", line 350, in <module>
    bao_fitter.get_xi_fit_range(xi_obs, s_obs) 
  File "src/fit.py", line 263, in fit
    self.sampler.run_mcmc(start, nsteps) # Run sampling
  File "/home/astro/dforero/.conda/envs/baofit/lib/python3.8/site-packages/zeus/ensemble.py", line 513, in summary
    logging.info('Mean Integrated Autocorrelation Time: ' + str(round(np.mean(self.act),2)))
  File "/home/astro/dforero/.conda/envs/baofit/lib/python3.8/site-packages/zeus/ensemble.py", line 446, in act
    return AutoCorrTime(self.chain[int(self.nsteps/(self.thin*2.0)):,:,:])
  File "/home/astro/dforero/.conda/envs/baofit/lib/python3.8/site-packages/zeus/autocorr.py", line 107, in AutoCorrTime
    taus[i] = _autocorr_time_1d(samples[:,:,i].T, c, method)
  File "/home/astro/dforero/.conda/envs/baofit/lib/python3.8/site-packages/zeus/autocorr.py", line 57, in _autocorr_time_1d
    f = _autocorr_func_1d(y.reshape((-1), order='C'))
  File "/home/astro/dforero/.conda/envs/baofit/lib/python3.8/site-packages/zeus/autocorr.py", line 28, in _autocorr_func_1d
    f = sp.fft.fft(x - np.mean(x), n=2 * n)
AttributeError: module 'scipy' has no attribute 'fft'

is this something related to the scipy version? I can open another issue if you prefer.
Thanks!

Answer 7 · 2021-01-29T16:00:46.000Z

It seems that the problem appears when you try to estimate the autocorrelation time of the chains. Looks like you have an outdated version of scipy installed. Did you try updating to the latest version of scipy?

Answer 8 · 2021-01-29T17:15:55.000Z

That was it for the scipy error, thanks. I tried relaxing the gaussian prior but even just using sigma=0.002 already causes the fit to fail.

Answer 9 · 2021-01-29T17:18:09.000Z

Is there any chance the interpolator is stochastic? i.e. calling it for the same parameters returns slightly different results every time?

Answer 10 · 2021-01-29T17:26:10.000Z

I dont think so, I'm using scipy's interp1d cubic spline. The issue is, I think that I'm trying to measure how shifted the data curve is compared to the model, which is controlled by the parameter alpha. My data is defined in the range (0, 200) and my fitting range is (60, 150) so any time alpha is >1.33 the interpolation breaks.

Answer 11 · 2021-01-29T17:40:47.000Z

So if the problem appears due to the interpolation when alpha>1.33 then the flat(0.8,1.2) should work fine, right? Is the problem only with the gaussian one?

Answer 12 · 2021-01-29T18:00:59.000Z

Indeed, I have now checked and the flat prior works fine, I had just tested the gaussian after your proposed solution. It now makes sense to me why the gaussian does not work, but your first solution did indeed fix the problem. To make a gaussian prior work in this setting it should then suffice to properly define it, i.e. put the hard cut at the edges:

def logprior(self, theta):

    lp = 0
    for i in range(len(theta)):
        if self.prior_types[i] == 'flat':
            lp -= 0 if self.prior_params[i][0] < theta[i] < self.prior_params[i][1] else np.inf
        elif self.prior_types[i] == 'gauss':
            mu, sigma = self.prior_params[i]
            lp -= 0.5 * ((theta[i] - mu) / sigma)**2 if self.prior_params[i][2] < theta[i] < self.prior_params[i][3] else np.inf
        else:
            lp = -np.inf 
            
    return lp

where self.prior_params[i][2] and self.prior_params[i][3] would be the hard cuts for the fitting range for parameter theta[i].
Thanks a lot for your help! I think the issue can be closed.

Answer 13 · 2021-01-29T18:08:53.000Z

I'm glad I could help