Gaussian shells convergence
JohannesBuchner opened this issue · 3 comments
Hi,
I wanted to show a small demo of Diffusive Nested Sampling in comparison to other techniques. I set up two gaussian shells which work as a spike-slab model in 2d. The posterior looks like a diamond ring (the second peak/shell is the knot on the left):
I thought this problem is nice because it tries to mimic some problems of high-d -- MCMC auto-correlation is rather poor, it has heavy tails, etc. I set up DNest4 code, and it runs, but it converges extremely slowly towards the true lnZ (-22.93).
Output:
[user:/tmp] $ python3 shells.py
# Seeding random number generators. First seed = 1234.
# Generating 5 particles from the prior...done.
# Creating level 1 with log likelihood = -145018.632847.
Sample 10: log(Z) = -84682.88157402459 [exact log(Z) = -22.930943168589735]
# Creating level 2 with log likelihood = -33778.5852309.
Sample 20: log(Z) = -33781.99728541736 [exact log(Z) = -22.930943168589735]
# Creating level 3 with log likelihood = -11858.5812697.
# Creating level 4 with log likelihood = -3808.49545032.
Sample 30: log(Z) = -3814.3049953313503 [exact log(Z) = -22.930943168589735]
# Creating level 5 with log likelihood = -1227.8929285.
# Creating level 6 with log likelihood = -349.311858325.
Sample 40: log(Z) = -357.14489504108184 [exact log(Z) = -22.930943168589735]
# Creating level 7 with log likelihood = -79.4236333158.
Sample 50: log(Z) = -88.45636871469677 [exact log(Z) = -22.930943168589735]
# Creating level 8 with log likelihood = 1.52233642312.
# Creating level 9 with log likelihood = 22.5384315752.
Sample 60: log(Z) = 11.666590645973669 [exact log(Z) = -22.930943168589735]
# Creating level 10 with log likelihood = 25.5607727607.
# Creating level 11 with log likelihood = 25.9572596971.
Sample 70: log(Z) = 13.347869420545218 [exact log(Z) = -22.930943168589735]
# Creating level 12 with log likelihood = 26.0108996749.
Sample 80: log(Z) = 18.03851374899128 [exact log(Z) = -22.930943168589735]
# Creating level 13 with log likelihood = 26.0179440326.
# Creating level 14 with log likelihood = 26.0188429039.
Sample 90: log(Z) = 16.040130539196088 [exact log(Z) = -22.930943168589735]
# Creating level 15 with log likelihood = 26.0189315068.
# Creating level 16 with log likelihood = 28.3170662288.
Sample 100: log(Z) = 14.175528141221353 [exact log(Z) = -22.930943168589735]
# Creating level 17 with log likelihood = 33.4451901087.
# Creating level 18 with log likelihood = 34.1986237598.
Sample 110: log(Z) = 14.620023223478169 [exact log(Z) = -22.930943168589735]
# Creating level 19 with log likelihood = 34.2978862356.
# Done creating levels.
Sample 120: log(Z) = 16.65271062388356 [exact log(Z) = -22.930943168589735]
Sample 130: log(Z) = 16.393998538296625 [exact log(Z) = -22.930943168589735]
Sample 140: log(Z) = 16.185914009662742 [exact log(Z) = -22.930943168589735]
My questions are:
- Did I do anything incorrectly?
- How does DNest4 know the width of the prior distribution? I want to use a uniform prior from -1 to 1. I don't seem to provide that anywhere.
- Can I modify the DNest4 parameters so it behaves better, without making it overly problem-dependent?
Code:
import os
import warnings
import numpy as np
from numpy import exp, log, pi
import scipy.stats
import dnest4
# analytic solution:
def shell_vol(ndim, r, w):
# integral along the radius
#mom = scipy.stats.t.moment(ndim - 1, df=1, loc=r, scale=w)
mom = scipy.stats.norm.moment(ndim - 1, loc=r, scale=w)
# integral along the angles is surface of hyper-ball
# which is volume of one higher dimension x (ndim + 1)
vol = pi**((ndim)/2.) / scipy.special.gamma((ndim)/2. + 1)
surf = vol * ndim
return mom * surf
ndim = 2
nlive = 50
r = 0.1e-10
# the shell thickness is
#w = (r**(ndim+1) + C * scipy.special.gamma((ndim+3)/2)*ndim*pi**(-(ndim+1)/2) / (
# scipy.special.gamma((ndim+2)/2) * pi**(-ndim/2)))**(1 / (ndim+1)) - r
w = 0.04e-10 / ndim
r1, r2 = r, r / 40
w1, w2 = w, w / 40
c1, c2 = np.zeros(ndim), np.zeros(ndim)
c2[0] -= r1
N1 = -0.5 * log(2 * pi * w1**2)
N2 = -0.5 * log(2 * pi * w2**2) + log(100)
Z_analytic = log((shell_vol(ndim, r1, w1) + 100 * shell_vol(ndim, r2, w2)) / 2)
def loglike(theta):
d1 = ((theta - c1)**2).sum(axis=1)**0.5
d2 = ((theta - c2)**2).sum(axis=1)**0.5
L1 = -0.5 * ((d1 - r1)**2) / w1**2 + N1
L2 = -0.5 * ((d2 - r2)**2) / w2**2 + N2
return np.logaddexp(L1, L2)
def transform(x):
return x * 2 - 1
log_dir = 'dnest4-%dd' % (ndim)
os.makedirs(log_dir, exist_ok=True)
class Model(object):
def __init__(self, ndim=2):
self.ndim = ndim
self.ncall = 0
def from_prior(self):
# start already close to the peak to make it easier
return np.random.uniform(-1e-10, 1e-10, size=(self.ndim,))
def perturb(self, coords):
i = np.random.randint(self.ndim)
# perturb with ring width w as scale to make it easier
coords[i] += w * dnest4.randh()
# Note: use the return value of wrap, unlike in C++
coords[i] = dnest4.wrap(coords[i], -1, 1)
return 0.0
def log_likelihood(self, coords):
self.ncall += 1
return loglike(coords.reshape(1, -1))[0]
model = Model()
sampler = dnest4.DNest4Sampler(model,
backend=dnest4.backends.CSVBackend(".",
sep=" "))
gen = sampler.sample(20, num_steps=10000, new_level_interval=100000,
num_per_step=100000, thread_steps=100,
num_particles=5, lam=10, beta=100, seed=1234)
fout = open(log_dir + '/ncall_ess.txt', 'w')
foutZ = open(log_dir + '/zevol.txt', 'w')
for i, sample in enumerate(gen):
stats = sampler.postprocess()
ncall = model.ncall
fout.write('%s %s\n' % (ncall, stats['N_eff']))
fout.flush()
foutZ.write('%s %s %s\n' % (ncall, stats['log_Z'], stats['log_Z_std']))
foutZ.flush()
if i % 10 == 9:
print("Sample {0}: log(Z) = {1} [exact log(Z) = {2}]"
.format(i + 1, stats["log_Z"], Z_analytic))
How does DNest4 know the width of the prior distribution? I want to use a uniform prior from -1 to 1. I don't seem to provide that anywhere.
Can I modify the DNest4 parameters so it behaves better, without making it overly problem-dependent?
OK, I made some progress. I realised that I need to draw from the entire prior, because the prior to posterior shrinkage is constrained by how particles go up in levels. I also needed to modify the perturb function to cover more orders of magnitudes, with:
def randh():
a = np.random.randn()
b = np.random.rand()
t = a/np.sqrt(-np.log(b))
n = np.random.randn()
return 10.0**(1.5 - 15*np.abs(t))*n
I also had to increase to using 100 levels.