Unable to reproduce figure 2
Closed this issue · 1 comments
Hi,
as I am trying to port this method to pytorch, I stumbled upon a curious error. I was using the gaussian toy model from the paper section 4.1. I first had a look at the unit tests of this package here. They appear to work great, but I found it curious that the absolute difference of ecp and alpha was set to a high value of (0.25
).
So I added two tests which try to reproduce the overconfident and underconfident cases.
def get_test_data(scalefactor=1.0):
num_samples = 100
num_sims = 100
num_dims = 5
theta = np.random.uniform(low=-5, high=5, size=(num_sims, num_dims))
log_sigma = np.random.uniform(low=-5, high=-1, size=(num_sims, num_dims))
sigma = scalefactor * np.exp(log_sigma)
samples = np.random.normal(
loc=theta, scale=sigma, size=(num_samples, num_sims, num_dims)
)
theta = np.random.normal(loc=theta, scale=sigma, size=(num_sims, num_dims))
return samples, theta
# ...
class ConfidenceTest(unittest.TestCase):
def test_underdispersed(self):
samples, theta = get_test_data(0.05)
ecp, alpha = get_tarp_coverage(
samples,
theta,
references="random",
metric="euclidean",
norm=False,
bootstrap=False,
)
print("test_underdispersed", np.max(np.abs(ecp - alpha)))
self.assertAlmostEqual(np.max(np.abs(ecp - alpha)), 0.0, delta=0.25)
def test_overdispersed(self):
samples, theta = get_test_data(20.0)
ecp, alpha = get_tarp_coverage(
samples,
theta,
references="random",
metric="euclidean",
norm=False,
bootstrap=False,
)
print("test_overdispersed", np.max(np.abs(ecp - alpha)))
self.assertAlmostEqual(np.max(np.abs(ecp - alpha)), 0.0, delta=0.25)
To my surprise, the numerical value of np.max(np.abs(ecp - alpha))
didn't change much. So I went and gave the notebook a spin. I observe that the notebook uses a different way to generate the Gaussian toy data. I was able to reproduce the correct case.
Then, I changed the data generating function slightly and produced overconfident and underconfident data:
def generate_correlated_samples(num_samples, num_sims, num_dims, covfactor=1.):
""" Generate samples and true parameter values """
theta = np.random.uniform(low=-5, high=5, size=(num_sims, num_dims))
cov = [generate_psd_matrix(num_dims) for _ in range(num_sims)]
cov = covfactor*np.concatenate(cov).reshape(num_sims, num_dims, num_dims)
samples = [np.random.multivariate_normal(mean=theta[i], cov=cov[i], size=num_samples) for i in range(num_sims)]
samples = np.stack(samples)
samples = samples.transpose(1, 0, 2)
theta = [np.random.multivariate_normal(mean=theta[i], cov=cov[i], size=1) for i in range(num_sims)]
theta = np.stack(theta)[:,0]
return samples, theta
I was again surprised to see that the tarp coverage is unable to differentiate correct case from underdispersed or overdispersed.
Could you please double check where the problem is? I added the notebook which produced the plots above for convenience as a gist here.
Stupid me, I put the covfactor
in the wrong spot. This dispersed both theta
and samples
- which is incorrect. It works like a charm now, once I only disperse samples
as it should be.