dfm/emcee

Problem using 4-byte floats with EnsembleSampler

emprice opened this issue · 4 comments

General information

  • emcee version: 3.14
  • platform: Ubuntu Linux, Python 3.11.5
  • installation method: pip

Problem description

For context, the reason I'm doing this at all is because I wanted to benchmark some code that uses a GPU likelihood function, in vectorized mode, to see if it gives a performance boost. 4-byte floats are better-supported on GPUs, and they tend to perform better.

Expected behavior

When the input state is given as a NumPy ndarray with dtype np.float32 (rather than the usual default np.float64), the parameters passed to the likelihood function on each iteration should also be 4-byte floats.

Actual behavior

On the first call to the likelihood function, everything is as expected: the input parameter array has the same dtype as the initial parameters passed to run_mcmc. On the second call, however, the parameter array has a different size (half the original on the zeroth axis) and the dtype is np.float64.

What I have tried so far

Inspired by the test for quad precision in the repo, I tried using a Backend(dtype=np.float32), but that doesn't seem to make any difference. I also looked for instances of explicit dtype=np.float64 in the codebase, but it rarely appears, and my quick attempt at a fix in the WalkMove class also didn't change the observed behavior.

Minimal example

import emcee
import numpy as np

sigma = 1.

def lhood(p):
    if p.dtype != np.float32:
        print(p.dtype, p.shape)
        raise ValueError                                                                                               
    return -0.5 * np.sum(p**2, axis=1) / sigma**2

nwalkers, ndim = 16, 4
sampler = emcee.EnsembleSampler(nwalkers, ndim, lhood, vectorize=True)
p0 = np.random.randn(nwalkers, ndim).astype(np.float32)
sampler.run_mcmc(p0, 100)

This will print, before raising a ValueError: np.float64 (8, 4). Same total array size, but reinterpreted as a different data type.

dfm commented

Thanks! I expect that what's happening here is that all the calls to numpy.random are producing float64 arrays which can be upcast to float128, but won't be downcast to float32. This means that every time a proposal is made, the coordinates will end up being cast to the higher precision. It would take some work to be explicit about the random dtype throughout the whole code base, so I'm not sure there's much I'd want to change about emcee's operation here. For your use case, would it be sufficient to just downcast the dtype at the top of lhood before evaluating your model on the GPU?

Okay, having fiddled with this a little more, I found a very simple fix, at least in the case of the Goodman & Weare WalkMove class. I had misunderstood why the shape was different after the first call to the likelihood function -- I wrongly assumed that the data had been reinterpreted to an array half the size with np.float64 elements, when the size change is actually just because of how RedBlueMove fundamentally works. Realizing that, I was able to modify my MWE to the following, which runs without error. (I realize downcasting was an option, too, but, for a performance-critical application, you'd prefer to eliminate the typecast and work with shallow copies when possible.)

import emcee                                                                                                                                  
import numpy as np

sigma = 1.

def lhood(p):
    assert p.dtype == np.float32
    return -0.5 * np.sum(p**2, axis=1) / sigma**2

nwalkers, ndim = 16, 4

class MyWalkMove(emcee.moves.WalkMove):
    
    def get_proposal(self, s, c, random):
        c = np.concatenate(c, axis=0)
        Ns, Nc = len(s), len(c)
        ndim = s.shape[1]
        q = np.empty_like(s)   # <-- changed here
        s0 = Nc if self.s is None else self.s
        for i in range(Ns):
            inds = random.choice(Nc, s0, replace=False)
            cov = np.atleast_2d(np.cov(c[inds], rowvar=0))
            q[i] = random.multivariate_normal(s[i], cov)
        return q, np.zeros(Ns)

sampler = emcee.EnsembleSampler(nwalkers, ndim, lhood,
    vectorize=True, moves=MyWalkMove())
p0 = np.random.randn(nwalkers, ndim).astype(np.float32)
sampler.run_mcmc(p0, 100)

I don't think this breaks anything else in the code, though I haven't run any other tests. Happy to make it a pull request if you prefer, though it's literally a one-line fix.

dfm commented

Oh nice! Please do open a PR with that change. One liners are the best! :D Thanks.

Addressed (for two move types, at least) by PR #484.