theislab/batchglm

Shuffle Assignments doesn't work when generating sample description

Opened this issue · 0 comments

This here:

if shuffle_assignments:
sample_description = sample_description.isel(
observations=np.random.permutation(sample_description.observations.values)
)
return patsy.dmatrix("~1+condition+batch", sample_description), sample_description

is a bug that results in 'DataFrame' object has no attribute 'isel'.

How to reproduce:

from batchglm.api.models.numpy.glm_nb import Simulator

sim = Simulator()
sim.generate_sample_description(shuffle_assignments=True)

I think this is a remnant of the time when we used xarray.DataSet instead of pandas.DataFrame.

The question is now if we wanna fix this or get rid of it?
The method returns a design matrix based on the sample_description anyways so shuffling only reorders the rows of the matrix but at the time of generating the sample description, no data has been sampled yet because we need the sample description to simulate counts. So this has no real effect?

So is there any reason to keep it?