kevin931/PyCytoData

[BUG] `run_dr_methods` breaks when the PyCytoData's channels have been subsetted

Opened this issue · 1 comments

What issues are you experiencing?
When we've subset a dataset, the _lineage_channels_indices private attribute is not properly updated. This results in an index out of bound IndexError when running run_dr_methods method. This is a critical issue because in the case that the subsetting procedure does not result in an index out of bound error, then this is a insidious bug that won't surface.

To Reproduce

It's easy to reproduce with a built-in example:

from PyCytoData import DataLoader

exprs = DataLoader.load_dataset(dataset = "levine32", preprocess=True)
exprs = exprs.subset(channels = exprs.lineage_channels)
exprs.run_dr_methods(methods = ["UMAP"])

Expected behavior
We should internally update the indices so that it will not result in such bugs. We should have this resolved quickly and push out a patch because this is an actual bug rather than a documentation issue.

Your environment:

  • OS: Linux
  • Python Version: 3.11
  • Python Distribution: Anaconda
  • PyCytoData version: 0.1.2
  • PyCytoData distribution: Conda

Also, this bug persists when we manually set the lineage channels at a later stage. This needs to fixed at the setter.