saezlab/liana-py

ZeroDivisionError: float division by zero

anke-king opened this issue · 10 comments

Describe the bug
following your tutorial with own data:
li.mt.rank_aggregate.by_sample(
adata,
groupby=groupby,
sample_key=sample_key, # sample key by which we which to loop
use_raw=False,
verbose=True, # use 'full' to show all verbose information
n_perms=100, # reduce permutations for speed
return_all_lrs=True, # return all LR values
)
I get a zero division error.
To Reproduce
If possible please provide a minimal reproducible example.
For example, a downsampled version of the your anndata object.

Screenshots
File ~/miniconda3/lib/python3.9/site-packages/liana/method/sc/_liana_pipe.py:300, in _get_lr(adata, resource, groupby_pairs, relevant_cols, mat_mean, mat_max, de_method, base, verbose)
298 dedict[label]['zscores'] = temp.layers['scaled'].mean(axis=0)
299 if logfc_flag:
--> 300 dedict[label]['logfc'] = _calc_log2fc(adata, label)
301 if isinstance(mat_max, np.float32): # cellchat flag
302 dedict[label]['trimean'] = _trimean(temp.X / mat_max)

File ~/miniconda3/lib/python3.9/site-packages/liana/method/sc/_liana_pipe.py:342, in _calc_log2fc(adata, label)
340 # subject and rest means
341 subj_means = subject.layers['normcounts'].mean(0).A.flatten()
--> 342 rest_means = rest.layers['normcounts'].mean(0).A.flatten()
344 # log2 + 1 transform
345 subj_log2means = np.log2(subj_means + 1)

File ~/miniconda3/lib/python3.9/site-packages/scipy/sparse/_base.py:1191, in spmatrix.mean(self, axis, dtype, out)
1189 # axis = 0 or 1 now
1190 if axis == 0:
-> 1191 return (inter_self * (1.0 / self.shape[0])).sum(
1192 axis=0, dtype=res_dtype, out=out)
1193 else:
1194 return (inter_self * (1.0 / self.shape[1])).sum(
1195 axis=1, dtype=res_dtype, out=out)

ZeroDivisionError: float division by zero

variables used:
sample_key = 'sample' (sample key)
condition_key = 'cell_type' (2 cats: malignant/healthy)
groupby = 'day' (7 cats: 7 days)

Hi @anke-king,

Is it possible that you have unexpected values in the anndara object? For example, zeroes or nan?

Hi, thanks for your reply.
I checked, and I don't have nans or zeros.

@anke-king apologies I meant negative values, not zeroes.

I do have negative values, as the data is normalized, scaled and transformed. Does your tool need raw data? Because in the tutorial you also do standard preprocessing steps.

Hi @anke-king, notice that in the tutorial I use the log-normalized counts, not the scaled ones :)

thanks for your help! I changed my code and now use log transformed values (so no nan, no negative) instead of scaled, however I still get the same error.

I even tried adding 1e-6 to each value and only using highly variable genes because I thought maybe somewhere the algorithms divides by zero, but the float division by zero error still occurs.

Hi @anke-king,

You can share a reprex and I can test it on my end.

Best if I get a subset of your data and the code relevant for running LIANA.

Hey, giving my two cents here as I ran into this issue today. Make sure that each of your sample_key annotations has more than one groupby annotation (e.g., cell types). Otherwise when computing log2fc, rest will be an empty AnnData.

Maybe a good idea to check this beforehand and raise an error to the user?

Hi @nrclaudio,

Thanks. I will make sure to check for this :)