lilab-bcb/pegasus

[ERROR] ValueError: Buffer dtype mismatch, expected 'const float' but got 'unsigned int'

Closed this issue · 2 comments

Hi!

I've been using Pegasus, and from time to time I run into the following error when calling highly_variable_features:

Traceback (most recent call last):
  File "Pegasus-Pipeline.py", line 249, in <module>
    pg.highly_variable_features(data_pre, batch="Channel", n_top=5000)
  File "/sc/arion/work/t/miniconda3/envs/pec_pegasusenv/lib/python3.8/site-packages/pegasusio/decorators.py", line 12, in wrapper_timer
    result = func(*args, **kwargs)
  File "/sc/arion/work/t/miniconda3/envs/pec_pegasusenv/lib/python3.8/site-packages/pegasus/tools/hvf_selection.py", line 292, in highly_variable_features
    select_hvf_pegasus(data, batch, n_top=n_top, span=span) 
  File "/sc/arion/work/t/miniconda3/envs/pec_pegasusenv/lib/python3.8/site-packages/pegasus/tools/hvf_selection.py", line 54, in select_hvf_pegasus
    estimate_feature_statistics(data, batch)
  File "/sc/arion/work/t/miniconda3/envs/pec_pegasusenv/lib/python3.8/site-packages/pegasusio/decorators.py", line 12, in wrapper_timer
    result = func(*args, **kwargs)
  File "/sc/arion/work/t/miniconda3/envs/pec_pegasusenv/lib/python3.8/site-packages/pegasus/tools/hvf_selection.py", line 26, in estimate_feature_statistics
    ncells, means, partial_sum = calc_stat_per_batch(data.X, data.obs[batch].values)
  File "/sc/arion/work/t/miniconda3/envs/pec_pegasusenv/lib/python3.8/site-packages/pegasus/tools/utils.py", line 114, in calc_stat_per_batch  
    return calc_stat_per_batch_sparse(X.shape[0], X.shape[1], X.data, X.indices, X.indptr, nbatch, codes)
  File "ext_modules/fast_utils.pyx", line 108, in pegasus.cylib.fast_utils.__pyx_fuse_0_0calc_stat_per_batch_sparse
ValueError: Buffer dtype mismatch, expected 'const float' but got 'unsigned int'

After a few attempts of re-running the script, it eventually works. Any idea as to why this could be occurring in a non-systematic manner?

Note that I submitted pull request #281 to address this issue, which addresses the occasional dtype errors in the calc_stat_per_batch_sparse function by ensuring that the matrix X is float32.

Hi @brauliovaldebenitomaturana and @hvbakel ,

Thanks for reporting this issue. Please see my comments in #281.

To be brief, the reason that this issue happens sometimes is that aggregate_matrices() function didn't have a logic on choosing which count matrix to be the default one in the resulting data object, if the source has multiple count matrices. I've applied such a logic in this PR. Please upgrade your pegasusio package to version 0.8.2.

Sincerely,
Yiming