morris-lab/CellOracle

Bug or common? How to find the parameters for n_grid and min_mass?

Closed this issue · 2 comments

Dear Kenji,
Thanks for developing this awesome software. I really enjoy it, and the workflow works very well.

But when it comes to my data, it seems a bit strange to find the suitable parameters for n_grid and min_mass.

As described:

n_grid: Number of grid points. 
min_mass: Threshold value for the cell density. The appropriate values for these parameters depends on the data. Please find appropriate values using the helper functions below.

My commands:

# n_grid = 40 is a good starting value.
n_grid = 40 
oracle.calculate_p_mass(smooth=0.8, n_grid=n_grid, n_neighbors=200)
# Search for best min_mass.
oracle.suggest_mass_thresholds(n_suggestion=12)

Screenshot 2023-07-11 at 15 23 40

It seems I don't get any min_mass values lower than 0.01 which is the default value in this oracle.calculate_mass_filter() function. Do you think it is reasonable to have these high min_mass threshold values? Could you please give me some hints to interpret the results?

Thanks in advance!
Zhuang

@Zhuang-Bio

Thank you for using celloracle!

The oracle.calculate_mass_filter calculates local cell density for each grid point, and we use this information to remove some grid points that do not include cells. The min_mass is the threshold value for this filtering process.

The scale of this value is affected by both number of cells and the scale of the embedding axis. The min_mass value scale can vary depending on data and dimensional reduction algorithm, and getting a suggested min_mass value of more than 10 is no problem.
I often see such a relatively large value.

I should have explained this characteristic in the celloracle tutorial. I will add this.
Thank you for your feedback!

Kenji

Hi Kenji, thanks so much for your quick response. It makes me clear.