gao-lab/Cell_BLAST

Could we use immgen datasets as a reference?

Closed this issue · 1 comments

Dear Developers?
Thank you for you CellBlast.
I recently used SingleR and found the built-in ImmGen reference very useful. Could we use that dataset in CellBlast?

Thank you for your response.

Thanks for your interest for Cell BLAST!

In principle Cell BLAST can be used with any scRNA-seq reference data. You may read custom reference data via functions like read_table, from_anndata and from_loom in the Python package, and then train your own model on the custom reference (this tutorial might help).

However, the ImmGen reference in SingleR (SingleR::ImmGenData) seems to be microarray data, which may not match the probability model used by Cell BLAST, and the data size (830 cells) is too small for the Cell BLAST model to train properly, so it's not recommended to use Cell BLAST with ImmGen. I would recommend using larger scRNA-seq datasets (> 3,000 cells) as reference in this case .