Teichlab/celltypist

SCT Object Compatibility with celltypist

katkatrach opened this issue · 2 comments

Hello,

I am currently following a celltypist tutorial in which it can be used on a .h5ad object. My original object is Seurat, so I converted it first to h5Seurat and then to h5ad. I loaded in the h5ad file with scanpy.

Here are some attributes that print when I look at the object:
object.shape
(139380, 7630)
object.X
[[-0.18042335, 2.62392346, -0.24814961, ... etc
object.raw.X
<Compressed Sparse Row sparse matrix of dtype 'float64' with 177847606 stored elements and shape (139380, 31227)>
object.var_names
Index(['LINC01409', 'SAMD11', 'HES4', 'ISG15', etc

I am trying to annotate with the following
predictions = celltypist.annotate(object, model = 'Immune_All_Low.pkl', majority_voting = True, mode = 'best match')
and have also tried transposing input. I get the same error each time: first that it will use raw.X:
👀 Invalid expression matrix in '.X', expect log1p normalized expression to 10000 counts per cell; will use '.raw.X' instead
⚠️ Warning: invalid expression matrix, expect ALL genes and log1p normalized expression to 10000 counts per cell. The prediction result may not be accurate

I use Seurat SCTranform which does both normalization and scaling.

and then the following:
ValueError: 🛑 No features overlap with the model. Please provide gene symbols

Is there anything I should do to make sure the object is in the correct format? I am stuck on what to do my object, and would prefer not to re-scale my matrix.

Thank you!

I fixed the issue with scanpy normalization and scaling on my object!

@katkatrach, glad you resolved it:) CellTypist needs normalized expression to 10000 counts per cell as input, which is incompatible with SCT that relies on Pearson residuals for normalization.