HelenaLC/muscat

pbHeatmap y %in% assayNames(x) is not TRUE

Closed this issue · 8 comments

Dear Helena. The package is very nice to use. May i ask a question about plotting? I have 5 different groups in the dataset and used contrast to select 2 groups for comparation. But when i try to plot the result with heatmap, i got an error.

pbHeatmap(sce, res, g = "IL10RB")
Error in .check_arg_assay(x, assay) : y %in% assayNames(x) is not TRUE

I think it may be due to i have 5 groups in sce but only 2 groups in res? Thank you.

Could you perhaps print the SCE object going into this step? Thanks

Thank you, Helena. Initially the object has only assay 'X', holding gene count. I duplicated this assay, called it 'counts'.

sce
class: SingleCellExperiment
dim: 23648 139333
metadata(12): disease_colors hvg ... umap experiment_info
assays(2): X counts
rownames(23648): MIR1302-2HG AL627309.1 ... AC233755.1 AC240274.1
rowData names(5): n_cells highly_variable means dispersions dispersions_norm
colnames(139333): H1_AAACCTGAGATACACA H1_AAACCTGAGTCGTACT ... TRICOV20_F_TTTGGTTGTAGAGACC
TRICOV20_F_TTTGTTGGTCGCATTA
colData names(3): cluster_id sample_id group_id
reducedDimNames(3): X_pca X_pca_harmony X_umap
mainExpName: NULL
altExpNames(0):

Alright, so this answers it. Q: does X contain transformed (expression-like) values or raw counts? In any case, you wouldn’t nor need to duplicate the data. Instead, you can use assayNames(sce)<-… to rename. Also, pbHeatmap() (& nearly all plotting functions) operate on expression (not counts/intensity for CyTOF/FACS). So the default is using assay=“exprs”, which it cannot find, hence the error. You have two options: rename the corresponding assay to “exprs” or specify argument ‘assay’ when plotting.

Thanks a lot. It works now. BTW, when we use the pbHeatmap function, pbHeatmap(sce, res, g = "'), sometimes the gene is not found. Are only significant genes to be shown or any genes in the limma test?

Also, if i use assay='X', does the heatmap use count data or normalised log transformed data for plotting? I assume the latter shoudl be used.

I don’t know what your X assay contains, so can’t tell. It will use whatever is in there. Checking range(assay(spe, “X”)) should probably answer it. But knowing what the upstream pipeline does is in any case something worth knowing…

Thank you, Helena. The range is between 0 and 9. I directly exported an anndata object with raw count to sce, which were normalized and logarithmized raw gene expression.

For the DS test, the sum of raw counts are defaulted. Is there a step to normalise the raw count to library size before testing?

Yes, that’s correct. The best performant method (we found) uses raw counts. Most visualizations, however, need expression-like data. So, ideally, you’d have two assays: one of “counts” and one of “logcounts”.