Using selected cell type markers can't improve the accuracy

Question

Using selected cell type markers can't improve the accuracy

Opened this issue 4 months ago · 0 comments

Hi
Thank you for your fancy work!
I've met a little questions when using BayesPrism and wanna get your advice. I have thousands of bulk RNA-seq samples and two scRNA-seq samples for reference. To evaluate the precise proportion of TME composition, we also counted the ratio of cd8+T and cd4+T cells of ~30 samples based on immunostaining and used it for gold standard.

When I used all genes, markers based on seurat function FindAllMarkers() and subset markers that expressed nearly 0 in other cell types, it seems that using all genes is better than other 2, that was strange. I am confused why more informative genes can't help. Here are correlation and RSME under different genes.
I noticed that when I used selected genes, BayersPrism calculated expression matrix only on choosed genes. I wonder if there are some parameters help to get cell proportion based on selected gene and meanwhile obtain the whole expression matrix.

The cell.type.label and cell number is shown below:
`

table(sc$cell.type.label)

    B cell    Cancer cell    Cd4+ T cell    Cd8+ T cell    DCs       DNT cell     Epithelial         M-MDSC 
      2057           2062           1921           3590           464            174             28           2153 
   Macrophage(AM) Macrophage(MM)        NK cell       PMN-MDSC 
      1072           1005            770           3436

`
the corr.plot of cell types:
type_cor_phi.pdf