cozygene/bisque

How to interpreter Bisque result

ConcettaDe4 opened this issue · 7 comments

Hi!
I am using Bisque to perform deconvolution of bulk RNA-seq data. I applied the Marker-based decomposition approach using a list of genes Marker without the foldChange and the function:

BisqueRNA::MarkerBasedDecomposition(bulk.eset, markers, weighted=F)

I obtained the following result:

                sample1       sample2     sample3 
cell_type1      -2.29         1.20         -0.14
cell_type2       1.1            0.22         -3.14 

I did not know how to interpreter the result that I obtained.
For example what does it mean if I have a negative value? If into a sample I have negative values for both cell types what does it indicate?

Thank you for your help.

By,

Concetta

Hi Concetta,

Thanks for your interest in our method! The marker-based decomposition returns values that indicate relative cell abundances that can only be compared within cell types. In your example, the method is estimating that sample1 has less of cell_type1 compared to the other two samples but more of cell_type2 compared to the other two samples.

The numbers themselves don't have an interpretable meaning, but they can be compared to other samples to estimate relative abundance of a specific cell type.

Please let me know if you have any additional questions.

Thanks,
Brandon

Hi!
Thank you for your reply, now it is clear :-).

I have an additional questions about the selection of markers genes. How many marker genes should I use and how their expression should be?

Thank you,

Concetta

Hi,

There is no definite answer for how marker genes should be selected. If single-cell data is available, you could start with the FindAllMarkers function in Seurat to get a set of marker genes. If not, there are databases, such as PanglaoDB, that provide marker genes. In general, the genes selected should have the largest variance between cell types but have relatively stable expression across individuals.

Thanks,
Brandon

Hi,
Thank you for your help.
I used bisqueMarkers to deconvulte bulk RNA-seq data using on average 15 markers genes for each cell type. After the deconvolution I got the following message:

Estimating proportions for 2 cell types in 21 samples
Filtered 24727 zero variance genes.
Using 6 genes for cell type 1; 
83% of 6 marker genes correlate positively with PC1 for cell type 1
Using 11 genes for cell type 2; 
100% of 11 marker genes correlate positively with PC1 for cell type 2
Finished estimating cell type proportions using PCA

I was wondering why Bisque select a subset of genes among the provided genes and how markers genes are selected by Bisque.
In addition the sentence “100% of 11 marker genes correlate positively with PC1 for cell type neuron” what does it mean? Does it suggested that the selected markers explained very well the deconcoluted cellTypes?
Considering that if I got that the 83% of markers correlate positively, is the result still reliable?

Thank you in advance for your help.

Concetta

The marker-based analysis runs principal component analysis on the expression matrix for the provided marker genes and further only selects genes that have a positive correlation with the first principal component to estimate relative cell type abundances (under the assumption that the first principal component captures cell type variability). This percentage does not give the reliability of the estimates but just indicates how many genes the algorithm thinks is informative. It should be noted that the first principal component could also capture things like batch effects or other significant confounding factors if they have not already been addressed in preprocessing of the bulk RNA data.

DRLMQ commented

Hi!

Thanks for your great tool. I'm using it to find proportion of celltypes in bulk-like data ie. Spatial Transcriptomics. It works perfectly well on human celltypes with ReferenceBasedDecomposition.
In addition i'm looking for Plasmodium falciparum cells and trying to deconvolve using the same command. For that case, it raises an error because I only have one individual (not really true because here each parasite is a different individual). I decided to try the MarkerBasedDecomposition and it works well, no problem.

Now I have relative abundance among celltypes in every bulk-like samples. I would like to know if there is a way to tell if one of those samples is really negative. Obviously I cannot use 0 as marker of absence of parasite. Do you think i could use in some way negative values as marker of absence of parasite ?

I hope i am clear.

Thank you in advance for your help,
DLRMQ.

Hi @DRLMQ,

For your first point, you can use the reference based decomposition if you edit the ExpressionSet to have the proper 'individual' labels to separate the parasites.

For your second point, the relative abundances cannot be used to infer the absence of a cell type, only if samples have less than others.

Let me know if you have any other questions.

Thanks,
Brandon