SydneyBioX/scMerge

Intergration with DE analysis

crotoc opened this issue · 7 comments

Thanks for this great package that I can combined data sets from completely different sources. I have a question that after getting batch effect corrected, how can I do the DE analysis. Or DE analysis after batch effect correction is fundamentally not suitable. Please give me some advice! Thanks!

Many DE analysis method requires count data and I figured many negative numbers existed in the corrected results. It seems like this data doesn't fit the required input format of DE analysis.

If I am correct the results after correction are still in log scale and can be used in limma to do the DE analysis? Please let me know if I am wrong. Thanks!

Hi,

Thank you for your interest of scMerge! The output of scMerge is practically a "log-transformed" distributed interpretation of data, with a very small percentage of negative values because of a scaling step within the algorithm. You could substitute the negative values with zeros before performing DE. DE methods for log-scale data, such as limma, are suitable for the scMerge output.

Yingxin

Thanks very much! Will try it right now!
I have another question when I am using prenatal brain scRNAseq and adult brain scRNAseq to conduct the analysis. These two data sets are completely from two labs, and are totally different. After reading your paper, I think that scMerge will identify psuedo replicates between these two data sets. I think there should be no excact same cell types between these to data set, but maybe some of them may be similar. In this case, does scMerge can deal it right? When running scMerge, there is a window with a plit coming up and show that there two pairs between the two data sets. Does it mean scMerge makes sense on my application? Thanks!

Hi,

The network plot with two pairs connected indicates scMerge has identified two pairs of mutual nearest clusters as pseudo-replicates. The current scMerge algorithm is base on the assumption that these two pairs of clusters will share some similar biology signals.

Yingxin

Very great to know that! I have the labels for every clusters, is that possible to know which clusters are pairs?

Hi,

I am wondering if you have input cell_type information when performing scMerge (that is, perform semi-supervised scMerge II, https://sydneybiox.github.io/scMerge/articles/scMerge.html#semi-supervised-scmerge-ii) or the default setting.

Currently, we do not have a very convenient way to check this output, but we will implement it as one of the output soon.

A way that might be useful to check this output for now is to check the replicate matrix that is stored in metadata after performing scMerge

scRep <- apply(sce@metadata$scRep_res, 1, 
               function(x) colnames(sce@metadata$scRep_res)[which.max(x)])

table(scRep[grep("Replicate", scRep)], 
      sce$cellTypes[grep("Replicate", scRep)])

The cell types that are in the same replicates are corresponding to the pair of mutual nearest cluster. Please let me know if these codes work for you!

Cheers,
Yingxin