prabhakarlab/Banksy

Calculate the correlation of the expression of two genes

Closed this issue · 1 comments

It is well known that spatial transcriptome data has special characteristics that distinguish it from single-cell transcriptome data, and if I want to calculate the correlation between the expression of two genes in spatial transcriptome data, we should take into account the role of spatial location more in spatial transcriptome data. By reading your article, I found that in the process of calculating using this software, we obtained a new matrix(nbr.expr) by considering the use of spatial location information, and I would like to ask if it is more reliable to use this matrix for calculating the correlation between two gene expressions? Thank you!

It depends on that you mean by reliable. If you use only the neighbour expression matrix to compute gene-gene correlations, what you are actually doing is computing correlations between spatially averaged features (see also this excellent article, which discusses the concept of spatial "lag", https://link.springer.com/article/10.1007/s101090100064). This will be more "robust" in the sense that noise will be smoothed out. Also, it will be spatial in the sense that you are correlating entities that contain information from spatial neighbours, so the relative ordering (indeed graph connectivity) of the objects matters, while when computing the usual nonspatial gene gene correlations there is no ordering of objects.

Parameters like size of the spatial kernel (defining the lengthscale of the spatial smoothing) will matter here.