skinnider/dismay

which measure of proportionality is better?

Opened this issue · 2 comments

Dear Michael,

I am conducting an analysis in which I aim to rank all known sources of variance (ie cell type, donor, technical artifacts) in my single-cell RNA-seq dataset. Among others, I am computing all pairwise cell-cell distances, getting a distance matrix as an output. Your article "Evaluating measures of association for single-cell transcriptomics" has been extremely useful in this regard. I also observe a greater signal-to-noise ratio and overall accuracy when using measures of proportionality (phi and rho) as compared to Pearson correlation (as you report in figure 4).

My question is: which measure of proportionality would you use? I like rho because its bounded between [-1,1]. However, I get a great deal of negative values (ie -0.1) which I find it hard to interpret. On the other hand, phi is always positive, but is unbounded.

Thanks a lot for your time and help, and for creating this awesome package.

Best,

Ramon

Hey Ramon, sorry for the delay. Glad to hear our paper was useful to you, and that you are seeing similar results. I tend to use rho because, as you say, it’s often useful to have a measure bounded by [-1, 1]. In practice, depending on the application I’m not sure the choice is that significant; the two are related by a monotonic function and correspondingly, the differences we saw between them in our paper were quite minor. You might want to take a look at the propr paper (https://www.nature.com/articles/s41598-017-16520-0), which is the implementation that dismay is providing a fairly shallow wrapper around, for more details - the SI appendix of this paper might be particularly useful.
Hope this helps.
Mike

Thanks a lot Mike, this is very useful!

Ramon