Which modality to use in Multiome for metacell inference

Question

Which modality to use in Multiome for metacell inference

Closed this issue 2 years ago · 4 comments

Hi,

In Multiome data which data modality would you recommend to use for the inference of metacells? RNA or ATAC?
I'm asking because in this tutorial you use ATAC but according to your biorxiv you mention that is actually harder:

As a further challenge, we ran Palantir on aggregated RNA from metacells computed on the ATAC modality, since the sparsity of scATAC-seq data renders cell-state identification much more difficult

I know it depends on the biological context but, would you recommend as a rule of thumb to use RNA instead for metacell inference? What do you think?

Thank you for your time!

Answer 1 · 2022-07-03T21:58:54.000Z

Thank you for your query. For peak-gene associations, gene score computation etc, we do recommend the use of ATAC modality for metacell identification.

Answer 2 · 2022-07-04T07:39:10.000Z

Thanks for the response!
Alright so for state definition better use RNA, but for epigenome analysis better use ATAC, makes sense.
In any case, are you thinking on making a joint metacell inference step? Something like an averaged aggregation coming from RNA and ATAC at the same time?
Thank you for your time

Answer 3 · 2022-07-04T21:00:36.000Z

Joint metacell inference is an interesting idea and we are exploring a few options. We think a kernel representation that captures both modalities might be the best option here. For eg: MOFA+ generates a low embedding using multiple modalities. One can construct the nearest neighbor graph using the jointly learned embedding and generate a kernel which can then serve as input to SEACells.

Answer 4 · 2022-07-05T07:37:01.000Z

Thanks for the feedback @ManuSetty! This makes a lot of sense, I'll play around this idea.