potulabe/symphonypy

`X_pca_reference` versus `X_pca_harmony`

Closed this issue · 1 comments

Hi,

Thanks again for the great package.
I have a conceptual question about the following scenario:

  1. I have an adata_reference which I have integrated with sp.pp.harmony_integrate(), and thus it now has the .obsm['X_pca_harmony'] attribute.
  2. Now I a map adata_query onto adata_reference using sp.tl.map_embedding().
    adata_query now has both .obsm['X_pca_reference'] and .obsm['X_pca_harmony'].
  3. Now I want to concatenate adata_reference to adata_query and plot them in integrated PCA space.
    Question: for adata_query, should I use the X_pca_reference or X_pca_harmony embedding for this?

I tried both and the results look quite similar, but which one is more correct?

Hey! Sorry for the delay in replying.

X_pca_reference slot is a non-corrected representation of query dataset in reference PCA space with reference PCA model. X_pca_harmony is a corrected representation. Thus, if you want to plot uncorrected coordinates you should use adata_ref.obsm["X_pca"]+adata_query.obsm["X_pca_reference"], and if you want to plot coordinates after correction you should use adata_ref.obsm["X_pca_harmony"]+adata_query.obsm["X_pca_harmony"]