immunogenomics/harmony

Is it okay to paste two covariates into one when correcting for two covariates?

Closed this issue · 4 comments

Hi,

Sometimes, people choose to paste two covariates into one like this: Seurat V5 FindVariableFeatures() and HarmonyIntegration() Question.

Considering the theta value (maybe also lambda) difference for one covariate or two covariates (#100, #24), do you think it is okay to use this way? I am not sure how this would affect the correction results.

Hi @YiweiNiu,

It depends on your study design. If the design is hierarchical, pasting the covariates in a single one does not make a difference. For example: obj$tech_sample <- paste0(obj$tech, "_", obj$sample), unless you did the same biological samples with different technology then the number levels for the covariate will be identical with obj$sample.

If you could give me more details about your situation, I can be more helpful.

Hi @pati-ni,

Thank you so much for your reply.

Our experimental design goes like this:

image

We have several libraries with pooled donors and several donors with replicated libraries, and I want to correct the variances from different libraries and donors. It's not fully hierarchical.

Do you think it's okay to paste donor ID and library ID in this case?

Hi @YiweiNiu

In your case, I would recommend having two independent covariates. Otherwise, you compromise both sample library batch effects (Harmony does not know which cells are from the same sample) and donor-specific effects (Harmony does not know which cells are from the same donor).

If you can, let me know how well harmony works in this use case

Hi, thank you again for your quick reply! I tried to use both Donor ID and Library ID as covariates, and for now it works quite well. Thanks for this great tool!