saeyslab/multinichenetr

Question senders_oi/receivers_oi and p_adj values

chloebazin opened this issue · 1 comments

Hi,

Thank you for this great tool! I'm exploring different parameters in Multinichenet and I have two questions:

  1. I ran two analyses using different senders and receivers of interest. In the first analysis, I first specified senders and receivers of interest, then ran the analysis. In the second analysis, I used all-vs-all celltypes as senders and receivers and only determined specific senders and receivers when plotting ligand activities. The plots generated from both approaches are relatively similar, but some interactions are specific to each analysis. What is the difference between these two approaches, and is one recommended over the other?

  2. When defining parameters for the ligand activity analysis, I'm unsure whether to use adjusted or normal p-values. I have a sufficiently high number of DE genes per group-cell type, and each group consists of 8 samples. Is there a recommended number of samples to accurately use the adjusted p-value?

Thank you!

Code: analysis 1

senders_oi <- 'myeloid'
receivers_oi <- 'tumor'
sce <- sce[, SummarizedExperiment::colData(sce)[,celltype_id] %in% c(senders_oi, receivers_oi)]

# Plot scaled ligand activity
group_oi <- "group_1"
prioritized_tbl_oi_M_50 <- get_top_n_lr_pairs(multinichenet_output$prioritization_tables, 50, groups_oi = group_oi)
plot_oi <- make_sample_lr_prod_activity_plots(multinichenet_output$prioritization_tables, prioritized_tbl_oi_M_50)

Code: analysis 2

senders_oi <- SummarizedExperiment::colData(sce)[,celltype_id] %>% unique()
receivers_oi <- SummarizedExperiment::colData(sce)[,celltype_id] %>% unique()
sce <- sce[, SummarizedExperiment::colData(sce)[,celltype_id] %in% c(senders_oi, receivers_oi)]

# Plot scaled ligand activity
group_oi <- "group_1"
prioritized_tbl_oi_M_50 <- get_top_n_lr_pairs(multinichenet_output$prioritization_tables, 50, groups_oi = group_oi, receivers_oi = "tumor", senders_oi = "myeloid")
plot_oi <- make_sample_lr_prod_activity_plots(multinichenet_output$prioritization_tables, prioritized_tbl_oi_M_50)

Hi @chloebazin

These are good and relevant questions.

What is the difference between these two approaches, and is one recommended over the other?

Prioritization scores and relative rankings of interactions can differ between the two approaches because cell-type specificity of the ligand and of the receptor are two of the prioritization criteria. This means: if you restrict your analysis to two cell types only, some ligands/receptors can be much more cell-type specifc compared to the analysis with all cell types. For this reason, we recommend to run the analyze on all cell types (so analysis 2), and interpret the output by focusing on cell types of interest.

When defining parameters for the ligand activity analysis, I'm unsure whether to use adjusted or normal p-values.

Because the ligand activity analysis is an enrichment analysis, the most important thing is that the ratio of geneset-vs-background is reasonable. This is quite arbitrary. In a parameter robustness analysis that we have been doing for the revision of the paper, we saw that this range is quite big. You just need to avoid extremes such as a couple of genes in the geneset vs 10000 genes in the background, or the opposite: having 1000s of genes in the geneset oi.

In the vignettes of the development branch of this package, you can find some guidelines and code to calculate these ratios and get an indication of whether this is within recommended ranges:
https://github.com/saeyslab/multinichenetr/blob/dev-branch/vignettes/basic_analysis_steps_MISC.knit.md

In case you are within the recommended range for the adjusted p-values, I would use these.

I will close this issue, but don't hesitate to reopen if something would still be unclear.