fgsea gives many highly sign. gene sets for comparison where I expect a few
ms-gx opened this issue · 1 comments
I am using fgsea according to this example:
https://crazyhottommy.github.io/scRNA-seq-workshop-Fall-2019/scRNAseq_workshop_3.html#downstream_analysis_of_scrnaseq_data
fgsea version: 1.18.0
I have one treatment (case 1) where I see a strong perturbation of a cell line and another treatment (case 2) where I just see a slight perturbation.
Now I compare the two treatments with a control condition (using wilcoxauc
) and feed the rankings to fgsea. Interestingly, for the case 2 (slight perturbation) I see many gene sets which are highly significant whereas for the case 1 (high perturbation) I just see a few highly significant gene groups and then the padj drops drastically.
This is the gsea call for both:
fgsea(homo_sapiens_gene_set, stats = ranks, scoreType = "pos", eps = 0.0)
I am using the auc
statistics from wilcoxauc
.
It seems to me that the signal for the slight perturbation is much more subtle and thus the weak "background" signal gets much more prominent. Or differently: there is no clear and dominant signal for the weak perturbation and fgsea finds lots of noise. At least that's how it looks to me.
Am I doing anything wrong or do I have to adjust something?
Would you suggest something else than auc
statistics?
EDIT: would you recommend logFC instead?
I think I didn't get you experimental design. What are you comparing with what? In any case, first I'd suggest to visualize your enrichment, this will help understand why a particular pathway is deemed to be enriched. Second, for single-cell RNA-seq, if you compare one cluster vs other, there is a recommendation to use logFC as a statistic, see #50