p_adj.loc greater than p_adj.glb

Question

p_adj.loc greater than p_adj.glb

Closed this issue 3 years ago · 2 comments

Hello!

I have noticed that the p_adj.loc is sometimes greater than the p_adj.glb in my differential expression results (using pbDS on an snRNA-seq dataset, with the edgeR method),

To my understanding the p_adj.loc is produced by correcting for multiple testing across all genes tested within a cluster while the p_adj.glb is produced by correcting for multiple testing across all genes tested in all clusters. Thus, p_adj.loc should be less than p_adj.glb.

Here is a small section of the table produced by resDS as an example (rows 3, 7, 16, 18, and 19):

	gene	cluster_id	logFC	logCPM	F	p_val	p_adj.loc	p_adj.glb	contrast
1	FO538757.2	Inhib_5	0.297907	6.55732	0.851136	0.361205	0.786318	0.804342	ei.group_idS-ei.group_idC
2	RP5-857K21.4	Inhib_5	-0.45668	6.879999	1.461924	0.233008	0.685881	0.715474	ei.group_idS-ei.group_idC
3	HES4	Inhib_5	-0.14578	5.831962	0.146945	0.703297	0.946466	0.94204	ei.group_idS-ei.group_idC
4	SCNN1D	Inhib_5	0.393927	5.304127	0.538035	0.467095	0.849796	0.858966	ei.group_idS-ei.group_idC
5	ACAP3	Inhib_5	1.019394	6.236945	9.745041	0.003155	0.143755	0.196263	ei.group_idS-ei.group_idC
6	ATAD3B	Inhib_5	0.690168	5.472104	2.170819	0.147689	0.589724	0.629304	ei.group_idS-ei.group_idC
7	MIB2	Inhib_5	-0.16557	5.665529	0.11623	0.734764	0.956805	0.950395	ei.group_idS-ei.group_idC
8	SLC35E2B	Inhib_5	0.572459	5.423179	2.145233	0.150042	0.590007	0.632773	ei.group_idS-ei.group_idC
9	GNB1	Inhib_5	-0.21795	7.569159	1.530005	0.222599	0.678797	0.706937	ei.group_idS-ei.group_idC
10	GABRD	Inhib_5	-0.51315	6.940436	5.345291	0.025464	0.321897	0.369583	ei.group_idS-ei.group_idC
11	PRKCZ	Inhib_5	0.349367	6.707143	2.446721	0.124854	0.561587	0.600813	ei.group_idS-ei.group_idC
12	SKI	Inhib_5	0.226541	6.517301	0.675613	0.415483	0.822396	0.833289	ei.group_idS-ei.group_idC
13	PLCH2	Inhib_5	0.47294	6.617706	4.095984	0.049014	0.408241	0.452765	ei.group_idS-ei.group_idC
14	CEP104	Inhib_5	0.295637	6.111008	0.92458	0.34147	0.770297	0.792691	ei.group_idS-ei.group_idC
15	DFFB	Inhib_5	0.555827	5.683095	2.5341	0.118489	0.557515	0.590573	ei.group_idS-ei.group_idC
16	AJAP1	Inhib_5	0.062656	7.534503	0.086387	0.770188	0.965444	0.959339	ei.group_idS-ei.group_idC
17	NPHP4	Inhib_5	0.614426	6.15494	4.304073	0.043834	0.394343	0.437253	ei.group_idS-ei.group_idC
18	KCNAB2	Inhib_5	0.037785	7.168045	0.032431	0.857903	0.986452	0.97939	ei.group_idS-ei.group_idC
19	CHD5	Inhib_5	0.050941	7.044387	0.046852	0.829623	0.979776	0.972831	ei.group_idS-ei.group_idC

I am using version 1.8.0 of muscat.

I would appreciate any information you can provide to help understand this discrepancy.

Thank you!

May be related to #69

Answer 1 · 2022-02-18T07:31:34.000Z

Hi,
while you are correct about the meaning of the global/local distinction, and that this will tend to lead to lower local adjusted p-values, in fact this is not always the case. Consider the following example:

> set.seed(123)
> pvals <- c(0.01,10^-abs(rnorm(9))) # random p-values, with 0.01 at the beginning
> p.adjust(pvals,method="BH")
 [1] 0.09208109 0.45853565 0.73575382 0.09208109 0.85014227 0.82503003
 [7] 0.09208109 0.49429447 0.13579343 0.41131746
> p.adjust(c(0.01,pvals),method="BH")
 [1] 0.05500000 0.05500000 0.43233361 0.71940373 0.07596690 0.85014227
 [7] 0.81677973 0.07066534 0.47575843 0.11949822 0.37704100

Notice that in the second case, I add another 0.01 at the beginning, and that this leads to a decrease in the adjusted p-value of the existing 0.01 (from ~0.09 to 0.055; others are also changed a bit).
The reason for this is that the correction method works under the assumption that a certain proportion of low p-values are expected under the null hypothesis, and looks for true signal in an excess of such p-values. Hence the presence of other such similar p-values decreases the probability that any of them is due to chance.

Answer 2 · 2022-02-18T14:33:46.000Z

Thank you @plger! This makes sense.