Multiple tests correction

Question

Multiple tests correction

Closed this issue 2 years ago · 14 comments

Hi Nic, cool stuff.

Having a quick read through it and I think it could be useful to correct the alpha level for multiple testing (as you may have panels with hundreds of SNVs).

Another question, why don't you calculate p_val = Bin(nv | c, pi/2) and then look if p_val < alpha (this would give you an idea of the magnitude at which the distribution differs from your null hypothesis)?

Cheers.

Answer 1 · 2022-06-28T09:36:25.000Z

My take on these 2:

that's possibile, and it should be arranged. @nicola-calonaci I would divide the required alpha (model input) by the number of tested mutations so to adjust for FWER a-la-bonferroni.
that is not the p-value though no? So $p_{val} < alpha$ is not a test. The magnitude is OK.

Answer 2 · 2022-06-28T10:12:04.000Z

Yea nono sorry, of course, the p-value it's not that expression, my fault, it's $$\sum_{i=0}^{n_v}{B( i | c, \pi/2)}$$, for the hypothesis "less than". Like the standard binom.test in R. I mean in general why not do a binomial test instead of comparing expected counts (with a test you get a p-value)?
While yea, comparing alpha to the p-value is what you usually do in hypothesis testing I think

Answer 3 · 2022-06-28T10:50:04.000Z

I think that the tests

$n_v \leq \sup\lbrace x\mid P\left(X\leq x\right)\leq\alpha\rbrace$

and

$p_{val}\left(n_v\right)\leq\alpha$

are equivalent (and same for the right-sided test) and that the former was implemented just to make the plotting function easier. We can definetly switch back to binom.test R function and implement the Bonferroni or BH corrections. Btw binom.test returns the extrema of the chosen confidence interval too so we could keep the same format for the plotting function.

Answer 4 · 2022-06-28T11:08:18.000Z

Definitely, they are exactly the same.
My only point is that having a p-value gives you a more interpretable measure of how far you are from your $\alpha$.

For ex. if I tell you that the p-value is 1e5 you know that this corresponds to a very low likelihood of observing the data under H0 (1 in 10000 trials), while if you tell me that the difference of $n_v$ and the $quantile(\alpha)$ is 6, that is a bit more complicated to put into context.

But the test of the quantiles is correct as it is of course. My point is just that it would be useful to report also the p-values

Answer 5 · 2022-06-28T11:08:24.000Z

But what we do when we compare

$${\sum_{i=1}^k} B(i | n,p) < \alpha$$

is the same thing as reverting a p-value no? We are asking what is the quantile at which we observe with $\alpha$-confidence that the NV counts are incompatible with clonality. I am not sure that I am getitng what the discussion is here.

Answer 6 · 2022-06-28T11:13:01.000Z

Maybe @nicola-calonaci and @Militeee can you also associate a p-value to the test? That might be easy to communicate even though right now the user has to specify the input alpha.

Answer 7 · 2022-06-28T13:48:13.000Z

Just made the run_classifier function return pvalues for subclonal and loh tests, adjusted for the Benjamini-Hochberg correction.
By default, I implemented it such that the BH correction takes into account the whole dataset.
Do you agree on leaving to the user the choice of correcting p-values by sample or by gene across samples? I think it depends on how one specifically wants to perform the classification.

Answer 8 · 2022-06-29T07:56:45.000Z

But does it make sense to adjust for the whole cohort? I don't think so honestly.

I think we should use an adjustment per sample (if the patient has X mutations, we adjust for X tests). I have no idea what the "genes" you mention are because there can be multiple mutations in the same gene.

Answer 9 · 2022-06-29T07:57:22.000Z

I mean patients are independent no?

Answer 10 · 2022-06-29T08:16:08.000Z

I am not sure: say you have a cohort of 100 (independent) patients for which you try to classify mutations on gene TP53. If you run the tests simultaneously, you would have a probability of getting a significant result (e.g. LOH class) by chance of

$P(\text{at least one significant result}) = 1- P(\text{no significant result})=1-(1-\alpha)^{100}$

and this would be as large as 99%.
Wouldn't this happen even if samplings from different patients are independent?

Answer 11 · 2022-06-29T08:48:34.000Z

But your tool

is not a tester for a single gene, rather is a tester for somatic mutations detected by a panel;
it should run on data of a single-patient, not on data from multiple patients.

I don't really see why you claim 2.

Answer 12 · 2022-06-29T09:52:35.000Z

Yea, I agree with Giulio on this one, I would correct by sample.

Answer 13 · 2022-06-29T11:36:15.000Z

Which means we develop a tool that works only on a single-sample. Like mobster, or others.

Answer 14 · 2022-06-29T13:09:04.000Z

Ok guys, many thanks for your help on this. I just pushed the new code with BH correction done per sample.