P-value larger than 1

Question

P-value larger than 1

lingchm opened this issue 3 years ago · 2 comments

Thank you so much for the nice written code package.
I am using stat_util.pvalue to compare the AUC of two models. However, I obtain a p-value output greater than 1. Is this possible? How should I interpret this? I knowpred1 has AUC 0.71 and pred2 has 0.58.

Answer 1 · 2022-02-08T09:42:38.000Z

@lingchm This is because by default two-tailed p-value is computed (which is simply one-tailed p-value multiplied by 2).
In this case, you can get p-value higher than 1.

P-value is computed by

bootstrapping examples (the same for pred1 and pred2),
computing score (ROC AUC) for pred1 and pred2,
calculating the difference,
evaluating percentile of value 0.0 in the distribution of differences.

Check use case 2 here: https://mateuszbuda.github.io/2019/04/30/stat.html

You can plot the values in z and the mean (np.mean(z)) should be roughly equal to the difference of your original scores for pred1 and pred2, i.e. 0.71-0.58=0.13 in your case.

One-tailed p-value is the probability mass for values in z that are <0. One-tailed p-value cannot be `>1.0'.
Two-tailed p-value is one-tailed p-value times 2.

For you, p-value of 1.49 means that there is no significant difference between AUC of pred1 and pred2. You probably do not have many examples.

Answer 2 · 2022-02-11T00:17:40.000Z

I see. Plotting the values in z was helpful. Thank you very much for your explanation.