pairing mean_diff with Mann-Whitney test

Question

pairing mean_diff with Mann-Whitney test

tmchartrand opened this issue 5 years ago · 5 comments

It seems to me that there's a conceptual issue with the default behavior shown in the tutorial: the documentation there states "By default, DABEST will report the two-sided p-value of the most conservative test that is appropriate for the effect size.", which in the example means that the mean_diff effect size reports a Mann-Whitney test p-value.
However, the Mann-Whitney test is fundamentally unrelated to the mean difference in general, corresponding to the median difference instead. It's possible that a positive mean difference and negative median difference could both be valid conclusions for the right data!
Maybe the defaults could restrict to pairing mean- and median-based effect sizes and tests?

Answer 1 · 2020-01-09T22:33:48.000Z

Actually now that I think about it, even median_diff doesn't necessarily match the Mann-Whitney conclusion. Median of differences would match MW (as does Cliff's delta), but median_diff is calculating difference of medians (not sure what corresponding test could be used there).
I guess this sort of decision is always complex and should be left to the user, but the tutorial example should definitely be adjusted at least, in my opinion!

Answer 2 · 2020-01-13T09:13:43.000Z

Hi @tmchartrand ,

You are correct in pointing out the incongruency. The overarching intention of estimation plots is to de-emphasise the dichotomous all-or-nothing nature of hypothesis tests, which current usage of P values exacerbates.

In the webapp at estimationstats.com, we do state below the results that

the P value(s) reported are the likelihood(s) of observing the effect size(s), if the null hypothesis of zero difference is true; they are included here to satisfy a common requirement of scientific journals.

which hopefully serves to inform the reader on what a P value really is...

We will update the tutorial to bring the intent and thinking in line with an estimation framework that de-emphasises P values; thanks for pointing this out!

Answer 3 · 2020-01-13T23:56:04.000Z

Thanks for the reply!
I'm not sure if my meaning got across fully though. It's not so much the tension between p-values and estimation plots in general that I was trying to bring up. As I see it, they can provide complementary views of the same question, provided the effect size measures the same properties of the data as the test statistic does - this is the case when a mean difference effect size plot is paired with a t-test, but not when it is paired with a MW test.

Answer 4 · 2020-01-14T02:16:56.000Z

Right, thanks for elaborating. We print the Mann-Whitney by default because it widely used in biomedical literature as a non-parametric "version" or counterpart to the two-group t-test. You are right in pointing out the discrepancy between mean differences and the hypothesis tested by the Mann-Whitney. The other robust non-parametric test I am aware of is the Kolmogorov-Smirnov test, but in my anecdotal experience, it is uncommon to see it deployed for a two-group comparison. In any case, a short (foot)note in the tutorial should suffice to correct this. 👍🏼

Answer 5 · 2020-01-30T09:48:59.000Z

We are now using permutation tests as the default statistical test in place of Mann-Whitney. See PR #96; feel free to upgrade to v0.3.0