pairing mean_diff with Mann-Whitney test
tmchartrand opened this issue · 5 comments
It seems to me that there's a conceptual issue with the default behavior shown in the tutorial: the documentation there states "By default, DABEST will report the two-sided p-value of the most conservative test that is appropriate for the effect size.", which in the example means that the mean_diff effect size reports a Mann-Whitney test p-value.
However, the Mann-Whitney test is fundamentally unrelated to the mean difference in general, corresponding to the median difference instead. It's possible that a positive mean difference and negative median difference could both be valid conclusions for the right data!
Maybe the defaults could restrict to pairing mean- and median-based effect sizes and tests?
Actually now that I think about it, even median_diff doesn't necessarily match the Mann-Whitney conclusion. Median of differences would match MW (as does Cliff's delta), but median_diff is calculating difference of medians (not sure what corresponding test could be used there).
I guess this sort of decision is always complex and should be left to the user, but the tutorial example should definitely be adjusted at least, in my opinion!
Hi @tmchartrand ,
You are correct in pointing out the incongruency. The overarching intention of estimation plots is to de-emphasise the dichotomous all-or-nothing nature of hypothesis tests, which current usage of P values exacerbates.
In the webapp at estimationstats.com, we do state below the results that
the P value(s) reported are the likelihood(s) of observing the effect size(s), if the null hypothesis of zero difference is true; they are included here to satisfy a common requirement of scientific journals.
which hopefully serves to inform the reader on what a P value really is...
We will update the tutorial to bring the intent and thinking in line with an estimation framework that de-emphasises P values; thanks for pointing this out!
Thanks for the reply!
I'm not sure if my meaning got across fully though. It's not so much the tension between p-values and estimation plots in general that I was trying to bring up. As I see it, they can provide complementary views of the same question, provided the effect size measures the same properties of the data as the test statistic does - this is the case when a mean difference effect size plot is paired with a t-test, but not when it is paired with a MW test.