How to display p-values adjusted across multiple groups of comparisons with pwpp()?
qdread opened this issue · 7 comments
Hi Russ, I appreciate your great documentation on comparisons, in particular your answer on this stackexchange post about how to define the family for multiple comparisons correction differently than the groups within which the pairwise comparisons are done.
My question is how to interact this with a plot generated with pwpp()
. I have two interacting factors and want to do pairwise comparisons of factor A levels within each level of factor B. But I want to use FDR correction across all comparisons done in this way. For example if there are 3 levels of factor A and 100 levels of factor B, that's 3 pairwise comparisons within each of 100 groups for a total of 300 comparisons.
I have no problem doing this to display in a table, as in this example:
library(emmeans)
data("Alfalfa", package = "nlme")
badmodel <- lm(Yield ~ Variety * Date, data = Alfalfa) # nlme data
emms <- emmeans(badmodel, ~ Variety | Date)
# These two things return the same result which is good
contrast(emms, method = 'pairwise') |> rbind(adjust = 'fdr')
contrast(emms, method = 'pairwise') |> summary(by = NULL, adjust = 'fdr')
However with a pwpp()
plot it only does the FDR adjustment within each Date group which is correcting for fewer comparisons than I want it to. If I switch pwpp()
to by = NULL
this is also not correct because it now does every possible comparison of combinations of Variety and Date, instead of just the Variety comparisons within each Date.
pwpp(emms, method = 'pairwise', adjust = 'fdr') # Adjusts for wrong number of comparisons
pwpp(emms, method = 'pairwise', adjust = 'fdr', by = NULL) + facet_wrap(~ Date) # Does the wrong comparisons
Thanks in advance for your help!
I think as it is currently designed, pwpp()
and pwpm
cannot display a subset of all the comparisons it is capable of showing.
I suppose it is technically possible to do this, but I'm having trouble thinking of what kind of argument to provide. Thinking out loud, maybe it's possible to provide an object that already has the comparisons you want, such as
contrast(emms, method = 'pairwise') |> rbind(adjust = 'fdr') -> what.we.want
and maybe then have something like method = "existing"
which requires a coef
argument (or attribute), like
> coef(pairs(emms))
Variety Date c.1 c.2 c.3 c.4 c.5 c.6 c.7 c.8 c.9 c.10 c.11 c.12
1 Cossack None 1 1 0 0 0 0 0 0 0 0 0 0
2 Ladak None -1 0 1 0 0 0 0 0 0 0 0 0
3 Ranger None 0 -1 -1 0 0 0 0 0 0 0 0 0
4 Cossack S1 0 0 0 1 1 0 0 0 0 0 0 0
5 Ladak S1 0 0 0 -1 0 1 0 0 0 0 0 0
6 Ranger S1 0 0 0 0 -1 -1 0 0 0 0 0 0
7 Cossack S20 0 0 0 0 0 0 1 1 0 0 0 0
8 Ladak S20 0 0 0 0 0 0 -1 0 1 0 0 0
9 Ranger S20 0 0 0 0 0 0 0 -1 -1 0 0 0
10 Cossack O7 0 0 0 0 0 0 0 0 0 1 1 0
11 Ladak O7 0 0 0 0 0 0 0 0 0 -1 0 1
12 Ranger O7 0 0 0 0 0 0 0 0 0 0 -1 -1
But this is pretty shaky because we can't get the coefs directly:
> coef(what.we.want)
No contrast coefficients are available
NULL
The coefs are nulled-out when we do the rbind()
Let me ponder this for a while.
I figured out how to do what you want. Note that the result of
coefs()` above is a data frame, so let
coefs = coefs(pairs(emms))
if we exclude the first two columns, we have a named list of contrast coefficients. Thus, just use them as the contrast method (we have to null-out the by
grouping first):
pwpp(emms, coefs[-(1:2)], by = NULL, adjust = "fdr")
I have a less-hacky way to do this. Note that we can get the combined groups by using the options
argument to contrast()
.
contrast(emms, "pairwise", options = list(by = NULL, adjust = "fdr"))
So, one would hope that
pwpp(emms, "pairwise", options = list(by = NULL, adjust = "fdr"))
would work right. However, it doesn't, because by
is an actual argument to pwpp()
and is used in its code. However, (and luckily), the first thing it does is gets the contrasts. So I added a line of code to change by
to whatever it is after that call. And now it works! You may wait till the next update to CRAN, or wait until I push it up to GitHub and then install it from here.
Hi Russ this is excellent. I would be curious if there is a way to get the pwpp()
plot with facets corresponding to the by groups. Both of the following give less than optimal results, using the example data from before.
pwpp(emms, "pairwise", options = list(by = NULL, adjust = "fdr")) + facet_wrap(~ Date)
pwpp(emms, "pairwise", options = list(by = NULL, adjust = "fdr")) + facet_wrap(~ Date, scales = 'free_y')
Well, I see that the free_y
has some issues with rendering, but the first one is pretty much exactly what I expected. What to you is less than optimal? Just the fact that all the labels occur in each panel?
I believe that the rendering issues in the second one are due to the fact that the line segments are rendered in two parts with a color change halfway between, and the midpoints of those segments don't get included in the grouping as far as scaling is concerned. Offhand, I am not sure that can be fixed, just because in general, the midpoints can be associated with more than one level of a given factor.
Yes, it was just that the extra labels occur in each panel but it's not a huge issue. Thanks so much for working on it!
The issue with scales = 'free_y'
messing things up has to do with the fact that the colors of the comparison bars change color at their midpoints. The idea is that the color at each end is the same as the color of the other mean, supposedly making it easier to identify which mean is being compared. That creates a need for a new grouping factor for use by the ggplot
code. I don't see a way that I can disentangle those groupings. so I think I'm stuck with leaving things as they are.