const-ae/ggsignif

Annotations disappear when using scale_x_discrete() to change order of x-axis

rpmady opened this issue · 7 comments

Hello, when I try to plot a bar chart, I'm finding that the annotations disappear when I try to use add scale_x_discrete to switch around the order of the groups on the x-axis.

Here's the data I'm trying to plot:

df <- structure(list(contrast = structure(c(1L, 1L, 1L), .Label = "near / far", class = "factor"), 
    trtmt = structure(1:3, .Label = c("constant", "pulsed_food", 
    "pulsed_nofood"), class = "factor"), ratio = c(5.07099352737315, 
    11.1731879523668, 1.3312513851146), SE = c(1.58712441846575, 
    8.31899028952686, 0.914462455274322), df = c(124, 124, 124
    ), lower.CL = c(2.72933986134215, 2.55958508525054, 0.34181466728037
    ), upper.CL = c(9.42168313989854, 48.773580428445, 5.18476946723847
    )), class = "data.frame", row.names = c(NA, 3L))

And here is the code and the plot without the scale_x_discrete active:

ggplot(df, aes(x=trtmt, y=ratio)) + 
  geom_bar(stat="identity", size=0.7, fill="gray", color="black") +
  geom_errorbar(aes(ymin=lower.CL, ymax=upper.CL), width=.2) +
  labs(y="Difference of Estimated Marginal Mean\nwith 95% Confidence Intervals", x="Treatment") + 
  theme_bw() + 
  theme(panel.border = element_blank(), panel.grid.major = element_blank(),
        panel.grid.minor = element_blank(), axis.line = element_line(colour="black")) + 
  theme(text = element_text(size=15)) + 
  scale_y_continuous(limits=c(-2,49), breaks=seq(-2,49,5)) + 
  #scale_x_discrete(limits = c("pulsed_nofood","pulsed_food","constant"), 
   #                labels=c("pulsed_nofood" = "Pulsed (No Food)", "constant" = "Constant","pulsed_food" = "Pulsed (Food)")) + 
  geom_signif(y_position = c(25, 26), xmin = c(1,1), xmax = c(2,3),tip_length = 0, annotation = "*")

producing this plot:
image

When I un-comment the scale_x_discrete(), then the lines and asterisks go away:
image

This is very confusing because on another dataset, almost identical to this one, the scale_x_discrete doesn't seem to interfere:

df <- structure(list(contrast = structure(c(1L, 1L, 1L), .Label = "near / far", class = "factor"), 
    trtmt = structure(1:3, .Label = c("pulsed_nofood", "constant", 
    "pulsed_food"), class = "factor"), ratio = c(1.92122356321414, 
    5.23797787647802, 10.4941901194222), SE = c(0.878061062639006, 
    1.45350877127748, 4.46193848439194), df = c(123, 123, 123
    ), lower.CL = c(0.77747406435774, 3.02421834016281, 4.52311316079063
    ), upper.CL = c(4.74755384013795, 9.07223260639184, 24.3478379487034
    )), class = "data.frame", row.names = c(NA, 3L))

ggplot(df, aes(x=trtmt, y=ratio)) + 
  geom_bar(stat="identity", size=0.7, fill="gray", color="black") +
  geom_errorbar(aes(ymin=lower.CL, ymax=upper.CL), width=.2) +
  labs(y="Estimated Marginal Mean\nwith 95% Confidence Intervals", x="Treatment") + 
  theme_bw() + 
  theme(panel.border = element_blank(), panel.grid.major = element_blank(),
        panel.grid.minor = element_blank(), axis.line = element_line(colour="black")) + 
  theme(text = element_text(size=15)) + 
  scale_y_continuous(limits=c(0,26), breaks=seq(0,26,1)) + 
  scale_x_discrete(limits = c("pulsed_nofood","pulsed_food","constant"), 
                   labels=c("pulsed_nofood" = "Pulsed (No Food)", "constant" = "Constant","pulsed_food" = "Pulsed (Food)")) + 
  geom_signif(y_position = c(25, 26), xmin = c(1,1), xmax = c(2,3),tip_length = 0, annotation = "*")

image

Hi Rachael,

yes you are right, that indeed seems to be a bug and thank you for the reproducible example, which made it easy to understand what is happening.

I have created a slightly simpler example case, where I see the same pattern

library(ggplot2)
library(ggsignif)

ggplot(diamonds, aes(x=cut, y=price)) +
  geom_boxplot() +
  geom_signif(comparison = list(c("Good", "Very Good"))) +
  scale_x_discrete(limits = c("Fair", "Good", "Very Good")) +
  NULL
#> Warning: Removed 35342 rows containing missing values (stat_boxplot).
#> Warning: Removed 35342 rows containing non-finite values (stat_signif).

ggplot(diamonds, aes(x=cut, y=price)) +
  geom_boxplot() +
  geom_signif(comparison = list(c("Good", "Very Good")))+
  scale_x_discrete(limits = c("Good", "Very Good", "Premium")) +
  NULL
#> Warning: Removed 23161 rows containing missing values (stat_boxplot).
#> Warning: Removed 23161 rows containing non-finite values (stat_signif).

Created on 2019-05-30 by the reprex package (v0.2.1)

Unfortunately, I currently don't have the capacity to fix the bug immediately. But if by any chance you (or anyone else) is interested in learning more about writing a ggplot-extension and fixing the arising bugs, I would be very happy to give pointers, where to start get started, and accept a pull request.

Best, Constantin

That's okay to not have it fixed immediately. Any idea when you might be able to get around to it? I could maybe take a stab at it, but I'm still not very skilled at coding (very new to the whole process!).

Cheers,
Rachael

That's a good question. I am currently finishing something up for the next week, but planed to go on holidays afterwards. So if the other project works smoothly (usually it doesn't), I might get around to it before I leave, otherwise it could take a month or two.

Hm, I think that working on a ggplot extension is a super rewarding project, because you can make stuff that is practially useful. But there are also a few things that are helpful to know, but if you know how to write your own functions, you are already a good step along that road :)

If you are interested in debugging the package, you will first need to a local copy of my repo. Jenny Brian has a great book https://happygitwithr.com/ and there in particular chapter 28 Fork and Clone will explain how to get a copy of ggsignif onto your computer.

The next step is actually working with R-packages. Here, I would recommend to take a look at Hadley Wickhams R Packages book to understand what the different files and folders actually do. On the other hand you don't need to read the full thing, the most important steps are:

  1. Change some code
  2. Hit Ctrl+Shift+L to reload the package
  3. Rerun the code and see what changes

The best way to understand how a package works, is to simply put a browser() statement somewhere and step through the code and see how and when the internal functions are called.

There is a also a vignette Extending ggplot2, that explains the specific structure for a ggplot extension and is super helpful if you want to start from scratch, but can probably also be helpful of why I had to do certain things, the way I did.

I hope those resources are helpful for you, if you want to give it a try. Then, if anything else comes up or is unclear, just let me know.

Thank you for all of that! I think it will be a huge hurdle for me, but I will try it.

Great, that you give it a try. Don't hesitate to ask, if anything is unclear :)

Is there any follow up on this open bug? I just encountered the same error.

same question here =)