tidyverse/ggplot2

position_dodge2() should handle both point and interval geoms

frostell opened this issue · 10 comments

Displaying raw data together with box plots can be useful.

If using both 'x' and 'fill' aesthetic geom_boxplot() handles NA's elegantly through the new position_dodge2(), but unfortunately geom_point() with a matching 'colour' aesthetic is misaligned with the boxes both when using position_dodge() and position_dodge2().

This problem only occurs when handling NA's, and position_dodge() and position_dodge2() gets it wrong in different ways:

library('tidyverse')

dat <- data.frame("value" = rnorm(n=30, mean=2, sd=0.5),
                  "group" = LETTERS[1:3],
                  "x" = factor(1:2))

dat$value[dat$group=="A"&dat$x=="1"] <- NA

ggplot(dat, aes(x=x, y=value)) +
  geom_boxplot(aes(fill=group), alpha=0.3) +
  geom_point(aes(colour=group), position=position_dodge(width=0.75), size=3, alpha=0.5)

ggplot(dat, aes(x=x, y=value)) +
  geom_boxplot(aes(fill=group), alpha=0.3) +
  geom_point(aes(colour=group), position=position_dodge2(width=0.75), size=3, alpha=0.5)

reprex

library(tidyverse)

dat <- data.frame("value" = rnorm(n=30, mean=2, sd=0.5),
                  "group" = LETTERS[1:3],
                  "x" = factor(1:2))

dat$value[dat$group=="A"&dat$x=="1"] <- NA

ggplot(dat, aes(x=x, y=value)) +
  geom_boxplot(aes(fill=group), alpha=0.3) +
  geom_point(aes(colour=group), position=position_dodge(width=0.75), size=3, alpha=0.5)
#> Warning: Removed 5 rows containing non-finite values (stat_boxplot).
#> Warning: Removed 5 rows containing missing values (geom_point).

ggplot(dat, aes(x=x, y=value)) +
  geom_boxplot(aes(fill=group), alpha=0.3) +
  geom_point(aes(colour=group), position=position_dodge2(width=0.75), size=3, alpha=0.5)
#> Warning: Removed 5 rows containing non-finite values (stat_boxplot).

#> Warning: Removed 5 rows containing missing values (geom_point).

Created on 2018-03-13 by the reprex package (v0.2.0).

(What did I miss to not get the reprex output in my comment?)

Did you use the reprex package?

If you didn't, definitely check it out. It's especially awesome for ggplot2/viz stuff since it automatically generates the plots and uploads them (via imgur, I believe, but that's not something the user sees/needs to worry about).

No I didn't use it, I erroneously though it was somehow built into github - now I realise it's a package for R that will give me an output to copy/paste into github (right?)- I will try to use it next time!

Slightly more minimal reprex:

library(ggplot2)

df <- tibble::tribble(
  ~g, ~x, ~y,
  "x", "1", 1,
  "x", "1", 2,
  "x", "1", 3,
  "y", "1", NA,
  "x", "2", 1,
  "x", "2", 2,
  "x", "2", 3,
  "y", "2", 4,
  "y", "2", 5,
  "y", "2", 6
)

ggplot(df, aes(x, y, colour = g)) +
  geom_boxplot() +
  geom_point(position = position_dodge(width = 0.75))
#> Warning: Removed 1 rows containing non-finite values (stat_boxplot).
#> Warning: Removed 1 rows containing missing values (geom_point).

ggplot(df, aes(x, y, colour = g)) +
  geom_boxplot() +
  geom_point(position = position_dodge2(width = 0.75))
#> Warning: Removed 1 rows containing non-finite values (stat_boxplot).

#> Warning: Removed 1 rows containing missing values (geom_point).

Created on 2018-05-09 by the reprex package (v0.2.0).

I think the key problem is that position_dodge2() does not handle point geoms, where grouping is determined based on the group, not the position.

This is going to require more thinking about how dodging works, and it's technically a feature not a bug, so I'm going to move to a future ggplot2 release.

probably wise! (for anyone in need of an ad hoc solution, see #2481)

Closing in favour of #3022 which has more discussion

lock commented

This old issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with reprex) and link to this issue. https://reprex.tidyverse.org/