wilkelab/ggridges

scalar quantiles in stat_density_ridges

shabbybanks opened this issue · 10 comments

I have tried to request a single vertical line in stat_density_ridges at, say, the 40th quantile via:

library(dplyr)
library(ggplot2)
library(ggridges)

# no vertical lines plotted at all:
iris %>%
  ggplot(aes(y=Species,x=`Sepal.Length`)) +
  stat_density_ridges(quantile_lines=TRUE,quantiles=0.4)

# this gives two vertical lines, but I only want one, at 0.4:
iris %>%
  ggplot(aes(y=Species,x=`Sepal.Length`)) +
  stat_density_ridges(quantile_lines=TRUE,quantiles=c(0.4,0.6))

I would guess this a very simple fix.

Please run your example code through the reprex package so we can easily see the output.

reprex dies within the firewalls of this org, but I will do when I get home.

The reprex would look as follows: (seems like there currently is a hack that achieves what, uh, that other user who is not me wanted)

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(ggplot2)
library(ggridges)
#> 
#> Attaching package: 'ggridges'
#> The following object is masked from 'package:ggplot2':
#> 
#>     scale_discrete_manual

# no vertical lines plotted at all:
iris %>%
  ggplot(aes(y=Species,x=`Sepal.Length`)) +
  stat_density_ridges(quantile_lines=TRUE,quantiles=0.9) + 
  labs(title='I want a vertical line at the 0.90 quantile but do not get one.')
#> Picking joint bandwidth of 0.181

# this gives two vertical lines:
iris %>%
  ggplot(aes(y=Species,x=`Sepal.Length`)) +
  stat_density_ridges(quantile_lines=TRUE,quantiles=c(0.1,0.9)) + 
  labs(title='I can get two verticals at specific quantiles')
#> Picking joint bandwidth of 0.181

# I guess I could use this ugly hack?
iris %>%
  ggplot(aes(y=Species,x=`Sepal.Length`)) +
  stat_density_ridges(quantile_lines=TRUE,quantiles=c(0.9,0.9)) + 
  labs(title='Is this hack the right way to do this? Really?')
#> Picking joint bandwidth of 0.181

Created on 2019-12-03 by the reprex package (v0.3.0)

Thanks. As I said, your PR looks good to me, but I'd like to see a few similar examples with the fix in place, to make absolutely sure everything behaves as expected.

I'm not sure I understand what you mean by few similar examples, but here's more reprex:

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(ggplot2)
library(ggridges)
#> 
#> Attaching package: 'ggridges'
#> The following object is masked from 'package:ggplot2':
#> 
#>     scale_discrete_manual

# no vertical lines plotted at all:
iris %>%
  ggplot(aes(y=Species,x=`Sepal.Length`)) +
  stat_density_ridges(quantile_lines=TRUE,quantiles=0.95,alpha=0.5) + 
  labs(title='I want a vertical line at the 0.95 quantile but do not get one.')
#> Picking joint bandwidth of 0.181

# this gives two vertical lines:
iris %>%
  ggplot(aes(y=Species,x=`Sepal.Length`)) +
  stat_density_ridges(quantile_lines=TRUE,quantiles=c(0.05,0.95),alpha=0.5) + 
  labs(title='I can get two verticals at specific quantiles')
#> Picking joint bandwidth of 0.181

# this gives three verticals
iris %>%
  ggplot(aes(y=Species,x=`Sepal.Length`)) +
  stat_density_ridges(quantile_lines=TRUE,quantiles=4,alpha=0.5) + 
  labs(title='I can get three verticals, at the quartile cutoffs')
#> Picking joint bandwidth of 0.181

# three verticals
set.seed(1234)
data_frame(yval=sample(letters,1e5,replace=TRUE),xval=rnorm(1e5)) %>%
  ggplot(aes(y=yval,x=xval)) + 
  stat_density_ridges(quantile_lines=TRUE,quantiles=pnorm(c(-1,0,1)),alpha=0.5) + 
  labs(y='random letter',x='latitude',
       title='I expect three verticals, near zero and +/- 1')
#> Warning: `data_frame()` is deprecated, use `tibble()`.
#> This warning is displayed once per session.
#> Picking joint bandwidth of 0.171

# three upper values
data(storms)

storms %>%
  filter(year >= 2000) %>%
  mutate(facyear=factor(year)) %>%
  ggplot(aes(y=facyear,x=lat,color=status,fill=status)) + 
  stat_density_ridges(quantile_lines=TRUE,quantiles=c(0.90,0.95,0.99),alpha=0.3) + 
  labs(y='storm year',x='latitude',
       title='I expect three verticals')
#> Picking joint bandwidth of 2.65

Created on 2019-12-04 by the reprex package (v0.3.0)

The point is to run examples with the patched code of the PR, to show that the patch works. I get that the current code has issues. But does it behave as expected after the fix has been applied?

I see, something like a unit test, but which has to be visually inspected? Do you need more examples? Are they supposed to be interesting?

I don't get what the issue is. You proposed a PR, you want me to merge it, so please demonstrate it works as expected.

Here are two example PRs from ggplot2. Maybe this helps:
tidyverse/ggplot2#3546
tidyverse/ggplot2#3494

I apologize, I am not familiar with your dev workflow. (Partly because the results are visual and so are hard to unit test; for another package I would look for unit tests.) Here's the examples with the fix in place:

libd <- '~/.Rlib_ggridge'

# I installed via:
# require(devtools)
# dir.create(libd)
# install_github('shabbybanks/ggridges',ref='patch-1',lib=libd)
# .libPaths(unique(c(libd,.libPaths())))

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(ggplot2)
library(ggridges,lib=libd)
#> 
#> Attaching package: 'ggridges'
#> The following object is masked from 'package:ggplot2':
#> 
#>     scale_discrete_manual

# new behavior: I expect a vertical at 0.95
iris %>%
  ggplot(aes(y=Species,x=`Sepal.Length`)) +
  stat_density_ridges(quantile_lines=TRUE,quantiles=0.95,alpha=0.5) + 
  labs(title='I expect a vertical line at the 0.95 quantile.')
#> Picking joint bandwidth of 0.181

# I expect two lines very close to each other:
iris %>%
  ggplot(aes(y=Species,x=`Sepal.Length`)) +
  stat_density_ridges(quantile_lines=TRUE,quantiles=0.48,alpha=0.5) +  # new behavior
  stat_density_ridges(quantile_lines=TRUE,quantiles=2,alpha=0.5) +   # old behavior
  labs(title='I expect two verticals near the median')
#> Picking joint bandwidth of 0.181
#> Picking joint bandwidth of 0.181

# should this throw an error? not for me to decide.
iris %>%
  ggplot(aes(y=Species,x=`Sepal.Length`)) +
  stat_density_ridges(quantile_lines=TRUE,quantiles=-0.5,alpha=0.5) + 
  labs(title='I expect a vertical line at the 0.95 quantile.')
#> Picking joint bandwidth of 0.181

# old behavior should be unchanged:

# two verticals surrounding 0.90 of probability
iris %>%
  ggplot(aes(y=Species,x=`Sepal.Length`)) +
  stat_density_ridges(quantile_lines=TRUE,quantiles=c(0.05,0.95),alpha=0.5) + 
  labs(title='I can get two verticals at specific quantiles')
#> Picking joint bandwidth of 0.181

# this gives three verticals
iris %>%
  ggplot(aes(y=Species,x=`Sepal.Length`)) +
  stat_density_ridges(quantile_lines=TRUE,quantiles=4,alpha=0.5) + 
  labs(title='I can get three verticals, at the quartile cutoffs')
#> Picking joint bandwidth of 0.181

Created on 2019-12-07 by the reprex package (v0.3.0)

Thanks for the merge, and thanks again for the package! Sorry for being so dense.