Normalized density for facetted plots

Question

Normalized density for facetted plots

seasmith opened this issue 5 years ago · 4 comments

Would it make sense to create a computed statistic to show the normalized/relative number of neighbors per group to the max nearest neighbors?

# i.e.
data$r_neighbors <- data$n_neighbors / max(data$n_neighbors)

Answer 1 · 2019-09-04T13:57:19.000Z

Yes it would make sense in some cases! I wouldn't want to make this the default behavior because I think raw neighbor counts are a bit more intuitive than relative ones, but for facetted plots I see how it can be useful. Maybe something like geom_pointdensity(relative=TRUE) would be worthwhile?

Answer 2 · 2019-09-06T17:33:38.000Z

Returning a computed stat would bring the function's behavior inline with other ggplot2 functions (i.e. stat_density_2d returns both density, ndensity, level, and nlevel).

# Example
library(ggplot2)
library(ggpointdensity)

ggplot(diamonds, aes(carat, price)) +
  geom_pointdensity(aes(color = stat(r_neighbors)))

I feel density and ndensity are more inline with ggplot2 and would make the function more extendible (i.e. if the function accepted something like method = "kde2d" for 2d kernel-density or method = "bkde2d" for 2d binned kernel-density).

# Example

# Default method would be 'nn'
ggplot(diamonds, aes(carat, price)) +
  geom_pointdensity(aes(color = stat(ndensity)), method = "nn")

# kernel-density
ggplot(diamonds, aes(carat, price)) +
  geom_pointdensity(aes(color = stat(ndensity)), method = "bkde2d")

# binned kernel-density
ggplot(diamonds, aes(carat, price)) +
  geom_pointdensity(aes(color = stat(ndensity)), method = "bkde2d")

Answer 3 · 2019-09-06T18:05:41.000Z

Returning a computed stat would bring the function's behavior inline with other ggplot2 functions (i.e. stat_density_2d returns both density, ndensity, level, and nlevel).

This is already the case. stat_pointdensity computes a stat called n_neighbors.
I just realized you can even use this stat to plot the density as you originally proposed:

ggplot(dat, aes(x = x, y = y, color = stat(n_neighbors) / max(n_neighbors))) +
    geom_pointdensity() +
    scale_color_viridis()

I could tweak the stat_pointdensity to return both n_neighbors and the density for convenience.

Regarding your last suggestion with method = "something", I'm experimenting with something like this at the moment. Mostly to test out different algorithms to find an efficient one that can handle many points (issue #2).

Answer 4 · 2020-02-07T12:41:30.000Z

This was implemented in @bjreisman's recent pull request #8 , so I'm closing.