Normalized density for facetted plots
seasmith opened this issue · 4 comments
Would it make sense to create a computed statistic to show the normalized/relative number of neighbors per group to the max nearest neighbors?
# i.e.
data$r_neighbors <- data$n_neighbors / max(data$n_neighbors)
Yes it would make sense in some cases! I wouldn't want to make this the default behavior because I think raw neighbor counts are a bit more intuitive than relative ones, but for facetted plots I see how it can be useful. Maybe something like geom_pointdensity(relative=TRUE)
would be worthwhile?
Returning a computed stat would bring the function's behavior inline with other ggplot2 functions (i.e. stat_density_2d
returns both density
, ndensity
, level
, and nlevel
).
# Example
library(ggplot2)
library(ggpointdensity)
ggplot(diamonds, aes(carat, price)) +
geom_pointdensity(aes(color = stat(r_neighbors)))
I feel density
and ndensity
are more inline with ggplot2 and would make the function more extendible (i.e. if the function accepted something like method = "kde2d"
for 2d kernel-density or method = "bkde2d"
for 2d binned kernel-density).
# Example
# Default method would be 'nn'
ggplot(diamonds, aes(carat, price)) +
geom_pointdensity(aes(color = stat(ndensity)), method = "nn")
# kernel-density
ggplot(diamonds, aes(carat, price)) +
geom_pointdensity(aes(color = stat(ndensity)), method = "bkde2d")
# binned kernel-density
ggplot(diamonds, aes(carat, price)) +
geom_pointdensity(aes(color = stat(ndensity)), method = "bkde2d")
Returning a computed stat would bring the function's behavior inline with other ggplot2 functions (i.e. stat_density_2d returns both density, ndensity, level, and nlevel).
This is already the case. stat_pointdensity
computes a stat called n_neighbors
.
I just realized you can even use this stat to plot the density as you originally proposed:
ggplot(dat, aes(x = x, y = y, color = stat(n_neighbors) / max(n_neighbors))) +
geom_pointdensity() +
scale_color_viridis()
I could tweak the stat_pointdensity
to return both n_neighbors
and the density for convenience.
Regarding your last suggestion with method = "something"
, I'm experimenting with something like this at the moment. Mostly to test out different algorithms to find an efficient one that can handle many points (issue #2).
This was implemented in @bjreisman's recent pull request #8 , so I'm closing.