data transformation
sa-lee opened this issue · 8 comments
How to provide sugar / dplyr like interface for the vega-lite transformation API
https://observablehq.com/@uwdata/data-transformation?collection=@uwdata/visualization-curriculum
Just add one more example here https://bl.ocks.org/amitkaps/a484b94a7e1e0705c5ec865ba31f463c for later discussion.
transform filter takes more types of predicates than datum
.
thinking out loud but i think if we want to scale visualisations we would probably need to go to another drawing library. ala https://github.com/visgl/deck.gl
Vega transform -> dplyr verbs
- calculate ->
mutate()
- filter ->
filter()
,vega() %>% filter(selection)
(now it's clear thatselection
corresponds to a vector of logical) - sample ->
sample_n()
- lookup ->
left_join()
- join aggregate -> TBD
- window -> prefixed with
vg_*()
for vector functions likevg_mean()
For layer-specific transformation, use data pronoun .vega
as data input, .vega %>% filter(selection)
. By default, transform
arg in vega_layer()
uses filter()
when the input is selection
.
timeline <- select_interval("x")
p_avg <- akl_weather %>%
vega(enc(x = vg_month(date)), width = 600, height = 350) %>%
# filter(timeline) %>%
mark_ribbon(
enc(y = vg_mean(tmin), y2 = vg_mean(tmax)),
interpolate = "monotone",
colour = "#fc9272", opacity = 0.3,
transform = timeline) %>% # filter(.vega, timeline)
mark_line(enc(y = vg_mean(prcp)), colour = "#3182bd",
transform = timeline) %>%
mark_point(enc(y = vg_mean(prcp)), colour = "#3182bd",
transform = timeline) %>%
resolve_views(scale = list(y = "independent"))
I like the dplyr idea.
Also a quick note on performance - it looks altair
caps data rows to 5000? https://altair-viz.github.io/user_guide/data_transformers.html
I've been developing {dplyr} verbs, and I came to a point where I don't see much of use to transform static data in the interactive settings. It's useful in transforming selected data, and the current development has already handled the functionality. We'll discuss this more soon today.
Drop {dplyr} development for transforming
I'm going to reopen this issue, since I could see more use cases for transforming data. But instead of dispatching on the static data, we could define {dplyr} verbs for a selection object.
Useful with mutate(<selection>)
in particular.