support for base and dplyr joins
brownag opened this issue ยท 4 comments
Cool package! I have been intrigued by subclasses of tbl/data.frame, so I had to try this one out.
Here is a quick little reprex of something I noticed. I could see why this might be intended behavior -- but also bumped into it pretty quickly when dropping reactibble into a typical workflow of mine.
I would be curious to hear what you think!
library(reactibble)
rt1 <- reactibble(a = 1:10, b = ~ rev(a))
rt2 <- reactibble(a = 2:5, c = letters[1:4])
res <- dplyr::left_join(rt1, rt2)
#> Joining, by = "a"
# still reactibble
class(res)
#> [1] "reactibble" "tbl_df" "tbl" "data.frame"
# a/b reversed; b is reactive
rt1$a <- rev(rt1$a)
rt1
#> # A reactibble: 10 x 2
#> a b
#> <int> <~int>
#> 1 10 1
#> 2 9 2
#> 3 8 3
#> 4 7 4
#> 5 6 5
#> 6 5 6
#> 7 4 7
#> 8 3 8
#> 9 2 9
#> 10 1 10
# a/b not reversed; b is not reactive
res$a <- rev(res$a)
res
#> # A reactibble: 10 x 3
#> a b c
#> <int> <int> <chr>
#> 1 10 10 <NA>
#> 2 9 9 a
#> 3 8 8 b
#> 4 7 7 c
#> 5 6 6 d
#> 6 5 5 <NA>
#> 7 4 4 <NA>
#> 8 3 3 <NA>
#> 9 2 2 <NA>
#> 10 1 1 <NA>
Created on 2021-01-07 by the reprex package (v0.3.0)
Thanks a lot for your report.
It came from the fact that dplyr joins call as_tibble
, and I had defined as_tibble
to convert all columns to static. I removed the as.data.frame
, as_tibble
, and as.data.table
methods for now and it works as expected :
library(reactibble)
rt1 <- reactibble(a = 1:10, b = ~ rev(a))
rt2 <- reactibble(a = 2:5, c = letters[1:4])
res <- dplyr::left_join(rt1, rt2)
#> Joining, by = "a"
rt1$a <- rev(rt1$a)
rt1
#> # A reactibble: 10 x 2
#> a b
#> <int> <~int>
#> 1 10 1
#> 2 9 2
#> 3 8 3
#> 4 7 4
#> 5 6 5
#> 6 5 6
#> 7 4 7
#> 8 3 8
#> 9 2 9
#> 10 1 10
res$a <- rev(res$a)
res
#> # A reactibble: 10 x 3
#> a b c
#> <int> <~int> <chr>
#> 1 10 1 <NA>
#> 2 9 2 a
#> 3 8 3 b
#> 4 7 4 c
#> 5 6 5 d
#> 6 5 6 <NA>
#> 7 4 7 <NA>
#> 8 3 8 <NA>
#> 9 2 9 <NA>
#> 10 1 10 <NA>
Additional thoughts :
base::merge
makes all columns static though.
Ultimately we should design methods for merge and dplyr joins, and then we might try to reimplement the as.data.frame and as_tibble methods.
merge and dplyr join will run their data.frame method if the reactibble object is not provided first, in that case we cannot do much, it should return a non reactibble object with static columns (right now left_join(as.data.frame(r1), rt2)
will return a data frame with reactive columns, which we want to avoid).
as.data.frame.reactibble can detect that it has been called by merge.data.frame and trigger a warning.
Our methods should enforce that one cannot join by a reactive column.
Great, thanks for explanation and fix, makes sense.
For what it is worth I frequently use base::merge
over dplyr::left_join
-- and am familiar with the revert-to-data.frame that happens when merge
is called on tibbles -- so would be supportive of methods designed explicitly to handle persistence of reactivity.
I have implemented methods for dplyr joins, they do the same but make sure we don't join by reactive columns.
I decided for now to keep merge consistent with what happens to tibbles, so it creates a data frames with static columns, not what you hoped for @brownag (sorry!) but I believe it is for the best.
Also my current view (see last comment in #5 ) is that tibbles can be out of sync on purpose, while data frames shouldn't, and for consistency I would need to define merge.tbl_df
, which would be bad practice and rude to tidyverse maintainers.
This makes sense. I figured you would wind up going this way after doing some reading and thinking on it.
I think they have explicitly said that merge.tbl_df could not be done properly without multiple dispatch? I remember reading issue on tibble.
Great work and I appreciate you writing out your thought process.