dt_fill raises recycle error all of a sudden
gernophil opened this issue ยท 8 comments
I have data wrangling script, where I just want to fill all NA
s with the value above it. I used to do it this way:
library(data.table)
library(tidyfast)
col1 <- c("A", "B", "C", "D")
col2 <- c("X", NA, "Z", NA)
dt <- data.table(col1,
col2)
dt[, col2 := dt_fill(dt, col2, .direction = 'down')]
And this always worked, but now I get the error:
Error in `[.data.table`(dt, , `:=`(col2, dt_fill(dt, col2, .direction = "down"))) :
Supplied 2 items to be assigned to 4 items of column 'col2'. If you wish to 'recycle' the RHS please use rep() to make this intent clear to readers of your code.
going back to 0.2.1
fixes the issue. Would be great, if there was a better solution to this.
Per git bisect
, culprit is 55da55e
Thanks for reporting, will take a look this weekend
Thanks for posting. The update to dt_fill()
makes it possible for tidyfast
to leverage better coding with data.table
. The code that should work for your situation:
library(data.table)
library(tidyfast)
col1 <- c("A", "B", "C", "D", "E")
col2 <- c("X", NA, "X", NA, "Z")
col3 <- c("1", "1", "1", "2", "2")
dt <- data.table(col1, col2, col3)
# dt[, col2 := dt_fill(dt, col2, .direction = 'down')] # errors
# simple use
dt1 = dt_fill(dt, col2, .direction = 'down')
dt1[]
# by group (col3)
dt2 = dt_fill(dt, col2, id = col3, .direction = 'updown')
dt2[]
Simplifies the code too.
Thanks for clarifying and thank for your work. I have two questions for this though:
- It seems that the new
dt_fill()
does not allow in place modification anymore. Is that correct? - I use
dt_fill()
the way it worked before in some scripts. Now I need to change all of those and also tell people that use it, to change this (I know this could be easily done using git, bit not everyone uses that). Would it not have been better to make a new function instead of changing the already establisheddt_fill()
?
Thanks.
- You can set
immutable = FALSE
to mutate in place. - The goal of
dt_fill()
was to matchtidyr::fill()
while taking advantage ofdata.table
. This change was to better meet that goal.
Beyond the annoyance of changing code (sorry about that!) is there another reason to keep the previous behavior? It's possible to keep the previous behavior (potentially with a new argument or a different function).
Thanks :).
I guess there is not or I cannot really make an educated decision here, since I only use it for that case :). Maybe there is a way to announce those changes since I just needed to quickly rerun a script to create an updated plot and it didn't work anymore, which did cost some time to narrow it down.
It works again now and it's a quick fix. Thanks again for the package :).
Yeah I can work on making any breaking changes more obvious. Thanks for highlighting how you were using the function. Didn't realize it was being used that way so this is helpful for me ๐