Data.table scope issue in functions
DavideMessinaARS opened this issue · 2 comments
DavideMessinaARS commented
I'm new to disk.frame so maybe I'm misunderstanding how it works with data.table.
I run disk.frame version 0.50 and data.table version 1.14.0
library(disk.frame)
library(data.table)
setup_disk.frame()
test_dt = as.disk.frame(data.table(x = seq_len(10)), outdir = file.path(tempdir(), "test"), overwrite = TRUE)
test_fun <- function(fun_dt) {
col_vect <- "x"
print(fun_dt[, max(get(col_vect))])
}
col_vect <- "x"
test_fun(test_dt)
# return [1] 5 10
rm(col_vect)
test_fun(test_dt)
# return Error
The traceback for the error is:
Error in get(col_vect) : object 'col_vect' not found
13. stop(condition)
12. signalConditions(obj, exclude = getOption("future.relay.immediate", "immediateCondition"),
resignal = resignal, ...)
11. signalConditionsASAP(obj, resignal = FALSE, pos = ii)
10. resolve.list(y, result = TRUE, stdout = stdout, signal = signal, force = TRUE)
9. resolve(y, result = TRUE, stdout = stdout, signal = signal, force = TRUE)
8. value.list(fs)
7. value(fs)
6. future_xapply(FUN = FUN, nX = nX, chunk_args = X, args = list(...),
get_chunk = `[`, expr = expr, envir = envir, future.globals = future.globals,
future.packages = future.packages, future.scheduling = future.scheduling,
future.chunk.size = future.chunk.size, future.stdout = future.stdout, ...
5. future.apply::future_lapply(get_chunk_ids(df, strip_extension = FALSE),
function(chunk_id) {
chunk = get_chunk(df, chunk_id, keep = keep_for_future)
data.table::setDT(chunk) ...
4. `[.disk.frame`(fun_dt, , max(get(col_vect)))
3. fun_dt[, max(get(col_vect))]
2. print(fun_dt[, max(get(col_vect))])
1. test_fun(test_dt)
xiaodaigh commented
there's an issue with disk.frame where it doesn't wor within functions. it's to do with the global scope and NSE. I am designing a revamp of how disk.frame handles NSE. But the caveat is that functions are unlikely to compose well.
So this is a "known" issue.
DavideMessinaARS commented
I found a workaround to the scope issue by sending the objects to the GlobalEnv:
test_fun <- function(fun_dt) {
col_vect <<- "x"
print(fun_dt[, max(get(col_vect))])
}
(or using assign)
The problem is I can't modify the function I'm using so I'll need to wait for a fix to disk.frame or program myself a stopgap solution.
In any case, thanks for your help.