HenrikBengtsson/progressr

progressr together with future.apply prevents parallel execution

frederikziebell opened this issue · 10 comments

Here's an adaption of the progressr-into vignette

library(future.apply)
library(progressr)

plan(multisession)

handlers(global = TRUE)
handlers("progress")

my_fcn <- function(xs) {
  p <- progressor(along = xs)
  y <- future_lapply(xs, function(x, ...) {
    p()
    eigen(x)
  })
}

input <- rep(list(matrix(1:(200^2), ncol=200)), 1000)
my_fcn(input)

This code takes 75sec on my 12-core Macbook Pro, and importantly, only one R process shows full CPU load:
Screenshot 2021-07-03 at 12 03 34

Conversely, commenting out the progress update and running

plan(multisession)

handlers(global = TRUE)
handlers("progress")

my_fcn <- function(xs) {
  p <- progressor(along = xs)
  y <- future_lapply(xs, function(x, ...) {
    # p()
    eigen(x)
  })
}

input <- rep(list(matrix(1:(200^2), ncol=200)), 1000)
my_fcn(input)

finished computation in 15sec and spawns multiple R processes
Screenshot 2021-07-03 at 12 04 51

Update: Looking at the output of Sys.getpid(), 12 distinct PIDs are repored in both cases, corresponding to my 12 processors. Still it is odd that I never see mutiple R processes running simultaneously in the Activity Monitor.

Update2: The issue is only present in RStudio. Everything works as expected when running the code from the console via Rscript.

Thanks for the report. Interesting that it's only happening in RStudio. Unfortunately, I cannot reproduce this on my Linux machine.

What's your sessionInfo()?

Also, only just in case it's related to #116, which I doubt since you say it only happens in RStudio, can you try with:

my_fcn2 <- function(n) {
  p <- progressor(n)
  y <- future_lapply(1:n, function(x, ...) {
    p()
    x <- matrix(1:(200^2), ncol=200)
    eigen(x)
  })
}

and

y <- my_fcn2(1000)

Do you still see a difference? If not, then it's probably related to #116.

I don't see a difference, i.e. parallel execution works in the console but not RStudio. I'm not suprised that it's related to #116, since I've seen this behavior in another project where I wanted parallel computation on a large data object. Here's my sessionInfo():

R version 4.1.0 (2021-05-18)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Big Sur 11.4

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] progressr_0.8.0    future.apply_1.7.0 future_1.21.0     

loaded via a namespace (and not attached):
[1] compiler_4.1.0    parallelly_1.26.1 parallel_4.1.0    tools_4.1.0       listenv_0.8.0     codetools_0.2-18  digest_0.6.27     globals_0.14.0  

Okay. Then what it you try with the develop version; #116 (comment) and the run your original example again?

It still does not parallelize in RStudio. :/

Okay, but let's sort one thing out first. I'm quite certain it does indeed parallelize; instead, I think there's some extreme overhead, causing the total CPU load looks like sequential processing. Run with the following:

my_fcn <- function(xs) {
  p <- progressor(along = xs)
  y <- future_lapply(xs, function(x, ...) {
    p()
    r <- eigen(x)
    list(pid = Sys.getpid(), eigen = r)
  })
}

and then

size <- 20L ## Choose what you'd like here; I used 20 instead of 200 to speed it up
input <- rep(list(matrix(1:(size^2), ncol=size)), times = 1000)
res <- my_fcn(input)

## Validate that we actually ran in parallel;
## there should be one unique PID per worker.  
pids <- vapply(res, FUN = function(z) z$pid, FUN.VALUE = NA_integer_)
upids <- unique(pids)
message("Number of unique PIDs: ", length(upids))
message("Number of workers: ", nbrOfWorkers())
stopifnot(length(upids) == nbrOfWorkers())

Would output do you see?

I get

Number of unique PIDs: 12
Number of workers: 12

This fits to what I see in the Activity Monitor, that the PID of the process with 100% CPU load is changing over time. Also, I see in addition to the 'active' process nbrOfWorkers()-1 'idle' processes. It seems that the input object is passed sequentially to the parallel workers, so that only one worker at a time can work on it.

Issue #118 which I've just posted may be closely related.

Hi again, I think I've figured out what's going on and I've fixed this in progressr (>= 0.8.0-9005). Please try with:

remotes::install_github("HenrikBengtsson/progressr", ref = "develop")

and let me know if this solves the problem.

This solves the problem, thank you!

Thanks for confirm. FYI, progressr 0.9.0 is now on CRAN.