R session crash if plan is not changed back to sequential after using multisession (in linux)
MalditoBarbudo opened this issue · 2 comments
(Please use https://github.com/HenrikBengtsson/future/discussions for Q&A)
Describe the bug
When using future::plan(future::multisession)
in linux, this creates as R workers as especified in the workers arguments that are not closed until the plan is changed back to sequential or the session is closed/reinitialized. The latter in Rstudio almost always results in a crash of the R session
Reproduce example
Simply changing the plan to multisession creates the issue. R session processes can be monitored with htop
, btm
or the console system monitor of your choice:
- Initial
btm screenshot in a freshly new session:
- Changing the plan
# changing the plan
future::plan(future::multisession)
btm screenshot after changing plan:
As it can seen, R processes appear.
- executing code
furrr::future_map(
c(1:10), \(x) {1e10}
)
btm screenshot showing that memory is still used in each fork after computation ends:
-
Restarting the session (building and restarting for package development, changing project...) crash R session causing it to restart and can lead to data loss.
-
Changing the plan to sequential before exiting the session removes sucessfully the R workers
future::plan(future::sequential)
btm screenshot after changing to sequential:
Expected behavior
When ussing plan(multisession)
I expect those processes to be terminated after computation finish and results are gathered, or at least that RStudio does not crash if plan is not changed back to sequential before restarting/closing the session.
Session information
sessionInfo()
#> R version 4.3.1 (2023-06-16)
#> Platform: x86_64-pc-linux-gnu (64-bit)
#> Running under: Arch Linux
#>
#> Matrix products: default
#> BLAS/LAPACK: /usr/lib/libopenblas.so.0.3; LAPACK version 3.11.0
#>
#> locale:
#> [1] LC_CTYPE=es_ES.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=es_ES.UTF-8 LC_COLLATE=es_ES.UTF-8
#> [5] LC_MONETARY=es_ES.UTF-8 LC_MESSAGES=es_ES.UTF-8
#> [7] LC_PAPER=es_ES.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=es_ES.UTF-8 LC_IDENTIFICATION=C
#>
#> time zone: Europe/Madrid
#> tzcode source: system (glibc)
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> loaded via a namespace (and not attached):
#> [1] digest_0.6.31 fastmap_1.1.1 xfun_0.39 glue_1.6.2
#> [5] knitr_1.42 htmltools_0.5.5 rmarkdown_2.21 lifecycle_1.0.3
#> [9] cli_3.6.1 reprex_2.0.2 withr_2.5.0 compiler_4.3.1
#> [13] rstudioapi_0.14 tools_4.3.1 evaluate_0.21 yaml_2.3.7
#> [17] rlang_1.1.1 fs_1.6.2
Created on 2023-06-23 with reprex v2.0.2
** Future session info when in sequential**
future::futureSessionInfo()
#> *** Package versions
#> future 1.32.0, parallelly 1.35.0, parallel 4.3.1, globals 0.16.2, listenv 0.9.0
#>
#> *** Allocations
#> availableCores():
#> system nproc
#> 16 16
#> availableWorkers():
#> $nproc
#> [1] "localhost" "localhost" "localhost" "localhost" "localhost" "localhost"
#> [7] "localhost" "localhost" "localhost" "localhost" "localhost" "localhost"
#> [13] "localhost" "localhost" "localhost" "localhost"
#>
#> $system
#> [1] "localhost" "localhost" "localhost" "localhost" "localhost" "localhost"
#> [7] "localhost" "localhost" "localhost" "localhost" "localhost" "localhost"
#> [13] "localhost" "localhost" "localhost" "localhost"
#>
#> *** Settings
#> - future.plan=<not set>
#> - future.fork.multithreading.enable=<not set>
#> - future.globals.maxSize=<not set>
#> - future.globals.onReference=<not set>
#> - future.resolve.recursive=<not set>
#> - future.rng.onMisuse=<not set>
#> - future.wait.timeout=<not set>
#> - future.wait.interval=<not set>
#> - future.wait.alpha=<not set>
#> - future.startup.script=<not set>
#>
#> *** Backends
#> Number of workers: 1
#> List of future strategies:
#> 1. sequential:
#> - args: function (..., envir = parent.frame())
#> - tweaked: FALSE
#> - call: NULL
#>
#> *** Basic tests
#> Main R session details:
#> pid r sysname release
#> 1 28914 4.3.1 Linux 6.3.8-arch1-1
#> version nodename machine
#> 1 #1 SMP PREEMPT_DYNAMIC Wed, 14 Jun 2023 20:10:31 +0000 host001 x86_64
#> login user effective_user
#> 1 user001 user001 user001
#> Worker R session details:
#> worker pid r sysname release
#> 1 1 28914 4.3.1 Linux 6.3.8-arch1-1
#> version nodename machine
#> 1 #1 SMP PREEMPT_DYNAMIC Wed, 14 Jun 2023 20:10:31 +0000 host001 x86_64
#> login user effective_user
#> 1 user001 user001 user001
#> Number of unique worker PIDs: 1 (as expected)
Created on 2023-06-23 with reprex v2.0.2
** Future session info when in multiprocess**
future::plan(future::multisession)
future::futureSessionInfo()
#> *** Package versions
#> future 1.32.0, parallelly 1.35.0, parallel 4.3.1, globals 0.16.2, listenv 0.9.0
#>
#> *** Allocations
#> availableCores():
#> system nproc
#> 16 16
#> availableWorkers():
#> $nproc
#> [1] "localhost" "localhost" "localhost" "localhost" "localhost" "localhost"
#> [7] "localhost" "localhost" "localhost" "localhost" "localhost" "localhost"
#> [13] "localhost" "localhost" "localhost" "localhost"
#>
#> $system
#> [1] "localhost" "localhost" "localhost" "localhost" "localhost" "localhost"
#> [7] "localhost" "localhost" "localhost" "localhost" "localhost" "localhost"
#> [13] "localhost" "localhost" "localhost" "localhost"
#>
#> *** Settings
#> - future.plan=<not set>
#> - future.fork.multithreading.enable=<not set>
#> - future.globals.maxSize=<not set>
#> - future.globals.onReference=<not set>
#> - future.resolve.recursive=<not set>
#> - future.rng.onMisuse=<not set>
#> - future.wait.timeout=<not set>
#> - future.wait.interval=<not set>
#> - future.wait.alpha=<not set>
#> - future.startup.script=<not set>
#>
#> *** Backends
#> Number of workers: 16
#> List of future strategies:
#> 1. multisession:
#> - args: function (..., workers = availableCores(), lazy = FALSE, rscript_libs = .libPaths(), envir = parent.frame())
#> - tweaked: FALSE
#> - call: future::plan(future::multisession)
#>
#> *** Basic tests
#> Main R session details:
#> pid r sysname release
#> 1 29680 4.3.1 Linux 6.3.8-arch1-1
#> version nodename machine
#> 1 #1 SMP PREEMPT_DYNAMIC Wed, 14 Jun 2023 20:10:31 +0000 host001 x86_64
#> login user effective_user
#> 1 user001 user001 user001
#> Worker R session details:
#> worker pid r sysname release
#> 1 1 29737 4.3.1 Linux 6.3.8-arch1-1
#> 2 2 29731 4.3.1 Linux 6.3.8-arch1-1
#> 3 3 29730 4.3.1 Linux 6.3.8-arch1-1
#> 4 4 29742 4.3.1 Linux 6.3.8-arch1-1
#> 5 5 29741 4.3.1 Linux 6.3.8-arch1-1
#> 6 6 29740 4.3.1 Linux 6.3.8-arch1-1
#> 7 7 29729 4.3.1 Linux 6.3.8-arch1-1
#> 8 8 29739 4.3.1 Linux 6.3.8-arch1-1
#> 9 9 29738 4.3.1 Linux 6.3.8-arch1-1
#> 10 10 29735 4.3.1 Linux 6.3.8-arch1-1
#> 11 11 29744 4.3.1 Linux 6.3.8-arch1-1
#> 12 12 29736 4.3.1 Linux 6.3.8-arch1-1
#> 13 13 29734 4.3.1 Linux 6.3.8-arch1-1
#> 14 14 29743 4.3.1 Linux 6.3.8-arch1-1
#> 15 15 29732 4.3.1 Linux 6.3.8-arch1-1
#> 16 16 29733 4.3.1 Linux 6.3.8-arch1-1
#> version nodename machine
#> 1 #1 SMP PREEMPT_DYNAMIC Wed, 14 Jun 2023 20:10:31 +0000 host001 x86_64
#> 2 #1 SMP PREEMPT_DYNAMIC Wed, 14 Jun 2023 20:10:31 +0000 host001 x86_64
#> 3 #1 SMP PREEMPT_DYNAMIC Wed, 14 Jun 2023 20:10:31 +0000 host001 x86_64
#> 4 #1 SMP PREEMPT_DYNAMIC Wed, 14 Jun 2023 20:10:31 +0000 host001 x86_64
#> 5 #1 SMP PREEMPT_DYNAMIC Wed, 14 Jun 2023 20:10:31 +0000 host001 x86_64
#> 6 #1 SMP PREEMPT_DYNAMIC Wed, 14 Jun 2023 20:10:31 +0000 host001 x86_64
#> 7 #1 SMP PREEMPT_DYNAMIC Wed, 14 Jun 2023 20:10:31 +0000 host001 x86_64
#> 8 #1 SMP PREEMPT_DYNAMIC Wed, 14 Jun 2023 20:10:31 +0000 host001 x86_64
#> 9 #1 SMP PREEMPT_DYNAMIC Wed, 14 Jun 2023 20:10:31 +0000 host001 x86_64
#> 10 #1 SMP PREEMPT_DYNAMIC Wed, 14 Jun 2023 20:10:31 +0000 host001 x86_64
#> 11 #1 SMP PREEMPT_DYNAMIC Wed, 14 Jun 2023 20:10:31 +0000 host001 x86_64
#> 12 #1 SMP PREEMPT_DYNAMIC Wed, 14 Jun 2023 20:10:31 +0000 host001 x86_64
#> 13 #1 SMP PREEMPT_DYNAMIC Wed, 14 Jun 2023 20:10:31 +0000 host001 x86_64
#> 14 #1 SMP PREEMPT_DYNAMIC Wed, 14 Jun 2023 20:10:31 +0000 host001 x86_64
#> 15 #1 SMP PREEMPT_DYNAMIC Wed, 14 Jun 2023 20:10:31 +0000 host001 x86_64
#> 16 #1 SMP PREEMPT_DYNAMIC Wed, 14 Jun 2023 20:10:31 +0000 host001 x86_64
#> login user effective_user
#> 1 user001 user001 user001
#> 2 user001 user001 user001
#> 3 user001 user001 user001
#> 4 user001 user001 user001
#> 5 user001 user001 user001
#> 6 user001 user001 user001
#> 7 user001 user001 user001
#> 8 user001 user001 user001
#> 9 user001 user001 user001
#> 10 user001 user001 user001
#> 11 user001 user001 user001
#> 12 user001 user001 user001
#> 13 user001 user001 user001
#> 14 user001 user001 user001
#> 15 user001 user001 user001
#> 16 user001 user001 user001
#> Number of unique worker PIDs: 16 (as expected)
Created on 2023-06-23 with reprex v2.0.2
I don't have much knowledge in this area, but one thing you could try, both to narrow in on the root issue of the crash and as an alternative to "multisession" is to install the package "future.callr" and then use the following plan (instead of the "multisession" plan):
future::plan(future.callr::callr)
If you don't switch back to sequential at the end, do things still work well with the above plan?
@scottkosty
Yes, using future.callr::callr
as plan works as intended. R processes are spammed when computation begins, and closed afterwards, freeing the memory. Also, Rstudio doesn't crash when restarting/closing session without changing plan to sequential.
I can use this workaround for the moment for my own development, but I would like to know why is this happening with plans from future
, as is the default recommendation in docs from furrr
and other packages and probably the one that users of the package I'm developing will use.
Is there any more info I can provide to help narrowing the issue?