parallel::parLapply and nthreades > 1
Closed this issue · 5 comments
In certain cases stringdist
just hangs forever if invoked inside parallel::parLapply
with argument nthreads
> 1 or no nthreads
at all:
cl <- parallel::makeForkCluster(2)
parallel::parLapply(cl, list(c("a", "b"), c("d", "e")), function(x) stringdist::stringdist("a", x))
just hangs forever, it works when adding the nthreads=1
argument. (But seems like I have to stop the cluster before I can get it to work again).
I guess this is very much system-dependent, I noticed it on ubuntu 18.04 and R 3.5.3 (see below). I am also not sure if it is a problem with your code, openMP, core R or something else.
Thanks for the package :-)
> R.Version()
$platform
[1] "x86_64-pc-linux-gnu"
$arch
[1] "x86_64"
$os
[1] "linux-gnu"
$system
[1] "x86_64, linux-gnu"
$status
[1] ""
$major
[1] "3"
$minor
[1] "5.3"
$year
[1] "2019"
$month
[1] "03"
$day
[1] "11"
$`svn rev`
[1] "76217"
$language
[1] "R"
$version.string
[1] "R version 3.5.3 (2019-03-11)"
$nickname
[1] "Great Truth"
Hi, thanks for the report. The code you submit works fine for me on Ubuntu 16.04 (will try later on 18.04).
Contents of sd.R
cl <- parallel::makeForkCluster(2)
parallel::parLapply(cl, list(c("a", "b"), c("d", "e")), function(x) stringdist::stringdist("a", x))
sessionInfo()
mark@chouffe:~$ R -s -f sd.R
[[1]]
[1] 0 1
[[2]]
[1] 1 1
R version 3.5.3 (2019-03-11)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.6 LTS
Matrix products: default
BLAS: /usr/lib/libblas/libblas.so.3.6.0
LAPACK: /usr/lib/lapack/liblapack.so.3.6.0
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] compiler_3.5.3 parallel_3.5.3
Question though: why would you parallelize something that is already running in parallel? It seems to only make sense when paralellizing over multiple machines.
Thanks for the reply!
I agree that parallelizing something inside a parallel loop is tricky. But it may still make sense if I choose to run only little parallelism in the outer loop and want to add some more inside. But more importantly: my code that worked fine earlier, suddenly just hang there and it took and hour to find the culprit. I guess this was related to upgrading R 3.4 -> 3.5 but not sure any more. I try to check on different platforms and see where I can replicate it.
I guess it is not directly related to your code but something else, maybe the way how a particular gcc implements openMP or whatever...
Hard to make sure what it is, but please keep me in the loop if you find something. I like stringdist to be safe&fast.
Works correctly (when tested interactively):
- R 3.5.2, ubuntu 16.04, gcc 5.4.0, kernel 4.15.0 (6/12 cores)
- R 3.5.2, debian sid, gcc 8.3.0 , kernel 4.19.0 (20/40 cores)
- R 3.5.2, centos 6.10, gcc 4.4.7, kernel 2.6.32 (8 cores)
Sometimes works, sometimes not:
- R 3.5.3, ubuntu 18.04, gcc 7.3.0, kernel 4.15.0 (4/8 cores)
(it works in two interactive sessions and fails in another two...)
Repeatedly failed in a script -- needs further testing
Hi, I'm going to close this now, but perhaps it is good to quote the following from ?makeForkCluster
, as a reference.
It is strongly discouraged to use the ‘"FORK"’ cluster with GUI
front-ends or multi-threaded libraries. See ‘mcfork’ for details.