ck37/varimpact

Parallel computing issues

ahubb40 opened this issue · 4 comments

Hi Chris,

When I use foreach and registercluster for another function before varImpact, and I use the parallel=T option in varImpact, I get an error:

Error in summary.connection(connection) : invalid connection
Calls: ... sendData.SOCKnode -> serialize -> summary -> summary.connection.

My previous code before calling varImpact looks like this:

  cl <- makeCluster(V)
  registerDoParallel(cl)
  fit.test=origami_SuperLearner(Y = Y, X = Xdat, SL.library =SL.library, method = method.NNLS(), family = gaussian(),V=2)

stopCluster(cl)

ck37 commented

It looks to me like the issue is that "cl" is being registered as the cluster via registerDoParallel, but then stopCluster(cl) is deleting that cluster. So then when varImpact() tries to use "cl" it doesn't work. If you comment out the stopCluster() line does that fix it?

I'll try again, but I think that was one of my permutations that didn't
work (I suspected that might be the problem given what I read online, but I
think I had the same issue). Just to confirm, it all works fine if I use
parallel=F, so that must be the issue. I'll give this a try just to make
sure and see if it fails or not.

Alan Hubbard
Division of Biostatistics
UC Berkeley
(510)643-6160
http://hubbard.berkeley.edu

On Sat, Jul 9, 2016 at 8:30 AM, Chris Kennedy notifications@github.com
wrote:

It looks to me like the issue is that "cl" is being registered as the
cluster via registerDoParallel, but then stopCluster(cl) is deleting that
cluster. So then when varImpact() tries to use "cl" it doesn't work. If you
comment out the stopCluster() line does that fix it?


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#2 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/AOahjTNZEZ_ho88PjMl9lfNkH180X57Rks5qT76xgaJpZM4JIMHe
.

ck37 commented

Ok I tried a quick test case and it does seem to work without error:

library(doParallel)
cl <- makeCluster(4)
registerDoParallel(cl)
data(BreastCancer, package="mlbench")
data = BreastCancer
data$Y = as.numeric(data$Class == "malignant")
vim = varImpact(Y = data$Y, data = subset(data, select=-c(Y, Class, Id)))
vim
ck37 commented

Closing this old varImpact issue (feel free to re-open though!)