`expression()` doesn't seem to work with `snssde1d` in `foreach %dopar%`
VMOrca opened this issue · 4 comments
I'm trying to include an expression()
in foreach() %dopar%
. This expression()
will be used to simulate an Ornstein-Uhlenbeck process via the Sim.DiffProc
package within the foreach() %dopar%
call.
However the parallel computing nodes doesn't seem to recognise the variables specified in expression(
) and I got the following error Error in { : task 1 failed - "object 'OUTheta' not found"
My code:
library(Sim.DiffProc)
library(doSNOW)
library(foreach)
cl <- makeCluster(2)
registerDoSNOW(cl)
a = foreach(i = 1:2, .packages = c('Sim.DiffProc')) %dopar% {
OUMu = 1
OUTheta = 1
OUSigma = 1
f = expression(OUTheta * (OUMu - x))
g = expression(OUSigma)
sim = Sim.DiffProc::snssde1d(drift = f, diffusion = g, x0 = 0, N = 10, T = 1, method = 'euler', M = 1)
return(sim$X)
}
R version:
R.Version()
$platform
[1] "x86_64-w64-mingw32"
$arch
[1] "x86_64"
$os
[1] "mingw32"
$system
[1] "x86_64, mingw32"
$status
[1] ""
$major
[1] "4"
$minor
[1] "0.3"
$year
[1] "2020"
$month
[1] "10"
$day
[1] "10"
$`svn rev`
[1] "79318"
$language
[1] "R"
$version.string
[1] "R version 4.0.3 (2020-10-10)"
$nickname
[1] "Bunny-Wunnies Freak Out"
Good morning,
Indeed, in your code the parallel compute nodes do not recognize the specified variables in expression()
, you must use the clusterExport()
function available in parallel
core package. To assigns the values on the master R process of the variables named in varlist to variables of the same names in the global environment of each node, i.e., in your case varlist=c("OUTheta","OUMu","OUSigma")
. See the following R code:
R> library(doSNOW)
R> library(foreach)
R> OUMu = 1; OUTheta = 1; OUSigma = 1
R> f = expression(OUTheta * (OUMu - x))
R> g = expression(OUSigma)
R> cl <- makeCluster(2)
R> registerDoSNOW(cl)
R> parallel::clusterExport(cl, varlist=c("OUTheta","OUMu","OUSigma"),envir = environment())
R> a <- foreach(i = 1:2, .packages = c('Sim.DiffProc'),.combine=list) %dopar% {
+ Sim.DiffProc::snssde1d(drift = f, diffusion = g, x0 = 0, N = 10, T = 1, method = 'euler', M = 1)$X
+}
R> a
[[1]]
Time Series:
Start = c(0, 1)
End = c(1, 1)
Frequency = 10
[,1]
[1,] 0.0000000
[2,] 0.5361421
[3,] 0.6101941
[4,] 0.5210987
[5,] 0.1513371
[6,] 0.6598853
[7,] 1.1226598
[8,] 1.4754373
[9,] 1.5507929
[10,] 1.2577042
[11,] 1.3156085
[[2]]
Time Series:
Start = c(0, 1)
End = c(1, 1)
Frequency = 10
[,1]
[1,] 0.00000000
[2,] -0.04583063
[3,] 0.64412154
[4,] -0.11366549
[5,] 0.09658761
[6,] 0.02255550
[7,] -0.17141103
[8,] 0.45232315
[9,] 0.63692530
[10,] 0.34672216
[11,] 0.25129187
R> parallel::stopCluster(cl)
Thanks very much for your prompt reply acguidoum!
Just wondering, if the parameters of the OU process, e.g. c("OUTheta","OUMu","OUSigma")
are obtained within the foreach() %dopar%
call, is there a workaround? So essentially the pseudocode looks like below:
foreach(i = 1:100) %dopar% {
1. Estimate OUTheta, OUMu, OUSigma using existing data
2. Simulate trajectories using the estimated parameters
3. Calculate summary statistics from simulated trajectories
4. Return parameters and summary statistics as list
}
I guess I have to split step 1 and 2 on two foreach() %dopar%
calls, i.e. after obtaining estimated parameters from step 1, then using parallel::clusterExport
and start the second foreach() %dopar%
call to simulate trajectories?
Many thanks in advance!
Estimate of the parameters of the SDEs by the maximum likelihood method or least squares estimator; you can use qmle()
function available in yuima
package, for more information see jss.v057.i04. For your algorithm, you can use a single loop (not necessarily with function foreach() %dopar%
, there are several techniques of the parallel programming see the core package parallel
jss.v031.i01).
R> library(parallel)
R> ?parLapply
R> ?mclapply
Thanks a lot acguidoum - much appreciated!