gvegayon/parallel

Max. number of cores used?

fbittmann opened this issue · 3 comments

Hi there, I got a question for parallel in Stata 16.1 using the newest version from Github (v1.20.0). I wrote an ado that internally uses parallel for a kind of bootstrapping. This works very well. The only problem is that not all cores are used when many cores are available. I tested it on a server with 32 vCPUs and specified the usage of 31. However, what I can see from the server load is that only 15 were actually used, despite setting "parallel setclusters 31". Then I tested the "normal" parallel bootstrapping in a toy example (parallel bs) with 31 cores and 1000 replications, which worked fine and all cores were active. So, I wonder, in my ado, how do I have to specify parallel so that all cores are used? I wrote an ado that is called from the main program like so:

cap parallel setclusters `parallel'
quiet parallel, seed(`allseeds'): ///
  dbs_resampling, data(`originaldata') reps1(`reps1') reps2(`reps2') command(`command') ///
  totalstats(`exp_total') expression(`expression') totalinstances(`parallel') dots(0) ///
  `strata' `cluster' `idcluster'

Is this a trivial or complex problem? What determines now many cores are actually used by parallel? If you want to see the entire code, its here:
https://github.com/fbittmann/dbs
(dbs.ado) Main prog
(dbs_resampling.ado) Resampling prog

Thanks for advice!

Are you using the force option with the setclusters subcommand?

No I did not since the number of available cores (32) is larger than the number of requested cores for parallel (31). Or do I misunderstand the description of what force does here? Should this resolve the issue?

What do you get from parallel numprocessors? I am guessing it could be related to the number of logical vs physical processors.