Starting points are identical when running in parallel
ketch opened this issue · 6 comments
rk_opt
is designed to run with a number of initial guesses and a number of processes, and the intent is that each parallel process will use an independent (random) set of initial guesses. By default, MATLAB generates the same sequence of random numbers in each new session, so we use rng('shuffle')
in order to do searches that are truly independent across different sessions, different machines, etc. However, I just found out (see https://www.mathworks.com/help/parallel-computing/control-random-number-streams-on-workers.html):
Because rng('shuffle') seeds the random number generator based on the current time, do not use this command to set the random number stream on different workers if you want to ensure independent streams. This is especially true when the command is sent to multiple workers simultaneously, such as inside a parfor, spmd, or a communicating job. For independent streams on the workers, use the default behavior; or if that is not sufficient for your needs, consider using a unique substream on each worker using RandStream.
I've read the help on RandStream but haven't understood how to use it for what we want. @Sondar74 @ranocha @abhibsws if you can figure this out, let me know or feel free to submit a PR. Until then, it seems that setting np>1
actually does nothing.
@ranocha Let's say you ask for 10 starting points and 4 processes. The idea is that you should get 40 different searches, but instead you get the same 10 searches run on each process.
As far as I understand,
Lines 126 to 131 in fb10bf4
lets the solver run in parallel using the starting_points
created above. Thus, the starting_points
are created in serial on the main process but the individual local optimization problems are distributed across all workers. Am I confusing something?
What about using parfor
instead of MultiStart
?
I think parpool
together with parfor
will generate distinct starting points. I tried running 3 parallel cores with num_starting_points
set to 1 (just for the test).
These are the generated matrix of initial guess
for every core:
Core 1:
0.0447 0.1146 0.2423 -0.2628 -0.3163 0.3441 1.2350 -0.1957 -0.0994 0.3793 0.3740 -0.0629 0.1948 -0.0100
...
Core 2:
-0.4962 -0.4020 0.1861 -0.3755 -0.4628 -0.0451 1.8834 0.1881 -0.4331 0.2393 0.1131 0.1847 0.3706 -0.0100
...
Core 3:
-0.1252 0.2262 0.3518 0.3250 0.2517 0.0636 0.3598 0.0360 0.3239 -0.1862 -0.3233 -0.2385 -0.1021 -0.0100
Okay, after more careful investigation I agree that the starting points for different processes are distinct. I opened this issue because I noticed that in a particular case I would have either no threads converging to a solution or all threads converging, but I can only assume that it was some (improbable) coincidence. Sorry for the false alarm.