StuntsPT/Structure_threader

structure_threader problem with seed

StuntsPT opened this issue · 5 comments

Via Structure groups

It seems like there is an issue with the same random seed being issued on every STRUCTURE run that is wrapped into Structure_threader. This will cause the Evanno test to fail due to lack of variability.
This has to be fixed.

Do we even set the seed for structure runs? In any case, it's possible to set the RANDOMIZE option to 1 in the extraparameters to force different seeds in multiple runs.

RANDOMIZE (Boolean) Use a different random number seed for each run, taken from the
system clock. (See also SEED.)

SEED (Integer) If RANDOMIZE==0, then the simulation seed is initialized to SEED. This allows
runs to be repeated exactly. If RANDOMIZE=1 then any value specified in SEED is ignored.
Note that even when RANDOMIZE==1, the program output still indicates the starting seed
value so that it is possible to repeat particular runs if desired.

We currently support this on fastStructure, but not on STRUCTURE.
The thing is if we want to do 2 completely reproducible runs inside Structure_threader, the extraparams method will not work.
I was thinking we could intercept the -D XXXXX option from the --extra_opts argument, and use it to "randomly" generate N seeds, where N is the total number of runs. The means we will effectively be using a single seed to deterministically generate N seeds for the multiple runs. This should handle the issue at hand.

Fixed via 96f6282

It seems STRUCTURE is ignoring the seed value that Strucutre_threader is passing it via the command line. This warrants more investigation.
Furthermore, the command line logs report one thing, and the seed.txt file reports another. Logfiles match what is being reported in the seed.txt file, so this is definitely an issue with command line interpretation.

Found it.
There is an option in extraparams:

#define RANDOMIZE      1  // (B) use new random seed for each run 
#define SEED        2245  // (int) seed value for random number generator 

If RANDOMIZE is set to 1, the seed value passed via the command line is ignored.
We must therefore ensure that RANDOMIZE is set to 0 if the -D option is passed via CLI.