BoPeng/simuPOP

SimuPOP and multithreading

Closed this issue · 3 comments

Dear SimuPOP users,

I think this is a very basic question, unfortunately I was unsuccessful in trying to figure this out:
I am working on SimuPOP 1.1.10.9, installed from Anaconda 3 using the command conda install simupop, on Windows 10.

I tried to run the quite simple script below, however, I was not able to use multithreading:
no matter how I set numThreads in setOptions (e.g. a fixed integer, or a value defined as OMP_NUM_THREADS previously set in my environment variables and initialized in my script), SimuPOP always seems to use only 1 thread.

Here is my code:

from simuOpt import setOptions
setOptions(alleleType='binary', numThreads=0) 

import simuPOP as sim
from simuPOP.sampling import drawRandomSample
from simuPOP.utils import export


def exportSample(pop):
   sample = drawRandomSample(pop, sizes=[200,200]) 
   sim.utils.Exporter(format='GENEPOP', title="SimuPop output", adjust=1, output='geneSim_n10000_%s_%s.gen' % (str(k), pop.dvars().gen), gui=False).apply(sample)
   return True

num_sim = 2
p = [0.10, 0.50]

for k in range(1, (num_sim+1)): 
        
    pop = sim.Population(size=[10000, 10000], ploidy=2, loci=1000, infoFields=['migrate_to', 'age', 'ind_id']) 
    M = [[0.00, 0.20],
         [0.20, 0.00]]
    
    pop.evolve(
        initOps=[sim.InitSex(maleProp=0.5)] + 
                # Sex status is randomly assigned to each individual for a balanced sex-ratio.
                [sim.InitGenotype(prop=[p[i], 1-p[i]], subPops=i) for i in range(2)], 
                # preOps defines what happens at each generation:
        preOps= [sim.Migrator(rate=M, mode=sim.BY_PROBABILITY)],      
                # Define a mating scheme that take into account the fact generations are overlapping.          
        matingScheme= sim.HeteroMating([
            sim.CloneMating(weight=2),
            sim.RandomMating(weight=1, sexMode=(sim.PROB_OF_MALES, 0.5), ops=[sim.MendelianGenoTransmitter()], numOffspring=1)
                        ], shuffleOffspring=True, subPopSize=[10000, 10000]
    	        ),
    
        postOps=[
                # Mutation rate 10e-8
                #sim.SNPMutator(u=0.00000001,v=0.00000001),
            sim.Stat(popSize=True, structure=range(1000), step=1),
            sim.PyEval(r"'Fst=%.3f' % (F_st)", step=1),
            sim.PyOutput('\n', step=1),
            sim.PyOperator(exportSample, begin=90, step=2)
            ],
                    gen=101
    )

And here is what I get when the SimuPOP library is imported:

simuPOP Version 1.1.10.9 : Copyright (c) 2004-2016 Bo Peng
Revision 4603 (Oct 13 2020) for Python 3.8.6 (64bit, 0thread)
Random Number Generator is set to mt19937 with random seed 0x93c4b6ff.
This is the standard binary allele version with 2 maximum allelic states.
For more information, please visit http://simupop.sourceforge.net,
or email simupop-list@lists.sourceforge.net (subscription required).   

So, it seems like numThreads is set to 0 by default, which corresponds to the number of cores available. However, the RAM never goes higher than 11% of its capacity. I tried higher values for population size and the number of loci, it is the same.

Is it a matter of script (for instance, I am aware that my num_sim replicates will necessarily run successively because these are set in a loop, but what about multithreading for populations or loci) ? Or else I guess I will have to check on OpenMP, but I could not find out what was wrong in this regard.

Maybe something went wrong during the conda installation and compiling? Do I need to perform an additional step after conda install, to be able to use SimuPOP with OpenMP?
Last question: was memsize deprecated? it is described in the SourceForge FAQ but does not appear in the Manual.

Thank you so much for your help, and for this great SimuPOP library!

Cheers,
Chrystelle

I do not use windows and I suspect that the windows build does not link to openMP so there is simply no multi-threading there. My recommendation is that

  1. Use the library as it is, maybe you do not really need multithreading.
  2. If you do, make the code work in single thread mode and run it under a linux machine...
  3. If you have to use multi-threading under windows, check the conda recipe and see if openMP is used... and you might have to compile simuPOP from source to enable multi-threading. It could be tedious.

Thanks a lot Bo!

I will try these options and keep the issue updated.
Have a nice day,

Chrystelle

Dear Bo and SimuPOP users,

So I tried to better understand the problem here. On Windows, it seems like indeed (hoping I was not misinterpreting) the conda-forge does need the Visual C++ 20XX compiler to build libraries with OpenMP support.
(e.g. see https://conda-forge.org/docs/maintainer/knowledge_base.html, "Particularities on Windows")

Therefore, I guess that in the end, it is similar to compile SimuPOP from source as described in the SimuPOP-sourceforge "Download and Installation" page, section 6-6.1 (or 6.4-6.5 if using Intel i++ or Mingw), which would correspond to Bo's point 3 above.

Maybe a 4th option could be to "sublaunch" several SimuPOP runs at the same time (and store the output on several directories), however, is it OK to do that?

Thanks a bunch!

Chrystelle