MaxHalford/eaopt

Parallelizing Individual Evaluate() calls?

andrewortman opened this issue · 7 comments

Hi! Thanks for writing gago - it is very useful. I was in the middle of writing my own GA loop, when I realized I shouldn't be reinventing the wheel and stumbled accross your library.

per the docs:

Talking about parallelism, there is a reason why the populations are run in parallel and not the individuals. First of all for parallelism at an individual level each individual would have to be assigned a new random number generator, which isn't very efficient. Second of all, even though Golang has an efficient concurrency model, spawning routines nonetheless has an overhead. It's simply not worth using a routine for each individual because operations at an individual level are often not time consuming enough.

This makes sense for GA operations, specifically those that require an RNG - mutate, crossover, et al. Those generally are fairly fast too, so the argument against overhead is totally valid.

With that said, my Evaluate() function is relatively expensive (upwards to 1 second for each call) and already does not have access to the population's RNG. Would it make sense then to parallelize Evaluate() calls on individuals within a population? We could make this default to off for use cases where the Evaluate() function is fast and doing this would just cause overhead.

Let me know what you think. If this sounds good, I'll be glad to work on it when I get a chance and submit a PR. If you have any implementation suggestions / preferences, let me know too

Hey!

Glad you like the library.

I think it makes a lot of sense to add an option to parallelise stuff on the individual level if you have expensive Evaluate calls, so you have a definitive 👍 on this from me.

As for implementation details, my intuition would be to add a parameter to the Evaluate method of the Individuals class (note the s). The parameter could be a number of workers to use. Then inside the Evaluate method you can check if the parameter is higher than 1 or not and apply the same logic that is done at the population level (use the g.Go method).

Feel free to give a shot! I think it makes sense you do the first implementation as you have a real use case when you can measure the speed-up. Don't hesitate to ask me questions :)

Hey @andrewortman, I have some time today so I'm going to implement this. I hope you haven't started so our work won't overlap!

Hey again, I implemented it. You can now use the ParallelEval field in the GA to choose to evaluate individuals in parallel. Tell me how it goes!

No problem! I'll let you try it out and if you're successful I'll let you close the issue yourself.

@andrewortman is it okay if I close the issue?

Aweome! Thanks! I will close this issue now