Code and supporting data for the benchmarks

Question

Code and supporting data for the benchmarks

Closed this issue 5 years ago · 3 comments

@DominikRafacz, it was helpful to see the parallel computing benchmarks on slide 9. Would it be possible to access the full code base, including phplE7q6h.csv or an equivalent input file, so I can reproduce them? I think this is a good opportunity to dive deeper into profiling and reduce drake's computational overhead.

Answer 1 · 2019-05-25T12:27:15.000Z

Sorry for delay in reply, but I was away from computer for a day.
I was supposed to add code earlier, but, of course, I forgot. I did it today. I want to perform more benchmarks and dig into this subject more carefully, but I'm gonna do this later (probably during holiday break), as now I need to prepare for my exams.

Answer 2 · 2019-05-26T05:54:23.000Z

Thanks for uploading the code and data. I am running it now. A couple initial thoughts:

make(parallelism = "clustermq") is usually faster than make(parallelism = "future").
When you selected drake-powered parallelism and mlr-powered parallelism together, I suspect drake and mlr may have been competing for the same pool of resources. Since drake starts workers before invoking mlr, that may be why drake + mlr parallelism ran about as fast as mlr parallelism alone. In my experience, the gains are small when the grand total number of multicore workers is greater than the number of physical cores. In this case, we have 16 total workers. How many cores does your machine have?
If you have access to a computing cluster, you can send targets to different nodes and tell mlr to use all the cores on each node. That way, drake + mlr parallelism should outperform either mode of parallelism on its own.

Answer 3 · 2019-05-26T09:39:40.000Z

Great thanks for your advises and those thoughts, for sure they will be helpful. Answering to your question: I have 8 cores. I haven't had access to cluster yet, but possibly soon I'll gain it. Then I'll try how does it look in practice.