Code and supporting data for the benchmarks
Closed this issue · 3 comments
@DominikRafacz, it was helpful to see the parallel computing benchmarks on slide 9. Would it be possible to access the full code base, including phplE7q6h.csv
or an equivalent input file, so I can reproduce them? I think this is a good opportunity to dive deeper into profiling and reduce drake
's computational overhead.
Sorry for delay in reply, but I was away from computer for a day.
I was supposed to add code earlier, but, of course, I forgot. I did it today. I want to perform more benchmarks and dig into this subject more carefully, but I'm gonna do this later (probably during holiday break), as now I need to prepare for my exams.
Thanks for uploading the code and data. I am running it now. A couple initial thoughts:
make(parallelism = "clustermq")
is usually faster thanmake(parallelism = "future")
.- When you selected
drake
-powered parallelism andmlr
-powered parallelism together, I suspectdrake
andmlr
may have been competing for the same pool of resources. Sincedrake
starts workers before invokingmlr
, that may be whydrake
+mlr
parallelism ran about as fast asmlr
parallelism alone. In my experience, the gains are small when the grand total number of multicore workers is greater than the number of physical cores. In this case, we have 16 total workers. How many cores does your machine have? - If you have access to a computing cluster, you can send targets to different nodes and tell
mlr
to use all the cores on each node. That way,drake
+mlr
parallelism should outperform either mode of parallelism on its own.
Great thanks for your advises and those thoughts, for sure they will be helpful. Answering to your question: I have 8 cores. I haven't had access to cluster yet, but possibly soon I'll gain it. Then I'll try how does it look in practice.