Running multiple instances of Timeloop in parallel
Nerotos opened this issue · 5 comments
When running multiple instances of Timeloop in parallel, I get this error:
execute:accelergy evaluation/features/features.8/Conv/eyeriss_like.yaml --oprefix timeloop-mapper. -o ./ > timeloop-mapper.accelergy.log 2>&1
ERROR: key not found: ERT, at line: 0
I suspect that is caused by the timeloop-mapper output file being written by multiple instances of Timeloop at the same time. Is there a way to prevent this? Running everything sequentially would be very slow.
This is plausible. We have not tested multiple instances of Timeloop running in parallel.
That said, Timeloop itself is multi-threaded and the number of mapper threads will expand to fill all available host CPUs. Do you need more parallelism beyond that (e.g., if mapper runs are very short, and/or you are only using timeloop-model)?
I want to evaluate multiple design choices, like, for example, the Eyeriss example but with some changes to the architecture parameters. And I want to evaluate a full network, e.g. ResNet-50. I know that Timeloop doesn't support cross-layer optimizations, but that is good enough for me.
So I have two more options for parallelism, and I want to be able to evaluate multiple architectures in parallel.
Understood, but my point is that each Timeloop-mapper invocation already maxes out the parallelism on your host machine in a controlled manner. Adding additional parallel work will only slow things down.
Isn't the number of threads limited by a key of the mapper
config? If that is the case, it would be interesting to also parallelize in other directions. To be honest, I have no idea if that would be faster than just giving the mapper more threads, but I would like to explore that option. Unless the mapper actually just fills the host machine with threads. Then yeah, won't make sense to do so.
The mapper's default behavior is indeed to fill the host machine with threads:
timeloop/src/applications/mapper/mapper.cpp
Line 145 in 450af79
The key in mapper
config is used to override that default behavior (e.g., for debugging, or for limiting host machine utilization).