Starved workers are busy hunting for a job
art-w opened this issue · 1 comments
art-w commented
@barko reported that this short test can reach 100% cpu activity even though the domains spend their time sleeping. According to perf and a quick investigation, it seems that the issue is this spin in Multi_channel
: When a worker is starving for a job, it busy waits for up to 1ms.
A quick fix would be to reduce the 2048 allowed iterations to a much smaller number (it's * nb_domains
with recv_poll_loop
, so it also gets worse as the number of cores grows).
(cc @lyrm since you also abhor spinlocks, the array of workstealing queues looks like a good opportunity for a fancy lock'n'lockfree datastructure combination!)
kayceesrk commented
CC @polytypic.