Some tasks go unnoticed in a simple case
wlandau opened this issue · 5 comments
I was revisiting https://github.com/wlandau/crew/blob/main/tests/throughput/test-persistent.R just now, and I noticed a task that did not seem to register with the dispatcher. It was easy to create a quick reproducible example of this behavior with just mirai
:
library(mirai)
daemons(n = 4, url = "ws://127.0.0.1:5000", dispatcher = TRUE, token = TRUE)
urls <- environment(daemons)$..$default$urls
tasks1 <- replicate(5, mirai(TRUE))
launch_server(url = urls[1])
tasks2 <- replicate(5, mirai(TRUE))
Sys.sleep(1)
daemons()$daemons
daemons(0)
When I run the code above, daemons()$daemons
tells me that only 7 of the 10 tasks were assigned and completed:
ws://127.0.0.1:5000/1/8b8c503e1c8cbcd077b1b79c7ac8bac55e5e9ced 1 1 7 7
ws://127.0.0.1:5000/2/5bd2d543258c4082e95f470da9aee91b95f4fa06 0 0 0 0
ws://127.0.0.1:5000/3/df141b21c0d541b49d663c6f7a8661a4ebbf1280 0 0 0 0
ws://127.0.0.1:5000/4/8bcff54b5c8e8ca6d917b72b2abc0c792b31fe22 0 0 0 0
Whenever I run this with starting with daemons(n = 4)
, the number of tasks assigned/completed is always 7. And interestingly, if n = 2
, then assigned and completed both are 9. Trying a grid of n
values from 1 to 10, I always see n
+ assigned = 11. For n > 10, I see assigned = 1 and completed = 1.
I tried mirai
versions 0.8.7.9006, 0.8.7, and 0.8.4, all with the same result. nanonext
is at 0.8.3.9001. Findings are the same on both my local Ubuntu machine and my Macbook. nanonext::nng_version()
returns c("1.6.0pre", "mbed TLS 3.4.0")
on both machines.
A bit of a brainteaser. May still be some edge cases out there in which case I can take a look tmrw.
Awesome, thanks! Looks fixed on my end.
Fyi, these tasks were never lost - simply pre-assigned to particular servers at dispatcher as mirai
does not suppose the scaling up/down that crew
does. However, it was fairly easy to 'plug in' the necessary logic to dispatcher. Possibly an example in support of #60 (comment).
My initial diagnosis was actually incorrect. It was not the case that tasks were 'pre-assigned'. Hence the re-implementation in ce6b92f provides a more elegant and targeted fix. This just ensures the dispatcher loop is run again whenever a result is returned, if it would otherwise not have done. This is enough to ensure that any waiting tasks are dispatched.
The above eliminates the need for an extra dispatcher loop. All as it should be.