when i run poker_ai train start, the training get stuck at INFO sending sentinel to worker server.py:131
Opened this issue · 1 comments
Adnan-annan commented
when i run poker_ai train start, the training get stuck at INFO sending sentinel to worker server.py:131
martingouy commented
It is probably not stuck but the message can be misguiding.
In the terminate
method, it is waiting for all the workers to be idle before exiting. What I experienced, and what I suspect you experienced also, is that a few workers were still busy with tasks.
You can try using the flags --sync_cfr --sync_discount --sync_serialise
, the server will wait for each step to complete on all the workers before moving to the next one. It will take more time, but the progress bar will be more reliable.