Hsword/Hetu

Get stuck when launching the server in hybrid mode

HPFilter opened this issue · 1 comments

Hello,
I was trying to run hetu in hybrid mode with multiple nodes.
My command to launch the server and scheduler is
python -m hetu.launcher ${workdir}/../settings/local_s1.yml -n 1 --sched

and the content of yml file is as follows.
image

However, both the scheduler and server processes are stuck in the following loop (Line 357 in the file ps-lite/src/van.cc).
image

Without shutting down the previous program, I started the worker and it could not find the server.
image

My workers and server are located in the same machine.
I wonder is there anything wrong with my configurations or did I miss any preceding steps?

Thank you very much.

It's normal that the scheduler and the servers will wait for the workers.
Now we use heturun command to launch distributed tasks. Please refer to #25 which provides an example of heturun.