Citi/scaler

Add option --keep-task to turn on/off scheduler persistence mode

Closed this issue · 0 comments

Currently, scaler scheduler is keeping the task object in the memory so if worker died, scheduler will allocate the task to other workers, but for memory efficiency, we will need have a mode that scheduler doesn't keep the task once sent to worker, so there are some behavior changes when this

keep task:

  • when task failed due to worker get disconnected, scheduler will reassign to another worker to do
  • when balancing tasks, scheduler just need task ids from busy worker and send to other workers

do not keep task:

  • when task failed due to worker get disconnected, scheduler will just return failed result to Client
  • when balancing tasks, because scheduler doesn't have task content at all, so it will ask busy workers to return not only task ids, but also task contents so it can reschedule to other workers