Scheduler for running multiple jobs resource-efficiently
wynot12 opened this issue · 0 comments
wynot12 commented
Current JobServer runs jobs with partitioned resources.
However, we can run jobs more efficiently by sharing resource across jobs.
For this, we need to coordinate jobs run harmoniously without contention, maximizing resource utilization.
In detail, we need to do following things:
- change worker trainer task to be controllable with more fine-grained manner.
- introduce a component to control trainer tasks.