CentaurusInfra/alnair

scheduling needs

Opened this issue · 0 comments

  1. Decide when to prefetch data, assume multiple jobs in queue, when to bring a job's required data in to a limited local storage
  2. Cache evict policy, pop out which job's data
  3. In spot instance case, which jobs (one or a group of jobs) to evict when a high priority job comes, minimize impact
  4. Cross-region scheduling when compute and storage resource located at different regions. how to handle network uncertainties.
  5. Simulation model, resources include CPU, Memory, GPU/(other accelerators)