InftyAI/llmaz

Integrate with Kueue for fungibility capacity

kerthcet opened this issue · 1 comments

What would you like to be added:

Kueue is a great project which focus on job queueing and resource management, it can also support inference service by managing Pods, it's efficient because we have the overview of the cluster and we know much whether the GPU kinds are insufficient or not, comparing to runtime failover.

What's more, if kueue is already part of your component, it would be really great!

Why is this needed:

Fungibility capacity.

Completion requirements:

This enhancement requires the following artifacts:

  • Design doc
  • API change
  • Docs update

The artifacts should be linked in subsequent comments.

/kind feature