More complex model management (multiple models, model reloading etc...)

Question

More complex model management (multiple models, model reloading etc...)

Closed this issue 3 months ago · 2 comments

bsergean commented 4 months ago

🚀 Feature

Supporting model reloads (when a new version is available) and multiple models.

Motivation

Other servers supports this so to be more attractive that would be a nice feature.

Pitch

Right now it's obvious on how to serve one model, but what if there are multiple ones (and the request (binary, or HTTP arguments) will tell which model should be used).

Alternatives

Run N instances for the N models present at a certain time, but if a new model appear, that won't work.

Additional context

We have an internal C++ server that supports this, torch.serve support that too with I believe what they call an orchestrator.

Answer 1 · 2024-09-20T16:33:59.000Z

Looks like a duplicate of #271

Answer 2 · 2024-10-07T11:04:50.000Z

closing since duplicate of #271