Lightning-AI/LitServe

More complex model management (multiple models, model reloading etc...)

Closed this issue ยท 2 comments

๐Ÿš€ Feature

Supporting model reloads (when a new version is available) and multiple models.

Motivation

Other servers supports this so to be more attractive that would be a nice feature.

Pitch

Right now it's obvious on how to serve one model, but what if there are multiple ones (and the request (binary, or HTTP arguments) will tell which model should be used).

Alternatives

Run N instances for the N models present at a certain time, but if a new model appear, that won't work.

Additional context

We have an internal C++ server that supports this, torch.serve support that too with I believe what they call an orchestrator.

Looks like a duplicate of #271

closing since duplicate of #271