Info Endpoint in Openai v1/model style

Question

Info Endpoint in Openai v1/model style

Closed this issue a year ago · 3 comments

It would be helpful to provide the information on which model is running.

Answer 1 · 2023-10-04T14:55:18.000Z

Hey @michaelfeil - definitely. Can you tell me more about what you're trying to achieve?

Answer 2 · 2023-10-04T16:15:07.000Z

I would like to host multiple instances of e.g. tgi and set a API Gateway (litellm) in the center, which can handle e.g. GPT3.5, GPT4 and Mistal-7b under ONE url. To do that, i need some kind of „info“ / config, which can dynamically do this, depending on uptime in k8s.

Answer 3 · 2023-10-04T16:16:40.000Z

what does 'which model is running' mean in that context?

E.g. if you can call openai and anthropic via the server, would that mean both models are 'running'?

And how is 'running' different from 'available'?