replicate/replicate-python

Deployment functionality

vishnubob opened this issue · 2 comments

I would like to be able to spin up and shutdown deployments from the API. From looking over the API and python client, this doesn’t seem possible. Am I missing something or would it be possible to add this functionality?

Thanks!

Hi, @vishnubob. You're correct that Replicate doesn't currently expose any APIs for managing deployments. However, you can configure your deployment with a min / max number of concurrent predictions to handle, and the autoscaler will spin up and down down model instances based on inbound requests.

Hi @mattt, thanks for your response. I am using replicate for an interactive photobooth, so my use case is a bit unusual. Since the installation is temporal, I only need the deployment while the installation is available. In order to reduce any latency, I standup a single node deployment while the installation is available, and spin down the nodes when I strike. However, it's a complicated installation, and I sometimes forget to spin down the deployments during strike, so I end up paying for idle deployments. Being able to automate the deployment from the software would be a huge win.

For now, I have transitioned this part of the project to tailscale which lets me use my own server at home, but if I could automate the deployment, I would switch back to using replicate.