Allow user to configure autoscaling
Opened this issue · 0 comments
ariefrahmansyah commented
Right now, we only use the queue.sidecar.serving.knative.dev/resourcePercentage
annotation to configure the autoscaling and its value is configured globally per environment.
As described here, we also cannot specify autoscaling policy for predictor (the model) and transformer specifically.
This issue tracks how to enables the user to specify autoscaling configuration for their model.