Allow user to configure autoscaling

Question

Opened this issue 4 years ago · 0 comments

Right now, we only use the queue.sidecar.serving.knative.dev/resourcePercentage annotation to configure the autoscaling and its value is configured globally per environment.

As described here, we also cannot specify autoscaling policy for predictor (the model) and transformer specifically.

This issue tracks how to enables the user to specify autoscaling configuration for their model.