Change scoring_uri for ClusterIP inferenceRouterServiceType
Opened this issue · 0 comments
When the ML extension is installed with ClusterIP set for inferenceRouterServiceType, the OnlineEndpoint seems to be building the scoring URI with the azureml-fe service's cluster IP, which is completely inaccessible outside the cluster if you use kubenet. This breaks the "Test" functionality on an Azure ML endpoint and the "az ml online-endpoint invoke" command, both of which try to use that internal ClusterIP.
OnlineEndpoint status
status:
...
scoringUri: http://10.0.227.236/api/v1/endpoint/dev-rec-ab/score
Azure CLI Invoke and Error
$ az ml online-endpoint invoke --name <endpoint-name> \
--resource-group <workspace-rg> \
--workspace-name <workspace-name> \
--deployment-name "green" \
--request-file "sample_request.json"
cli.azure.cli.core.azclierror: (<urllib3.connection.HTTPConnection object at 0x7fc1dfebbee0>, 'Connection to 10.0.227.236 timed out. (connect timeout=300)')
Is there not any way to set or override this URI? I dug through the documentation, but the only relevant text I could find in regards to cluster IP and nodeport configurations are here Key considerations for AzureML extension deployment:
- Type NodePort. Exposes azureml-fe on each Node's IP at a staic port. You'll be able to contact azureml-fe, from outside of cluster, by requesting :. Using NodePort also allows you to setup your own load balancing solution and SSL termination for azureml-fe.
- Type ClusterIP. Exposes azureml-fe on a cluster-internal IP, and it makes azureml-fe only reachable from within the cluster. For azureml-fe to serve inference requests coming outside of cluster, it requires you to setup your own load balancing solution and SSL termination for azureml-fe.
I've configured an Istio ingress controller to pass external requests along to the azureml-fe service as my own load balancing solution, and that is working fine.
The rest of the documentation is written specifically for the LoadBalancer type, ignoring these other options. I tried setting a "scoring_uri" parameter in my endpoint.yaml, but it was ignored. I also dug through the configmaps in the azureml namespace, but saw no reference of it. I assume it is pulling it from the azureml/azureml-fe
service directly. Is there really no way to override this? I want it to use the IP address I've configured on the ingress which is actually accessible, not the internal cluster IP. I'm having a hard time believing the only way to expose inference endpoints externally without crippling the earlier mentioned problems hasn't already been accounted for, but I can't seem to track down a solution.