feat: KServe Inference
sungsoo opened this issue ยท 20 comments
KServer Inference
- Article Source: First InferenceService
First InferenceService
Run your first InferenceService
In this tutorial, you will deploy a ScikitLearn InferenceService.
This inference service loads a simple iris ML model, send a list of attributes and print the prediction for the class of iris plant."
Since your model is being deployed as an InferenceService, not a raw Kubernetes Service, you just need to provide the trained model and it gets some super powers out of the box ๐.
1. Create test InferenceService
apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
name: "sklearn-iris"
spec:
predictor:
sklearn:
storageUri: "gs://kfserving-samples/models/sklearn/iris"
Once you've created your YAML file (named something like "sklearn.yaml"):
kubectl create namespace kserve-test
kubectl apply -f sklearn.yaml -n kserve-test
You can verify the deployment of this inference service as follows.
(base) โญโsungsoo@z840 ~
โฐโ$ k get pods -A -w
NAMESPACE NAME READY STATUS RESTARTS AGE
...์ค๊ฐ ์๋ต
kserve-test sklearn-iris-predictor-default-00001-deployment-7958c8bfddv68k9 0/2 Pending 0 2s
kserve-test sklearn-iris-predictor-default-00001-deployment-7958c8bfddv68k9 0/2 Pending 0 3s
kserve-test sklearn-iris-predictor-default-00001-deployment-7958c8bfddv68k9 0/2 Init:0/1 0 3s
kserve-test sklearn-iris-predictor-default-00001-deployment-7958c8bfddv68k9 0/2 Init:0/1 0 8s
kserve-test sklearn-iris-predictor-default-00001-deployment-7958c8bfddv68k9 0/2 Init:0/1 0 41s
kserve-test sklearn-iris-predictor-default-00001-deployment-7958c8bfddv68k9 0/2 PodInitializing 0 51s
kserve-test sklearn-iris-predictor-default-00001-deployment-7958c8bfddv68k9 1/2 Running 0 97s
kserve-test sklearn-iris-predictor-default-00001-deployment-7958c8bfddv68k9 2/2 Running 0 98s
2. Check InferenceService status.
kubectl get inferenceservices sklearn-iris -n kserve-test
NAME URL READY PREV LATEST PREVROLLEDOUTREVISION LATESTREADYREVISION AGE
sklearn-iris http://sklearn-iris.kserve-test.example.com True 100 sklearn-iris-predictor-default-47q2g 7d23h
If your DNS contains example.com please consult your admin for configuring DNS or using custom domain.
3. Determine the ingress IP and ports
Execute the following command to determine if your kubernetes cluster is running in an environment that supports external load balancers
$ kubectl get svc istio-ingressgateway -n istio-system
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
istio-ingressgateway LoadBalancer 172.21.109.129 130.211.10.121 ... 17h
or @microk8s with kubeflow
(base) โญโsungsoo@sungsoo-HP-Z840 ~
โฐโ$ kubectl get svc istio-ingressgateway -n kubeflow
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
istio-ingressgateway LoadBalancer 10.152.183.116 10.64.140.43 15020:32267/TCP,80:32425/TCP,443:31890/TCP,15029:31587/TCP,15030:31591/TCP,15031:32223/TCP,15032:32596/TCP,15443:32307/TCP,15011:32504/TCP,8060:32176/TCP,853:30715/TCP 12h
Load Balancer
If the EXTERNAL-IP value is set, your environment has an external load balancer that you can use for the ingress gateway.
export INGRESS_HOST=$(kubectl -n istio-system get service istio-ingressgateway -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
export INGRESS_PORT=$(kubectl -n istio-system get service istio-ingressgateway -o jsonpath='{.spec.ports[?(@.name=="http2")].port}')
or @microk8s with kubeflow
export INGRESS_HOST=$(kubectl -n kubeflow get service istio-ingressgateway -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
export INGRESS_PORT=$(kubectl -n kubeflow get service istio-ingressgateway -o jsonpath='{.spec.ports[?(@.name=="http2")].port}')
Node Port
If the EXTERNAL-IP value is none (or perpetually pending), your environment does not provide an external load balancer for the ingress gateway. In this case, you can access the gateway using the serviceโs node port.
# GKE
export INGRESS_HOST=worker-node-address
# Minikube
export INGRESS_HOST=$(minikube ip)
# Other environment(On Prem)
export INGRESS_HOST=$(kubectl get po -l istio=ingressgateway -n istio-system -o jsonpath='{.items[0].status.hostIP}')
export INGRESS_PORT=$(kubectl -n istio-system get service istio-ingressgateway -o jsonpath='{.spec.ports[?(@.name=="http2")].nodePort}')
Port Forward
Alternatively you can do Port Forward for testing purpose
INGRESS_GATEWAY_SERVICE=$(kubectl get svc --namespace istio-system --selector="app=istio-ingressgateway" --output jsonpath='{.items[0].metadata.name}')
kubectl port-forward --namespace istio-system svc/${INGRESS_GATEWAY_SERVICE} 8080:80
# start another terminal
export INGRESS_HOST=localhost
export INGRESS_PORT=8080
4. Curl the InferenceService
First prepare your inference input request
{
"instances": [
[6.8, 2.8, 4.8, 1.4],
[6.0, 3.4, 4.5, 1.6]
]
}
Once you've created your json test input file (named something like "iris-input.json"):
Real DNS
If you have configured the DNS, you can directly curl the InferenceService with the URL obtained from the status print. e.g
์ด ๋ถ๋ถ์์ ์ค๋ฅ๊ฐ ์๊ธด๋ค. DNS ๋ฌธ์ ์ธ ๋ฏ...
์ดํด๋ณด์!
curl -v http://sklearn-iris.kserve-test.${CUSTOM_DOMAIN}/v1/models/sklearn-iris:predict -d @./iris-input.json
curl -v http://sklearn-iris.kserve-test.example.com/v1/models/sklearn-iris:predict -d @./iris-input.json
Magic DNS
If you don't want to go through the trouble to get a real domain, you can instead use "magic" dns xip.io. The key is to get the external IP for your cluster.
kubectl get svc istio-ingressgateway --namespace istio-system
Look for the EXTERNAL-IP column's value(in this case 35.237.217.209)
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
istio-ingressgateway LoadBalancer 10.51.253.94 35.237.217.209
Next step is to setting up the custom domain:
kubectl edit cm config-domain --namespace knative-serving
Now in your editor, change example.com to {{external-ip}}.xip.io (make sure to replace {{external-ip}} with the IP you found earlier).
With the change applied you can now directly curl the URL
curl -v http://sklearn-iris.kserve-test.35.237.217.209.xip.io/v1/models/sklearn-iris:predict -d @./iris-input.json
From Ingress gateway with HOST Header
If you do not have DNS, you can still curl with the ingress gateway external IP using the HOST Header.
SERVICE_HOSTNAME=$(kubectl get inferenceservice sklearn-iris -n kserve-test -o jsonpath='{.status.url}' | cut -d "/" -f 3)
curl -v -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/sklearn-iris:predict -d @./iris-input.json
From local cluster gateway
If you are calling from in cluster you can curl with the internal url with host {{InferenceServiceName}}.{{namespace}}
curl -v http://sklearn-iris.kserve-test/v1/models/sklearn-iris:predict -d @./iris-input.json
6. Run Performance Test
# use kubectl create instead of apply because the job template is using generateName which doesn't work with kubectl apply
kubectl create -f https://raw.githubusercontent.com/kserve/kserve/release-0.7/docs/samples/v1beta1/sklearn/v1/perf.yaml -n kserve-test
Expected Outpout
kubectl logs load-test8b58n-rgfxr -n kserve-test
Requests [total, rate, throughput] 30000, 500.02, 499.99
Duration [total, attack, wait] 1m0s, 59.998s, 3.336ms
Latencies [min, mean, 50, 90, 95, 99, max] 1.743ms, 2.748ms, 2.494ms, 3.363ms, 4.091ms, 7.749ms, 46.354ms
Bytes In [total, mean] 690000, 23.00
Bytes Out [total, mean] 2460000, 82.00
Success [ratio] 100.00%
Status Codes [code:count] 200:30000
Error Set:
Run your first InferenceService
KFServing InferenceService ๋ฐฐํฌ์ ์์ธก
KFServing - Deep dive
์๋ฒ๋ฆฌ์ค๋?
์๋ฒ๋ฆฌ์ค(serverless)๋ ๊ฐ๋ฐ์๊ฐ ์๋ฒ๋ฅผ ๊ด๋ฆฌํ ํ์ ์์ด ์ ํ๋ฆฌ์ผ์ด์ ์ ๋น๋ํ๊ณ ์คํํ ์ ์๋๋ก ํ๋ ํด๋ผ์ฐ๋ ๋ค์ดํฐ๋ธ ๊ฐ๋ฐ ๋ชจ๋ธ์ ๋๋ค.
Python SDK for building, training, and deploying ML models
Overview of Kubeflow Fairing
Kubeflow Fairing is a Python package that streamlines the process of building, training, and deploying machine learning (ML) models in a hybrid cloud environment. By using Kubeflow Fairing and adding a few lines of code, you can run your ML training job locally or in the cloud, directly from Python code or a Jupyter notebook. After your training job is complete, you can use Kubeflow Fairing to deploy your trained model as a prediction endpoint.
Use Kubeflow Fairing SDK
To install the SDK:
pip install kubeflow-fairing
To quick start, you can run the E2E MNIST sample.
Documentation
To learn how Kubeflow Fairing streamlines the process of training and deploying ML models in the cloud, read the Kubeflow Fairing documentation.
To learn the Kubeflow Fairing SDK API, read the HTML documentation.
Getting Started with KServe
Install the KServe "Quickstart" environment
You can get started with a local deployment of KServe by using KServe Quick installation script on Kind:
First, download quick_install.sh file.
wget https://raw.githubusercontent.com/kserve/kserve/release-0.8/hack/quick_install.sh
Insert the following relevant shell at the first line in quick_install.sh file.
In my case, I use zsh. So, I inserted as the following.
#!/usr/bin/zsh
...
set -e
############################################################
# Help #
############################################################
Help()
...
Then, execute the shell file
(base) โญโsungsoo@sungsoo-HP-Z840 ~/kubeflow
โฐโ$ quick_install.sh
You can see the following console outputs.
Downloading istio-1.9.0 from https://github.com/istio/istio/releases/download/1.9.0/istio-1.9.0-linux-amd64.tar.gz ...
Istio 1.9.0 Download Complete!
Istio has been successfully downloaded into the istio-1.9.0 folder on your system.
Next Steps:
See https://istio.io/latest/docs/setup/install/ to add Istio to your Kubernetes cluster.
To configure the istioctl client tool for your workstation,
add the /home/sungsoo/kubeflow/istio-1.9.0/bin directory to your environment path variable with:
export PATH="$PATH:/home/sungsoo/kubeflow/istio-1.9.0/bin"
Begin the Istio pre-installation check by running:
istioctl x precheck
Need more information? Visit https://istio.io/latest/docs/setup/install/
namespace/istio-system unchanged
โ Istio core installed
โ Istiod installed
- Processing resources for Ingress gateways. Waiting for Deployment/istio-system/istio-ingressgateway
์ฃผ์ ์ค๋ฅ
Warning: apiextensions.k8s.io/v1beta1 CustomResourceDefinition is deprecated in v1.16+, unavailable in v1.22+; use apiextensions.k8s.io/v1 CustomResourceDefinition
customresourcedefinition.apiextensions.k8s.io/ambassadorinstallations.getambassador.io created
error: .status.conditions accessor error: <nil> is of the type <nil>, expected []interface{}
microk8s ์ฌ์ค์น ๊ฒฝ๊ณ ๋ฉ์ธ์ง
ํ๋ ์ํ๋ฆฌํฐ ํด๋ฆฌ์
ํ๋์ํ๋ฆฌํฐํด๋ฆฌ์(PodSecurityPolicy)๋ ์ฟ ๋ฒ๋คํฐ์ค v1.21๋ถํฐ ๋ ์ด์ ์ฌ์ฉ๋์ง ์์ผ๋ฉฐ, v1.25์์ ์ ๊ฑฐ๋ ์์ ์ด๋ค.
Warning: policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
Juju ์ญ์ ๊ด๋ จ
Removal terms
There is a distinction between the similar sounding commands unregister
, detach
, remove
, destroy
, and kill
. These commands are ordered such that their effect increases in severity:
-
Unregister
means to decouple a resource from a logical entity for the client. The effect is local to the client only and does not affect the logical entity in any way. -
Detach
means to decouple a resource from a logical entity (such as an application). The resource will remain available and the underlying cloud resources used by it also remain in place. -
Remove
means to cleanly remove a single logical entity. This is a destructive process, meaning the entity will no longer be available via Juju, and any underlying cloud resources used by it will be freed (however, this can often be overridden on a case-by-case basis to leave the underlying cloud resources in place). -
Destroy
means to cleanly tear down a logical entity, along with everything within these entities. This is a very destructive process. -
Kill
means to forcibly tear down an unresponsive logical entity, along with everything within it. This is a very destructive process that does not guarantee associated resources are cleaned up.
These command terms/prefixes do not apply to all commands in a generic way. The explanations above are merely intended to convey how a command generally operates and what its severity level is.
Forcing removals
Juju object removal commands do not succeed when there are errors in the multiple steps that are required to remove the underlying object. For instance, a unit will not remove properly if it has a hook error, or a model cannot be removed if application units are in an error state. This is an intentionally conservative approach to the deletion of things.
However, this policy can also be a source of frustration for users in
certain situations (i.e. โI donโt care, I just want my model gone!โ).
Because of this, several commands have a --force
option.
Furthermore, even when utilising the --force
option, the
process may take more time than an administrator is willing to accept
(i.e. โJust go away as quickly as possible!โ). Because of this, several
commands that support the --force
option have, in addition, support for a --no-wait
option.
Caution:
The --force
and --no-wait
options should be
regarded as tools to wield as a last resort. Using them introduces a
chance of associated parts (e.g., relations) not being cleaned up, which
can lead to future problems.
As of v.2.6.1
, this is the state of affairs for those commands that support at least the --force
option:
command | --force | --no-wait |
---|---|---|
destroy-model | yes | yes |
detach-storage | yes | no |
remove-application | yes | yes |
remove-machine | yes | yes |
remove-offer | yes | no |
remove-relation | yes | no |
remove-storage | yes | no |
remove-unit | yes | yes |
When a command has --force
but not --no-wait
, this means that the combination of those options simply does not apply.
There is a distinction between the similar sounding commands unregister, detach, remove, destroy, and kill. These commands are ordered such that their effect increases in severity:
Unregister means to decouple a resource from a logical entity for the client. The effect is local to the client only and does not affect the logical entity in any way.
Detach means to decouple a resource from a logical entity (such as an application). The resource will remain available and the underlying cloud resources used by it also remain in place.
Remove means to cleanly remove a single logical entity. This is a destructive process, meaning the entity will no longer be available via Juju, and any underlying cloud resources used by it will be freed (however, this can often be overridden on a case-by-case basis to leave the underlying cloud resources in place).
Destroy means to cleanly tear down a logical entity, along with everything within these entities. This is a very destructive process.
Kill means to forcibly tear down an unresponsive logical entity, along with everything within it. This is a very destructive process that does not guarantee associated resources are cleaned up.
These command terms/prefixes do not apply to all commands in a generic way. The explanations above are merely intended to convey how a command generally operates and what its severity level is.
Juju object removal commands do not succeed when there are errors in the multiple steps that are required to remove the underlying object. For instance, a unit will not remove properly if it has a hook error, or a model cannot be removed if application units are in an error state. This is an intentionally conservative approach to the deletion of things.
However, this policy can also be a source of frustration for users in certain situations (i.e. โI donโt care, I just want my model gone!โ). Because of this, several commands have a --force option.
Furthermore, even when utilising the --force option, the process may take more time than an administrator is willing to accept (i.e. โJust go away as quickly as possible!โ). Because of this, several commands that support the --force option have, in addition, support for a --no-wait option.
Caution: The --force and --no-wait options should be regarded as tools to wield as a last resort. Using them introduces a chance of associated parts (e.g., relations) not being cleaned up, which can lead to future problems.
As of v.2.6.1, this is the state of affairs for those commands that support at least the --force option:
command --force --no-wait
destroy-model yes yes
detach-storage yes no
remove-application yes yes
remove-machine yes yes
remove-offer yes no
remove-relation yes no
remove-storage yes no
remove-unit yes yes
When a command has --force but not --no-wait, this means that the combination of those options simply does not apply.
Juju deploy ์ ์ค๋ฅ
Juju deploy ๋ช ๋ น ์คํ ํ, ์๋์ ๊ฐ์ ์ค๋ฅ๊ฐ ์๊ธธ ๋,
(base) โญโsungsoo@sungsoo-HP-Z840 ~
โฐโ$ juju deploy kubeflow --trust 1 โต
ERROR The charm or bundle "kubeflow" is ambiguous.
๋ค์๊ณผ ๊ฐ์ด ํด๋น ์์ค์ ๋ํ ๋ค์์คํ์ด์ค๋ฅผ ๋ฃ์ด์ ์คํํ์.
(base) โญโsungsoo@sungsoo-HP-Z840 ~
โฐโ$ juju deploy cs:kubeflow --trust
Juju uninstallation
# Hard reinstall of clients
snap remove --purge juju
rm -rf ~/.local/share/juju
snap install juju --classic
# Hard re-install of controllers or machines needs a bit more
# Gladly juju leaves a helper to do so
$ sudo /usr/sbin/remove-juju-services
KServe: ๊ฒฌ๊ณ ํ๊ณ ํ์ฅ ๊ฐ๋ฅํ ํด๋ผ์ฐ๋ ๋ค์ดํฐ๋ธ ๋ชจ๋ธ ์๋ฒ
Kubeflow์ ์ต์ํ๋ค๋ฉด KFServing์ ํ๋ซํผ์ ๋ชจ๋ธ ์๋ฒ ๋ฐ ์ถ๋ก ์์ง์ผ๋ก ์๊ณ ์์ ๊ฒ์ด๋ค. 2021๋ 9์ KFServing ํ๋ก์ ํธ๋ KServe๋ก ๋ณ๋ชจํ๋ค.
KServe๋ ํ์ฌ Kubeflow ํ๋ก์ ํธ๋ฅผ ์กธ์ ํ ๋ ๋ฆฝ ์ปดํฌ๋ํธ์ด๋ฉฐ ๋ช ์นญ ๋ณ๊ฒฝ์ ๋ณ๊ฐ์ด๋ค. ์ด๋ฌํ ๋ถ๋ฆฌ๋ฅผ ํตํด KServe๋ ๋ ๋ฆฝํ ๋ชจ๋ธ ์๋ฒ๋ก ๊ตฌ์ถ๋ ๋ณ๋์ ํด๋ผ์ฐ๋ ๋ค์ดํฐ๋ธ ์ถ๋ก ์์ง์ผ๋ก ๋ฐ์ ํ ์ ์๋ค. ๋ฌผ๋ก Kubeflow์์ ๊ธด๋ฐํ ํตํฉ์ ๊ณ์๋๊ฒ ์ง๋ง, ๋ ๋ฆฝ์ ์ธ ์คํ ์์ค ํ๋ก์ ํธ๋ก ์ทจ๊ธ๋๊ณ ์ ์ง๋๋ค.
KServe๋ Google, IBM, Bloomberg, Nvidia ๋ฐ Seldon์ด Kubernetes์ ์คํ ์์ค ํด๋ผ์ฐ๋ ๋ค์ดํฐ๋ธ ๋ชจ๋ธ ์๋ฒ๋ก ๊ณต๋์ผ๋ก ๊ฐ๋ฐํ๋ค. ์ต์ ๋ฒ์ ์ธ 0.8์์๋ ๋ถ๋ฅ๋ฒ ๋ฐ ๋ช ๋ช ๋ฒ์ด ๋ณ๊ฒฝ๋์ด ๋ชจ๋ธ ์๋ฒ๋ฅผ ๋ ๋ฆฝํ ์ปดํฌ๋ํธ๋ก ์ ํํ๋ ๋ฐ ์ด์ ์ ๋ง์ท๋ค.
KServe์ ํต์ฌ ๊ธฐ๋ฅ์ ๋ํด ์์๋ณด๊ฒ ๋ค.
๋ชจ๋ธ ์๋ฒ๋ ๋จธ์ ๋ฌ๋ ๋ชจ๋ธ์๊ฒ ์ดํ๋ฆฌ์ผ์ด์ ์ด ๋ฐ์ด๋๋ฆฌ๋ฅผ ์ฝ๋ํํ๋ ๊ฒ๊ณผ ๊ฐ์ ์ญํ ์ ํ๋ค. ๋ ๋ค ๋ฐฐํฌ์ ๋ฐํ์ ๋ฐ ์คํ ์ปจํ ์คํธ๋ฅผ ์ ๊ณตํ๋ค. KServe๋ ๋ชจ๋ธ ์๋ฒ๋ก์ ๋จธ์ ๋ฌ๋ ๋ฐ ๋ฅ๋ฌ๋ ๋ชจ๋ธ์ ๊ท๋ชจ์ ๋ง๊ฒ ์ ๊ณตํ ์ ์๋ ๊ธฐ๋ฐ์ ์ ๊ณตํ๋ค.
KServe๋ ๊ธฐ์กด Kubernetes ๋ฐฐํฌ ๋๋ scale-to-zero๋ฅผ ์ง์ํ๋ ์๋ฒ๋ฆฌ์ค๋ก ๋ฐฐํฌํ ์ ์๋ค. ์๋ฒ๋ฆฌ์ค์์๋ ์๋ ์ค์ผ์ผ์ ๋ฐ ์ค์ผ์ผ๋ค์ด ๊ธฐ๋ฅ์ ๊ฐ์ถ ์๋ฒ๋ฆฌ์ค์ฉ Knative Serving๋ฅผ ํ์ฉํ๋ค. Istio๋ ์๋น์ค ์๋ํฌ์ธํธ๋ฅผ API ์๋น์์๊ฒ ๊ณต๊ฐํ๊ธฐ ์ํ ์ธ๊ทธ๋ ์ค๋ก ์ฌ์ฉ๋๋ค. Istio์ Knative Serving์ ์กฐํฉ์ผ๋ก ๋ชจ๋ธ์ ๋ธ๋ฃจ/๊ทธ๋ฆฐ ๋ฐ ์นด๋๋ฆฌ ๋ฐฐํฌ์ ๊ฐ์ ํฅ๋ฏธ๋ก์ด ์๋๋ฆฌ์ค๊ฐ ๊ฐ๋ฅํ๋ค.
KServe๋ฅผ Knative Serving ์์ด ์ฌ์ฉํ ์ ์๋ RawDeployment Mode๋ HPA(์ํ ํฌ๋ ์๋์ค์ผ์ผ๋ฌ)์ ๊ฐ์ ๊ธฐ์กด ์ค์ผ์ผ๋ง ๊ธฐ์ ์ ์ง์ํ์ง๋ง scale-to-zero๋ ์ง์ํ์ง ์๋๋ค.
KServe ์ํคํ ์ฒ
KServe ๋ชจ๋ธ ์๋ฒ์๋ ์ปจํธ๋กค ํ๋ ์ธ๊ณผ ๋ฐ์ดํฐ ํ๋ ์ธ์ด ์๋ค. ์ปจํธ๋กค ํ๋ ์ธ์ ์ถ๋ก ์ ๋ด๋นํ๋ ์ปค์คํ ๋ฆฌ์์ค๋ฅผ ๊ด๋ฆฌํ๊ณ ์กฐ์ ํ๋ค. ์๋ฒ๋ฆฌ์ค ๋ชจ๋์์๋ Knative ๋ฆฌ์์ค์ ์ฐ๊ณํ์ฌ ์๋ ์ค์ผ์ผ์ ๊ด๋ฆฌํ๋ค.
KServe ์ปจํธ๋กค ํ๋ ์ธ์ ์ค์ฌ์๋ ์ถ๋ก ์๋น์ค์ ๋ผ์ดํ ์ฌ์ดํด์ ๊ด๋ฆฌํ๋ KServe ์ปจํธ๋กค๋ฌ๊ฐ ์๋ค. ์๋น์ค, ์ธ๊ทธ๋ ์ค ๋ฆฌ์์ค, ๋ชจ๋ธ ์๋ฒ ์ปจํ ์ด๋, ์์ฒญ/์๋ต ๋ก๊น ์ ์ํ ๋ชจ๋ธ ์์ด์ ํธ ์ปจํ ์ด๋, ๋ฐฐ์น ๋ฐ ๋ชจ๋ธ ์ ์ฅ์์์ ๋ชจ๋ธ์ ํ๋ฆผ ์ ๋ฌด๋ฅผ ๋ด๋นํ๋ค. ๋ชจ๋ธ ์ ์ฅ์๋ ๋ชจ๋ธ ์๋ฒ์ ๋ฑ๋ก๋ ๋ชจ๋ธ์ ์ ์ฅ์์ด๋ค. ์ผ๋ฐ์ ์ผ๋ก Amazon S3, Google Cloud Storage, Azure Storage ๋๋ MinIO์ ๊ฐ์ ์ค๋ธ์ ํธ ์คํ ๋ฆฌ์ง ์๋น์ค์ด๋ค.
๋ฐ์ดํฐ ํ๋ ์ธ์ ํน์ ๋ชจ๋ธ์ ๋์์ผ๋ก ํ๋ ์์ฒญ/์๋ต ์ฃผ๊ธฐ๋ฅผ ๊ด๋ฆฌํ๋ค. ์ฌ๊ธฐ์๋ predictor, transformer, explainer์ด ์๋ค.
AI ์ ํ๋ฆฌ์ผ์ด์ ์ REST ๋๋ gRPC ์์ฒญ๋ฅผ predictor ์๋ํฌ์ธํธ๋ก ์ ์กํ๋ค. predictor๋ transformer ์ปดํฌ๋ํธ๋ฅผ ํธ์ถํ๋ ์ถ๋ก ํ์ดํ๋ผ์ธ์ผ๋ก์ ์๋ํ๋ค. transformer ์ปดํฌ๋ํธ๋ inbound ๋ฐ์ดํฐ์ ์ ์ฒ๋ฆฌ(์์ฒญ)์ outbound ๋ฐ์ดํฐ์ ํ์ฒ๋ฆฌ(์๋ต)๋ฅผ ์คํํ ์ ์๋ค. ์ต์ ์ผ๋ก ํธ์คํธ๋ ๋ชจ๋ธ์ AI ์ค๋ช ๊ฐ๋ฅ์ฑ์ ์ ๊ณตํ๋ explainer ์ปดํฌ๋ํธ๊ฐ ์์ ์ ์๋ค. KServe๋ ์ํธ ์ด์ฉ์ฑ๊ณผ ํ์ฅ์ด ๊ฐ๋ฅํ V2 ํ๋กํ ์ฝ์ ์ฌ์ฉ์ ๊ถ์ฅํ๋ค.
๋ฐ์ดํฐ ํ๋ ์ธ์๋ ๋ชจ๋ธ์ ์ค๋น ์ํ์ ์์ ์กด์ฌ ์ฌ๋ถ ์ํ๋ฅผ ํ์ธํ ์ ์๋ ์๋ํฌ์ธํธ๋ ์๋ค. ๋ํ ๋ชจ๋ธ ๋ฉํ๋ฐ์ดํฐ๋ฅผ ๊ฒ์ํ๊ธฐ ์ํ API๋ ์ ๊ณตํ๋ค.
์ง์๋๋ ํ๋ ์์ํฌ ๋ฐ ๋ฐํ์
KServe๋ ๊ด๋ฒ์ํ ๋จธ์ ๋ฌ๋ ๋ฐ ๋ฅ๋ฌ๋ ํ๋ ์์ํฌ๋ฅผ ์ง์ํ๋ค. ๋ฅ๋ฌ๋ ํ๋ ์์๊ณผ ๋ฐํ์์ TensorFlow Serving, TorchServe, Triton Inference Server์ ๊ฐ์ ๊ธฐ์กด์ ์๋น ์ธํ๋ผ์ ํจ๊ป ์๋ํ๋ค. KServe๋ Triton์ ํตํด TensorFlow, ONNX, PyTorch, TensorRT๋ฅผ ํธ์คํธํ ์ ์๋ค.
SKLearn, XGBoost, Spark MLLib ๋ฐ LightGBM KServe๋ฅผ ๊ธฐ๋ฐ์ผ๋ก ํ๋ ๊ธฐ์กด ๋จธ์ ๋ฌ๋ ๋ชจ๋ธ์ ๊ฒฝ์ฐ Seldon์ MLServer๋ฅผ ์ฌ์ฉํ๋ค.
KServe์ ํ์ฅ ๊ฐ๋ฅํ ํ๋ ์์ํฌ๋ V2 ์ถ๋ก ํ๋กํ ์ฝ์ ์ค์ํ๋ ๋ชจ๋ ๋ฐํ์์ ํ๋ฌ๊ทธ์ธํ ์ ์๋๋ก ํ๋ค.
ModelMesh์ ํจ๊ป ์ ๊ณต๋๋ ๋ฉํฐ๋ชจ๋ธ ์๋น
KServe๋ ์ถ๋ก ๋น 1๊ฐ์ ๋ชจ๋ธ์ ๋์ ํ์ฌ ํ๋ซํผ์ ํ์ฅ์ฑ์ ์ฌ์ฉ ๊ฐ๋ฅํ CPU ๋ฐ GPU๋ก ์ ํํ๋ค. ์ด ์ ํ์ ๋น์ฉ์ด ๋ง์ด ๋ค๊ณ ์ปดํจํ ๋ฆฌ์์ค๊ฐ ๋ถ์กฑํ GPU์์ ์ถ๋ก ์ ์คํํ ๋ ๋ช ๋ฐฑํด์ง๋ค.
๋ฉํฐ๋ชจ๋ธ ์๋น์ค๋ฅผ ์ด์ฉํ๋ฉด ์ปดํจํ ๋ฆฌ์์ค, ์ต๋ ํ๋, ์ต๋ IP ์ฃผ์ ๋ฑ ์ธํ๋ผ์ ์ ์ฝ์ ๊ทน๋ณตํ ์ ์๋ค.
IBM์ด ๊ฐ๋ฐํ ModelMesh Serving์ ML/DL ๋ชจ๋ธ์ ์ค์๊ฐ์ผ๋ก ์ ๊ณตํ๊ธฐ ์ํ Kubernetes ๊ธฐ๋ฐ ํ๋ซํผ์ผ๋ก, ๋์ volume/density ์ฌ์ฉ ์ฌ๋ก์ ์ต์ ํ๋์ด ์๋ค. ์ฌ์ฉ ๊ฐ๋ฅํ ๋ฆฌ์์ค๋ฅผ ์ต์ ์ผ๋ก ํ์ฉํ๊ธฐ ์ํด ํ๋ก์ธ์ค๋ฅผ ๊ด๋ฆฌํ๋ ์ด์ ์ฒด์ ์ ๋ง์ฐฌ๊ฐ์ง๋ก ModelMesh๋ ํด๋ฌ์คํฐ ๋ด์์ ํจ์จ์ ์ผ๋ก ์คํ๋๋๋ก ๋ฐฐํฌ๋ ๋ชจ๋ธ์ ์ต์ ํํ๋ค.
๋ฐฐํฌ๋ ํ๋์ ํด๋ฌ์คํฐ ์ ์ฒด์์ ์ธ๋ฉ๋ชจ๋ฆฌ ๋ชจ๋ธ ๋ฐ์ดํฐ๋ฅผ ์ธํ ๋ฆฌ์ ํธํ๊ฒ ๊ด๋ฆฌํ๊ณ , ์ด๋ฌํ ๋ชจ๋ธ์ ์ฅ๊ธฐ๊ฐ์ ๊ฑธ์ณ ์ฌ์ฉํจ์ผ๋ก์จ ์์คํ ์ ์ฌ์ฉ ๊ฐ๋ฅํ ํด๋ฌ์คํฐ ๋ฆฌ์์ค๋ฅผ ์ต๋ํ ํ์ฉํ ์ ์๋ค.
ModelMesh Serving์ KServe v2 ๋ฐ์ดํฐ ํ๋ ์ธ API์ ๊ธฐ๋ฐํ์ฌ NVIDIA Triton Inference Server์ ์ ์ฌํ ๋ฐํ์์ผ๋ก ๋ฐฐํฌํ ์ ์๋ค. ์์ฒญ์ด KServe ๋ฐ์ดํฐ ํ๋ ์ธ์ ๋๋ฌํ๋ฉด, ModelMesh Serving์ ์์๋๋ค.
ModelMesh Serving๊ณผ KServe์ ํตํฉ์ ํ์ฌ Alpha ํ ์คํธ ๋จ๊ณ์ ์๋ค. ๋ ํ๋ก์ ํธ๊ฐ ์ฑ์ํจ์ ๋ฐ๋ผ ํตํฉ์ด ๊ฐํ๋์ด ๋ ํ๋ซํผ์ ํน์ง๊ณผ ๊ธฐ๋ฅ์ ํผ์ฌ์ํฌ ์ ์๊ฒ ๋๋ค.
๋ชจ๋ธ ์๋น์ด MLOps์ ํต์ฌ ๋น๋ฉ ๋ธ๋ก์ด ๋๋ฉด์ KServe์ ๊ฐ์ ์คํ ์์ค ํ๋ก์ ํธ๊ฐ ์ค์ํด์ก๋ค. KServe๋ ๊ธฐ์กด ๋ฐํ์๊ณผ ํฅํ ๋ฐํ์์ ์ฌ์ฉํ ์ ์๋ ํ์ฅ์ฑ์ด ๊ฐ๋ฅํ ๊ณ ์ ํ ๋ชจ๋ธ ์๋น ํ๋ซํผ์ด๋ค.
https://github.com/kserve/kserve
https://www.kubeflow.org/docs/external-add-ons/kserve/kserve/
https://kserve.github.io/website/0.8/
Kserve Istio dex ์ฐํํ๊ธฐ
Article Source
์์ฆ์ kubeflow ๋ฑ MLOps์ ์ธ ๋ถ๋ถ๋ค์ ํ์ฌ์์ ์์ ํ๊ณ ์๋ค. ์๋๋ ๋ชจ๋ธ ๋ฐฐํฌ ์ชฝ์ ๊ธฐ์กด ๋ฐฉ์๋๋ก ์งํํ๋ ค ํ์ง๋ง ๋ฐ์ดํฐ ๋ถ์ํ์์ ๋ชจ๋ธ ๋ฐฐํฌ ๊ณผ์ ์ ๋น ๋ฅด๊ฒ ์งํํ๊ณ ์ถ์ด ํด kserve๋ ํจ๊ป ์ฌ์ฉํ๊ธฐ๋ก ํ๋ค. ์จํ๋ ๋ฏธ์ค ํ๊ฒฝ์์ ๊ด๋ จํ ํ ์คํธ๋ฅผ ์งํํ๋ค dex ์ธ์ฆ ๊ด๋ จ ๋ฌธ์ ๋ฅผ ๋ง๋ ์ด๋ฅผ ํด๊ฒฐํ๋ ๋ฐฉ๋ฒ์ ๋ํด ๊ฐ๋จํ๊ฒ ์ ๋ฆฌํ๋ค.
kubeflow์ istio ๊ตฌ์ฑ, ๊ณต์๋ฌธ์
Kubeflow๋ฅผ ๋ฐฐํฌํ๋ฉด์ istio์ dex๋ฅผ ํจ๊ป ๋ฐฐํฌํ๋ค. istio๋ ์๋น์ค ๊ฐ์ ์ฐ๊ฒฐ์ ์ํด์ ์ฌ์ฉํ๊ณ , dex๋ ์ธ์ฆ์ ์ํด์ ์ฌ์ฉํ๋ค. istio๋ฅผ port forwardํด์ kubeflow dashboard์ ์ ์ํด๋ณด๋ฉด ๊ฐ์ฅ ๋จผ์ dex login ์ฐฝ์ด ์ฐ๊ฒฐ๋๋ค. ๊ทธ๋ฌ๋๊น istio ๊ฒ์ดํธ์จ์ด์ ์ฐ๊ฒฐํ๊ธฐ ์ํด์๋ ์ด ์ธ์ฆ ์ ๋ณด๊ฐ ํ์ํ ๊ฒ์ด๋ค.
kserve๋ฅผ servelessํ ๊ตฌ์ฑ์ผ๋ก ๋ฐฐํฌํ๊ธฐ ์ํด์๋ knative๋ฅผ ํจ๊ป ๋ฐฐํฌํด์ผ ํ๋ค. ์ด knative๋ ๋ค์ istio๋ฅผ ์ด์ฉํด ์๋ก๋ฅผ ์ฐ๊ฒฐํ๋ค. ๋ฌธ์ ๋ ์ฌ๊ธฐ์ ๋ฐ์ํ๋๋ฐ api ์์ฒญ์ด istio ๊ฒ์ดํธ์จ์ด๋ฅผ ๊ฑฐ์น๋ฉด์ ์ธ์ฆ ์ ๋ณด๊ฐ ํ์ํ ๊ฒ์ด๋ค. ํด๋ฌ์คํฐ ๋ฐ๊นฅ์์ ์ฐ๊ฒฐํ๋ ๊ฒฝ์ฐ์๋ง ์ธ์ฆ์ ์๊ตฌํ๋ฉด ๊ด์ฐฎ์๋ฐ, ํด๋ฌ์คํฐ ๋ด์์ ์๋น์ค๋ฅผ ํตํด ์ฐ๊ฒฐํด๋ ์ด๋ฌํ ์ธ์ฆ์ ์๊ตฌํ๋ค.
์ค์น
kubeflow ๋ฐฐํฌ๋ ๋ชจ๋์ MLOps๋ฅผ ์ฐธ์กฐํ๋ค.
kserve ์ค์น์ ๊ฒฝ์ฐ์๋ ๊ณต์ ๋ฌธ์๋ฅผ ์ฐธ์กฐํด ์งํํ๋ค. ์ด๋ kubeflow ๋ฐฐํฌ ๊ณผ์ ์์ ์ด๋ฏธ istio๊ฐ ๋ฐฐํฌ๋์ด ์์ผ๋ฏ๋ก istio ๋ฐฐํฌ๋ ์ ์ธํ๊ณ ์งํํ๋ค.
๋ฌธ์
์ฐ์ ํด๋ฌ์คํฐ ๋ด์ ์๋ฌด ๋์๋ ํ์ง ์๋ ๋จ์ํ ํ๋๋ฅผ ํ๋ ์์ฑํด๋ณด์. ์ด ํ๋์ ์ฐ๊ฒฐํด ๋ด๋ถ ์๋น์ค๋ก curl์ ๋ณด๋ผ ๊ฒ์ด๋ฏ๋ก curl์ด ์ค์น๋์ด ์๋ ์ด๋ฏธ์ง๋ฅผ ํ๋๋ก ๋ฐฐํฌํ๋ค.
apiVersion: v1
kind: Pod
metadata:
name: myapp-pod
labels:
app: myapp
spec:
containers:
- name: myapp-container
image: curlimages/curl:7.82.0
command: ['sh', '-c', 'echo Hello k8s! && sleep 3600']
kserve์ ๊ฒฝ์ฐ์๋ ๊ณต์ ํํ์ด์ง์ ์๋ ์์ ๋๋ก ๊ฐ๋จํ iris ์์ธก ๋ชจ๋ธ์ ๋ฐฐํฌํ๋ค.
apiVersion: "[serving.kserve.io/v1beta1](http://serving.kserve.io/v1beta1)"
kind: "InferenceService"
metadata:
name: "sklearn-iris"
spec:
predictor:
sklearn:
storageUri: "gs://kfserving-examples/models/sklearn/1.0/model"
์๋น์ค๋ฅผ ํ์ธํด๋ณด๋ฉด ์ด ๋ชจ๋ธ์ ๋ํ ์๋น์ค๊ฐ ์กด์ฌํ๋ ๊ฑธ ํ์ธํ ์ ์๋ค.
kubectl get svc -n kserve-test
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
sklearn-iris ExternalName <none> knative-local-gateway.istio-system.svc.cluster.local <none> 133m
์ด์ ์ด ์๋น์ค์ ์ด๋ฆ์ผ๋ก ์์ฒญ์ ๋ณด๋ด๋ณด์. ์ฐ์ ์์์ ์์ฑํ ํ๋์ ์ฐ๊ฒฐํด์ผํ๋ค.
kubectl exec --stdin --tty myapp-pod -- /bin/sh
๊ทธ ๋ค์์ ์์ ์ ๋์ ์๋ jsonํ์ผ์ ์์ฑํ๊ณ ์๋น์ค๋ก ์์ฒญ์ ์ ์กํด๋ณด์.
cat <<EOF > "./iris-input.json"
{
"instances": [
[6.8, 2.8, 4.8, 1.4],
[6.0, 3.4, 4.5, 1.6]
]
}
EOF
curl -v http://sklearn-iris.kserve-test.svc.cluster.local/v1/models/sklearn-iris:predict -d @./iris-input.json
๊ทธ๋ฌ๋ฉด ์๋ง ์๋ต ์ฝ๋๋ก 302๋ฒ๊ณผ ํจ๊ป dex ์ธ์ฆ ๊ด๋ จํ ์ ๋ณด๊ฐ ๋์ฌ ๊ฒ์ด๋ค.
์ฌ์ค ์ด ๋ฌธ์ ๋ฅผ ํด๊ฒฐํ๋ ค๋ฉด ์๊ตฌํ๋ ๋๋ก dex ์ธ์ฆ์ ๊ด๋ จํ ์ ๋ณด๋ฅผ ํจ๊ป ๋ด์ ์์ฒญ์ ๋ณด๋ด๋ฉด ๋๋ค. ๊ณต์ ๋ ํฌ์ ์น์ ํ ์์๋ ์๋ค. ๋์์๋ ๋๋ก CLI์์ ์ง์ง๊ณ ๋ณถ์ ์๋ ์๊ณ , ์ฌ์ง์ด๋ kubeflow ๋์๋ณด๋์ ๋ก๊ทธ์ธํ๊ณ ๊ฑฐ๊ธฐ์ ์ฌ์ฉํ๋ ์ ๋ณด๋ฅผ ๊ฐ์ ธ์ ํค๋์ ๋ด์ ์์ฒญ์ ๋ณด๋ผ ์๋ ์๋ค.
ํ์ง๋ง ์ด๊ฒ๋ง์ผ๋ก ์ถฉ๋ถํ ๊น? ์ฌ๊ธฐ์ ๋ฌธ์ ๋ istio๋ฅผ ์ฌ์ฉํ๋ ๋ชจ๋ ์ดํ๋ฆฌ์ผ์ด์ ์ด ์ด dex ์ ๋ณด๋ฅผ ์๊ตฌํ๋ค๋ ๋ฐ ์๋ค. ๋ง์ฝ ๋ฐฑ์๋ ํ์์ istio๋ฅผ ์ฌ์ฉํ๋ค๋ฉด ๋จธ์ ๋ฌ๋ ํ์์ ์ฌ์ฉํ๋ dex๋ฅผ ์ํด ๊ทธ๋๋ง๋ค ํค๋ฅผ ์์ฑํด์ผ๋ง ํ ๊น? ๋น์ทํ ๋ฌธ์ ๋ฅผ ๊ฒช๋ ์ฌ๋๋ค์ ์ด์๋ ์ข ์ข ์๋ ๊ฒ ๊ฐ๋ค(#1 #2, ์ฒซ๋ฒ์งธ๋ 2019๋ ์ ์ฌ๋ผ์จ ์ด์์ง๋ง ๋๋ฒ์งธ๋ ๋น์ฅ ๋ฉฐ์น ์ ์ ์ฌ๋ผ์จ ์ด์๋ค)
์์ธ
์ ์ด๋ฐ ๋ฌธ์ ๊ฐ ๋ฐ์ํ ๊น? ์ฐ์ istio virtual service ์ ๋ณด๋ฅผ ํ์ธํด๋ณด์.
kubectl get virtualservices.networking.istio.io --all-namespaces
๊ทธ๋ฌ๋ฉด dex์ ๊ดํ ๋ฒ์ถ์ผ ์๋น์ค์ ์ด ์๋น์ค๊ฐ ์ฌ์ฉํ๋ ๊ฒ์ดํธ์จ์ด๋ฅผ ํ์ธํ ์ ์๋ค. dex๋ kubeflow์์ ์ธ์ฆ์ ์ํด ์ฌ์ฉํ๋ kubeflow-gateway์ ์ฐ๊ฒฐ๋ ๊ฒ์ ํ์ธํ ์ ์๋ค.
์ด๋ฒ์ ์ด ๊ฒ์ดํธ์จ์ด์ ์ ๋ณด๋ฅผ ํ์ธํด๋ณด์.
kubectl get gateways.networking.io -n kubeflow kubeflow-gateway -o yaml
spec:
selector:
istio: ingressgateway
servers:
- hosts:
- '*'
port:
name: http
number: 80
protocol: HTTP
๊ทธ๋ฌ๋ฉด ์ ๋ ํฐ๋ก ๊ธฐ๋ณธ ์ปจํธ๋กค๋ฌ๋ฅผ ์ฌ์ฉํ๊ณ ์๋ ๊ฒ์ ๋ณผ ์ ์๋ค. ์ด ๊ธฐ๋ณธ ์ปจํธ๋กค๋ฌ๋ฅผ ์ฌ์ฉํ๋ ๋ชจ๋ ๊ฒ์ดํธ์จ์ด๋ dex์ ์ํฅ์ ํจ๊ป ๋ฐ๊ฒ ๋๋ค. knative์ ๊ฒ์ดํธ์จ์ด ์ ๋ณด๋ ํ ๋ฒ ํ์ธํด๋ณด์.
kubectl get gateways.networking.istio.io -n knative-serving knative-local-gateway -o yaml
spec:
selector:
istio: ingressgateway
๋ง์ฐฌ๊ฐ์ง๋ก ๊ธฐ๋ณธ ์ปจํธ๋กค๋ฌ๋ฅผ ์ฌ์ฉํ๊ณ ์๋ ๊ฒ์ ํ์ธํ ์ ์๋ค.
ํด๊ฒฐ
์ด ์ธ์ฆ์ ์ฐํํ๋ ๊ณผ์ ์ด ํ์ํ๋ค. Envoy filter๋ฅผ ์ฌ์ฉํ๋ ๋ฐฉ๋ฒ์ ์ฐพ๊ธด ํ๋๋ฐ, ๋ฒ์ ์ด ๋ค๋ฅธ์ง ์ ์๋๋ค. ์๋ํด๋ณด๊ณ ์ถ๋ค๋ฉด ์๋ ์ฒ๋ผ patch๋ฅผ ์์ ํด์ผํ ์๋ ์๋ค.
patch:
operation: MERGE
value:
name: envoy.ext_authz_disabled
typed_per_filter_config:
envoy.ext_authz:
"@type": [type.googleapis.com/envoy.extensions.filters.http.ext_authz.v3.ExtAuthzPerRoute](http://type.googleapis.com/envoy.extensions.filters.http.ext_authz.v3.ExtAuthzPerRoute)
disabled: true
๊นํ ์ด์์์ ์ฐพ์ ๋ด์ฉ์ผ๋ก ์๋ํ๋ ํด๊ฒฐ๋์๋ค.
istio ๋ฌธ์๋ฅผ ๋ณด๋ฉด External Authorization์ด๋ผ๋ ๋ด์ฉ์ด ์๋ค. ์ฐ๋ฆฌ๋ ์ด๋ฏธ dex๊ฐ ๋ฐฐํฌ๋์ด์์ผ๋ authorizer๋ฅผ ์ถ๊ฐ ๋ฐฐํฌํด์ค ํ์๋ ์๋ค. ์ฐ์ auth๊ฐ ํ์ํ ๋ถ๋ถ์ configmap์ ๋ช ์ํด์ฃผ์. ๋จผ์ configmap์ ์ฐ๋ค.
kubectl edit configmap istio -n istio-system
๊ทธ๋ฆฌ๊ณ dex ๊ด๋ จํ ์ ๋ณด๋ฅผ ์ฌ๊ธฐ์ ์ถ๊ฐํด์ค๋ค.
extensionProviders:
- name: dex-auth-provider
envoyExtAuthzHttp:
service: "authservice.istio-system.svc.cluster.local"
port: "8080"
includeHeadersInCheck: ["authorization", "cookie", "x-auth-token"]
headersToUpstreamOnAllow: ["kubeflow-userid"]
๊นํ ์ด์์์๋ kf๊ฐ ์ฌ์ฉํ๋ ํธ์คํธ๋ง์ ๋ฑ ๋ช ์ํด์ฃผ๋๋ฐ, ์ง๊ธ ๊ตฌ์ฑ์์๋ ๋ฐ๋ก ํธ์คํธ๋ฅผ ์ฌ์ฉํ๊ณ ์์ง ์์์ ๊ทธ๋ฐ๊ฐ ๊ทธ๋๋ก ์ฌ์ฉํ๋ฉด ์๋๋ค. ๋ฐ๋ผ์ kserve๊ฐ ์ฌ์ฉํ๋ ๊ฒฝ๋ก๋ฅผ ์ ์ธํด์ฃผ๋ ๋ฐฉ์์ผ๋ก ์ ๊ทผํ๋ค. ์๋ ์ ์ฑ ์ ์์ฑํ๋ค.
apiVersion: [security.istio.io/v1beta1](http://security.istio.io/v1beta1)
kind: AuthorizationPolicy
metadata:
name: dex-auth
namespace: istio-system
spec:
selector:
matchLabels:
istio: ingressgateway
action: CUSTOM
provider:
# The provider name must match the extension provider defined in the mesh config.
name: dex-auth-provider
rules:
# The rules specify when to trigger the external authorizer.
- to:
- operation:
notPaths: ["/v1*"]
๊ทธ๋ฆฌ๊ณ ๋์ ์๋ ์กด์ฌํ๋ authn-filter๋ฅผ ์ญ์ ํ๊ณ istiod๋ฅผ ์ฌ์์ํ๋ค.
kubeclt delete -n istio-system envoyfilters.networking.istio.io authn-filter
kubectl rollout restart deployment/istiod -n istio-system
์ด์ ์๊น ์ฐ๊ฒฐํด๋ ํ๋์์ ๋ค์ ์์ฒญ์ ๋ณด๋ด๋ณด๋ฉด ์ฝ๋ 200๊ณผ ํจ๊ป ์ ์์ ์ผ๋ก ์๋ต์ด ๋์ค๋ ๊ฒ์ ํ์ธํ ์ ์๋ค.
์ฌ์ค ์ด ๋ฐฉ์์ ์ฌ์ฉํ ๊ฒฝ๋ก๋ฅผ ๊ทธ๋๋ง๋ค ์ถ๊ฐํด์ฃผ์ด์ผ ํ๋ ๋ฌธ์ ๊ฐ ์๋ค. ๋ค๋ง ์์ง๊น์ง fancyํ๊ฒ kubeflow์๋ง authorization์ ์๊ตฌํ๋ ๋ฐฉ๋ฒ์ ์ฐพ์ง ๋ชปํ๋ค. ์ถํ ๋ ์ข์ ๋ฐฉ๋ฒ์ ์๊ฒ ๋๋ฉด ์ ๋ฐ์ดํธํ ์์ ์ด๋ค.
KServe Python Server
KServe's python server libraries implement a standardized library that is extended by model serving frameworks such as Scikit Learn, XGBoost and PyTorch. It encapsulates data plane API definitions and storage retrieval for models.
It provides many functionalities, including among others:
- Registering a model and starting the server
- Prediction Handler
- Pre/Post Processing Handler
- Liveness Handler
- Readiness Handlers
It supports the following storage providers:
- Google Cloud Storage with a prefix: "gs://"
- By default, it uses GOOGLE_APPLICATION_CREDENTIALS environment variable for user authentication.
- If GOOGLE_APPLICATION_CREDENTIALS is not provided, anonymous client will be used to download the artifacts.
- S3 Compatible Object Storage with a prefix "s3://"
- By default, it uses S3_ENDPOINT, AWS_ACCESS_KEY_ID, and AWS_SECRET_ACCESS_KEY environment variables for user authentication.
- Azure Blob Storage with the format: "https://{$STORAGE_ACCOUNT_NAME}.blob.core.windows.net/{$CONTAINER}/{$PATH}"
- By default, it uses anonymous client to download the artifacts.
- For e.g. https://kfserving.blob.core.windows.net/triton/simple_string/
- Local filesystem either without any prefix or with a prefix "file://". For example:
- Absolute path: /absolute/path or file:///absolute/path
- Relative path: relative/path or file://relative/path
- For local filesystem, we recommended to use relative path without any prefix.
- Persistent Volume Claim (PVC) with the format "pvc://{$pvcname}/[path]".
- The pvcname is the name of the PVC that contains the model.
- The [path] is the relative path to the model on the PVC.
- For e.g. pvc://mypvcname/model/path/on/pvc
- Generic URI, over either HTTP, prefixed with http:// or HTTPS, prefixed with https://. For example:
- https://<some_url>.com/model.joblib
- http://<some_url>.com/model.joblib
KServe Client
Getting Started
KServe's python client interacts with KServe control plane APIs for executing operations on a remote KServe cluster, such as creating, patching and deleting of a InferenceService instance. See the Sample for Python SDK Client to get started.
Documentation for Client API
Class | Method | Description |
---|---|---|
KServeClient | set_credentials | Set Credentials |
KServeClient | create | Create InferenceService |
KServeClient | get | Get or watch the specified InferenceService or all InferenceServices in the namespace |
KServeClient | patch | Patch the specified InferenceService |
KServeClient | replace | Replace the specified InferenceService |
KServeClient | delete | Delete the specified InferenceService |
KServeClient | wait_isvc_ready | Wait for the InferenceService to be ready |
KServeClient | is_isvc_ready | Check if the InferenceService is ready |
KServe's python client interacts with KServe control plane APIs for executing operations on a remote KServe cluster, such as creating, patching and deleting of a InferenceService instance. See the Sample for Python SDK Client to get started.
Documentation for Client API
Class Method Description
KServeClient set_credentials Set Credentials
KServeClient create Create InferenceService
KServeClient get Get or watch the specified InferenceService or all InferenceServices in the namespace
KServeClient patch Patch the specified InferenceService
KServeClient replace Replace the specified InferenceService
KServeClient delete Delete the specified InferenceService
KServeClient wait_isvc_ready Wait for the InferenceService to be ready
KServeClient is_isvc_ready Check if the InferenceService is ready
KServe Installation and Example
kubectl apply -f https://github.com/kserve/kserve/releases/download/v0.8.0/kserve.yaml
๊ด๋ จ ์ค์น ์คํจ ์ฌ๋ก
KServe Installation Log
This document describes the log for KServe installation and testing.
Installation
(pytorch) โญโsungsoo@sungsoo-HP-Z840 ~
โฐโ$ kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.8.0/cert-manager.yaml
(pytorch) โญโsungsoo@sungsoo-HP-Z840 ~
โฐโ$ kubectl apply -f https://github.com/kserve/kserve/releases/download/v0.8.0/kserve.yaml
Check pod status of KServe controller
(pytorch) โญโsungsoo@sungsoo-HP-Z840 ~
โฐโ$ kubectl get pods -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-7f9c69c78c-tgwrz 1/1 Running 0 25h
์ค๊ฐ ์๋ต
cert-manager cert-manager-b4d6fd99b-m6l64 1/1 Running 0 22m
cert-manager cert-manager-cainjector-74bfccdfdf-wp5t4 1/1 Running 0 22m
cert-manager cert-manager-webhook-65b766b5f8-s7lpj 1/1 Running 0 22m
kserve kserve-controller-manager-0 2/2 Running 4 11m
(pytorch) โญโsungsoo@sungsoo-HP-Z840 ~
โฐโ$ kubectl get pods -n kserve
NAME READY STATUS RESTARTS AGE
kserve-controller-manager-0 2/2 Running 1 3m46s
KServe Inference Service Example
1. Create test InferenceService
The following YAML file(iris-sklearn.yaml) describes the inference service for sklearn-based iris.
apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
name: "sklearn-iris"
spec:
predictor:
sklearn:
storageUri: "gs://kfserving-samples/models/sklearn/iris"
(pytorch) โญโsungsoo@sungsoo-HP-Z840 ~
โฐโ$ kubectl apply -f iris-sklearn.yaml -n traindb-ml โต
inferenceservice.serving.kserve.io/sklearn-iris created
2. Check InferenceService status.
(pytorch) โญโsungsoo@sungsoo-HP-Z840 ~
โฐโ$ k get inferenceservices -A
NAMESPACE NAME URL READY PREV LATEST PREVROLLEDOUTREVISION LATESTREADYREVISION AGE
traindb-ml sklearn-iris 108s
Knative and microk8s
Article Source
Install multipass
brew install multipass
Install hyperkit
or qemu
, do not use virtual box it doesn't allow access from the host network bridge by default.
For qemu install libvirt
and set as default driver
brew install libvirt
sudo multipass set local.driver=qemu
For hyperkit install hyperkit
and set as default driver
brew install hyperkit
sudo multipass set local.driver=hyperkit
Using multipass create a new ubuntu VM
Create a multipass vm with 3 CPU, 2 GB, and 8GB of disk
multipass launch -n knative -c 3 -m 2G -d 8G
Set the primary name to knative
to avoid always typing the name of the vm
multipass set client.primary-name=knative
Login into the vm
multipass shell
Install [microk8s])(https://microk8s.io/docs/getting-started) or from github/ubuntu/microk8s
sudo snap install microk8s --classic
Join the group microk8s
sudo usermod -a -G microk8s $USER
sudo chown -f -R $USER ~/.kube
Logout to refresh groups
exit
Login into the vm again
multipass shell
Check status
microk8s status --wait-ready
Check access
microk8s kubectl get nodes
Set alias
alias kubectl='microk8s kubectl'
alias k='kubectl'
Enable dns
microk8s enable dns
Install Knative Serving from knative.dev
TLDR;
kubectl apply -f https://github.com/knative/serving/releases/download/knative-v1.2.0/serving-crds.yaml
kubectl apply -f https://github.com/knative/serving/releases/download/knative-v1.2.0/serving-core.yaml
kubectl apply -f https://github.com/knative/net-kourier/releases/download/knative-v1.2.0/kourier.yaml
kubectl patch configmap/config-network \
--namespace knative-serving \
--type merge \
--patch '{"data":{"ingress.class":"kourier.ingress.networking.knative.dev"}}'
Check the status of the knative network layer load balancer
kubectl --namespace kourier-system get service kourier
If the EXTERNAL-IP
is in pending
then you need a load balancer in your kubernetes cluster
You can use the metalb addon, with a small range of ip addresses, use ip a
to inspect the ip address currently assign and assign IPs on the same subnet
microk8s enable metallb:192.168.205.250-192.168.205.254
Yes, I know this is a hack but allows me to access the cluster from the host macOS ๐
Check again
kubectl --namespace kourier-system get service kourier
Output should look like this
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kourier LoadBalancer 10.152.183.31 192.168.205.16 80:32564/TCP,443:32075/TCP 7m17s
Check knative is up
kubectl get pods -n knative-serving
Configure Knative DNS
kubectl apply -f https://github.com/knative/serving/releases/download/knative-v1.2.0/serving-default-domain.yaml
Install the kn
CLI
sudo curl -o /usr/local/bin/kn -sL https://github.com/knative/client/releases/download/knative-v1.2.0/kn-linux-amd64
sudo chmod +x /usr/local/bin/kn
Copy the kubeconfig to $HOME/.kube/config
microk8s config > $HOME/.kube/config
Create your first knative service
kn service create nginx --image nginx --port 80
Get the url of your new service
kn service describe nginx -o url
Curl the url
curl $(kn service describe nginx -o url)
You sould see the nginx output
Thank you for using nginx.
List the pods for your service
kubectl get pods
After a minute your pod should be deleted automatically (ie scale to zero)
NAME READY STATUS RESTARTS AGE
nginx-00001-deployment-5c94d6d769-ssnc7 2/2 Terminating 0 83s
Access the url again
curl $(kn service describe nginx -o url)
Istio Installation
์๋ 1
istoctl์ ์ด์ฉํด์ ๊ฐ๋จํ ์ค์น๋ฅผ ์๋ํด ๋ด
Istio ์ค์น์ ์ค๋ฅ๊ฐ ๋ฐ์ํ๋ค.
(base) โญโsungsoo@sungsoo-HP-Z840 ~
โฐโ$ istioctl install
This will install the Istio 1.14.1 default profile with ["Istio core" "Istiod" "Ingress gateways"] components into the cluster. Proceed? (y/N) y
โ Istio core installed
โ Istiod installed
โ Ingress gateways encountered an error: failed to wait for resource: resources not ready after 5m0s: timed out waiting for the condition
Deployment/istio-system/istio-ingressgateway (containers with unready status: [istio-proxy])
- Pruning removed resources Error: failed to install manifests: errors occurred during operation
์๋ 2
microk8s.disable์ ํตํด istio ์ญ์ ํ, istoctl๋ก ์ฌ์ค์นํด ๋ด.
(pytorch) โญโsungsoo@sungsoo-HP-Z840 ~
โฐโ$ microk8s.disable istio
Disabling Istio
Error from server (NotFound): namespaces "istio-system" not found
(pytorch) โญโsungsoo@sungsoo-HP-Z840 ~
โฐโ$ istioctl install
This will install the Istio 1.14.1 default profile with ["Istio core" "Istiod" "Ingress gateways"] components into the cluster. Proceed? (y/N) y
โ Istio core installed
โ Istiod installed
โ Ingress gateways installed
โ Installation complete Making this installation the default for injection and validation.
Thank you for installing Istio 1.14. Please take a few minutes to tell us about your install/upgrade experience! https://forms.gle/yEtCbt45FZ3VoDT5A
์๋ 3
Istio Ingress gateway validation
์ค์น ์ ๋๋ก ๋์๋์ง ํ์ธํด ๋ณด์.
โistio-systemโ ๋ค์์คํ์ด์ค๋ก istio ๊ฐ์ฒด๊ฐ ์ ๋๋ก ๋ก๋ฉ๋์๋์ง ํ์ธํด ๋ด
(pytorch) โญโsungsoo@sungsoo-HP-Z840 ~
โฐโ$ kubectl get pods -n istio-system
NAME READY STATUS RESTARTS AGE
istiod-6d67d84bc7-dbzbk 1/1 Running 0 5m59s
istio-ingressgateway-778f44479-rq4j4 1/1 Running 0 5m51s
(pytorch) โญโsungsoo@sungsoo-HP-Z840 ~
โฐโ$ kubectl get services -n istio-system
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
istiod ClusterIP 10.152.183.182 <none> 15010/TCP,15012/TCP,443/TCP,15014/TCP 6m18s
istio-ingressgateway LoadBalancer 10.152.183.49 10.64.140.45 15021:31348/TCP,80:31128/TCP,443:32300/TCP 6m10s
Kubernetes: microk8s with multiple Istio ingress gateways
Article Source
microk8s has convenient out-of-the-box support for MetalLB and an NGINX ingress controller. But microk8s is also perfectly capable of handling Istio operators, gateways, and virtual services if you want the advanced policy, security, and observability offered by Istio.
In this article, we will install the Istio Operator, and allow it to create the Istio Ingress gateway service. We follow that up by creating an Istio Gateway in the default namespace, then create a Deployment and VirtualService projecting unto the Istio Gateway.
To exercise an even more advanced scenario, we will install both a primary and secondary Istio Ingress gateway, each tied to a different MetalLB IP address. This can emulate serving your public customers one set of services, and serving a different set of administrative applications to a private internal network for employees.
This article builds off my previous article where we built a microk8s cluster using Ansible. There are many steps required for Istio setup, so I have wrapped this up into Ansible roles.
Prerequisites
This article builds off my previous article where we built a microk8s cluster using Ansible. If you used Terraform as described to create the microk8s-1 host, you already have an additional 2 network interfaces on the master microk8-1 host (ens4=192.168.1.141 and ens5=192.168.1.142).
However, a microk8s cluster is not required. You can run the steps in this article on a single microk8s node. But you MUST have an additional two network interfaces and IP addresses on the same network as your host (e.g. 192.168.1.0/24) for the MetalLB endpoints.
Istio Playbook
From the previous article, your last step was running the playbook that deployed a microk8s cluster, playbook_microk8s.yml.
We need to build on top of that and install the Istio Operator, Istio ingress gateway Service, Istio Gateway, and test Virtual Service and Deployment. Run this playbook.
ansible-playbook playbook_metallb_primary_secondary_istio.yml
At the successful completion of this playbook run, you will have Istio installed, two Istio Ingress services, two Istio Gateways, and two independent versions of the sample helloworld deployment served up using different endpoints and certificates.
The playbook does TLS validation using curl as a success criteria. However, it is beneficial for learning to step through the objects created and then execute a smoke test of the TLS endpoints manually. The rest of this article is devoted to these manual validations.
MetalLB validation
View the MetalLB objects.
$ kubectl get all -n metallb-system
NAME READY STATUS RESTARTS AGE
pod/speaker-9xzlc 1/1 Running 0 64m
pod/speaker-dts5k 1/1 Running 0 64m
pod/speaker-r8kck 1/1 Running 0 64m
pod/controller-559b68bfd8-mtl2s 1/1 Running 0 64m
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
daemonset.apps/speaker 3 3 3 3 3 beta.kubernetes.io/os=linux 64m
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/controller 1/1 1 1 64m
NAME DESIRED CURRENT READY AGE
replicaset.apps/controller-559b68bfd8 1 1 1 64m
Show the MetalLB configmap with the IP used.
$ kubectl get configmap/config -n metallb-system -o yaml
apiVersion: v1
data:
config: |
address-pools:
- name: default
protocol: layer2
addresses:
- 192.168.1.141-192.168.1.142
kind: ConfigMap
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: ....
creationTimestamp: "2021-07-31T10:07:56Z"
name: config
namespace: metallb-system
resourceVersion: "38015"
selfLink: /api/v1/namespaces/metallb-system/configmaps/config
uid: 234ad41d-cfde-4bf5-990e-627f74744aad
Istio Operator validation
View the Istio Operator objects in the โistio-operatorโ namespace.
$ kubectl get all -n istio-operator
NAME READY STATUS RESTARTS AGE
pod/istio-operator-1-9-7-5d47654878-jh5sr 1/1 Running 1 65m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/istio-operator-1-9-7 ClusterIP 10.152.183.120 8383/TCP 65m
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/istio-operator-1-9-7 1/1 1 1 65m
NAME DESIRED CURRENT READY AGE
replicaset.apps/istio-operator-1-9-7-5d47654878 1 1 1 65m
The Operator should be โRunningโ, now check the Istio Operator logs for errors.
$ kubectl logs --since=15m -n istio-operator $(kubectl get pods -n istio-operator -lname=istio-operator -o jsonpath="{.items[0].metadata.name}")
...
- Processing resources for Ingress gateways.
โ Ingress gateways installed
...
Istio Ingress gateway validation
View the Istio objects in the โistio-systemโ namespace. These are objects that the Istio Operator has created.
$ kubectl get pods -n istio-system
NAME READY STATUS RESTARTS AGE
istiod-1-9-7-656bdccc78-rr8hf 1/1 Running 0 95m
istio-ingressgateway-b9b6fb6d8-d8fbp 1/1 Running 0 94m
istio-ingressgateway-secondary-76db9f9f7b-2zkcl 1/1 Running 0 94m
$ kubectl get services -n istio-system
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
istiod-1-9-7 ClusterIP 10.152.183.198 15010/TCP,15012/TCP,443/TCP,15014/TCP 95m
istio-ingressgateway LoadBalancer 10.152.183.92 192.168.1.141 15021:31471/TCP,80:32600/TCP,443:32601/TCP,31400:32239/TCP,15443:30571/TCP 94m
istio-ingressgateway-secondary LoadBalancer 10.152.183.29 192.168.1.142 15021:30982/TCP,80:32700/TCP,443:32701/TCP,31400:31575/TCP,15443:31114/TCP 94m
Notice we have purposely created two istio ingress gateways, one is for our primary access (such as public customer traffic), and the other is to mimic a secondary access (perhaps for employee-only management access).
In the services, you will see reference to our MetalLB IP endpoints which is how we will ultimately reach the services projected unto these gateways.
Service and Deployment validation
Istio has an example app called helloworld. Our Ansible created two independent deployments that could be projected unto the two Istio Gateways.
Letโs validate these deployments by testing access to the pods and services, without any involvement by Istio.
Service=helloworld, Deployment=helloworld-v1
Service=helloworld2, Deployment=helloworld-v2
To reach the internal pod and service IP addresses, we need to be inside the cluster itself so we ssh into the master before running these commands:
ssh -i tf-libvirt/id_rsa ubuntu@192.168.122.210
Letโs view the deployments, pods, and then services for these two independent applications.
$ kubectl get deployments -n default
NAME READY UP-TO-DATE AVAILABLE AGE
helloworld2-v2 1/1 1 1 112m
helloworld-v1 1/1 1 1 112m
$ kubectl get pods -n default -l 'app in (helloworld,helloworld2)'
NAME READY STATUS RESTARTS AGE
helloworld2-v2-749cc8dc6d-6kbh7 2/2 Running 0 110m
helloworld-v1-776f57d5f6-4gvp7 2/2 Running 0 109m
$ kubectl get services -n default -l 'app in (helloworld,helloworld2)'
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
helloworld2 ClusterIP 10.152.183.251 5000/TCP 113m
helloworld ClusterIP 10.152.183.187 5000/TCP 113m
First, letโs pull from the private pod IP directly.
# internal ip of primary pod
$ primaryPodIP=$(microk8s kubectl get pods -l app=helloworld -o=jsonpath="{.items[0].status.podIPs[0].ip}")
# internal IP of secondary pod
$ secondaryPodIP=$(microk8s kubectl get pods -l app=helloworld2 -o=jsonpath="{.items[0].status.podIPs[0].ip}")
# check pod using internal IP
$ curl http://${primaryPodIP}:5000/hello
Hello version: v1, instance: helloworld-v1-776f57d5f6-4gvp7
# check pod using internal IP
$ curl http://${secondaryPodIP}:5000/hello
Hello version: v2, instance: helloworld2-v2-749cc8dc6d-6kbh7
With internal pod IP proven out, move up to the Cluster IP defined at the Service level.
# IP of primary service
$ primaryServiceIP=$(microk8s kubectl get service/helloworld -o=jsonpath="{.spec.clusterIP}")
# IP of secondary service
$ secondaryServiceIP=$(microk8s kubectl get service/helloworld2 -o=jsonpath="{.spec.clusterIP}")
# check primary service
$ curl http://${primaryServiceIP}:5000/hello
Hello version: v1, instance: helloworld-v1-776f57d5f6-4gvp7
# check secondary service
$ curl http://${secondaryServiceIP}:5000/hello
Hello version: v2, instance: helloworld2-v2-749cc8dc6d-6kbh7
These validations proved out the pod and service independent of the Istio Gateway or VirtualService. Notice all these were using insecure HTTP on port 5000, because TLS is layered on top by Istio.
Exit the cluster ssh session before continuing.
exit
Validate TLS certs
The Ansible scripts created a custom CA and then key+certificates for โmicrok8s.localโ and โmicrok8s-secondary.localโ. These are located in the /tmp directory of the microk8s-1 host.
These will be used by the Istio Gateway and VirtualService for secure TLS.
# show primary cert info
$openssl x509 -in /tmp/microk8s.local.crt -text -noout | grep -E "CN |DNS"
Issuer: CN = myCA.local
Subject: CN = microk8s.local
DNS:microk8s.local, DNS:microk8s-alt.local
# show secondary cert info
$ openssl x509 -in /tmp/microk8s-secondary.local.crt -text -noout | grep -E "CN |DNS"
Issuer: CN = myCA.local
Subject: CN = microk8s-secondary.local
DNS:microk8s-secondary.local
Validate Kubernetes TLS secrets
The keys and certificates will not be used by Istio unless they are loaded as Kubernetes secrets available to the Istio Gateway.
# primary tls secret for 'microk8s.local'
$ kubectl get -n default secret tls-credential
NAME TYPE DATA AGE
tls-credential kubernetes.io/tls 2 10h
# primary tls secret for 'microk8s-secondary.local'
$ kubectl get -n default secret tls-secondary-credential
NAME TYPE DATA AGE
tls-secondary-credential kubernetes.io/tls 2 10h
# if needed, you can pull the actual certificate from the secret
# it requires a backslash escape for 'tls.crt'
$ kubectl get -n default secret tls-credential -o jsonpath="{.data.tls\.crt}"
| base64 --decode
Validate Istio Gateway
The Istio Gateway object is the entity that uses the Kubernetes TLS secrets shown above.
$ kubectl get -n default gateway
NAME AGE
gateway-ingressgateway-secondary 3h2m
gateway-ingressgateway 3h2m
Digging into the details of the Gateway object, we can see the host name it will be processing as well as the kubernetes tls secret it is using.
# show primary gateway
$ kubectl get -n default gateway/gateway-ingressgateway -o jsonpath="{.spec.servers}" | jq
[
{
"hosts": [
"microk8s.local",
"microk8s-alt.local"
],
"port": {
"name": "http",
"number": 80,
"protocol": "HTTP"
}
},
{
"hosts": [
"microk8s.local",
"microk8s-alt.local"
],
"port": {
"name": "https",
"number": 443,
"protocol": "HTTPS"
},
"tls": {
"credentialName": "tls-credential",
"mode": "SIMPLE"
}
}
]
# show secondary gateway
$ kubectl get -n default gateway/gateway-ingressgateway-secondary -o jsonpath="{.spec.servers}" | jq
[
{
"hosts": [
"microk8s-secondary.local"
],
"port": {
"name": "http-secondary",
"number": 80,
"protocol": "HTTP"
}
},
{
"hosts": [
"microk8s-secondary.local"
],
"port": {
"name": "https-secondary",
"number": 443,
"protocol": "HTTPS"
},
"tls": {
"credentialName": "tls-secondary-credential",
"mode": "SIMPLE"
}
}
]
Notice the first Gateway uses the โtls-credentialโ secret, while the second uses โtls-secondary-credentialโ.
Validate VirtualService
The bridge that creates the relationship between the purely Istio objects (istio-system/ingressgateway,default/Gateway) and the application objects (pod,deployment,service) is the VirtualService.
This VirtualService is how the application is projected unto a specific Istio Gateway.
$ kubectl get -n default virtualservice
NAME GATEWAYS HOSTS AGE
hello-v2-on-gateway-ingressgateway-secondary ["gateway-ingressgateway-secondary"] ["microk8s-secondary.local"] 3h14m
hello-v1-on-gateway-ingressgateway ["gateway-ingressgateway"] ["microk8s.local","microk8s-alt.local"] 3h14m
Digging down into the VirtualService, you can see it lists the applicationโs route, port, path, the expected HTTP Host header, and Istio gateway to project unto.
# show primary VirtualService
$ kubectl get -n default virtualservice/hello-v1-on-gateway-ingressgateway -o jsonpath="{.spec}" | jq
{
"gateways": [
"gateway-ingressgateway"
],
"hosts": [
"microk8s.local",
"microk8s-alt.local"
],
"http": [
{
"match": [
{
"uri": {
"exact": "/hello"
}
}
],
"route": [
{
"destination": {
"host": "helloworld",
"port": {
"number": 5000
}
}
}
]
}
]
}
# show secondary VirtualService
$ kubectl get -n default virtualservice/hello-v2-on-gateway-ingressgateway-secondary -o jsonpath="{.spec}" | jq
{
"gateways": [
"gateway-ingressgateway-secondary"
],
"hosts": [
"microk8s-secondary.local"
],
"http": [
{
"match": [
{
"uri": {
"exact": "/hello"
}
}
],
"route": [
{
"destination": {
"host": "helloworld2",
"port": {
"number": 5000
}
}
}
]
}
]
}
Validate URL endpoints
With the validation of all the dependent objects complete, you can now run the ultimate test which is to run an HTTPS against the TLS secured endpoints.
The Gateway requires that the proper FQDN headers be sent by your browser, so it is not sufficient to do a GET against the MetalLB IP addresses. The ansible scripts should have already created entries in the local /etc/hosts file so we can use the FQDN.
# validate that /etc/hosts has entries for URL
$ grep '\.local' /etc/hosts
192.168.1.141 microk8s.local
192.168.1.142 microk8s-secondary.local
# test primary gateway
# we use '-k' because the CA cert has not been loaded at the OS level
$ curl -k https://microk8s.local/hello
Hello version: v1, instance: helloworld-v1-776f57d5f6-4gvp7
# test secondary gateway
$ curl -k https://microk8s-secondary.local/hello
Hello version: v2, instance: helloworld2-v2-749cc8dc6d-6kbh7
Notice from the /etc/hosts entries, we have entries corresponding the MetalLB endpoints. The tie between the MetalLB IP addresses and the Istio ingress gateway objects was shown earlier, but for convenience is below.
# tie between MetalLB and Istio Ingress Gateways
$ kubectl get -n istio-system services
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
istiod-1-9-7 ClusterIP 10.152.183.198 15010/TCP,15012/TCP,443/TCP,15014/TCP 3h30m
istio-ingressgateway LoadBalancer 10.152.183.92 192.168.1.141 15021:31471/TCP,80:32600/TCP,443:32601/TCP,31400:32239/TCP,15443:30571/TCP 3h30m
istio-ingressgateway-secondary LoadBalancer 10.152.183.29 192.168.1.142 15021:30982/TCP,80:32700/TCP,443:32701/TCP,31400:31575/TCP,15443:31114/TCP 3h30m
Validate URL endpoints remotely
These same request can be made from your host machine as well since the MetalLB endpoints are on the same network as your host (all our actions so far have been from inside the microk8s-1 host). But the Istio Gateway expects a proper HTTP Host header so you have several options:
- Enable DNS lookup from your host upstream (router)
- Add the โmicrok8s.localโ and โmicrok8s-secondary.localโ entries to your local /etc/hosts file
- OR use the curl โโresolveโ flag to specify the FQDN to IP mapping which will send the host header correctly
Iโve provided a script that you can run from the host for validation:
./test-istio-endpoints.sh
Conclusion
Using this concept of multiple ingress, you can isolate traffic to different source networks, customers, and services.
nREFERENCES
- metallb
- fabianlee github, microk8s-nginx-istio repo
- istio, getting started
- istio, installing
- istio, helloworld source for istio
- dockerhub, helloworldv1 and [helloworldv2]
- https://hub.docker.com/r/istio/examples-helloworld-v2) images
- rob.salmond.ca, good explanation of Istio ingress gateway versus Istio Gateway and its usage
- kubernetes.io, list of different ingress controllers
- stackoverflow, diagrams of istiod, istio proxy, and ingress and egress controllers
- pavan kumar, weighted routing with istio
- pavan kumar, mtls in istio showing access with kiali
Microk8s puts up its Istio and sails away
Article Source
Istio almost immediately strikes you as enterprise grade software. Not so much because of the complexity it introduces, but more because of the features it adds to your service mesh. Must-have features packaged together in a coherent framework:
- Traffic Management
- Security Policies
- Telemetry
- Performance Tuning
Since microk8s positions itself as the local Kubernetes cluster developers prototype on, it is no surprise that deployment of Istio is made dead simple. Letโs start with the microk8s deployment itself:
> sudo snap install microk8s --classic
Istio deployment available with:
> microk8s.enable istio
There is a single question that we need to respond to at this point. Do we want to enforce mutual TLS authentication among sidecars? Istio places a proxy to your services so as to take control over routing, security etc. If we know we have a mixed deployment with non-Istio and Istio enabled services we would rather not enforce mutual TLS:
> microk8s.enable istio
Enabling Istio
Enabling DNS
Applying manifest
service/kube-dns created
serviceaccount/kube-dns created
configmap/kube-dns created
deployment.extensions/kube-dns created
Restarting kubelet
DNS is enabled
Enforce mutual TLS authentication (https://bit.ly/2KB4j04) between sidecars? If unsure, choose N. (y/N): y
Believe it or not we are done, Istio v1.0 services are being set up, you can check the deployment progress with:
> watch microk8s.kubectl get all --all-namespaces
We have packaged istioctl
in microk8s for your convenience:
> microk8s.istioctl get all --all-namespaces
NAME KIND NAMESPACE AGE
grafana-ports-mtls-disabled Policy.authentication.istio.io.v1alpha1 istio-system 2m
DESTINATION-RULE NAME HOST SUBSETS NAMESPACE AGE
istio-policy istio-policy.istio-system.svc.cluster.local istio-system 3m
istio-telemetry istio-telemetry.istio-system.svc.cluster.local istio-system 3m
GATEWAY NAME HOSTS NAMESPACE AGE
istio-autogenerated-k8s-ingress * istio-system 3m
Do not get scared by the amount of services and deployments, everything is under the istio-system
namespace. We are ready to start exploring!
Demo Time!
Istio needs to inject sidecars to the pods of your deployment. In microk8s auto-injection is supported so the only thing you have to label the namespace you will be using with istion-injection=enabled
:
> microk8s.kubectl label namespace default istio-injection=enabled
Letโs now grab the bookinfo example from the v1.0 Istio release and apply it:
> wget https://raw.githubusercontent.com/istio/istio/release-1.0/samples/bookinfo/platform/kube/bookinfo.yaml
> microk8s.kubectl create -f bookinfo.yaml
The following services should be available soon:
> microk8s.kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) details ClusterIP 10.152.183.33 <none> 9080/TCP kubernetes ClusterIP 10.152.183.1 <none> 443/TCP productpage ClusterIP 10.152.183.59 <none> 9080/TCP ratings ClusterIP 10.152.183.124 <none> 9080/TCP reviews ClusterIP 10.152.183.9 <none> 9080/TCP
We can reach the services using the ClusterIP they have; we can for example get to the productpage
in the above example by pointing our browser to 10.152.183.59:9080
. But letโs play by the rules and follow the official instructions on exposing the services via NodePort:
> wget https://raw.githubusercontent.com/istio/istio/release-1.0/samples/bookinfo/networking/bookinfo-gateway.yaml
> microk8s.kubectl create -f bookinfo-gateway.yaml
To get to the productpage
through ingress we shamelessly copy the example instructions:
> microk8s.kubectl -n istio-system get service istio-ingressgateway -o jsonpath='{.spec.ports[?(@.name=="http2")].nodePort}'
31380
And our node is the localhost so we can point our browser to http://localhost:31380/productpage
Show me some graphs!
Of course graphs look nice in a blog post, so here you go.
You will need to grab the ClusterIP of the Grafana service:
microk8s.kubectl -n istio-system get svc grafana
Prometheus is also available in the same way.
microk8s.kubectl -n istio-system get svc prometheus
And for traces you will need to look at the jaeger-query
.
microk8s.kubectl -n istio-system get service/jaeger-query
The servicegraph endpoint is available with:
microk8s.kubectl -n istio-system get svc servicegraph
I should stop here. Go and checkout the Istio documentation for more details on how to take advantage of what Istio is offering.
What to keep from this post
- There is great value in Istio. Itโs a framework for preparing Kubernetes for the enterprise.
- Microk8s can get you up and running quickly. Drop us a line with what you want to see improved.
- Do not be afraid to fail. A shipwreck can have more value than a sailing ship.
References
KServe ์ค์น ํ์ธ
KServe Quick Start๋ฅผ ์ฐธ๊ณ (quick_install.sh)ํ์ฌ ์ค์น ํ, ์ค์น๊ฐ ์ ๋๋ก ๋์๋์ง ํ์ธ
(pytorch) โญโsungsoo@sungsoo-HP-Z840 ~
โฐโ$ k get pod -n kserve
NAME READY STATUS RESTARTS AGE
kserve-controller-manager-0 2/2 Running 0 3d21h
Microk8s ๋ค๋ฃจ๊ธฐ
microk8s reset ํ๊ธฐ
(pytorch) โญโsungsoo@sungsoo-HP-Z840 ~
โฐโ$ microk8s reset
Disabling all addons.
Disabling addon : ambassador
Disabling addon : cilium
Disabling addon : dashboard
Disabling addon : dns
Disabling addon : fluentd
Disabling addon : gpu
Disabling addon : helm
Disabling addon : helm3
Disabling addon : host-access
Disabling addon : ingress
Disabling addon : istio
Disabling addon : jaeger
...
Serverless Installation Guide
KServe Serverless installation enables autoscaling based on request volume and supports scale down to and from zero. It also supports revision management and canary rollout based on revisions.
Kubernetes 1.20 is the minimally required version and please check the following recommended Knative, Istio versions for the corresponding Kubernetes version.
Recommended Version Matrix
Kubernetes Version | Recommended Istio Version | Recommended Knative Version |
---|---|---|
1.20 | 1.9, 1.10, 1.11 | 0.25, 0.26, 1.0 |
1.21 | 1.10, 1.11 | 0.25, 0.26, 1.0 |
1.22 | 1.11, 1.12 | 0.25, 0.26, 1.0 |
KServe setup and testing (starting from 5 July)
Prerequests
- Microk8s with Kubeflow: you have an installed version of Kubeflow.
- Installation Guide: Quick start guide to Kubeflow
- Fundamental Concepts of Kubeflow, Istio, KNative, KServe(or formerly KFServing)
- You need to understand the following core concepts related to model serving in Kubeflow.
- Since we can't delve deeply into every topic, we would like to provide you a short list of our favorite primers on Kubeflow especially serving topics.
- Kubeflow for Machine Learning - Chapter 8
- Kubeflow Operations Guide - Chapter 8
0. Installing Kubeflow
We assume that you have already installed Kubeflow by using the following guide.
- Installation Guide: Quick start guide to Kubeflow
1. KServe Installation
-
Install Istio
Please refer to the Istio install guide. -
Install Knative Serving
Please refer to Knative Serving install guide.
Note If you are looking to use PodSpec fields such as nodeSelector, affinity or tolerations which are now supported in the v1beta1 API spec, you need to turn on the corresponding feature flags in your Knative configuration.
- Install Cert Manager
The minimally required Cert Manager version is 1.3.0 and you can refer to Cert Manager.
Note Cert manager is required to provision webhook certs for production grade installation, alternatively you can run self signed certs generation script.
kubectl apply -f https://github.com/kserve/kserve/releases/download/v0.8.0/kserve.yaml
kubectl apply -f https://github.com/kserve/kserve/releases/download/v0.8.0/kserve-runtimes.yaml
Note ClusterServingRuntimes are required to create InferenceService for built-in model serving runtimes with KServe v0.8.0 or higher.
์ฌ์ค์น Microk8s ์ค๋ฅ ์๊ธธ ๋
microk8s๋ฅผ ์ฌ์ค์นํ๊ณ istio๋ฅผ ์ค์นํ๋ ค๊ณ ํ ๋, ์๋์ ๊ฐ์ ์ค๋ฅ๊ฐ ์๊ธด๋ค.
(base) โญโsungsoo@z840 ~/kubeflow/istio-1.11.0
โฐโ$ bin/istioctl install
Error: fetch Kubernetes config file: Get "http://localhost:8080/api?timeout=32s": dial tcp 127.0.0.1:8080: connect: connection refused
์๋ ๋ช ๋ น์ ์คํํ์ฌ config ๋ฅผ ๊ฐฑ์ ํ์.
(base) โญโsungsoo@z840 ~/kubeflow/istio-1.11.0
โฐโ$ microk8s config > ~/.kube/config
I wanted to bypass the Dex when accessing Inference Services from the outside.
In my case it was necessary to deploy an additional policy, otherwise there was no access:
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
name: allow-inference-services
namespace: istio-system
spec:
selector:
matchLabels:
action: ALLOW
rules:
- from:
- source:
principals: ["cluster.local/ns/istio-system/sa/istio-ingressgateway-service-account"]
- to:
- operation:
methods: ["POST"]
paths: ["/v1*"]
Also, these actions seem to lead to future crashes:
kubeflow/manifests#2309 (comment)